0
Sorting
What makes it hard?
Chapter 7 in DS&AA
Chapter 8 in DS&PS
Insertion Sort
• Algorithm
– Conceptually, incremental add element to sorted array
or list, starting with an empty array (...
Inversions
• What is average number of inversions, over all inputs?
• Let A be any array of integers
• Let revA be the rev...
BubbleSort
• Most frequently used sorting algorithm
• Algorithm:
for j=n-1 to 1 …. O(n)
for i=0 to j ….. O(j)
if A[i] and ...
Shell Sort: 1959 by Shell
• Motivated by inversion result - need to move far
elements
• Still quadratic
• Only in text boo...
Divide and Conquer: Merge Sort
• Let A be array of integers of length n
• define Sort (A) recursively via auxSort(A,0,N) w...
Merge
• Int[] Merge(int[] temp1, int[] temp2)
– int[] temp = new int[ temp1.length+temp2.length]
– int i,j,k
– repeat
• if...
Analysis of Merge Sort
• Time
– Let N be number of elements
– Number of levels is O(logN)
– At each level, O(N) work
– Tot...
Quicksort: Algorithm
• QuickSort - fastest algorithm
• QuickSort(S)
– 1. If size of S is 0 or 1, return S
– 2. Pick elemen...
Quicksort: Analysis
• Worst Case:
– T(N) = worst case sorting time
– T(1) = 1
– if bad pivot, T(N) = T(N-1)+N
– Via Telesc...
Analysis continued
– T(left branch) = T(right branch) (average) so
– T(N) = 2* ( T(0)+T(1)….T(N-1) )/N + N, where N is
cos...
Last Step
• Substitute N-1, N-2,... 3 for N
– T(N-1)/N = T(N-2)/(N-1) + 2/N
– …
– T(2)/3 = T(1)/2 +2/3
• Add
– T(N)/(N+1) ...
Quickselect: Algorithm
• Problem: find the kth smallest item
• Algorithm: modify Quicksort
– let |S| be the number of elem...
Quickselect: Analysis
• Worst Case is O(N^2)
• Average Case: analysis similar to quicksort’s.
• Here T(N) = 1*(T(0)+T(1)+…...
Bucket Sort
• A linear time sort algorithm!
• Need to know the possible values.
• Example 1: to sort N integers less than ...
Radix Sorting (card sorting)
• Uses linked lists
• Idea: Multiple passes of Bucket Sort
• Trick: Iteratively sort by last ...
External Sorting (Tape or CD)
• Idea: merge sort (2-way)
• Suppose memory size is M (enough to sort internally)
• Ta1, Ta2...
External Sorting
• Continuing merging, alternating writing to ta1, ta2.
• Number of passes is log(N/M)
• Time comlexity is...
Lower Bound for Sorting
• Theorem: if you sort by comparisons, then must use at
least log(N!) comparisons. Hence N logN al...
Summary
• For online sorting, use heapsort.
– Online : get elements one at at time
– Offline or Batch: have all elements a...
Upcoming SlideShare
Loading in...5
×

Quicksort

255

Published on

Published in: Engineering, Technology, Career
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
255
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
12
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Quicksort"

  1. 1. Sorting What makes it hard? Chapter 7 in DS&AA Chapter 8 in DS&PS
  2. 2. Insertion Sort • Algorithm – Conceptually, incremental add element to sorted array or list, starting with an empty array (list). – Incremental or batch algorithm. • Analysis – In best case, input is sorted: time is O(N) – In worst case, input is reverse sorted: time is O(N2). – Average case is (loose argument) is O(N2) • Inversion: elements out of order – critical variable for determining algorithm time-cost – each swap removes exactly 1 inversion
  3. 3. Inversions • What is average number of inversions, over all inputs? • Let A be any array of integers • Let revA be the reverse of A • Note: if (i,j) are in order in A they are out of order in revA. And vice versa. • Total number of pairs (i,j) is N*(N-1)/2 so average number of inversions is N*(N-1)/4 which is O(N2) • Corollary: any algorithm that only removes a single inversion at a time will take time at least O(N2)! • To do better, we need to remove more than one inversion at a time.
  4. 4. BubbleSort • Most frequently used sorting algorithm • Algorithm: for j=n-1 to 1 …. O(n) for i=0 to j ….. O(j) if A[i] and A[i+1] are out of order, swap them (that’s the bubble) …. O(1) • Analysis – Bubblesort is O(n^2) • Appropriate for small arrays • Appropriate for nearly sorted arrays • Comparision versus swaps ?
  5. 5. Shell Sort: 1959 by Shell • Motivated by inversion result - need to move far elements • Still quadratic • Only in text books • Historical interest and theoretical interest - not fully understood. • Algorithm: (with schedule 1, 3, 5) – bubble sort things spaced 5 apart – bubble sort things 3 apart – bubble sort things 1 apart • Faster than insertion sort, but still O(N^2) • No one knows the best schedule
  6. 6. Divide and Conquer: Merge Sort • Let A be array of integers of length n • define Sort (A) recursively via auxSort(A,0,N) where • Define array[] Sort(A,low, high) – if (low == high) return – Else • mid = (low+high)/2 • temp1 = sort(A,low,mid) • temp2 = sort(A,mid,high) • temp3 = merge(temp1,temp2)
  7. 7. Merge • Int[] Merge(int[] temp1, int[] temp2) – int[] temp = new int[ temp1.length+temp2.length] – int i,j,k – repeat • if (temp1[i]<temp2[j]) temp[k++]=temp1[i++] • else temp[k++] = temp2[j++] – for all appropriate i, j. • Analysis of Merge: – time: O( temp1.length+temp2.length) – memory: O(temp1.length+temp2.length)
  8. 8. Analysis of Merge Sort • Time – Let N be number of elements – Number of levels is O(logN) – At each level, O(N) work – Total is O(N*logN) – This is best possible for sorting. • Space – At each level, O(N) temporary space – Space can be freed, but calls to new costly – Needs O(N) space – Bad - better to have an in place sort – Quick Sort (chapter 8) is the sort of choice.
  9. 9. Quicksort: Algorithm • QuickSort - fastest algorithm • QuickSort(S) – 1. If size of S is 0 or 1, return S – 2. Pick element v in S (pivot) – 3. Construct L = all elements less than v and R = all elements greater than v. – 4. Return QuickSort(L), then v, then QuickSort(R) • Algorithm can be done in situ (in place). • On average runs in O(NlogN), but can take O(N2) time – depends on choice of pivot.
  10. 10. Quicksort: Analysis • Worst Case: – T(N) = worst case sorting time – T(1) = 1 – if bad pivot, T(N) = T(N-1)+N – Via Telescope argument (expand and add) – T(N) = O(N^2) • Average Case (text argument) – Assume equally likely subproblem sizes • Note: chance of picking ith is 1/N – T(N) average cost to sort
  11. 11. Analysis continued – T(left branch) = T(right branch) (average) so – T(N) = 2* ( T(0)+T(1)….T(N-1) )/N + N, where N is cost of partitioning – Multiply by N: • NT(N) = 2(T(0)+…+T(N-1)) +N^2 (*) – Subtract N-1 case of (*) • NT(N) - (N-1)T(N-1) = 2T(N-1) +2N-1 – Rearrange and drop -1 • NT(N) = (N+1)T(N-1) + 2N -1 – Divide by N(N+1) • T(N)/(N+1) = T(N-1) + 2/(N+1)
  12. 12. Last Step • Substitute N-1, N-2,... 3 for N – T(N-1)/N = T(N-2)/(N-1) + 2/N – … – T(2)/3 = T(1)/2 +2/3 • Add – T(N)/(N+1) = T(1)/2+ 2(1/3+1/4 + ..+1/(N+1) – = 2( 1+1/2 +…) -5/2 since T(1) = 0 – = O(logN) • Hence T(N) = N logN • In literature, more accurate proof. • For better results, choose pivot as median of 3 random values.
  13. 13. Quickselect: Algorithm • Problem: find the kth smallest item • Algorithm: modify Quicksort – let |S| be the number of elements in S. • QuickSelect(S, k) – if |S| = 1, return element in S – Pick element p in S (the pivot) – Partition S via p as in QuickSort into L and R – if k < |L| return QuickSelect(L,k) – if k = |L|+1, return pivot – otherwise return QuickSelect(R, k - |L|-1)
  14. 14. Quickselect: Analysis • Worst Case is O(N^2) • Average Case: analysis similar to quicksort’s. • Here T(N) = 1*(T(0)+T(1)+…+T(N-1))/N + N • Multiply by N – NT(N) = T(0)+T(1) +T(N-1) + N^2 • Substitute with N = N-1 and subtract: – NT(N) -(N-1)T(N-1) = T(N-1) + 2N -1 • Rearrange and divide by N – T(N)= T(N-1)+2 – T(N) = T(N-2) + 4….. = T(1)+2*N = O(N) • Average Case: Linear.
  15. 15. Bucket Sort • A linear time sort algorithm! • Need to know the possible values. • Example 1: to sort N integers less than M. – Make array A of size M – Read each integer i and update, A[i]++ • Example 2: 200 names – make array of size 26*26 = 676 – Using first 2 letters of each name, put it in [char-char] bucket (usually a short ordered linked list) – Collect them up
  16. 16. Radix Sorting (card sorting) • Uses linked lists • Idea: Multiple passes of Bucket Sort • Trick: Iteratively sort by last index, next to last, etc. • Example ed ca xa cd xd bd pass1: a:{ca, xa} d:{ed, cd, xd, bd} ca xa ed cd xd bd pass 2: b{bd} c: {ca, cd} e: {ed} x:{xa, xd} bd ca cd ed xa xd • Complexity: O(N* number of passes) – number of passes = length of key
  17. 17. External Sorting (Tape or CD) • Idea: merge sort (2-way) • Suppose memory size is M (enough to sort internally) • Ta1, Ta2, Tb1, Tb2 are tape drives • Data on Ta1 (initially) • Pass 1: – read M records – sort and write to Tb1, Tb2 alternatively (each run of M records on Tb1, Tb2 is sorted) • Pass 2: – merge sort Tb1 and Tb2 onto Ta1 and Ta2 • Note this takes O(1) memory – Each run of 2*M records is sorted
  18. 18. External Sorting • Continuing merging, alternating writing to ta1, ta2. • Number of passes is log(N/M) • Time comlexity is O( N/M *log(M)) for first pass • O(N) for subsequent passes • Total: O(max(N log(N/M), N/M*log(M)) • With more tapes, can reduce time by doing k-way merge rather than 2-way merge • Replace Log base 2 with log base k • A trickier algorithm (Polyphase) can do it with fewer tapes. • Who uses tapes? Algorithm works for CDs
  19. 19. Lower Bound for Sorting • Theorem: if you sort by comparisons, then must use at least log(N!) comparisons. Hence N logN algorithm. • Proof: – N items can be rearranged in N! ways. – Consider a decision tree where each internal node is a comparison. – Each possible array goes down one path – Number of leaves N! – minimum depth of a decision tree is log(N!) – log(N!) = log1+log2+…+log(N) is O(N logN) – Proof: use partition trick • sum log(N/2) + log(N/2+1)….log(N) >N/2*log(N/2)
  20. 20. Summary • For online sorting, use heapsort. – Online : get elements one at at time – Offline or Batch: have all elements available • For small collections, bubble sort is fine • For large collections, use quicksort • You may hybridize the algorithms, e.g – use quicksort until the size is below some k – then use bubble sort • Sorting is important and well-studied and often inefficiently done. • Libraries often contain sorting routines, but beware: the quicksort routine in Visual C++ seems to run in quadratic time. Java sorts in Collections are fine.
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×