Upcoming SlideShare
×

# Sorting Algorithms in Java

8,514 views
8,420 views

Published on

Sorting Algorithms in Java. Complexity. Algorithms. Interfaces.

Topics:
Problem definition
Insertion Sort
Selection Sort
Counting Sort
Merge Sort
Collections.sort

Teaching material for the course of "Tecniche di Programmazione" at Politecnico di Torino in year 2012/2013. More information: http://bit.ly/tecn-progr

Published in: Education, Technology
1 Comment
6 Likes
Statistics
Notes
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Nice

Are you sure you want to  Yes  No
Views
Total views
8,514
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
237
1
Likes
6
Embeds 0
No embeds

No notes for slide

### Sorting Algorithms in Java

1. 1. SortingTecniche di Programmazione – A.A. 2012/2013
2. 2. Summary1. Problem definition2. Insertion Sort3. Selection Sort4. Counting Sort5. Merge Sort6. Quicksort7. Collections.sort 2 Tecniche di programmazione A.A. 2012/2013
3. 3. Summary1. Problem definition2. Insertion Sort Iterative3. Selection Sort4. Counting Sort Special5. Merge Sort Recursive6. Quicksort7. Collections.sort 3 Tecniche di programmazione A.A. 2012/2013
4. 4. Problem definition Sorting
5. 5. Formal problem definition: Sorting Input:  A sequence of n numbers <a1, a2, …, an> Output:  A permutation <a’1, a’2, …, a’n> of the original elements, such that a’1  a’2  …  a’n 5 Tecniche di programmazione A.A. 2012/2013
6. 6. Types of sorting approaches Internal sorting  Data to be sorted are all within the main computer memory (RAM)  Direct access to all element values External sorting  Data to be sorted may not all be loaded in memory at the same time  We must work directly on data stored on file  Typically, sequential access to data 6 Tecniche di programmazione A.A. 2012/2013
7. 7. Sorting objects Book-algorithms always refer to sorting sequences of numbers In practice, we need to sort the elements of a collection, of some class type The objects to be sorted must implement the Comparable interface 7 Tecniche di programmazione A.A. 2012/2013
8. 8. Comparable public interface Comparable<T> (java.lang) Must implement:  int compareTo(T other)  Returns a negative integer, zero, or a positive integer as this object is less than, equal to, or greater than the specified other object.  It is strongly recommended, but not strictly required that (x.compareTo(y)==0) == (x.equals(y)) http://docs.oracle.com/javase/7/docs/api/java/lang/Comparable.html 8 Tecniche di programmazione A.A. 2012/2013
9. 9. Sorting Comparable objects Given a class, usually  A sub-set of the fields is used for sorting  The fields for sorting are called the «key» of the objects  .equals and .compareTo are defined according to the key fields  Other fields are regarded as «additional data» Different types of keys (and thus ordering criteria) may be defined The Comparable interface specifies the «natural» ordering  Other orderings may be achieved with the Comparator helper classes 9 Tecniche di programmazione A.A. 2012/2013
10. 10. Comparator public interface Comparator<T> (java.util) Must implement:  int compare(T obj1, T obj2)  Returns a negative integer, zero, or a positive integer as the first argument is less than, equal to, or greater than the second.  It is generally the case, but not strictly required that (compare(x, y)==0) == (x.equals(y)) Comparators can be passed to a sort method http://docs.oracle.com/javase/7/docs/api/java/util/Comparator.html 10 Tecniche di programmazione A.A. 2012/2013
11. 11. Examplepublic class Studente implements Comparable<Studente> { private int matricola ; private String cognome ; private String nome ; private int voto ; @Override public int compareTo(Studente other) { return this.matricola - other.matricola ; } 11 Tecniche di programmazione A.A. 2012/2013
12. 12. Examplepublic class Studente implements Comparable<Studente> { private int matricola ; private String cognome ; private String nome ; «Natural» ordering: by Matricola field private int voto ; @Override public int compareTo(Studente other) { return this.matricola - other.matricola ; } 12 Tecniche di programmazione A.A. 2012/2013
13. 13. Based on the same Example «key» fields // Since we define compareTo, we should also redefine equals and hashCode !!! @Override public boolean equals(Object other) { return this.matricola == ((Studente)other).matricola ; } @Override public int hashCode() { return ((Integer)this.matricola).hashCode(); } ... getters & setters ...} 13 Tecniche di programmazione A.A. 2012/2013
14. 14. Comparator for sorting by namepublic class StudenteByName implements Comparator<Studente> { @Override public int compare(Studente arg0, Studente arg1) { int cmp = arg0.getCognome().compareTo(arg1.getCognome()) ; if( cmp!=0 ) return cmp ; else return arg0.getNome().compareTo(arg1.getNome()) ; }} Check names only if surnames are equal. 14 Tecniche di programmazione A.A. 2012/2013
15. 15. Comparator for sorting by votopublic class StudenteByVoto implements Comparator<Studente> { @Override public int compare(Studente o1, Studente o2) { return o1.getVoto()-o2.getVoto() ; }} Note: repeated values for the Voto field are possible 15 Tecniche di programmazione A.A. 2012/2013
16. 16. Stability A sorting algorithm is said to be stable when, if multiple elements share the same value of the key, in the sorted sequence such elements appear in the same relative order of the original sequence. 16 Tecniche di programmazione A.A. 2012/2013
17. 17. Algorithms Various sorting algorithms are known, with differing complexity:  O(n2): simple, iterative  Insertion sort, Selection sort, Bubble sort, …  O(n): applicable in special cases, only  Counting sort, Radix sort, Bin (o Bucket) sort, …  O(n log n): more complex, recursive  Merge sort, Quicksort, Heapsort 17 Tecniche di programmazione A.A. 2012/2013
18. 18. Insertion Sort Sorting
19. 19. Insertion sort Already ordered Not considered yet v[j]2 3 6 12 16 21 8 Move right by one cell all elements ‘i’ for which v[i]>v[j]2 3 6 8 12 16 212 3 6 8 12 16 21 519 Tecniche di programmazione A.A. 2012/2013
20. 20. Quick reference20 Tecniche di programmazione A.A. 2012/2013
21. 21. Running example21 Tecniche di programmazione A.A. 2012/2013
22. 22. Complexity Number of comparisons  Number of data copies  Cmin = n-1  Mmin = 2(n-1)  Cavg = ¼(n2+n-2)  Mavg = ¼(n2+9n-10)  Cmax = ½(n2+n)-1  Mmax = ½(n2+3n-4) C = O(n2), M = O(n2) Best case: already sorted vector T(n) = O(n2)Worst case: inversely sorted vector T(n) is not (n2) Tworst case (n) = (n2) 22 Tecniche di programmazione A.A. 2012/2013
23. 23. Selection Sort Sorting
24. 24. Selection Sort At every iteration, find the minimum of the yet-unsorted part of the vector Swap the minimum with the current position in the vector Already ordered Not ordered 2 3 6 12 16 21 34 81 25 28 41 27 60 Mimimum v[j] 2 3 6 12 16 21 25 81 34 28 41 27 60 24 Tecniche di programmazione A.A. 2012/2013
25. 25. Complexity The loops don’t depend on the data stored in the array: complexity is independent from the contents of the values to be sorted Worst case performance : О(n2) Best case performance: О(n2) Average case performance: О(n2) 25 Tecniche di programmazione A.A. 2012/2013
26. 26. Implementationpublic void sort(T[] vector, Comparator<T> comp) { for(int j=0; j<vector.length; j++) { // find minimum int pos = j ; for(int i=j+1; i<vector.length; i++) { if( comp.compare(vector[i], vector[pos])<0 ) pos = i ; } // swap positions if( j!=pos ) { T temp = vector[pos] ; vector[pos] = vector[j] ; vector[j] = temp ; } }} 26 Tecniche di programmazione A.A. 2012/2013
27. 27. Counting Sort Sorting
28. 28. Counting sort Not applicable in general Precondition (hypothesis for applicability):  The n elements to be sorted are integer numbers ranging from 1 to k, for some positive integer k With this hypothesis, if k = O(n), then the algorithm has complexity O(n), only! 28 Tecniche di programmazione A.A. 2012/2013
29. 29. Basic idea Find, for each element x to be sorted, how many elements are less than x This information allows us to directly deposit x into its final destination position. 29 Tecniche di programmazione A.A. 2012/2013
30. 30. Data structures We need 3 vectors:  Starting vector : A[1..n]  Result vector : B[1..n]  Support vector : C[1..k] Vector C keeps track of the number of elements in A that have a certain value:  C[i] = how many elements in A have value i The sum of the first i elements in C equals the number of elements in A with value <= i. 30 Tecniche di programmazione A.A. 2012/2013
31. 31. Pseudo-code31 Tecniche di programmazione A.A. 2012/2013
32. 32. Analysis For each j, C[A[j]] is the number of elements <=A[j], and also represents the final position of A[j] in B:  B[ C[A[j]] ] = A[j] The corrective term C[A[j]]  C[A[j]] – 1 handles the presence of duplicate items 32 Tecniche di programmazione A.A. 2012/2013
33. 33. Example (n=8, k=6)A 3 6 4 1 3 4 1 4C 2 0 2 3 0 1C 2 2 4 7 7 8B 4 C 2 2 4 6 7 8B 1 4 C 1 2 4 6 7 8 33 Tecniche di programmazione A.A. 2012/2013
34. 34. ExampleA 3 6 4 1 3 4 1 4B 4 C 2 2 4 6 7 8 j=8B 1 4 C 1 2 4 6 7 8 j=7B 1 4 4 C 1 2 4 5 7 8 j=6B 1 3 4 4 C 1 2 3 5 7 8 j=5B 1 1 3 4 4 C 0 2 3 5 7 8 j=4B 1 1 3 4 4 4 C 0 2 3 4 7 8 j=3B 1 1 3 4 4 4 6 C 0 2 3 4 7 7 j=2B 1 1 3 3 4 4 4 6 C 0 2 2 4 7 7 j=1 34 Tecniche di programmazione A.A. 2012/2013
35. 35. Complexity  1-2: Initialization of C: O(k)  3-4: Computaion of C: O(n)  6-7: Running sum in C: O(k)  9-11: Copy back to B: O(n) Total complexity is therefore: O(n+k). The algorithm is useful with k=O(n), only…  In such a case, the overall complexity is O(n). 35 Tecniche di programmazione A.A. 2012/2013
36. 36. Quick Reference36 Tecniche di programmazione A.A. 2012/2013
37. 37. Merge Sort Sorting
38. 38. Merge Sort The Merge Sort algorithm is a direct application of the Divide et Impera approach 6 12 4 5 2 9 5 12 Divide 6 12 4 5 2 9 5 12 Solve Solve 4 5 6 12 2 5 9 12 Combine 2 4 5 5 6 9 12 12 38 Tecniche di programmazione A.A. 2012/2013
39. 39. Merge Sort: Divide The vector is simply partitioned in two sub-vector, according to a splitting point The splitting point is usually chosen at the middle of the vector 1 8 p r 6 12 4 5 2 9 5 12 Divide6 12 4 5 2 9 5 121 4 5 8 p q q+1 r 39 Tecniche di programmazione A.A. 2012/2013
40. 40. Merge Sort: Termination Recursion terminates when the sub-vector:  Has one element, only: p=r  Has no elements: p>r p r p q q+1 r 40 Tecniche di programmazione A.A. 2012/2013
41. 41. Merge Sort: Combine The combining step implies merging two sorted sub- vectors  Recursion guarantees that the sub-vectors are sorted  The merging approach compares the first element of each of the two vectors, and copies the lowest one  The result of the merging is saved in a different vector Such algorithm may be realized in (n). 4 5 6 12 2 5 9 12 Combine 2 4 5 5 6 9 12 12 41 Tecniche di programmazione A.A. 2012/2013
42. 42. Pseudo-code MERGE-SORT(A, p, r) 1 if p < r Termination 2 then q  (p+r)/2 Divide 3 MERGE-SORT(A, p, q) Solve 4 MERGE-SORT(A, q+1, r) 5 MERGE(A, p, q, r) Combine42 Tecniche di programmazione A.A. 2012/2013
43. 43. Note We often use the following symbols:  x = integer part of x, i.e. largest integer preceding x (floor function)  x = smallest integer following x (ceiling function) Examples:  3 = 3 = 3  3.1 = 3; 3.1 = 4 43 Tecniche di programmazione A.A. 2012/2013
44. 44. The Merge procedureMERGE(A, p, q, r) Complexity: (n).1 i  p ; j  q+1 ; k  12 while( i  q and j  r )3 if( A[i] < A[j]) B[k]  A[i] ; i  i+14 else B[k]  A[j] ; j  j+15 k  k+16 while( iq ) B[k]A[i] ; ii+1; kk+17 while( jr ) B[k]A[j] ; jj+1; kk+18 A[p..r]  B[1..k-1] 44 Tecniche di programmazione A.A. 2012/2013
45. 45. The Merge procedureMERGE(A, p, q, r) At each iteration, the smallest number between the heads of the two vectors is1 i  p ; j  q+1 ; k  1 copied to B2 while( i  q and j  r )3 if( A[i] < A[j]) B[k]  A[i] ; i  i+14 else B[k]  A[j] ; j  j+15 k  k+16 while( iq ) B[k]A[i] ; ii+1; kk+17 while( jr ) B[k]A[j] ; jj+1; kk+18 A[p..r]  B[1..k-1] The «tail» of one of the vectors is emptied 45 Tecniche di programmazione A.A. 2012/2013
46. 46. Complexity analysis Termination: a simple test, (1) Divide (2): find the mid-point of the vector, D(n)=(1) Solve (3-4): solves 2 sub-problems of size n/2 each, 2T(n/2) Combine (5): based on the Merge algorithm, C(n) = (n). 46 Tecniche di programmazione A.A. 2012/2013
47. 47. Complexity analysis Termination: a simple test, (1) Divide (2): find the mid-point of the vector, D(n)=(1) Solve (3-4): solves 2 sub-problems of size n/2 each, 2T(n/2) Combine (5): based on the Merge algorithm, C(n) = (n). One sub-problem has size  n/2 , the other  n/2 . This detail does not change the complexity result. 47 Tecniche di programmazione A.A. 2012/2013
48. 48. Complexity T(n) =  (1) for n  1  2T(n/2) + (n) for n > 1 The solution (proof omitted…) is:  T(n) = (n log n) 48 Tecniche di programmazione A.A. 2012/2013
49. 49. Intuitive understanding (n=16) 16 1 x 16 = n 8 8 2x8=n 4x4=nlog2 n 4 4 4 4 2 2 2 2 2 2 2 2 8x2=n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 16 x 1 = n Recursion levels: log2 n Operations per level: n Total operations: n log2 n 49 Tecniche di programmazione A.A. 2012/2013
50. 50. Warning Not all recursive implementations have (n log n) complexity. For example, if merge sort is used with asymmetric partitioning (q=p+1), it degrades to an insertion sort, yielding (n2). 50 Tecniche di programmazione A.A. 2012/2013
51. 51. Quicksort Sorting
52. 52. Collections.sort Sorting
53. 53. Sorting, in practice, in Java A programmer’s motto says:  Use the system sort  i.e., the sorting algorithm already provided by your libraries  In other words, don’t re-implement your own sorting functions The Collections framework provides:  public class Collections  This class consists exclusively of static methods that operate on or return collections  public static <T extends Comparable<? super T>> void sort(List<T> list)  public static <T> void sort(List<T> list, Comparator<? super T> c) 74 Tecniche di programmazione A.A. 2012/2013
54. 54. Collections.sort(list) Sorts the specified list into ascending order, according to the natural ordering of its elements.  All elements in the list must implement the Comparable interface.  Furthermore, all elements in the list must be mutually comparable (that is, e1.compareTo(e2) must not throw a ClassCastException for any elements e1 and e2 in the list). This sort is guaranteed to be stable: equal elements will not be reordered as a result of the sort. The specified list must be modifiable, but need not be resizable. http://docs.oracle.com/javase/7/docs/api/java/util/Coll ections.html#sort(java.util.List) 75 Tecniche di programmazione A.A. 2012/2013
55. 55. Implementation of Collections.sort This implementation is a stable, adaptive, iterative mergesort that requires far fewer than n lg(n) comparisons when the input array is partially sorted, while offering the performance of a traditional mergesort when the input array is randomly ordered. If the input array is nearly sorted, the implementation requires approximately n comparisons. Temporary storage requirements vary from a small constant for nearly sorted input arrays to n/2 object references for randomly ordered input arrays. http://docs.oracle.com/javase/7/docs/api/java/util/C ollections.html#sort(java.util.List) 76 Tecniche di programmazione A.A. 2012/2013
56. 56. Resources Algorithms in a Nutshell, By George T. Heineman, Gary Pollice, Stanley Selkow, OReilly Media http://docs.oracle.com/javase/7/docs/api/java/lang/Compar able.html http://www.sorting-algorithms.com/ 77 Tecniche di programmazione A.A. 2012/2013
57. 57. Licenza d’uso Queste diapositive sono distribuite con licenza Creative Commons “Attribuzione - Non commerciale - Condividi allo stesso modo (CC BY-NC-SA)” Sei libero:  di riprodurre, distribuire, comunicare al pubblico, esporre in pubblico, rappresentare, eseguire e recitare questopera  di modificare questopera Alle seguenti condizioni:  Attribuzione — Devi attribuire la paternità dellopera agli autori originali e in modo tale da non suggerire che essi avallino te o il modo in cui tu usi lopera.  Non commerciale — Non puoi usare questopera per fini commerciali.  Condividi allo stesso modo — Se alteri o trasformi questopera, o se la usi per crearne unaltra, puoi distribuire lopera risultante solo con una licenza identica o equivalente a questa. http://creativecommons.org/licenses/by-nc-sa/3.0/ 78 Tecniche di programmazione A.A. 2012/2013