Parallel sorting
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Parallel sorting

  • 778 views
Uploaded on

Overview of Parallel Sorting ...

Overview of Parallel Sorting
Odd–Even Sorting
Overview
Algorithm
Example
Complexity
Bitonic Sort
Overview
Binary Split
Example
Complexity
References

More in: Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
778
On Slideshare
778
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
24
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Dept. of Computer ScienceCourse :Advance Data Algorithm Vikram Singh SlathiaCourse Id: MAI312 2011MAI025Central University of Rajasthan MSc CS III sem.
  • 2.  Overview of Parallel Sorting Odd–Even Sorting  Overview  Algorithm  Example  Complexity Bitonic Sort  Overview  Binary Split  Example  Complexity References Dept. of Computer Science Curaj 2
  • 3.  What is a parallel sorted sequence ?  The sorted list is partitioned with the property that each partitioned list is sorted and each element in processor Pis list is less than that in Pjs list if i < j. Dept. of Computer Science Curaj 3
  • 4.  What is the parallel counterpart to a sequential comparator?  If each processor has one element, the compare exchange operation stores the smaller element at the processor with smaller id. This can be done in ts + tw time.  If we have more than one element per processor, we call this operation a compare split. Assume each of two processors have n/p elements. Dept. of Computer Science Curaj 4
  • 5.  After the compare-split operation, the smaller n/p elements are at processor Pi and the larger n/p elements at Pj, where i < j. The time for a compare-split operation is (ts+ twn/p), assuming that the two partial lists were initially sorted. Dept. of Computer Science Curaj 5
  • 6. A parallel compare-exchange operation.Processes Pi and Pj send their elements to eachother. Process Pi keeps min{ai,aj}, and Pj keepsmax{ai, aj}. Dept. of Computer Science Curaj 6
  • 7. A compare-split operation. Each process sends its block of size n/p to theother process. Each process merges the received block with its own blockand retains only the appropriate half of the merged block. In thisexample, process Pi retains the smaller elements and process Pi retains thelarger elements. Dept. of Computer Science Curaj 7
  • 8.  An odd–even sort or odd–even transposition sort also known as brick sort. Dept. of Computer Science Curaj 9
  • 9. Dept. of Computer Science Curaj 10
  • 10.  void OddEvenSort(T a[ ], int n) { for (int i = 0; i < n; ++i) { if (i & 1) { for ( int j = 2; j < n; j+=2 ) if (a [j] < a[j-1]) Swap(a[ j-1], a[ j ]); } else { for (int j = 1; j < n; j+=2) if (a[ j ] < a[j-1]) Swap(a[ j-1], a[ j ]); } } } Dept. of Computer Science Curaj 11
  • 11. Odd-Even Transposition Sort - example Step 0 1 2 3Time 4 5 6 7 Parallel time complexity: Tpar = O(n) (for P=n) Dept. of Computer Science Curaj 12
  • 12.  Unsorted elements 3 2 3 8 5 6 4 1Solution ▪ Sorting n = 8 elements, using the odd-even transposition sort algorithm. ▪ During each phase, n = 8 elements are compared. Dept. of Computer Science Curaj 13
  • 13. 3 2 3 8 5 6 4 1 Phase 1 (Odd)2 3 3 8 5 6 1 4 Dept. of Computer Science Curaj 14
  • 14. 3 2 3 8 5 6 4 1 Phase 1 (Odd)2 3 3 8 5 6 1 4 Phase 2 (Even)2 3 3 5 8 1 6 4 Dept. of Computer Science Curaj 15
  • 15. 3 2 3 8 5 6 4 1 Phase 1 (Odd)2 3 3 8 5 6 1 4 Phase 2 (Even)2 3 3 5 8 1 6 4 Phase 3 (Odd)2 3 3 5 1 8 4 6 Dept. of Computer Science Curaj 16
  • 16. 3 2 3 8 5 6 4 1 Phase 1 (Odd)2 3 3 8 5 6 1 4 Phase 2 (Even)2 3 3 5 8 1 6 4 Phase 3 (Odd)2 3 3 5 1 8 4 6 Phase 4(Even)2 3 3 1 5 4 8 6 Dept. of Computer Science Curaj 17
  • 17. 3 2 3 8 5 6 4 1 Phase 1 (Odd)2 3 3 8 5 6 1 4 Phase 2 (Even)2 3 3 5 8 1 6 4 Phase 3 (Odd)2 3 3 5 1 8 4 6 Phase 4(Even)2 3 3 1 5 4 8 6 Phase 5(Odd)2 3 1 3 4 5 6 8 Dept. of Computer Science Curaj 18
  • 18. 3 2 3 8 5 6 4 1 Phase 1 (Odd)2 3 3 8 5 6 1 4 Phase 2 (Even)2 3 3 5 8 1 6 4 Phase 3 (Odd)2 3 3 5 1 8 4 6 Phase 4(Even)2 3 3 1 5 4 8 6 Phase 5(Odd)2 3 1 3 4 5 6 8 Phase 6 (Odd)2 1 3 3 4 5 6 8 Dept. of Computer Science Curaj 19
  • 19. 3 2 3 8 5 6 4 1 Phase 1 (Odd)2 3 3 8 5 6 1 4 Phase 2 (Even)2 3 3 5 8 1 6 4 Phase 3 (Odd)2 3 3 5 1 8 4 6 Phase 4(Even)2 3 3 1 5 4 8 6 Phase 5(Odd)2 3 1 3 4 5 6 8 Phase 6 (Odd)2 1 3 3 4 5 6 8 Phase7(Even)1 2 3 3 4 5 6 8 Dept. of Computer Science Curaj 20
  • 20. 3 2 3 8 5 6 4 1 Phase 1 (Odd)2 3 3 8 5 6 1 4 Phase 2 (Even)2 3 3 5 8 1 6 4 Phase 3 (Odd)2 3 3 5 1 8 4 6 Phase 4(Even)2 3 3 1 5 4 8 6 Phase 5(Odd)2 3 1 3 4 5 6 8 Phase 6 (Odd)2 1 3 3 4 5 6 8 Phase7(Even)1 2 3 3 4 5 6 8 Phase 8(Odd)1 2 3 3 4 5 6 8 Dept. of Computer Science Curaj 21
  • 21.  After n phases of odd-even exchanges, the sequence is sorted. Each phase of the algorithm (either odd or even) requires Θ(n) comparisons. Serial complexity is Θ(n2). Dept. of Computer Science Curaj 22
  • 22.  Consider the one item per processor case. There are n iterations, in each iteration, each processor does one compare-exchange. The parallel run time of this formulation is Θ(n). Dept. of Computer Science Curaj 23
  • 23. Dept. of Computer Science Curaj 24
  • 24.  Consider a block of n/p elements per processor. The first step is a local sort. In each subsequent step, the compare exchange operation is replaced by the compare split operation. Dept. of Computer Science Curaj 25
  • 25. P0 P1 P2 P3 13 7 12 8 5 4 6 1 3 9 2 10Local sort 7 12 13 4 5 8 1 3 6 2 9 10O-E 4 5 7 8 12 13 1 2 3 6 9 10E-O 4 5 7 1 2 3 8 12 13 6 9 10O-E 1 2 3 4 5 7 6 8 9 10 12 13E-OSORTED: 1 2 3 4 5 6 7 8 9 10 12 13 Dept. of Computer Science Curaj 26
  • 26.  Time complexity:  Tpar = (Local Sort) + (p merge-splits) +(p exchanges)  Tpar = (n/p)log(n/p) + n + n = (n/p)log(n/p) + 2n Dept. of Computer Science Curaj 27
  • 27.  A bitonic sequence is defined as a list with no more than one LOCAL MAXIMUM and no more than one LOCAL MINIMUM.  Dept. of Computer Science Curaj 29
  • 28. A bitonic sequence is a list with no more than one LOCAL MAXIMUMand no more than one LOCAL MINIMUM.(Endpoints must be considered - wraparound ) This is ok! 1 Local MAX; 1 Local MIN The list is bitonic! This is NOT bitonic! Why? 1 Local MAX; 2 Local MINs Dept. of Computer Science Curaj 30
  • 29. 1. Divide the bitonic list into two equal halves.2. Compare-Exchange each item on the first half with the corresponding item in the second half.Result:Two bitonic sequences where the numbers in one sequence are all lessthan the numbers in the other sequence. Dept. of Computer Science Curaj 31
  • 30. Bitonic list: 24 20 15 9 4 2 5 8 | 10 11 12 13 22 30 32 45Result after Binary-split: 10 11 12 9 4 2 5 8 | 24 20 15 13 22 30 32 45If you keep applying the BINARY-SPLIT to each half repeatedly, you will get a ORTED LIST ! 10 11 12 9 . 4 2 5 8 | 24 20 15 13 . 22 30 32 45 4 2 . 5 8 10 11 . 12 9 | 22 20 . 15 13 24 30 . 32 45 4 . 2 5 . 8 10 . 9 12 .11 15 . 13 22 . 20 24 . 30 32 . 45 2 4 5 8 9 10 11 12 13 15 20 22 24 30 32 45Q: How many parallel steps does it take to sort ?A: log n Dept. of Computer Science Curaj 32
  • 31.  A bitonic sorting network sorts n elements in Θ(log2n) time. A bitonic sequence has two tones - increasing and decreasing, or vice versa. Any cyclic rotation of such networks is also considered bitonic. 1,2,4,7,6,0 is a bitonic sequence, because it first increases and then decreases. 8,9,2,1,0,4 is another bitonic sequence, because it is a cyclic shift of 0,4,8,9,2,1 . Dept. of Computer Science Curaj 33
  • 32.  Let s = a0,a1,…,an-1 be a bitonic sequence such that a0 ≤ a1 ≤ ··· ≤ an/2-1 and an/2 ≥ an/2+1 ≥ ··· ≥ an-1. Consider the following subsequences of s: s1 = min{a0,an/2},min{a1,an/2+1},…,min{an/2-1,an-1} s2 = max{a0,an/2},max{a1,an/2+1},…,max{an/2-1,an-1} Note that s1 and s2 are both bitonic and each element of s1 is less than every element in s2. We can apply the procedure recursively on s1 and s2 to get the sorted sequence. Dept. of Computer Science Curaj 34
  • 33.  We can easily build a sorting network to implement this bitonic merge algorithm. Such a network is called a bitonic merging network. The network contains log n columns. Each column contains n/2 comparators and performs one step of the bitonic merge. We denote a bitonic merging network with n inputs by BM[n]. Replacing the comparators by Ө comparators results in a decreasing output sequence; such a network is denoted by ӨBM[n]. Dept. of Computer Science Curaj 35
  • 34.  How do we sort an unsorted sequence using a bitonic merge? We must first build a single bitonic sequence from the given sequence.  A sequence of length 2 is a bitonic sequence.  A bitonic sequence of length 4 can be built by sorting the first two elements using BM[2] and next two, using ӨBM[2].  This process can be repeated to generate larger bitonic sequences. Dept. of Computer Science Curaj 36
  • 35. A bitonic merging network for n = 16. The inputwires are numbered 0,1,…, n - 1, and the binaryrepresentation of these numbers is shown. Eachcolumn of comparators is drawn separately; theentire figure represents a BM[16] bitonicmerging network. The network takes a bitonicsequence and outputs it in sorted order. Dept. of Computer Science Curaj 37
  • 36. Dept. of Computer Science Curaj 38
  • 37. The comparator network that transforms aninput sequence of 16 unordered numbers intoa bitonic sequence. Dept. of Computer Science Curaj 39
  • 38. Dept. of Computer Science Curaj 40
  • 39. A schematic representation of a network thatconverts an input sequence into a bitonicsequence. In this example, BM[k] andӨBM[k] denote bitonic merging networks ofinput size k that use and Өcomparators, respectively. The last mergingnetwork ( BM[16]) sorts the input. In this example, ▪ n = 16. Dept. of Computer Science Curaj 41
  • 40. Six phases of Bitonic Sort on a hypercube of dimension 3 Step No. Processor No. 000 001 010 011 100 101 110 111 1 L H H L L H H L 2 L L H H H H L L 3 L H L H H L H L 4 L L L L H H H H 5 L L H H L L H H 6 L H L H L H L H Dept. of Computer Science Curaj 42
  • 41. Dept. of Computer Science Curaj 43
  • 42.  The depth of the network is Θ(log2 n). Each stage of the network contains n/2 comparators. A serial implementation of the network would have complexity Θ(nlog2 n). Dept. of Computer Science Curaj 44
  • 43. Bitonic sort (for N = P)P0 P1 P2 P3 P4 P5 P6 P7000 001 010 011 100 101 110 111K G J M C A N F Dept. of Computer Science Curaj 45
  • 44. Bitonic sort (for N = P)P0 P1 P2 P3 P4 P5 P6 P7000 001 010 011 100 101 110 111K G J M C A N FLo Hi Hi Lo Lo Hi High Low G K M J A C N F Dept. of Computer Science Curaj 46
  • 45. Bitonic sort (for N = P)P0 P1 P2 P3 P4 P5 P6 P7000 001 010 011 100 101 110 111K G J M C A N FLo Hi Hi Lo Lo Hi High Low G K M J A C N FL L H H H H L LG J M K N F A C Dept. of Computer Science Curaj 47
  • 46. Bitonic sort (for N = P)P0 P1 P2 P3 P4 P5 P6 P7000 001 010 011 100 101 110 111K G J M C A N FLo Hi Hi Lo Lo Hi High Low G K M J A C N FL L H H H H L LG J M K N F A CL H L H H L H LG J K M N F C A Dept. of Computer Science Curaj 48
  • 47. Bitonic sort (for N = P)P0 P1 P2 P3 P4 P5 P6 P7000 001 010 011 100 101 110 111K G J M C A N FLo Hi Hi Lo Lo Hi High Low G K M J A C N FL L H H H H L LG J M K N F A CL H L H H L H LG J K M N F C AL L L L H H H HG F C A N J K M Dept. of Computer Science Curaj 49
  • 48. Bitonic sort (for N = P)P0 P1 P2 P3 P4 P5 P6 P7000 001 010 011 100 101 110 111K G J M C A N FLo Hi Hi Lo Lo Hi High Low G K M J A C N FL L H H H H L LG J M K N F A CL H L H H L H LG J K M N F C AL L L L H H H HG F C A N J K ML L H H L L H HC A G F K J N M Dept. of Computer Science Curaj 50
  • 49. Bitonic sort (for N = P)P0 P1 P2 P3 P4 P5 P6 P7000 001 010 011 100 101 110 111K G J M C A N FLo Hi Hi Lo Lo Hi High Low G K M J A C N FL L H H H H L LG J M K N F A CL H L H H L H LG J K M N F C AL L L L H H H HG F C A N J K ML L H H L L H HC A G F K J N MA C F G J K M N Dept. of Computer Science Curaj 51
  • 50.  In general, with n = 2k, there are k phases, each of 1, 2, 3, …, k steps. Hence the total number of steps is: i log n bitonicbitonic i log n log n (log n log n (log n 1) 1) 2 T par T par ii O (log O) n (log 2 n) i 1 2 2 i 1 Dept. of Computer Science Curaj 52
  • 51. Bitonic sort (for N >> P) P0 P1 P2 P3 P4 P5 P6 P7 000 001 010 011 100 101 110 1112 7 4 13 6 9 4 18 5 12 1 7 6 3 14 11 6 8 4 10 5 2 15 17Local Sort (ascending):2 4 7 6 9 13 4 5 18 1 7 12 3 6 14 6 8 11 4 5 10 2 15 17 Dept. of Computer Science Curaj 53
  • 52. Bitonic sort (for N >> P) P0 P1 P2 P3 P4 P5 P6 P7 000 001 010 011 100 101 110 1112 7 4 13 6 9 4 18 5 12 1 7 6 3 14 11 6 8 4 10 5 2 15 17Local Sort (ascending):2 4 7 6 9 13 4 5 18 1 7 12 3 6 14 6 8 11 4 5 10 2 15 17 L H H L L H High Low2 4 6 7 9 13 7 12 18 1 4 5 3 6 6 8 11 14 10 15 17 2 4 5 Dept. of Computer Science Curaj 54
  • 53. Bitonic sort (for N >> P) P0 P1 P2 P3 P4 P5 P6 P7 000 001 010 011 100 101 110 1112 7 4 13 6 9 4 18 5 12 1 7 6 3 14 11 6 8 4 10 5 2 15 17Local Sort (ascending):2 4 7 6 9 13 4 5 18 1 7 12 3 6 14 6 8 11 4 5 10 2 15 17 L H H L L H High Low2 4 6 7 9 13 7 12 18 1 4 5 3 6 6 8 11 14 10 15 17 2 4 5 L L H H H H L L2 4 6 1 4 5 7 12 18 7 9 13 10 15 17 8 11 14 3 6 6 2 4 5 Dept. of Computer Science Curaj 55
  • 54. Bitonic sort (for N >> P) P0 P1 P2 P3 P4 P5 P6 P7 000 001 010 011 100 101 110 1112 7 4 13 6 9 4 18 5 12 1 7 6 3 14 11 6 8 4 10 5 2 15 17Local Sort (ascending):2 4 7 6 9 13 4 5 18 1 7 12 3 6 14 6 8 11 4 5 10 2 15 17 L H H L L H High Low2 4 6 7 9 13 7 12 18 1 4 5 3 6 6 8 11 14 10 15 17 2 4 5 L L H H H H L L2 4 6 1 4 5 7 12 18 7 9 13 10 15 17 8 11 14 3 6 6 2 4 5 L H L H H L H L1 2 4 4 5 6 7 7 9 12 13 18 14 15 17 8 10 11 5 6 6 2 3 4 Dept. of Computer Science Curaj 56
  • 55. Bitonic sort (for N >> P) P0 P1 P2 P3 P4 P5 P6 P7 000 001 010 011 100 101 110 1112 7 4 13 6 9 4 18 5 12 1 7 6 3 14 11 6 8 4 10 5 2 15 17Local Sort (ascending):2 4 7 6 9 13 4 5 18 1 7 12 3 6 14 6 8 11 4 5 10 2 15 17 L H H L L H High Low2 4 6 7 9 13 7 12 18 1 4 5 3 6 6 8 11 14 10 15 17 2 4 5 L L H H H H L L2 4 6 1 4 5 7 12 18 7 9 13 10 15 17 8 11 14 3 6 6 2 4 5 L H L H H L H L1 2 4 4 5 6 7 7 9 12 13 18 14 15 17 8 10 11 5 6 6 2 3 4 L L L L H H H H1 2 4 4 5 6 5 6 6 2 3 4 14 15 17 8 10 11 7 7 9 12 13 18 Dept. of Computer Science Curaj 57
  • 56. Bitonic sort (for N >> P) P0 P1 P2 P3 P4 P5 P6 P7 000 001 010 011 100 101 110 1112 7 4 13 6 9 4 18 5 12 1 7 6 3 14 11 6 8 4 10 5 2 15 17Local Sort (ascending):2 4 7 6 9 13 4 5 18 1 7 12 3 6 14 6 8 11 4 5 10 2 15 17 L H H L L H High Low2 4 6 7 9 13 7 12 18 1 4 5 3 6 6 8 11 14 10 15 17 2 4 5 L L H H H H L L2 4 6 1 4 5 7 12 18 7 9 13 10 15 17 8 11 14 3 6 6 2 4 5 L H L H H L H L1 2 4 4 5 6 7 7 9 12 13 18 14 15 17 8 10 11 5 6 6 2 3 4 L L L L H H H H1 2 4 4 5 6 5 6 6 2 3 4 14 15 17 8 10 11 7 7 9 12 13 18 L L H H L L H H1 2 4 2 3 4 5 6 6 4 5 6 7 7 9 8 10 11 14 15 17 12 13 18 Dept. of Computer Science Curaj 58
  • 57. Bitonic sort (for N >> P) P0 P1 P2 P3 P4 P5 P6 P7 000 001 010 011 100 101 110 1112 7 4 13 6 9 4 18 5 12 1 7 6 3 14 11 6 8 4 10 5 2 15 17Local Sort (ascending):2 4 7 6 9 13 4 5 18 1 7 12 3 6 14 6 8 11 4 5 10 2 15 17 L H H L L H High Low2 4 6 7 9 13 7 12 18 1 4 5 3 6 6 8 11 14 10 15 17 2 4 5 L L H H H H L L2 4 6 1 4 5 7 12 18 7 9 13 10 15 17 8 11 14 3 6 6 2 4 5 L H L H H L H L1 2 4 4 5 6 7 7 9 12 13 18 14 15 17 8 10 11 5 6 6 2 3 4 L L L L H H H H1 2 4 4 5 6 5 6 6 2 3 4 14 15 17 8 10 11 7 7 9 12 13 18 L L H H L L H H1 2 4 2 3 4 5 6 6 4 5 6 7 7 9 8 10 11 14 15 17 12 13 18 L H L H L H L H1 2 2 3 4 4 4 5 5 6 6 6 7 7 8 9 10 11 12 13 14 15 17 18 Dept. of Computer Science Curaj 59
  • 58. Complexity (for N >> P) bitonicT par Local Sort Parallel Bitonic Merge N N N log 2 (1 2 3 ... log P ) P P P N N log P (1 log P ) {log 2( )} P P 2 N 2 (log N log P log P log P) P bitonic N 2T par (log N log P) P of Computer Science Dept. Curaj 60
  • 59. Computational time complexity using P=nprocessors• Odd-even transposition sort - • O(n)• Bitonic Mergesort – • O(log2n) (** BEST! **) Dept. of Computer Science Curaj 61
  • 60.  Books  Parallel Programming in C with MPI and OpenMP , Michael J. Quinn, McGraw Hill Higher Education, 2003  Introduction to Parallel Processing: Algorithms and Architectures, Behrooz Parham, Springer  The Art of Concurrency: A Thread Monkeys Guide to Writing Parallel Applications, Clay Breshears, OReilly Media Links  http://www- users.cs.umn.edu/~karypis/parbook/Lectures/AG/chap9_slides.pdf  A Library of Parallel Algorithms, ▪ www.cs.cmu.edu/~scandal/nesl/algorithms.html Image Source  http://www-users.cs.umn.edu/~karypis/parbook/Lectures/AG/chap9_slides.pdf Dept. of Computer Science Curaj 62