NON-COMPARISON BASED SORTING
ALGORITHM(BUCKET SORT)
Presented By :- Krupali Mistry
INDEX
• Why Non-Comparison Based Sorting Algorithms?
• Introduction to bucket sort
• Working of bucket sort
• Algorithm
• Complexity
• Literature review
• Applications
WHY NON-COMPARISON BASED SORTING
ALGORITHMS?
• Selection Sort, Bubble Sort, Insertion Sort: O(n2).
• Heap sort and Merge sort: O( n log n )
Quick sort : O( n log n ) ,on average, O( n2 ) worst
case.
• Can we do better than O( n log n ) using comparison
based sorting algorithms?
 No.
• It can be proven that any comparison-based sorting
algorithm will need to carry out at least O( n log n )
operations.
NON-COMPARISON BASED SORTING ALGORITHMS
• Bucket sort or Bin Sort
• Counting Sort
• Radix Sort
• Input is uniformly distributed over a range. For example
Sort a large set of floating point numbers which are in range
from 0.0 to 1.0 and are uniformly distributed across the range.
• Used in conjuction with another sorting algorithms.
• We can perform bucket sort on any array of (non-negative)
integers but number of buckets will depend on the maximum
integer value.
Introduction
WORKING OF BUCKET SORT
.78
.17
.39
.26
.72
.94
.21
.12
.23
.68
0
1
2
3
4
5
6
7
8
9
1
2
3
4
5
6
7
8
9
10
.21
.12 /
.72 /
.23 /
.78
.94 /
.68 /
.39 /
.26
.17
/
/
/
/
A B
Distribute
Into buckets
7
0
1
2
3
4
5
6
7
8
9
.23
.17 /
.78 /
.26 /
.72
.94 /
.68 /
.39 /
.21
.12
/
/
/
/
Sort each Buckets internally using insertion sort or recursively
apply the bucket sort algorithm.
8
0
1
2
3
4
5
6
7
8
9
.23
.17 /
.78 /
.26 /
.72
.94 /
.68 /
.39 /
.21
.12
/
/
/
/
.17.12 .23 .26.21 .39 .68 .78.72 .94 /
Concatenate the buckets from 0 to n – 1 together, in order
6 9 7 1 5 4 2 8 3
1-3 4-6 7-9
1 2 3 6 5 4 9 7 8
Bucket sort on array of non-negative integers
1 2 3 654 97 8
1-3 4-6 7-9
1 2 3 4 5 6 7 8 9
ALGORITHM
1. n length[A]
2. for i 1 to n
3. do insert A[i] into bucket B[nA[i]]
4. for i 0 to n-1
5. do sort bucket B[i] with insertion sort
6. Concatenate bucket B[0],B[1],…,B[n-1]
COMPLEXITY
1. n length[A] (1)
2. for i 1 to n O(n)
3. do insert A[i] into bucket B[nA[i]] (1) (i.e. total
O(n))
4. for i 0 to n-1 O(n)
5. do sort bucket B[i] with insertion sort O(ni
2) (i=0
n-1
O(ni
2))
6. Concatenate bucket B[0],B[1],…,B[n-1] O(n)
Where ni is the size of bucket B[i]
BEST CASE
• In best case every element belongs to different buckets so the
complexity is O(n+k).
• Where n=number of elements and k=number of buckets
AVERAGE CASE
• In average case running time of bucketsort is
T(n) = (n) + i=0
n-1 O(ni
2)
E[T(n)] = E[(n) + i=0
n-1 O(ni
2)]
= (n)+ i=0
n-1 E[O(ni
2)] (by linearity of expectation)
= (n)+ i=0
n-1 O (E[ni
2])
• So, the average complexity is O(n+k).
WORST CASE
• In worst case complexity is O(n^2), as every
element belongs to one bucket so we have to use
insertion sort on n elements.
• Instead of insertion sort we could use merge sort or
heap sort because their worst case running time is
O( n log n ).
• But we use insertion sort because we expect the
buckets to be small and insertion sort works much
faster for small array.
LITERATURE REVIEW
Title Year of
publish
Method
Analysis of Non-Comparison
Based Sorting Algorithms
December,2013 International
Journal of
Emerging
Research in
Management
&Technology
Solving
buckets
with help
of
insertion
sort
GPU bucket sort algorithm with
applications to nearest-neighbour
search
- AGH University
of science and
technology
Bucket
sort
Bucket Sort on the GPU May,2012 Report Recursive
and non-
recursive
bucket
sort
Fast median filtering based on
bucket sort
2009 Information
Technologies and
Control
Bucket
sort
APPLICATIONS
• Nearest- neighbour search in particle based simulation.
• For median filtering of images.
FAST MEDIAN FILTERING BASED ON BUCKET
SORT
Introduction
 A comparison between the sequential and parallel implementation of
our algorithms on one side and Huang, Yang and Tang’s algorithm
on the other.
 The algorithms for image processing are divided into three
hierarchal groups:
 algorithms for primary image processing (low-level image
processing): like noise reduction and image enhancement;
 algorithms for an intermediate image processing: are used for
segmentation of image, defining of skeleton, edge detection, etc.;
 high level algorithms for image processing: in this group diverse
algorithms for detection and recognition of objects are included
 Primary images processing could be done either in a
spatial domain or a frequency domain.
 Depending on the chosen domain there is used:
-> the function of the filter (its Fourier
trans-formation)
-> the mask (kernel, window) that presents
filtering function.
MEDIAN FILTER
 The filter Median is often used for primary images pro-
cessing.
 It does not blur edges.
 The median filter is non-linear.
Means that if two images are summed-up and the
result image is median filtered, the processed image is
different from the sum of these two images processed
with filter median.
median[A(x) + B(x)] ≠ median[A(x)] + median[B(x)]
ALGORITHM OF HUANG,YANG AND TANG
 The algorithm that is proposed Thomas Huang, George
Yang and Gregory Tang gives significant acceleration in
comparison of “brute force” algorithms.
 They propose the using of partial histograms and
removing the old values of pixels and adding new pixel
values on mask movement.
 Movement of mask sized 3x3 is shown.
 Grey color are hatch pixels that must be removed from the partial
histogram.
 Black color are hatch pixels that must be added to partial histogram.
 When the mask 3x3 - median is the 5th element
MEDIAN FILTERING BASED ON BUCKET SORT
 The bucket sort complexity is O(n+m)
 n is the range of numbers to be sorted (for black-and-white
images n=256)
 m - the number of elements in the massif, i.e. num-ber of
pixels under the mask in our case (this number can be
9,25,49,81, ..., (2r+1)2, where r is positive integer).
for( i=0; i<3; i++ )
for( j=0; j<3; j++ )
{ // Writing how many times
colour[pixel[i][j]]++; // value of the massif pixel [3][3], in the relevant
} //elements of the massif colour [256].
sum = 0;
for( k=0; k<256; k++ )
{
median = k;
if(colour [k]!=0 )
{
sum = sum + colour[k]]; // Summing-up the non-zero values of colour
if( sum>=5 ) // If the sum is >= 5 we write in the
break; // pixel[1][1] the value of the median.
}
}
pixel[1][1] = median;
ALGORITHM :
FAST MEDIAN FILTERING BASED ON
BUCKET SORT
 The essence of this algorithm is to modify the proposed
method from Thomas Huang, George Yang and Gregory
Tang using partial histograms to achieve process
acceleration
 Here, we can accelerate the processing introducing new
variable index where to store the index of the lowest
element of colour with non-zero value.
 The middle value, we will start over from the position saved
in index instead from the beginning of the massif.
int index=255;
for( i=0; i<3; i++ )
for( j=0; j<3; j++ )
{ // Writing the number of times every
colour[pixel[i][j]]++; // value of the massif pixel[3][3]is met
if (pixel[i][j]<index)
index = pixel[i][j]; // Writing the index of the most dark colour
}
sum = 0;
for( k=index; k<256; k++ ) // The survey of the massif colour starts from
{
median = k; // position index
if(colour [k]!=0 )
{ // Summing-up the non-zero values of colour
sum = sum + colour[k]];
if( sum>=5 ) // When the sum is >= 5, we write
break; // in the pixel[1][1] the value of the median.
}
}
pixel[1][1] = median;
ALGORITHM :
BEFORE AFTER
8-bit BMP image before and after the processing using median
filtering with a mask sized 7X7
PARALLEL ALGORITHMS FOR ANISOTROPIC MEDIAN
FILTERING
o The image is divided by rows and distributed over the parallel
processors.
o Every processor will receive the rows of the image which it must
process and also the rows it will need to process the pixels of the
first and the last row of its part of image.
CONCLUSION
Blue curve line :- Huang, Yang and Tang Algorithm
Pink curve line :- Median Filtering based on Bucket Sort
Yellow curve line :- Fast Median Filtering baesd on Bucket Sort
Best results :- Fast Median Filtering baesd on Bucket Sort
 The image processing becomes faster when we use a cluster of
parallel processors.
 For the three algorithms, the higher is the number of the
processes and the size of the mask, the higher the acceleration
is.
 The highest acceleration is reached when using the algorithm
Fast median filtering based on bucket sort.
COMPARISON OF AMOEBA, V-SYSTEM,
MACH AND CHORUS
 Comparison

Bucket sort- A Noncomparision Algorithm

  • 1.
    NON-COMPARISON BASED SORTING ALGORITHM(BUCKETSORT) Presented By :- Krupali Mistry
  • 2.
    INDEX • Why Non-ComparisonBased Sorting Algorithms? • Introduction to bucket sort • Working of bucket sort • Algorithm • Complexity • Literature review • Applications
  • 3.
    WHY NON-COMPARISON BASEDSORTING ALGORITHMS? • Selection Sort, Bubble Sort, Insertion Sort: O(n2). • Heap sort and Merge sort: O( n log n ) Quick sort : O( n log n ) ,on average, O( n2 ) worst case. • Can we do better than O( n log n ) using comparison based sorting algorithms?  No. • It can be proven that any comparison-based sorting algorithm will need to carry out at least O( n log n ) operations.
  • 4.
    NON-COMPARISON BASED SORTINGALGORITHMS • Bucket sort or Bin Sort • Counting Sort • Radix Sort
  • 5.
    • Input isuniformly distributed over a range. For example Sort a large set of floating point numbers which are in range from 0.0 to 1.0 and are uniformly distributed across the range. • Used in conjuction with another sorting algorithms. • We can perform bucket sort on any array of (non-negative) integers but number of buckets will depend on the maximum integer value. Introduction
  • 6.
    WORKING OF BUCKETSORT .78 .17 .39 .26 .72 .94 .21 .12 .23 .68 0 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 10 .21 .12 / .72 / .23 / .78 .94 / .68 / .39 / .26 .17 / / / / A B Distribute Into buckets
  • 7.
    7 0 1 2 3 4 5 6 7 8 9 .23 .17 / .78 / .26/ .72 .94 / .68 / .39 / .21 .12 / / / / Sort each Buckets internally using insertion sort or recursively apply the bucket sort algorithm.
  • 8.
    8 0 1 2 3 4 5 6 7 8 9 .23 .17 / .78 / .26/ .72 .94 / .68 / .39 / .21 .12 / / / / .17.12 .23 .26.21 .39 .68 .78.72 .94 / Concatenate the buckets from 0 to n – 1 together, in order
  • 9.
    6 9 71 5 4 2 8 3 1-3 4-6 7-9 1 2 3 6 5 4 9 7 8 Bucket sort on array of non-negative integers
  • 10.
    1 2 3654 97 8 1-3 4-6 7-9 1 2 3 4 5 6 7 8 9
  • 11.
    ALGORITHM 1. n length[A] 2.for i 1 to n 3. do insert A[i] into bucket B[nA[i]] 4. for i 0 to n-1 5. do sort bucket B[i] with insertion sort 6. Concatenate bucket B[0],B[1],…,B[n-1]
  • 12.
    COMPLEXITY 1. n length[A](1) 2. for i 1 to n O(n) 3. do insert A[i] into bucket B[nA[i]] (1) (i.e. total O(n)) 4. for i 0 to n-1 O(n) 5. do sort bucket B[i] with insertion sort O(ni 2) (i=0 n-1 O(ni 2)) 6. Concatenate bucket B[0],B[1],…,B[n-1] O(n) Where ni is the size of bucket B[i]
  • 13.
    BEST CASE • Inbest case every element belongs to different buckets so the complexity is O(n+k). • Where n=number of elements and k=number of buckets
  • 14.
    AVERAGE CASE • Inaverage case running time of bucketsort is T(n) = (n) + i=0 n-1 O(ni 2) E[T(n)] = E[(n) + i=0 n-1 O(ni 2)] = (n)+ i=0 n-1 E[O(ni 2)] (by linearity of expectation) = (n)+ i=0 n-1 O (E[ni 2]) • So, the average complexity is O(n+k).
  • 15.
    WORST CASE • Inworst case complexity is O(n^2), as every element belongs to one bucket so we have to use insertion sort on n elements. • Instead of insertion sort we could use merge sort or heap sort because their worst case running time is O( n log n ). • But we use insertion sort because we expect the buckets to be small and insertion sort works much faster for small array.
  • 16.
    LITERATURE REVIEW Title Yearof publish Method Analysis of Non-Comparison Based Sorting Algorithms December,2013 International Journal of Emerging Research in Management &Technology Solving buckets with help of insertion sort GPU bucket sort algorithm with applications to nearest-neighbour search - AGH University of science and technology Bucket sort Bucket Sort on the GPU May,2012 Report Recursive and non- recursive bucket sort Fast median filtering based on bucket sort 2009 Information Technologies and Control Bucket sort
  • 17.
    APPLICATIONS • Nearest- neighboursearch in particle based simulation. • For median filtering of images.
  • 18.
    FAST MEDIAN FILTERINGBASED ON BUCKET SORT Introduction  A comparison between the sequential and parallel implementation of our algorithms on one side and Huang, Yang and Tang’s algorithm on the other.  The algorithms for image processing are divided into three hierarchal groups:  algorithms for primary image processing (low-level image processing): like noise reduction and image enhancement;  algorithms for an intermediate image processing: are used for segmentation of image, defining of skeleton, edge detection, etc.;  high level algorithms for image processing: in this group diverse algorithms for detection and recognition of objects are included
  • 19.
     Primary imagesprocessing could be done either in a spatial domain or a frequency domain.  Depending on the chosen domain there is used: -> the function of the filter (its Fourier trans-formation) -> the mask (kernel, window) that presents filtering function.
  • 20.
    MEDIAN FILTER  Thefilter Median is often used for primary images pro- cessing.  It does not blur edges.  The median filter is non-linear. Means that if two images are summed-up and the result image is median filtered, the processed image is different from the sum of these two images processed with filter median. median[A(x) + B(x)] ≠ median[A(x)] + median[B(x)]
  • 21.
    ALGORITHM OF HUANG,YANGAND TANG  The algorithm that is proposed Thomas Huang, George Yang and Gregory Tang gives significant acceleration in comparison of “brute force” algorithms.  They propose the using of partial histograms and removing the old values of pixels and adding new pixel values on mask movement.
  • 22.
     Movement ofmask sized 3x3 is shown.  Grey color are hatch pixels that must be removed from the partial histogram.  Black color are hatch pixels that must be added to partial histogram.  When the mask 3x3 - median is the 5th element
  • 23.
    MEDIAN FILTERING BASEDON BUCKET SORT  The bucket sort complexity is O(n+m)  n is the range of numbers to be sorted (for black-and-white images n=256)  m - the number of elements in the massif, i.e. num-ber of pixels under the mask in our case (this number can be 9,25,49,81, ..., (2r+1)2, where r is positive integer).
  • 24.
    for( i=0; i<3;i++ ) for( j=0; j<3; j++ ) { // Writing how many times colour[pixel[i][j]]++; // value of the massif pixel [3][3], in the relevant } //elements of the massif colour [256]. sum = 0; for( k=0; k<256; k++ ) { median = k; if(colour [k]!=0 ) { sum = sum + colour[k]]; // Summing-up the non-zero values of colour if( sum>=5 ) // If the sum is >= 5 we write in the break; // pixel[1][1] the value of the median. } } pixel[1][1] = median; ALGORITHM :
  • 25.
    FAST MEDIAN FILTERINGBASED ON BUCKET SORT  The essence of this algorithm is to modify the proposed method from Thomas Huang, George Yang and Gregory Tang using partial histograms to achieve process acceleration  Here, we can accelerate the processing introducing new variable index where to store the index of the lowest element of colour with non-zero value.  The middle value, we will start over from the position saved in index instead from the beginning of the massif.
  • 26.
    int index=255; for( i=0;i<3; i++ ) for( j=0; j<3; j++ ) { // Writing the number of times every colour[pixel[i][j]]++; // value of the massif pixel[3][3]is met if (pixel[i][j]<index) index = pixel[i][j]; // Writing the index of the most dark colour } sum = 0; for( k=index; k<256; k++ ) // The survey of the massif colour starts from { median = k; // position index if(colour [k]!=0 ) { // Summing-up the non-zero values of colour sum = sum + colour[k]]; if( sum>=5 ) // When the sum is >= 5, we write break; // in the pixel[1][1] the value of the median. } } pixel[1][1] = median; ALGORITHM :
  • 27.
    BEFORE AFTER 8-bit BMPimage before and after the processing using median filtering with a mask sized 7X7
  • 28.
    PARALLEL ALGORITHMS FORANISOTROPIC MEDIAN FILTERING o The image is divided by rows and distributed over the parallel processors. o Every processor will receive the rows of the image which it must process and also the rows it will need to process the pixels of the first and the last row of its part of image.
  • 29.
    CONCLUSION Blue curve line:- Huang, Yang and Tang Algorithm Pink curve line :- Median Filtering based on Bucket Sort Yellow curve line :- Fast Median Filtering baesd on Bucket Sort Best results :- Fast Median Filtering baesd on Bucket Sort
  • 30.
     The imageprocessing becomes faster when we use a cluster of parallel processors.  For the three algorithms, the higher is the number of the processes and the size of the mask, the higher the acceleration is.  The highest acceleration is reached when using the algorithm Fast median filtering based on bucket sort.
  • 31.
    COMPARISON OF AMOEBA,V-SYSTEM, MACH AND CHORUS  Comparison