SlideShare a Scribd company logo
1 of 12
Download to read offline
International Journal of Computer Mathematics
Vol. 00, No. 00, Month 200x, 1–12




                                   RESEARCH ARTICLE

                      An in-place heapsort algorithm requiring
                       n log n + n log∗ n − 0.546871n comparisons

                   Md. Mahbubul Hasana† , Md. Shahjalalb‡ and M. Kaykobada§
           a
               CSE Department, Bangladesh University of Engineering and Technology
                           b
                             Therap (BD) Ltd., Banani, Dhaka 1213
                         (Received 00 Month 200x; in final form 00 Month 200x)


      In this paper we present an interesting modification to the heap structure which yields a
      better comparison based in-place heapsort algorithm. The number of comparisons is shown
      to be bounded by n log n + n log∗ n − 0.546871n which is 0.853129n + n log∗ n away from the
      optimal theoretical bound of n log n − 1.44n.

      Keywords: Algorithms, Data structure, Heapsort.



1.   Introduction

The heapsort algorithm is one of the best sorting algorithms and was introduced
by Williams in 1964. It achieves both worst case and average case time complex-
ity of O(n log n). Lower bound theory asserts that any comparison based sorting
algorithm will require log (n!) comparisons which is approximately n log n − 1.44n.
   There are many promising variants of heapsort algorithm such as MDR-
Heapsort [10, 12], Generalized Heapsort [7, 11], Weak Heapsort [4], Bottom Up
Heapsort [15], Ultimate Heapsort [8], Heapsort in 3-heaps [9] etc.
   Carlsson’s variant of heapsort [1] does not need extra space and requires
n log n + n log log n + 1.82n comparisons. Dutton showed that sorting using Weak
Heap requires comparisons bounded by n log n + 0.086013n, but it uses n extra
bits [4]. In [5], Edelkamp and Stiegeler modified Dutton’s weak heap and designed
several algorithms of heapsort requiring n log n − 0.9n comparisons but using an
array of size n to store indices. Xiao Dong Wang and Ying Jie Wu [14] gave another
algorithm bounding the number of comparisons to n log n−0.788928n, but using an
additional array of size n to store indices. The Bottom Up Heapsort [15] requires
1.5n log n comparisons in the worst case. The Ultimate Heapsort [8] is another
variant of heapsort algorithm which requires fewer than n log n + 32n + O(log n)
comparisons to sort n elements in-place. In [6], Gonnet and Munro gave an al-
gorithm to construct a heap of n elements which takes 1.625n comparisons in
the worst case. In the same paper they gave another algorithm, subsequently im-
proved by Carlsson [2], for extraction of the maximum element requiring around
log n + log ∗ n + O(1) comparisons, where log ∗ denotes the iterated logarithm. Al-
gorithm presented by McDiarmid and Reed [10] requires 1.5212n comparisons in
the average case for heap construction. But both of these algorithms require extra

† Email: shanto86@yahoo.com
‡ Email: shahjalal@msn.com
§ Email: kaykobad@cse.buet.ac.bd


ISSN: 0020-7160 print/ISSN 1029-0265 online
 c 200x Taylor & Francis
DOI: 10.1080/0020716YYxxxxxxxx
http://www.informaworld.com
2                  Md. Mahbubul Hasan, Md. Shahjalal and M. Kaykobad


space. Carlsson and Chen gave two algorithms for heap construction in [3]. First one
achieves 1.625n + O(log2 n) comparisons in the worst case and 1.5642n + O(log 2 n)
on average but requires extra space. Second one is in-place algorithm which re-
quires 1.528n + O(log 2 n) comparisons on average and 2n comparisons in the worst
case.
   In this paper we introduce a new data structure to reduce number of comparisons
by pairing elements to be sorted that does not use any extra space. Carlsson’s
variant of heapsort has been applied on this data structure. This has restricted
the number of comparisons to n log n + n log log n + 0.40821n. This data structure
can be applied to other heap related algorithms as well. For example, in this paper
we have shown how memory requirement of Dutton’s Weak Heapsort algorithm
can be reduced using this data structure. Moreover, even algorithms of Gonnet
and Munro [6] or Carlsson [2] when applied to this data structure yields better
performance than that of the corresponding algorithms. It restricts the number
of comparisons to n log n + n log∗ n − 0.546871n. While Ultimate Heapsort [8] is
asymptotically faster than the proposed algorithm, ours will outperform it since
the linear term in Ultimate Heapsort (32n) is more than n log∗ n for all practical
purposes (even for large values of n). A preliminary version of this paper appeared
in WALCOM 2007 [13].
   In Section 2 we present the modified data structure for heapsort. Section 3 con-
tains analysis followed by the theoretical comparison between Carlsson’s variant
and the proposed algorithm in Section 4. In Section 5 we show how Gonnet, Munro
or Carlsson’s deletion algorithm yields better performance on the proposed data
structure, followed by generalization of the data structure in Section 6. In Sec-
tion 7 we apply our pairing trick in Weak Heap. The practical performance of the
proposed algorithm is presented in Section 8.


2.    A modified data structure for heapsort

Independent pairwise comparisons reduce uncertainty by the most. This fact en-
courages creation of heaps with group of elements ordered in each node. In the
following we present our data structure with two elements in each node. However,
in a later section we analyze performance of a more generalized data structure to
find the best number of elements to be stored in a node.


2.1    Preprocessing
For n given elements we will make n/2 pairs. If there are odd number of elements
a dummy element can be suitably added.
  For simplicity of our discussion from now on we will call the first element of a
pair as the Primary element, and the second one as the Secondary element. For
each pair we compare the elements and arrange them in non-increasing order. That
means after rearranging, the Primary element will be greater than or equal to the
Secondary element of the pair. Figure 1 shows some possible pairings.


2.2    Heap Construction Phase
Now we construct a (max)heap according to the Primary elements of the pairs.
The heap is constructed in bottom up fashion. That is, we start from the last pair
of the heap and we try to push it down. To push a pair, we find the path of elder
sons and then we do binary search on the path. The implementation is similar to
International Journal of Computer Mathematics                          3



                     a)        10     3                            10       3



                     b)         2     8                            8        2
Figure 1. In a) Primary element is greater than Secondary element. So we do not swap. In b) Primary
element is smaller than Secondary element. So we swap the elements.


that of the Carlsson’s variant [1]. Figure 2 shows a sample heap.


                                                   12    3


                               9     9                                 10       9



                     8     2              7    6             8     4

            Figure 2. A (max)heap constructed based on the Primary element of the pairs.




2.3   Sorting Phase
Construction phase yields a heap where each of the Primary elements is greater
than or equal to the Primary element of its children pairs. As the Primary elements
are always greater than or equal to the corresponding Secondary elements, the
Primary element of the root will be the largest element of the heap.
  In each iteration of the sorting phase we extract the Primary element of the root
and adjust the heap, decreasing number of elements in the heap by one.
  Suppose 1 is the root and 2 is the last pair (with possible empty secondary
element). Let 1P and 1S be Primary and Secondary elements of the root, and 2P ,
2S are defined similarly. If 1 and 2 are the same pair then we swap Primary and
Secondary elements(1P and 1S). If they are not the same pair then we remove the
Primary element of the root(1P ) (which is the largest element in the heap) and
place it at temp (a temporary variable to store an element). Then there can be
four possible cases:
Case 1. 2S is empty (if there is an odd number of elements in the heap)
        and 1S ≥ 2P: we place 1S at 1P , 2P at 1S and temp at 2P . (Figure 3)
Case 2. 2S is empty (if there is an odd number of elements in the heap)
        and 1S < 2P: we place 2P at 1P , 1S remains at the same place and temp
        at 2P . (Figure 4)
Case 3. 2S is not empty (if there is an even number of elements in the
        heap) and 1S ≥ 2P: we place 1S at 1P , 2P at 1S, 2S at 2P and temp at
        2S. (Figure 5)
Case 4. 2S is not empty (if there is an even number of elements in the
        heap) and 1S < 2P: we place 2P at 1P , 1S remains at the same place,
4                      Md. Mahbubul Hasan, Md. Shahjalal and M. Kaykobad


                                   a

                               b               temp


                       8 5                                         5      3
                                       d
                          c


             6     1               3                      6 1                 8
Figure 3. Case 1. 2S is empty and 1S ≥ 2P . So a) 1P → temp, b) 1S → 1P , c) 2P → 1S and d)
temp → 2P .




                                   a

                                               temp


                       8 3                                         5      3
                                       c
                           b


             6 1                   5                      6 1                 8
    Figure 4. Case 2. 2S is empty and 1S < 2P . So a) 1P → temp, b) 2P → 1P and c) temp → 2P .




                                   a

                               b               temp


                       8 5                                         5      3
                                           e

                          c


             6 1                   3 2                    6 1                 2     8
                                       d

Figure 5. Case 3. 2S is not empty and 1S ≥ 2P . So a) 1P → temp, b) 1S → 1P , c) 2P → 1S d) 2S → 2P
and d) temp → 2S.




         2S at 2P and temp at 2S. (Figure 6)

  All of these cases require a single comparison. After adjusting the order of the
pair we fix the heap by pushing down the root node in the same way as we did in
the heap construction phase. We also remove the last element and mark it empty.
International Journal of Computer Mathematics                       5


                              a

                                             temp


                      8 3                                        5       3
                                       d
                          b


            6 1                5       2                6 1                  2   8
                                   c

Figure 6. Case 4. 2S is not empty and 1S < 2P . So a) 1P → temp, b) 2P → 1P , c) 2S → 2P and d)
temp → 2S.




3.    Analysis

For simplicity of analysis, let us assume that heap in our new data structure is full.
That is,


                                              h
                                   n/2 =           2i
                                             i=0

                                           = 2h+1 − 1

                                                    n
                                   ∴ h = log          +1 −1
                                                    2


  However, had it not been a full heap the formula would have been h =
⌈log n + 1 ⌉ − 1
     2




3.1    Preprocessing
We require n/2 comparisons to pair the n elements with a possible left over.




3.2    Heap Construction Phase
At every level i (0 ≤ i < h) we have 2i pairs. To find the path of elder sons we
need h − i comparisons and for inserting the pair into the path we need binary
search on a path of length h − i. So the number of comparisons required for heap
construction is,
6                        Md. Mahbubul Hasan, Md. Shahjalal and M. Kaykobad




    h−1
          2i ((h − i) + ⌊log(h − i)⌋ + 1)
    i=0
                                                 h
                                             =         2h−i (i + ⌊log i⌋ + 1)
                                                 i=1
                                                 h                h                      h
                                             =         2h−i i +         2h−i ⌊log i⌋ +         2h−i
                                                 i=1              i=1                    i=1
                                                                               h
                                                                                    ⌊log i⌋
                                             = 2h+1 − 2 − h + 2h                            + 2h − 1
                                                                                      2i
                                                                              i=1

                                             ≤ 2h+1 − 2 − h + 2h ∗ 0.632843 + 2h − 1

                                             = 0.90821n − log n + o(1)



    Here,


                             h
                                   ⌊log i⌋
                                     2i
                             i=1
                             ∞
                                   ⌊log i⌋
                         <
                                     2i
                             i=1
                             50                  ∞
                                   ⌊log i⌋            ⌊log i⌋
                         <                 +
                                     2i                 2i
                             i=1               i=51
                             50                                    ∞
                                   ⌊log i⌋       ⌊log 51⌋                ⌊log i⌋
                         <                 +              +
                                     2i             251           i=51     2i
                             i=1
                         < 0.632843



  The integral and summand are calculated using www.wolframalpha.com.
  Note that, if there are x elements then we require ⌊log x⌋ + 1 comparisons for
binary search in the worst case.




3.3       Sorting Phase
In sorting phase, we will have i (1 ≤ i ≤ h) length path of elder sons 2i+1 times,
because we have 2i+1 numbers at the ith level. For every such path we require i
comparisons for determining the path of elder sons and ⌊log i⌋ + 1 comparisons for
inserting a pair into the path. So for pushing down, we require
International Journal of Computer Mathematics                           7




                         h
                             2i+1 (i + ⌊log i⌋ + 1)
                       i=1
                                                           h                       h
                                                       =         2i+1 (i + 1) +         2i+1 ⌊log i⌋
                                                           i=1                    i=1
                                                                               h
                                                       ≤ 2h+2 h + ⌊log h⌋           2i+1
                                                                              i=1

                                                       = 2h+2 h + ⌊log h⌋ 2h+2 − 4

                                                       = 2 (n/2 + 1) h + (n − 2) ⌊log log (n/2 + 1)⌋
                                                       ≤ (n + 2) (log (n/2 + 1) − 1) + (n − 2) ⌊log log n⌋
                                                       = (n + 2) log (n/2 + 1) − n − 2 + (n − 2) ⌊log log n⌋
                                                       = (n + 2) log n − 2n + (n − 2) ⌊log log n⌋ + o(1)

                      We also require n − 2 extra comparisons for adjustment.
                      So the total number of comparisons is,

                       n/2 + 0.90821n − log n + n − 2 + (n + 2) log n − 2n + (n − 2)⌊log log n⌋ + o(1)
                   = n log n + n log log n + 0.40821n + o(1)



                 4.     Theoretical comparison between Carlsson’s variant and the proposed
                        algorithm

                 Table 1 shows results of comparison between the two algorithms. We have analyzed
                 the worst cases here.

Table 1.   Theoretical comparison between Carlsson’s algorithm and the proposed algorithm
    Number of elements          CH1       NH2          CS3           NS4    DH = CH - NH        DS = CS - NS   Total difference(DH + DS)
                     7             8         9           20            17              -1                  3                          2
                     8            13        10           25            22               3                  3                          6
                     9            13        10           30            27               3                  3                          6
                    10            15        11           35            32               4                  3                          7
                    14            21        15           55            52               6                  3                          9
                    15            21        20           60            58               1                  2                          3
                    16            28        21           67            64               7                  3                         10
                   100           178       137          761           746              41                 15                         56
                  1000          1809      1401        11713         11446             408                267                        675
                 10000         18159     14076       153357        153094           4083                 263                       4346
                100000        181635    140814      1903137       1868412          40821               34725                      75546
               1000000       1816414   1408203     22885636      22819844         408211               65792                     474003
1   Comparison required in Carlsson’s Heap Construction in the worst case
2   Comparison required in the proposed Heap Construction in the worst case
3   Comparison required in Carlsson’s Sorting in the worst case
4   Comparison required in the proposed sorting algorithm in the worst case




                      In section 3, we considered our heap to be full and the number of comparisons
8                     Md. Mahbubul Hasan, Md. Shahjalal and M. Kaykobad


required for sorting phase of our proposed algorithm was,

                                       h
                            n−2+            2i+1 (i + ⌊log i⌋ + 1)
                                      i=1


  However, for our experimental data we considered any heap. Let the height of
the heap be h. That means, we have full heap of height h − 1 and t = n − 2 2h − 1
extra nodes in the last level. So for full heap of height h− 1 we apply the formula in
section 3.3. For the last level we will have t paths of elder sons of length h and we
need h comparisons to determine the path and ⌊log h⌋+1 comparisons for inserting
pair into the path. So the number of comparisons for n element heap is:


             h−1
     n−2+           2i+1 (i + ⌊log i⌋ + 1) + n − 2 2h − 1            ∗ (h + ⌊log h⌋ + 1)
              i=1


   Similarly, the number of comparisons for Carlsson’s algorithm in sorting phase
is:

           h−1
                 2i (i + ⌊log i⌋ + 1) + n − 2h − 1          ∗ (h + ⌊log h⌋ + 1)
           i=1


   We computed the number of comparisons required for the heap construction
phase of Carlsson’s variant and the proposed algorithm by Algorithm 1 and Algo-
rithm 2 respectively.

Algorithm 1 The number of comparisons required for the heap construction phase
of Carlsson’s variant
Require: n ≥ 1
 1: h ← ⌊log n⌋
 2: CH ← 0
 3: active ← n − 2h + 1
 4: for i = 1 to h do
 5:   current ← 2h−i
 6:   active ← ⌈active/2⌉
 7:   passive = current − active
 8:   CH ← CH + active ∗ (i + ⌊log i⌋ + 1)
 9:   CH ← CH + passive ∗ ((i − 1) + ⌊log (i − 1)⌋ + 1)
10: end for




5.   Further improvement

Using Gonnet, Munro or Carlsson’s [2, 6] algorithm we can further improve heap
construction and sorting phases. To replace the maximum in a heap log n+log∗ n =
h + log∗ h + 1 comparisons are necessary and sufficient as h = log n.
  So the number of comparisons required for construction in our proposed data
structure is:
International Journal of Computer Mathematics               9


Algorithm 2 The number of comparisons required for the heap construction phase
of the proposed algorithm
Require: n ≥ 1
 1: h ← ⌊log ⌈n/2⌉⌋
 2: N H ← ⌊n/2⌋
 3: active ← ⌈n/2⌉ − 2h + 1
 4: for i = 1 to h do
 5:    current ← 2h−i
 6:    active ← ⌈active/2⌉
 7:    passive = current − active
 8:    N H ← N H + active ∗ (i + ⌊log i⌋ + 1)
 9:    N H ← N H + passive ∗ ((i − 1) + ⌊log (i − 1)⌋ + 1)
10: end for




                                h
                     n/2 +           2h−i (i + log∗ i + 1)
                               i=1
                                h                          h
                                       h−i
                  = n/2 +            2       (i + 1) +          2h−i log∗ i
                               i=1                        i=1
                                                          h
                                                                log∗ i
                  = n/2 + (3 ∗ 2h − h) + 2h                            + o(1)
                                                                  2i
                                                          i=1

                  ≤ n/2 + (3 ∗ 2h − h) + 0.812515 ∗ 2h + o(1)
                  = n/2 + 3.812515 ∗ 2h − h + o(1)
                  = 1.453129n − log n + o(1)

 Here,

                        h
                              log∗ i
                                2i
                        i=1
                        ∞
                              log∗ i
                    <
                                2i
                        i=1
                        50                   ∞
                              log∗ i             log∗ i
                    <                +
                                2i                 2i
                        i=1              i=51
                        50
                              log∗ i          log∗ 51            ∞
                                                                       log∗ i
                    <                +                +
                                2i              251             i=51     2i
                        i=1
                        50
                              log∗ i log∗ 51                  ∞
                                                                     ⌊log i⌋
                    <               +        +
                                2i     251                 i=51        2i
                        i=1
                    < 0.812515

  The integral and the summand are calculated using www.wolframalpha.com and
a simple C program respectively.
10                    Md. Mahbubul Hasan, Md. Shahjalal and M. Kaykobad


     Similarly the number of comparisons required for sorting is:

                                h
                     n−2+            2i+1 (i + log∗ i + 1)
                               i=1
                                h                         h
                                      i+1
                   =n−2+             2      (i + 1) +          2i+1 log∗ i
                               i=1                       i=1

                   ≤ n − 2 + 2h+2 h + log∗ h 2h+2 − 3

                   = n − 2 + (n + 2)(log n − 2) + (log∗ n − 1)(n − 2)
                   = (n + 2) log n + (n − 2) log∗ n − 2n + o(1)
                   = n log n + n log∗ n − 2n + O(log n)

     So total number of comparison is:

                       n log n + n log∗ n − 0.546871n + O(log n)


6.     Generalization of the data structure

Let us now generalize the data structure and keep c elements per node. So there
will be n nodes and if the height of the heap is h then 2h+1 − 1 = n . So we sort
        c                                                           c
the elements at each node and construct the heap as described above. We sort
the elements at a node by traditional inplace heapsort algorithm requiring 2c log c
comparisons. So to create the heap the required number of comparisons is:

                                            h
                           2n log c +             2h−i (i + log∗ i + 1)
                                            i=1

.
  In sorting phase, when there will be c elements left in the heap then we do not
need further comparisons. So we need to adjust the root n−c times. For adjustment
we take the largest element at the last node and insert it into the root of the heap
where there is c − 1 elements. Since we require ⌊log i⌋ + 1 comparisons to insert an
element in a sorted list of i elements, the required number of comparisons is:

                                                         h
                   (n − c) (⌊log (c − 1)⌋ + 1) +               c2i (i + log∗ i + 1)
                                                        i=1

  Asymptotic analysis shows that the optimal value of c is O(log n). However,
experimental results do not always support this theoretical findings. One of the
reasons is that the cost function contains integral functions that are not so easy to
analyze by differential calculus.


7.     Pairing trick in Weak Heapsort algorithm

Similar to the previous one, we pair up the numbers and construct a Weak Heap [4]
according to the Primary elements of these pairs. These pairs acts as a node in
the Weak Heap and each node have an extra bit just like normal Weak Heap. We
International Journal of Computer Mathematics                         11


keep an extra variable suppose temp which is initially set to empty. After finding
the pair with the largest Primary element in the heap (we call this node head)
we check temp variable. If temp is empty, the Primary element of the head is the
next largest number and the Secondary element of the head is assigned in temp.
If temp is not empty and temp is larger than the Primary element of the head,
the next two largest number will be the number in temp and the Primary element
of the head, the Secondary element will be assigned in temp. And if temp is not
empty and temp is smaller than the Primary element of the head, the next largest
number will be the Primary element of the head and we make a pair by combining
the number in temp and the Secondary element of the head. Each of these scenarios
requires at most two extra comparisons. So the total number of comparisons is:


                      n/2 + n/2 − 1 + 2(n/2 ∗ (k − 1) − 2k−1 − n/2 + 2) + n
                      = n − 1 + nk − n − 2k − n + 4 + n
                      = nk − 2k + 3

  where k = ⌈log n⌉. This is the same expression as the number of comparisons of
Weak Heapsort algorithm [4]. But here we required only n/2 bits to sort n numbers
in (n − 1) log n + 0.086013n comparisons.
  We can also apply this trick similarly to the Weak Heapsort algorithm modified
by Edelkamp and Stiegeler [5]. It gives a sorting algorithm with n log n − 0.9n
comparisons and requiring n/2 bits and n size array to store indices.


8.     Experimental results

In table 2, the practical performance of the three algorithms is presented. The
algorithms were run on the same random sets of floating point data several times
and the average time is presented in table 2. In implementation we used bitwise
operation instead of normal arithmetic operation where it is possible. We also
avoided recomputations of some equations storing it into a temporary variable
where it is applicable. These two optimizations give us better runtime.
     Table 2.   Experimental results
                                                         Runtime in millisecond
      Number of elements       The proposed algorithm     Carlsson’s variant Dutton’s Weak Heapsort
                   10000                             0                      0                     0
                   50000                            31                    31                     78
                  100000                            47                    47                   156
                  500000                           313                   375                   969
                 1000000                           580                   907                  2234
                 5000000                         3340                  5601                  15554
               10000000                          7078                 12385                  35297
               30000000                         22765                 49938                 138937




9.     Conclusion

We have presented a new approach to reduce the number of comparisons for in-
place heapsort. The main idea of the paper is to consider groups of elements at
each node of a heap. This reduces the height of the tree, and hence results in a
slight reduction in time for each of the deletemax operations, despite an increase
of some comparisons to ensure that every node (except the last) has a sorted
12                                           REFERENCES


group of elements. This is because of the reduction in the cost due to lower order
terms of the deletemax operations. So overall there is a reduction in the number of
comparisons made in the lower order (more precisely the linear) term for heapsort.
However optimum number of comparisons to be stored in a node in our proposed
data structure is still inconclusive and requires further investigation.



  Acknowledgement: The authors profusely thank anonymous referees for sug-
gesting significant improvement and for pointing out errors and inaccuracies of an
earlier version of the manuscript.


References

 [1] S. Carlsson, A variant of heapsort with almost optimal number of comparisons, Inf. Process. Lett. 24
     (1987), pp. 247–250.
 [2] ———, An optimal algorithm for deleting the root of a heap, Inf. Process. Lett. 37 (1991), pp. 117–120.
 [3] S. Carlsson and J. Chen, Heap Construction: Optimal in Both Worst and Average Cases?, in ISAAC,
     Lecture Notes in Computer Science, vol. 1004, Springer, 1995, pp. 254–263.
 [4] R.D. Dutton, Weak-heap sort, BIT 33 (1993), pp. 372–381.
 [5] S. Edelkamp and P. Stiegeler, Implementing heapsort with (n logn - 0.9n) and quicksort with (n logn
     + 0.2n) comparisons, ACM Journal of Experimental Algorithmics 7 (2002), p. 5.
 [6] G.H. Gonnet and J.I. Munro, Heaps on heaps, SIAM J. Comput. 15 (1986), pp. 964–971.
 [7] T.M. Islam and M. Kaykobad, Worst-case analysis of generalized heapsort algorithm revisited, Int.
     J. Comput. Math. 83 (2006), pp. 59–67.
 [8] J. Katajainen, The Ultimate Heapsort, in CATS, 1998, pp. 87–96.
 [9] M. Kaykobad, M.M. Islam, M.E. Amyeen, and M.M. Murshed, 3 is a more promising algorithmic
     parameter than 2, Comput. Math. Appl. 36 (1998), pp. 19–24.
[10] C. McDiarmid and B.A. Reed, Building heaps fast, J. Algorithms 10 (1989), pp. 352–365.
[11] A. Paulik, Worst-case analysis of a generalized heapsort algorithm, Inf. Process. Lett. 36 (1990), pp.
     159–165.
[12] S.K.N. Rezaul Alam Chowdhury and M. Kaykobad, A simplified complexity analysis of mcdiarmid
     and reed’s variant of bottom-up heapsort algorithm, Int. J. Comput. Math. 73 (2000), pp. 293–297.
[13] M. Shahjalal and M. Kaykobad, A New Data Structure for Heapsort with Improved Number of
     Comparisons, in WALCOM, 2007, pp. 88–96.
[14] X.D. Wang and Y.J. Wu, An improved heapsort algorithm with log n - 0.788928 comparisons in
     the worst case, J. Comput. Sci. Technol. 22 (2007), pp. 898–903.
[15] I. Wegener, Bottom-up-heapsort, a new variant of heapsort, beating, on an average, quicksort (if n
     is not very small), Theor. Comput. Sci. 118 (1993), pp. 81–98.

More Related Content

What's hot

C O M P U T E R A P P L I C A T I O N I N C H E M I C A L E N G I N E E R I N...
C O M P U T E R A P P L I C A T I O N I N C H E M I C A L E N G I N E E R I N...C O M P U T E R A P P L I C A T I O N I N C H E M I C A L E N G I N E E R I N...
C O M P U T E R A P P L I C A T I O N I N C H E M I C A L E N G I N E E R I N...guest3f9c6b
 
Introduction to density functional theory
Introduction to density functional theory Introduction to density functional theory
Introduction to density functional theory Sarthak Hajirnis
 
Density Functional Theory
Density Functional TheoryDensity Functional Theory
Density Functional TheoryWesley Chen
 
Problems and solutions statistical physics 1
Problems and solutions   statistical physics 1Problems and solutions   statistical physics 1
Problems and solutions statistical physics 1Alberto de Mesquita
 
Bound State Solution of the Klein–Gordon Equation for the Modified Screened C...
Bound State Solution of the Klein–Gordon Equation for the Modified Screened C...Bound State Solution of the Klein–Gordon Equation for the Modified Screened C...
Bound State Solution of the Klein–Gordon Equation for the Modified Screened C...BRNSS Publication Hub
 
Exact Exchange in Density Functional Theory
Exact Exchange in Density Functional TheoryExact Exchange in Density Functional Theory
Exact Exchange in Density Functional TheoryABDERRAHMANE REGGAD
 
Talk given at the Workshop in Catania University
Talk given at the Workshop in Catania University Talk given at the Workshop in Catania University
Talk given at the Workshop in Catania University Marco Frasca
 

What's hot (16)

C O M P U T E R A P P L I C A T I O N I N C H E M I C A L E N G I N E E R I N...
C O M P U T E R A P P L I C A T I O N I N C H E M I C A L E N G I N E E R I N...C O M P U T E R A P P L I C A T I O N I N C H E M I C A L E N G I N E E R I N...
C O M P U T E R A P P L I C A T I O N I N C H E M I C A L E N G I N E E R I N...
 
NANO266 - Lecture 7 - QM Modeling of Periodic Structures
NANO266 - Lecture 7 - QM Modeling of Periodic StructuresNANO266 - Lecture 7 - QM Modeling of Periodic Structures
NANO266 - Lecture 7 - QM Modeling of Periodic Structures
 
Introduction to density functional theory
Introduction to density functional theory Introduction to density functional theory
Introduction to density functional theory
 
The monoatomic ideal gas
The monoatomic ideal gasThe monoatomic ideal gas
The monoatomic ideal gas
 
Exercises with DFT+U
Exercises with DFT+UExercises with DFT+U
Exercises with DFT+U
 
Density Functional Theory
Density Functional TheoryDensity Functional Theory
Density Functional Theory
 
A0730103
A0730103A0730103
A0730103
 
Problems and solutions statistical physics 1
Problems and solutions   statistical physics 1Problems and solutions   statistical physics 1
Problems and solutions statistical physics 1
 
Bound State Solution of the Klein–Gordon Equation for the Modified Screened C...
Bound State Solution of the Klein–Gordon Equation for the Modified Screened C...Bound State Solution of the Klein–Gordon Equation for the Modified Screened C...
Bound State Solution of the Klein–Gordon Equation for the Modified Screened C...
 
Peskin chap02
Peskin chap02Peskin chap02
Peskin chap02
 
Physics Assignment Help
Physics Assignment HelpPhysics Assignment Help
Physics Assignment Help
 
Exact Exchange in Density Functional Theory
Exact Exchange in Density Functional TheoryExact Exchange in Density Functional Theory
Exact Exchange in Density Functional Theory
 
E04933745
E04933745E04933745
E04933745
 
Introduction to DFT Part 2
Introduction to DFT Part 2Introduction to DFT Part 2
Introduction to DFT Part 2
 
Talk given at the Workshop in Catania University
Talk given at the Workshop in Catania University Talk given at the Workshop in Catania University
Talk given at the Workshop in Catania University
 
Powder
PowderPowder
Powder
 

Viewers also liked

Formació "Iniciació al Google Analytics", març 2011
Formació "Iniciació al Google Analytics", març 2011Formació "Iniciació al Google Analytics", març 2011
Formació "Iniciació al Google Analytics", març 2011Fernando Blanco
 
1:1 Unconference + Tech in our PDHPE Faculty
1:1 Unconference + Tech in our PDHPE Faculty1:1 Unconference + Tech in our PDHPE Faculty
1:1 Unconference + Tech in our PDHPE Facultyleonore13
 
Introduction - SA Print Media
Introduction - SA Print MediaIntroduction - SA Print Media
Introduction - SA Print MediaRohan Shahane
 
Tawatinaw watershed stewards introduction
Tawatinaw watershed stewards introduction Tawatinaw watershed stewards introduction
Tawatinaw watershed stewards introduction tawatinaw
 
BPM - A Practitioners Playbook
BPM -  A Practitioners PlaybookBPM -  A Practitioners Playbook
BPM - A Practitioners PlaybookAniruddha Paul
 

Viewers also liked (7)

Formació "Iniciació al Google Analytics", març 2011
Formació "Iniciació al Google Analytics", març 2011Formació "Iniciació al Google Analytics", març 2011
Formació "Iniciació al Google Analytics", març 2011
 
1:1 Unconference + Tech in our PDHPE Faculty
1:1 Unconference + Tech in our PDHPE Faculty1:1 Unconference + Tech in our PDHPE Faculty
1:1 Unconference + Tech in our PDHPE Faculty
 
Presentación1
Presentación1Presentación1
Presentación1
 
Presentacio
PresentacioPresentacio
Presentacio
 
Introduction - SA Print Media
Introduction - SA Print MediaIntroduction - SA Print Media
Introduction - SA Print Media
 
Tawatinaw watershed stewards introduction
Tawatinaw watershed stewards introduction Tawatinaw watershed stewards introduction
Tawatinaw watershed stewards introduction
 
BPM - A Practitioners Playbook
BPM -  A Practitioners PlaybookBPM -  A Practitioners Playbook
BPM - A Practitioners Playbook
 

Similar to Ijcm

Advanced s and s algorithm.ppt
Advanced s and s algorithm.pptAdvanced s and s algorithm.ppt
Advanced s and s algorithm.pptLegesseSamuel
 
Staircases_in_Fluid_Dynamics_JacobGreenhalgh
Staircases_in_Fluid_Dynamics_JacobGreenhalghStaircases_in_Fluid_Dynamics_JacobGreenhalgh
Staircases_in_Fluid_Dynamics_JacobGreenhalghJacob Greenhalgh
 
Lego like spheres and tori, enumeration and drawings
Lego like spheres and tori, enumeration and drawingsLego like spheres and tori, enumeration and drawings
Lego like spheres and tori, enumeration and drawingsMathieu Dutour Sikiric
 
7f44bdd880a385b7c1338293ea4183f930ea
7f44bdd880a385b7c1338293ea4183f930ea7f44bdd880a385b7c1338293ea4183f930ea
7f44bdd880a385b7c1338293ea4183f930eaAlvaro
 
Chapter 8 advanced sorting and hashing for print
Chapter 8 advanced sorting and hashing for printChapter 8 advanced sorting and hashing for print
Chapter 8 advanced sorting and hashing for printAbdii Rashid
 
Sienna 4 divideandconquer
Sienna 4 divideandconquerSienna 4 divideandconquer
Sienna 4 divideandconquerchidabdu
 
Heap Hand note
Heap Hand noteHeap Hand note
Heap Hand noteAbdur Rouf
 
Bin packing problem two approximation
Bin packing problem two approximationBin packing problem two approximation
Bin packing problem two approximationijfcstjournal
 
BIN PACKING PROBLEM: TWO APPROXIMATION ALGORITHMS
BIN PACKING PROBLEM: TWO APPROXIMATION ALGORITHMSBIN PACKING PROBLEM: TWO APPROXIMATION ALGORITHMS
BIN PACKING PROBLEM: TWO APPROXIMATION ALGORITHMSijfcstjournal
 
M269 Data Structures And Computability.docx
M269 Data Structures And Computability.docxM269 Data Structures And Computability.docx
M269 Data Structures And Computability.docxstirlingvwriters
 
Digital Systems Design Using Verilog 1st edition by Roth John Lee solution ma...
Digital Systems Design Using Verilog 1st edition by Roth John Lee solution ma...Digital Systems Design Using Verilog 1st edition by Roth John Lee solution ma...
Digital Systems Design Using Verilog 1st edition by Roth John Lee solution ma...endokayle
 
BIN PACKING PROBLEM: A LINEAR CONSTANTSPACE  -APPROXIMATION ALGORITHM
BIN PACKING PROBLEM: A LINEAR CONSTANTSPACE  -APPROXIMATION ALGORITHMBIN PACKING PROBLEM: A LINEAR CONSTANTSPACE  -APPROXIMATION ALGORITHM
BIN PACKING PROBLEM: A LINEAR CONSTANTSPACE  -APPROXIMATION ALGORITHMijcsa
 
Ee693 sept2014quiz1
Ee693 sept2014quiz1Ee693 sept2014quiz1
Ee693 sept2014quiz1Gopi Saiteja
 
Master of Computer Application (MCA) – Semester 4 MC0080
Master of Computer Application (MCA) – Semester 4  MC0080Master of Computer Application (MCA) – Semester 4  MC0080
Master of Computer Application (MCA) – Semester 4 MC0080Aravind NC
 
Probabilistic group theory, combinatorics, and computing
Probabilistic group theory, combinatorics, and computingProbabilistic group theory, combinatorics, and computing
Probabilistic group theory, combinatorics, and computingSpringer
 
one main advantage of bubble sort as compared to others
one main advantage of bubble sort as compared to othersone main advantage of bubble sort as compared to others
one main advantage of bubble sort as compared to othersAjay Chimmani
 

Similar to Ijcm (20)

Advanced s and s algorithm.ppt
Advanced s and s algorithm.pptAdvanced s and s algorithm.ppt
Advanced s and s algorithm.ppt
 
Staircases_in_Fluid_Dynamics_JacobGreenhalgh
Staircases_in_Fluid_Dynamics_JacobGreenhalghStaircases_in_Fluid_Dynamics_JacobGreenhalgh
Staircases_in_Fluid_Dynamics_JacobGreenhalgh
 
Lego like spheres and tori, enumeration and drawings
Lego like spheres and tori, enumeration and drawingsLego like spheres and tori, enumeration and drawings
Lego like spheres and tori, enumeration and drawings
 
7f44bdd880a385b7c1338293ea4183f930ea
7f44bdd880a385b7c1338293ea4183f930ea7f44bdd880a385b7c1338293ea4183f930ea
7f44bdd880a385b7c1338293ea4183f930ea
 
Sortsearch
SortsearchSortsearch
Sortsearch
 
Chapter 8 advanced sorting and hashing for print
Chapter 8 advanced sorting and hashing for printChapter 8 advanced sorting and hashing for print
Chapter 8 advanced sorting and hashing for print
 
Sienna 4 divideandconquer
Sienna 4 divideandconquerSienna 4 divideandconquer
Sienna 4 divideandconquer
 
Heap Hand note
Heap Hand noteHeap Hand note
Heap Hand note
 
Bin packing problem two approximation
Bin packing problem two approximationBin packing problem two approximation
Bin packing problem two approximation
 
BIN PACKING PROBLEM: TWO APPROXIMATION ALGORITHMS
BIN PACKING PROBLEM: TWO APPROXIMATION ALGORITHMSBIN PACKING PROBLEM: TWO APPROXIMATION ALGORITHMS
BIN PACKING PROBLEM: TWO APPROXIMATION ALGORITHMS
 
Algorithm Assignment Help
Algorithm Assignment HelpAlgorithm Assignment Help
Algorithm Assignment Help
 
M269 Data Structures And Computability.docx
M269 Data Structures And Computability.docxM269 Data Structures And Computability.docx
M269 Data Structures And Computability.docx
 
Bin packing
Bin packingBin packing
Bin packing
 
Digital Systems Design Using Verilog 1st edition by Roth John Lee solution ma...
Digital Systems Design Using Verilog 1st edition by Roth John Lee solution ma...Digital Systems Design Using Verilog 1st edition by Roth John Lee solution ma...
Digital Systems Design Using Verilog 1st edition by Roth John Lee solution ma...
 
BIN PACKING PROBLEM: A LINEAR CONSTANTSPACE  -APPROXIMATION ALGORITHM
BIN PACKING PROBLEM: A LINEAR CONSTANTSPACE  -APPROXIMATION ALGORITHMBIN PACKING PROBLEM: A LINEAR CONSTANTSPACE  -APPROXIMATION ALGORITHM
BIN PACKING PROBLEM: A LINEAR CONSTANTSPACE  -APPROXIMATION ALGORITHM
 
Ee693 sept2014quiz1
Ee693 sept2014quiz1Ee693 sept2014quiz1
Ee693 sept2014quiz1
 
Master of Computer Application (MCA) – Semester 4 MC0080
Master of Computer Application (MCA) – Semester 4  MC0080Master of Computer Application (MCA) – Semester 4  MC0080
Master of Computer Application (MCA) – Semester 4 MC0080
 
Heapsort
HeapsortHeapsort
Heapsort
 
Probabilistic group theory, combinatorics, and computing
Probabilistic group theory, combinatorics, and computingProbabilistic group theory, combinatorics, and computing
Probabilistic group theory, combinatorics, and computing
 
one main advantage of bubble sort as compared to others
one main advantage of bubble sort as compared to othersone main advantage of bubble sort as compared to others
one main advantage of bubble sort as compared to others
 

Ijcm

  • 1. International Journal of Computer Mathematics Vol. 00, No. 00, Month 200x, 1–12 RESEARCH ARTICLE An in-place heapsort algorithm requiring n log n + n log∗ n − 0.546871n comparisons Md. Mahbubul Hasana† , Md. Shahjalalb‡ and M. Kaykobada§ a CSE Department, Bangladesh University of Engineering and Technology b Therap (BD) Ltd., Banani, Dhaka 1213 (Received 00 Month 200x; in final form 00 Month 200x) In this paper we present an interesting modification to the heap structure which yields a better comparison based in-place heapsort algorithm. The number of comparisons is shown to be bounded by n log n + n log∗ n − 0.546871n which is 0.853129n + n log∗ n away from the optimal theoretical bound of n log n − 1.44n. Keywords: Algorithms, Data structure, Heapsort. 1. Introduction The heapsort algorithm is one of the best sorting algorithms and was introduced by Williams in 1964. It achieves both worst case and average case time complex- ity of O(n log n). Lower bound theory asserts that any comparison based sorting algorithm will require log (n!) comparisons which is approximately n log n − 1.44n. There are many promising variants of heapsort algorithm such as MDR- Heapsort [10, 12], Generalized Heapsort [7, 11], Weak Heapsort [4], Bottom Up Heapsort [15], Ultimate Heapsort [8], Heapsort in 3-heaps [9] etc. Carlsson’s variant of heapsort [1] does not need extra space and requires n log n + n log log n + 1.82n comparisons. Dutton showed that sorting using Weak Heap requires comparisons bounded by n log n + 0.086013n, but it uses n extra bits [4]. In [5], Edelkamp and Stiegeler modified Dutton’s weak heap and designed several algorithms of heapsort requiring n log n − 0.9n comparisons but using an array of size n to store indices. Xiao Dong Wang and Ying Jie Wu [14] gave another algorithm bounding the number of comparisons to n log n−0.788928n, but using an additional array of size n to store indices. The Bottom Up Heapsort [15] requires 1.5n log n comparisons in the worst case. The Ultimate Heapsort [8] is another variant of heapsort algorithm which requires fewer than n log n + 32n + O(log n) comparisons to sort n elements in-place. In [6], Gonnet and Munro gave an al- gorithm to construct a heap of n elements which takes 1.625n comparisons in the worst case. In the same paper they gave another algorithm, subsequently im- proved by Carlsson [2], for extraction of the maximum element requiring around log n + log ∗ n + O(1) comparisons, where log ∗ denotes the iterated logarithm. Al- gorithm presented by McDiarmid and Reed [10] requires 1.5212n comparisons in the average case for heap construction. But both of these algorithms require extra † Email: shanto86@yahoo.com ‡ Email: shahjalal@msn.com § Email: kaykobad@cse.buet.ac.bd ISSN: 0020-7160 print/ISSN 1029-0265 online c 200x Taylor & Francis DOI: 10.1080/0020716YYxxxxxxxx http://www.informaworld.com
  • 2. 2 Md. Mahbubul Hasan, Md. Shahjalal and M. Kaykobad space. Carlsson and Chen gave two algorithms for heap construction in [3]. First one achieves 1.625n + O(log2 n) comparisons in the worst case and 1.5642n + O(log 2 n) on average but requires extra space. Second one is in-place algorithm which re- quires 1.528n + O(log 2 n) comparisons on average and 2n comparisons in the worst case. In this paper we introduce a new data structure to reduce number of comparisons by pairing elements to be sorted that does not use any extra space. Carlsson’s variant of heapsort has been applied on this data structure. This has restricted the number of comparisons to n log n + n log log n + 0.40821n. This data structure can be applied to other heap related algorithms as well. For example, in this paper we have shown how memory requirement of Dutton’s Weak Heapsort algorithm can be reduced using this data structure. Moreover, even algorithms of Gonnet and Munro [6] or Carlsson [2] when applied to this data structure yields better performance than that of the corresponding algorithms. It restricts the number of comparisons to n log n + n log∗ n − 0.546871n. While Ultimate Heapsort [8] is asymptotically faster than the proposed algorithm, ours will outperform it since the linear term in Ultimate Heapsort (32n) is more than n log∗ n for all practical purposes (even for large values of n). A preliminary version of this paper appeared in WALCOM 2007 [13]. In Section 2 we present the modified data structure for heapsort. Section 3 con- tains analysis followed by the theoretical comparison between Carlsson’s variant and the proposed algorithm in Section 4. In Section 5 we show how Gonnet, Munro or Carlsson’s deletion algorithm yields better performance on the proposed data structure, followed by generalization of the data structure in Section 6. In Sec- tion 7 we apply our pairing trick in Weak Heap. The practical performance of the proposed algorithm is presented in Section 8. 2. A modified data structure for heapsort Independent pairwise comparisons reduce uncertainty by the most. This fact en- courages creation of heaps with group of elements ordered in each node. In the following we present our data structure with two elements in each node. However, in a later section we analyze performance of a more generalized data structure to find the best number of elements to be stored in a node. 2.1 Preprocessing For n given elements we will make n/2 pairs. If there are odd number of elements a dummy element can be suitably added. For simplicity of our discussion from now on we will call the first element of a pair as the Primary element, and the second one as the Secondary element. For each pair we compare the elements and arrange them in non-increasing order. That means after rearranging, the Primary element will be greater than or equal to the Secondary element of the pair. Figure 1 shows some possible pairings. 2.2 Heap Construction Phase Now we construct a (max)heap according to the Primary elements of the pairs. The heap is constructed in bottom up fashion. That is, we start from the last pair of the heap and we try to push it down. To push a pair, we find the path of elder sons and then we do binary search on the path. The implementation is similar to
  • 3. International Journal of Computer Mathematics 3 a) 10 3 10 3 b) 2 8 8 2 Figure 1. In a) Primary element is greater than Secondary element. So we do not swap. In b) Primary element is smaller than Secondary element. So we swap the elements. that of the Carlsson’s variant [1]. Figure 2 shows a sample heap. 12 3 9 9 10 9 8 2 7 6 8 4 Figure 2. A (max)heap constructed based on the Primary element of the pairs. 2.3 Sorting Phase Construction phase yields a heap where each of the Primary elements is greater than or equal to the Primary element of its children pairs. As the Primary elements are always greater than or equal to the corresponding Secondary elements, the Primary element of the root will be the largest element of the heap. In each iteration of the sorting phase we extract the Primary element of the root and adjust the heap, decreasing number of elements in the heap by one. Suppose 1 is the root and 2 is the last pair (with possible empty secondary element). Let 1P and 1S be Primary and Secondary elements of the root, and 2P , 2S are defined similarly. If 1 and 2 are the same pair then we swap Primary and Secondary elements(1P and 1S). If they are not the same pair then we remove the Primary element of the root(1P ) (which is the largest element in the heap) and place it at temp (a temporary variable to store an element). Then there can be four possible cases: Case 1. 2S is empty (if there is an odd number of elements in the heap) and 1S ≥ 2P: we place 1S at 1P , 2P at 1S and temp at 2P . (Figure 3) Case 2. 2S is empty (if there is an odd number of elements in the heap) and 1S < 2P: we place 2P at 1P , 1S remains at the same place and temp at 2P . (Figure 4) Case 3. 2S is not empty (if there is an even number of elements in the heap) and 1S ≥ 2P: we place 1S at 1P , 2P at 1S, 2S at 2P and temp at 2S. (Figure 5) Case 4. 2S is not empty (if there is an even number of elements in the heap) and 1S < 2P: we place 2P at 1P , 1S remains at the same place,
  • 4. 4 Md. Mahbubul Hasan, Md. Shahjalal and M. Kaykobad a b temp 8 5 5 3 d c 6 1 3 6 1 8 Figure 3. Case 1. 2S is empty and 1S ≥ 2P . So a) 1P → temp, b) 1S → 1P , c) 2P → 1S and d) temp → 2P . a temp 8 3 5 3 c b 6 1 5 6 1 8 Figure 4. Case 2. 2S is empty and 1S < 2P . So a) 1P → temp, b) 2P → 1P and c) temp → 2P . a b temp 8 5 5 3 e c 6 1 3 2 6 1 2 8 d Figure 5. Case 3. 2S is not empty and 1S ≥ 2P . So a) 1P → temp, b) 1S → 1P , c) 2P → 1S d) 2S → 2P and d) temp → 2S. 2S at 2P and temp at 2S. (Figure 6) All of these cases require a single comparison. After adjusting the order of the pair we fix the heap by pushing down the root node in the same way as we did in the heap construction phase. We also remove the last element and mark it empty.
  • 5. International Journal of Computer Mathematics 5 a temp 8 3 5 3 d b 6 1 5 2 6 1 2 8 c Figure 6. Case 4. 2S is not empty and 1S < 2P . So a) 1P → temp, b) 2P → 1P , c) 2S → 2P and d) temp → 2S. 3. Analysis For simplicity of analysis, let us assume that heap in our new data structure is full. That is, h n/2 = 2i i=0 = 2h+1 − 1 n ∴ h = log +1 −1 2 However, had it not been a full heap the formula would have been h = ⌈log n + 1 ⌉ − 1 2 3.1 Preprocessing We require n/2 comparisons to pair the n elements with a possible left over. 3.2 Heap Construction Phase At every level i (0 ≤ i < h) we have 2i pairs. To find the path of elder sons we need h − i comparisons and for inserting the pair into the path we need binary search on a path of length h − i. So the number of comparisons required for heap construction is,
  • 6. 6 Md. Mahbubul Hasan, Md. Shahjalal and M. Kaykobad h−1 2i ((h − i) + ⌊log(h − i)⌋ + 1) i=0 h = 2h−i (i + ⌊log i⌋ + 1) i=1 h h h = 2h−i i + 2h−i ⌊log i⌋ + 2h−i i=1 i=1 i=1 h ⌊log i⌋ = 2h+1 − 2 − h + 2h + 2h − 1 2i i=1 ≤ 2h+1 − 2 − h + 2h ∗ 0.632843 + 2h − 1 = 0.90821n − log n + o(1) Here, h ⌊log i⌋ 2i i=1 ∞ ⌊log i⌋ < 2i i=1 50 ∞ ⌊log i⌋ ⌊log i⌋ < + 2i 2i i=1 i=51 50 ∞ ⌊log i⌋ ⌊log 51⌋ ⌊log i⌋ < + + 2i 251 i=51 2i i=1 < 0.632843 The integral and summand are calculated using www.wolframalpha.com. Note that, if there are x elements then we require ⌊log x⌋ + 1 comparisons for binary search in the worst case. 3.3 Sorting Phase In sorting phase, we will have i (1 ≤ i ≤ h) length path of elder sons 2i+1 times, because we have 2i+1 numbers at the ith level. For every such path we require i comparisons for determining the path of elder sons and ⌊log i⌋ + 1 comparisons for inserting a pair into the path. So for pushing down, we require
  • 7. International Journal of Computer Mathematics 7 h 2i+1 (i + ⌊log i⌋ + 1) i=1 h h = 2i+1 (i + 1) + 2i+1 ⌊log i⌋ i=1 i=1 h ≤ 2h+2 h + ⌊log h⌋ 2i+1 i=1 = 2h+2 h + ⌊log h⌋ 2h+2 − 4 = 2 (n/2 + 1) h + (n − 2) ⌊log log (n/2 + 1)⌋ ≤ (n + 2) (log (n/2 + 1) − 1) + (n − 2) ⌊log log n⌋ = (n + 2) log (n/2 + 1) − n − 2 + (n − 2) ⌊log log n⌋ = (n + 2) log n − 2n + (n − 2) ⌊log log n⌋ + o(1) We also require n − 2 extra comparisons for adjustment. So the total number of comparisons is, n/2 + 0.90821n − log n + n − 2 + (n + 2) log n − 2n + (n − 2)⌊log log n⌋ + o(1) = n log n + n log log n + 0.40821n + o(1) 4. Theoretical comparison between Carlsson’s variant and the proposed algorithm Table 1 shows results of comparison between the two algorithms. We have analyzed the worst cases here. Table 1. Theoretical comparison between Carlsson’s algorithm and the proposed algorithm Number of elements CH1 NH2 CS3 NS4 DH = CH - NH DS = CS - NS Total difference(DH + DS) 7 8 9 20 17 -1 3 2 8 13 10 25 22 3 3 6 9 13 10 30 27 3 3 6 10 15 11 35 32 4 3 7 14 21 15 55 52 6 3 9 15 21 20 60 58 1 2 3 16 28 21 67 64 7 3 10 100 178 137 761 746 41 15 56 1000 1809 1401 11713 11446 408 267 675 10000 18159 14076 153357 153094 4083 263 4346 100000 181635 140814 1903137 1868412 40821 34725 75546 1000000 1816414 1408203 22885636 22819844 408211 65792 474003 1 Comparison required in Carlsson’s Heap Construction in the worst case 2 Comparison required in the proposed Heap Construction in the worst case 3 Comparison required in Carlsson’s Sorting in the worst case 4 Comparison required in the proposed sorting algorithm in the worst case In section 3, we considered our heap to be full and the number of comparisons
  • 8. 8 Md. Mahbubul Hasan, Md. Shahjalal and M. Kaykobad required for sorting phase of our proposed algorithm was, h n−2+ 2i+1 (i + ⌊log i⌋ + 1) i=1 However, for our experimental data we considered any heap. Let the height of the heap be h. That means, we have full heap of height h − 1 and t = n − 2 2h − 1 extra nodes in the last level. So for full heap of height h− 1 we apply the formula in section 3.3. For the last level we will have t paths of elder sons of length h and we need h comparisons to determine the path and ⌊log h⌋+1 comparisons for inserting pair into the path. So the number of comparisons for n element heap is: h−1 n−2+ 2i+1 (i + ⌊log i⌋ + 1) + n − 2 2h − 1 ∗ (h + ⌊log h⌋ + 1) i=1 Similarly, the number of comparisons for Carlsson’s algorithm in sorting phase is: h−1 2i (i + ⌊log i⌋ + 1) + n − 2h − 1 ∗ (h + ⌊log h⌋ + 1) i=1 We computed the number of comparisons required for the heap construction phase of Carlsson’s variant and the proposed algorithm by Algorithm 1 and Algo- rithm 2 respectively. Algorithm 1 The number of comparisons required for the heap construction phase of Carlsson’s variant Require: n ≥ 1 1: h ← ⌊log n⌋ 2: CH ← 0 3: active ← n − 2h + 1 4: for i = 1 to h do 5: current ← 2h−i 6: active ← ⌈active/2⌉ 7: passive = current − active 8: CH ← CH + active ∗ (i + ⌊log i⌋ + 1) 9: CH ← CH + passive ∗ ((i − 1) + ⌊log (i − 1)⌋ + 1) 10: end for 5. Further improvement Using Gonnet, Munro or Carlsson’s [2, 6] algorithm we can further improve heap construction and sorting phases. To replace the maximum in a heap log n+log∗ n = h + log∗ h + 1 comparisons are necessary and sufficient as h = log n. So the number of comparisons required for construction in our proposed data structure is:
  • 9. International Journal of Computer Mathematics 9 Algorithm 2 The number of comparisons required for the heap construction phase of the proposed algorithm Require: n ≥ 1 1: h ← ⌊log ⌈n/2⌉⌋ 2: N H ← ⌊n/2⌋ 3: active ← ⌈n/2⌉ − 2h + 1 4: for i = 1 to h do 5: current ← 2h−i 6: active ← ⌈active/2⌉ 7: passive = current − active 8: N H ← N H + active ∗ (i + ⌊log i⌋ + 1) 9: N H ← N H + passive ∗ ((i − 1) + ⌊log (i − 1)⌋ + 1) 10: end for h n/2 + 2h−i (i + log∗ i + 1) i=1 h h h−i = n/2 + 2 (i + 1) + 2h−i log∗ i i=1 i=1 h log∗ i = n/2 + (3 ∗ 2h − h) + 2h + o(1) 2i i=1 ≤ n/2 + (3 ∗ 2h − h) + 0.812515 ∗ 2h + o(1) = n/2 + 3.812515 ∗ 2h − h + o(1) = 1.453129n − log n + o(1) Here, h log∗ i 2i i=1 ∞ log∗ i < 2i i=1 50 ∞ log∗ i log∗ i < + 2i 2i i=1 i=51 50 log∗ i log∗ 51 ∞ log∗ i < + + 2i 251 i=51 2i i=1 50 log∗ i log∗ 51 ∞ ⌊log i⌋ < + + 2i 251 i=51 2i i=1 < 0.812515 The integral and the summand are calculated using www.wolframalpha.com and a simple C program respectively.
  • 10. 10 Md. Mahbubul Hasan, Md. Shahjalal and M. Kaykobad Similarly the number of comparisons required for sorting is: h n−2+ 2i+1 (i + log∗ i + 1) i=1 h h i+1 =n−2+ 2 (i + 1) + 2i+1 log∗ i i=1 i=1 ≤ n − 2 + 2h+2 h + log∗ h 2h+2 − 3 = n − 2 + (n + 2)(log n − 2) + (log∗ n − 1)(n − 2) = (n + 2) log n + (n − 2) log∗ n − 2n + o(1) = n log n + n log∗ n − 2n + O(log n) So total number of comparison is: n log n + n log∗ n − 0.546871n + O(log n) 6. Generalization of the data structure Let us now generalize the data structure and keep c elements per node. So there will be n nodes and if the height of the heap is h then 2h+1 − 1 = n . So we sort c c the elements at each node and construct the heap as described above. We sort the elements at a node by traditional inplace heapsort algorithm requiring 2c log c comparisons. So to create the heap the required number of comparisons is: h 2n log c + 2h−i (i + log∗ i + 1) i=1 . In sorting phase, when there will be c elements left in the heap then we do not need further comparisons. So we need to adjust the root n−c times. For adjustment we take the largest element at the last node and insert it into the root of the heap where there is c − 1 elements. Since we require ⌊log i⌋ + 1 comparisons to insert an element in a sorted list of i elements, the required number of comparisons is: h (n − c) (⌊log (c − 1)⌋ + 1) + c2i (i + log∗ i + 1) i=1 Asymptotic analysis shows that the optimal value of c is O(log n). However, experimental results do not always support this theoretical findings. One of the reasons is that the cost function contains integral functions that are not so easy to analyze by differential calculus. 7. Pairing trick in Weak Heapsort algorithm Similar to the previous one, we pair up the numbers and construct a Weak Heap [4] according to the Primary elements of these pairs. These pairs acts as a node in the Weak Heap and each node have an extra bit just like normal Weak Heap. We
  • 11. International Journal of Computer Mathematics 11 keep an extra variable suppose temp which is initially set to empty. After finding the pair with the largest Primary element in the heap (we call this node head) we check temp variable. If temp is empty, the Primary element of the head is the next largest number and the Secondary element of the head is assigned in temp. If temp is not empty and temp is larger than the Primary element of the head, the next two largest number will be the number in temp and the Primary element of the head, the Secondary element will be assigned in temp. And if temp is not empty and temp is smaller than the Primary element of the head, the next largest number will be the Primary element of the head and we make a pair by combining the number in temp and the Secondary element of the head. Each of these scenarios requires at most two extra comparisons. So the total number of comparisons is: n/2 + n/2 − 1 + 2(n/2 ∗ (k − 1) − 2k−1 − n/2 + 2) + n = n − 1 + nk − n − 2k − n + 4 + n = nk − 2k + 3 where k = ⌈log n⌉. This is the same expression as the number of comparisons of Weak Heapsort algorithm [4]. But here we required only n/2 bits to sort n numbers in (n − 1) log n + 0.086013n comparisons. We can also apply this trick similarly to the Weak Heapsort algorithm modified by Edelkamp and Stiegeler [5]. It gives a sorting algorithm with n log n − 0.9n comparisons and requiring n/2 bits and n size array to store indices. 8. Experimental results In table 2, the practical performance of the three algorithms is presented. The algorithms were run on the same random sets of floating point data several times and the average time is presented in table 2. In implementation we used bitwise operation instead of normal arithmetic operation where it is possible. We also avoided recomputations of some equations storing it into a temporary variable where it is applicable. These two optimizations give us better runtime. Table 2. Experimental results Runtime in millisecond Number of elements The proposed algorithm Carlsson’s variant Dutton’s Weak Heapsort 10000 0 0 0 50000 31 31 78 100000 47 47 156 500000 313 375 969 1000000 580 907 2234 5000000 3340 5601 15554 10000000 7078 12385 35297 30000000 22765 49938 138937 9. Conclusion We have presented a new approach to reduce the number of comparisons for in- place heapsort. The main idea of the paper is to consider groups of elements at each node of a heap. This reduces the height of the tree, and hence results in a slight reduction in time for each of the deletemax operations, despite an increase of some comparisons to ensure that every node (except the last) has a sorted
  • 12. 12 REFERENCES group of elements. This is because of the reduction in the cost due to lower order terms of the deletemax operations. So overall there is a reduction in the number of comparisons made in the lower order (more precisely the linear) term for heapsort. However optimum number of comparisons to be stored in a node in our proposed data structure is still inconclusive and requires further investigation. Acknowledgement: The authors profusely thank anonymous referees for sug- gesting significant improvement and for pointing out errors and inaccuracies of an earlier version of the manuscript. References [1] S. Carlsson, A variant of heapsort with almost optimal number of comparisons, Inf. Process. Lett. 24 (1987), pp. 247–250. [2] ———, An optimal algorithm for deleting the root of a heap, Inf. Process. Lett. 37 (1991), pp. 117–120. [3] S. Carlsson and J. Chen, Heap Construction: Optimal in Both Worst and Average Cases?, in ISAAC, Lecture Notes in Computer Science, vol. 1004, Springer, 1995, pp. 254–263. [4] R.D. Dutton, Weak-heap sort, BIT 33 (1993), pp. 372–381. [5] S. Edelkamp and P. Stiegeler, Implementing heapsort with (n logn - 0.9n) and quicksort with (n logn + 0.2n) comparisons, ACM Journal of Experimental Algorithmics 7 (2002), p. 5. [6] G.H. Gonnet and J.I. Munro, Heaps on heaps, SIAM J. Comput. 15 (1986), pp. 964–971. [7] T.M. Islam and M. Kaykobad, Worst-case analysis of generalized heapsort algorithm revisited, Int. J. Comput. Math. 83 (2006), pp. 59–67. [8] J. Katajainen, The Ultimate Heapsort, in CATS, 1998, pp. 87–96. [9] M. Kaykobad, M.M. Islam, M.E. Amyeen, and M.M. Murshed, 3 is a more promising algorithmic parameter than 2, Comput. Math. Appl. 36 (1998), pp. 19–24. [10] C. McDiarmid and B.A. Reed, Building heaps fast, J. Algorithms 10 (1989), pp. 352–365. [11] A. Paulik, Worst-case analysis of a generalized heapsort algorithm, Inf. Process. Lett. 36 (1990), pp. 159–165. [12] S.K.N. Rezaul Alam Chowdhury and M. Kaykobad, A simplified complexity analysis of mcdiarmid and reed’s variant of bottom-up heapsort algorithm, Int. J. Comput. Math. 73 (2000), pp. 293–297. [13] M. Shahjalal and M. Kaykobad, A New Data Structure for Heapsort with Improved Number of Comparisons, in WALCOM, 2007, pp. 88–96. [14] X.D. Wang and Y.J. Wu, An improved heapsort algorithm with log n - 0.788928 comparisons in the worst case, J. Comput. Sci. Technol. 22 (2007), pp. 898–903. [15] I. Wegener, Bottom-up-heapsort, a new variant of heapsort, beating, on an average, quicksort (if n is not very small), Theor. Comput. Sci. 118 (1993), pp. 81–98.