Ijcm

International Journal of Computer Mathematics
Vol. 00, No. 00, Month 200x, 1–12

RESEARCH ARTICLE

An in-place heapsort algorithm requiring
n log n + n log∗ n − 0.546871n comparisons

Md. Mahbubul Hasana† , Md. Shahjalalb‡ and M. Kaykobada§
a
CSE Department, Bangladesh University of Engineering and Technology
b
Therap (BD) Ltd., Banani, Dhaka 1213
(Received 00 Month 200x; in final form 00 Month 200x)

In this paper we present an interesting modification to the heap structure which yields a
better comparison based in-place heapsort algorithm. The number of comparisons is shown
to be bounded by n log n + n log∗ n − 0.546871n which is 0.853129n + n log∗ n away from the
optimal theoretical bound of n log n − 1.44n.

Keywords: Algorithms, Data structure, Heapsort.

1. Introduction

The heapsort algorithm is one of the best sorting algorithms and was introduced
by Williams in 1964. It achieves both worst case and average case time complex-
ity of O(n log n). Lower bound theory asserts that any comparison based sorting
algorithm will require log (n!) comparisons which is approximately n log n − 1.44n.
There are many promising variants of heapsort algorithm such as MDR-
Heapsort [10, 12], Generalized Heapsort [7, 11], Weak Heapsort [4], Bottom Up
Heapsort [15], Ultimate Heapsort [8], Heapsort in 3-heaps [9] etc.
Carlsson’s variant of heapsort [1] does not need extra space and requires
n log n + n log log n + 1.82n comparisons. Dutton showed that sorting using Weak
Heap requires comparisons bounded by n log n + 0.086013n, but it uses n extra
bits [4]. In [5], Edelkamp and Stiegeler modified Dutton’s weak heap and designed
several algorithms of heapsort requiring n log n − 0.9n comparisons but using an
array of size n to store indices. Xiao Dong Wang and Ying Jie Wu [14] gave another
algorithm bounding the number of comparisons to n log n−0.788928n, but using an
additional array of size n to store indices. The Bottom Up Heapsort [15] requires
1.5n log n comparisons in the worst case. The Ultimate Heapsort [8] is another
variant of heapsort algorithm which requires fewer than n log n + 32n + O(log n)
comparisons to sort n elements in-place. In [6], Gonnet and Munro gave an al-
gorithm to construct a heap of n elements which takes 1.625n comparisons in
the worst case. In the same paper they gave another algorithm, subsequently im-
proved by Carlsson [2], for extraction of the maximum element requiring around
log n + log ∗ n + O(1) comparisons, where log ∗ denotes the iterated logarithm. Al-
gorithm presented by McDiarmid and Reed [10] requires 1.5212n comparisons in
the average case for heap construction. But both of these algorithms require extra

† Email: shanto86@yahoo.com
‡ Email: shahjalal@msn.com
§ Email: kaykobad@cse.buet.ac.bd

ISSN: 0020-7160 print/ISSN 1029-0265 online
c 200x Taylor & Francis
DOI: 10.1080/0020716YYxxxxxxxx
http://www.informaworld.com

2 Md. Mahbubul Hasan, Md. Shahjalal and M. Kaykobad

space. Carlsson and Chen gave two algorithms for heap construction in [3]. First one
achieves 1.625n + O(log2 n) comparisons in the worst case and 1.5642n + O(log 2 n)
on average but requires extra space. Second one is in-place algorithm which re-
quires 1.528n + O(log 2 n) comparisons on average and 2n comparisons in the worst
case.
In this paper we introduce a new data structure to reduce number of comparisons
by pairing elements to be sorted that does not use any extra space. Carlsson’s
variant of heapsort has been applied on this data structure. This has restricted
the number of comparisons to n log n + n log log n + 0.40821n. This data structure
can be applied to other heap related algorithms as well. For example, in this paper
we have shown how memory requirement of Dutton’s Weak Heapsort algorithm
can be reduced using this data structure. Moreover, even algorithms of Gonnet
and Munro [6] or Carlsson [2] when applied to this data structure yields better
performance than that of the corresponding algorithms. It restricts the number
of comparisons to n log n + n log∗ n − 0.546871n. While Ultimate Heapsort [8] is
asymptotically faster than the proposed algorithm, ours will outperform it since
the linear term in Ultimate Heapsort (32n) is more than n log∗ n for all practical
purposes (even for large values of n). A preliminary version of this paper appeared
in WALCOM 2007 [13].
In Section 2 we present the modified data structure for heapsort. Section 3 con-
tains analysis followed by the theoretical comparison between Carlsson’s variant
and the proposed algorithm in Section 4. In Section 5 we show how Gonnet, Munro
or Carlsson’s deletion algorithm yields better performance on the proposed data
structure, followed by generalization of the data structure in Section 6. In Sec-
tion 7 we apply our pairing trick in Weak Heap. The practical performance of the
proposed algorithm is presented in Section 8.

2. A modified data structure for heapsort

Independent pairwise comparisons reduce uncertainty by the most. This fact en-
courages creation of heaps with group of elements ordered in each node. In the
following we present our data structure with two elements in each node. However,
in a later section we analyze performance of a more generalized data structure to
find the best number of elements to be stored in a node.

2.1 Preprocessing
For n given elements we will make n/2 pairs. If there are odd number of elements
a dummy element can be suitably added.
For simplicity of our discussion from now on we will call the first element of a
pair as the Primary element, and the second one as the Secondary element. For
each pair we compare the elements and arrange them in non-increasing order. That
means after rearranging, the Primary element will be greater than or equal to the
Secondary element of the pair. Figure 1 shows some possible pairings.

2.2 Heap Construction Phase
Now we construct a (max)heap according to the Primary elements of the pairs.
The heap is constructed in bottom up fashion. That is, we start from the last pair
of the heap and we try to push it down. To push a pair, we find the path of elder
sons and then we do binary search on the path. The implementation is similar to

International Journal of Computer Mathematics 3

a) 10 3 10 3

b) 2 8 8 2
Figure 1. In a) Primary element is greater than Secondary element. So we do not swap. In b) Primary
element is smaller than Secondary element. So we swap the elements.

that of the Carlsson’s variant [1]. Figure 2 shows a sample heap.

12 3

9 9 10 9

8 2 7 6 8 4

Figure 2. A (max)heap constructed based on the Primary element of the pairs.

2.3 Sorting Phase
Construction phase yields a heap where each of the Primary elements is greater
than or equal to the Primary element of its children pairs. As the Primary elements
are always greater than or equal to the corresponding Secondary elements, the
Primary element of the root will be the largest element of the heap.
In each iteration of the sorting phase we extract the Primary element of the root
and adjust the heap, decreasing number of elements in the heap by one.
Suppose 1 is the root and 2 is the last pair (with possible empty secondary
element). Let 1P and 1S be Primary and Secondary elements of the root, and 2P ,
2S are deﬁned similarly. If 1 and 2 are the same pair then we swap Primary and
Secondary elements(1P and 1S). If they are not the same pair then we remove the
Primary element of the root(1P ) (which is the largest element in the heap) and
place it at temp (a temporary variable to store an element). Then there can be
four possible cases:
Case 1. 2S is empty (if there is an odd number of elements in the heap)
and 1S ≥ 2P: we place 1S at 1P , 2P at 1S and temp at 2P . (Figure 3)
Case 2. 2S is empty (if there is an odd number of elements in the heap)
and 1S < 2P: we place 2P at 1P , 1S remains at the same place and temp
at 2P . (Figure 4)
Case 3. 2S is not empty (if there is an even number of elements in the
heap) and 1S ≥ 2P: we place 1S at 1P , 2P at 1S, 2S at 2P and temp at
2S. (Figure 5)
Case 4. 2S is not empty (if there is an even number of elements in the
heap) and 1S < 2P: we place 2P at 1P , 1S remains at the same place,


a

b temp

8 5 5 3
d
c

6 1 3 6 1 8
Figure 3. Case 1. 2S is empty and 1S ≥ 2P . So a) 1P → temp, b) 1S → 1P , c) 2P → 1S and d)
temp → 2P .

a

temp

8 3 5 3
c
b

6 1 5 6 1 8
Figure 4. Case 2. 2S is empty and 1S < 2P . So a) 1P → temp, b) 2P → 1P and c) temp → 2P .

a

b temp

8 5 5 3
e

c

6 1 3 2 6 1 2 8
d

Figure 5. Case 3. 2S is not empty and 1S ≥ 2P . So a) 1P → temp, b) 1S → 1P , c) 2P → 1S d) 2S → 2P
and d) temp → 2S.

2S at 2P and temp at 2S. (Figure 6)

All of these cases require a single comparison. After adjusting the order of the
pair we ﬁx the heap by pushing down the root node in the same way as we did in
the heap construction phase. We also remove the last element and mark it empty.


a

temp

8 3 5 3
d
b

6 1 5 2 6 1 2 8
c

Figure 6. Case 4. 2S is not empty and 1S < 2P . So a) 1P → temp, b) 2P → 1P , c) 2S → 2P and d)
temp → 2S.

3. Analysis

For simplicity of analysis, let us assume that heap in our new data structure is full.
That is,

h
n/2 = 2i
i=0

= 2h+1 − 1

n
∴ h = log +1 −1
2

However, had it not been a full heap the formula would have been h =
⌈log n + 1 ⌉ − 1
2

3.1 Preprocessing
We require n/2 comparisons to pair the n elements with a possible left over.

3.2 Heap Construction Phase
At every level i (0 ≤ i < h) we have 2i pairs. To ﬁnd the path of elder sons we
need h − i comparisons and for inserting the pair into the path we need binary
search on a path of length h − i. So the number of comparisons required for heap
construction is,


h−1
2i ((h − i) + ⌊log(h − i)⌋ + 1)
i=0
h
= 2h−i (i + ⌊log i⌋ + 1)
i=1
h h h
= 2h−i i + 2h−i ⌊log i⌋ + 2h−i
i=1 i=1 i=1
h
⌊log i⌋
= 2h+1 − 2 − h + 2h + 2h − 1
2i
i=1

≤ 2h+1 − 2 − h + 2h ∗ 0.632843 + 2h − 1

= 0.90821n − log n + o(1)

Here,

h
⌊log i⌋
2i
i=1
∞
⌊log i⌋
<
2i
i=1
50 ∞
⌊log i⌋ ⌊log i⌋
< +
2i 2i
i=1 i=51
50 ∞
⌊log i⌋ ⌊log 51⌋ ⌊log i⌋
< + +
2i 251 i=51 2i
i=1
< 0.632843

The integral and summand are calculated using www.wolframalpha.com.
Note that, if there are x elements then we require ⌊log x⌋ + 1 comparisons for
binary search in the worst case.

3.3 Sorting Phase
In sorting phase, we will have i (1 ≤ i ≤ h) length path of elder sons 2i+1 times,
because we have 2i+1 numbers at the ith level. For every such path we require i
comparisons for determining the path of elder sons and ⌊log i⌋ + 1 comparisons for
inserting a pair into the path. So for pushing down, we require


h
2i+1 (i + ⌊log i⌋ + 1)
i=1
h h
= 2i+1 (i + 1) + 2i+1 ⌊log i⌋
i=1 i=1
h
≤ 2h+2 h + ⌊log h⌋ 2i+1
i=1

= 2h+2 h + ⌊log h⌋ 2h+2 − 4

= 2 (n/2 + 1) h + (n − 2) ⌊log log (n/2 + 1)⌋
≤ (n + 2) (log (n/2 + 1) − 1) + (n − 2) ⌊log log n⌋
= (n + 2) log (n/2 + 1) − n − 2 + (n − 2) ⌊log log n⌋
= (n + 2) log n − 2n + (n − 2) ⌊log log n⌋ + o(1)

We also require n − 2 extra comparisons for adjustment.
So the total number of comparisons is,

n/2 + 0.90821n − log n + n − 2 + (n + 2) log n − 2n + (n − 2)⌊log log n⌋ + o(1)
= n log n + n log log n + 0.40821n + o(1)

4. Theoretical comparison between Carlsson’s variant and the proposed
algorithm

Table 1 shows results of comparison between the two algorithms. We have analyzed
the worst cases here.

Table 1. Theoretical comparison between Carlsson’s algorithm and the proposed algorithm
Number of elements CH1 NH2 CS3 NS4 DH = CH - NH DS = CS - NS Total diﬀerence(DH + DS)
7 8 9 20 17 -1 3 2
8 13 10 25 22 3 3 6
9 13 10 30 27 3 3 6
10 15 11 35 32 4 3 7
14 21 15 55 52 6 3 9
15 21 20 60 58 1 2 3
16 28 21 67 64 7 3 10
100 178 137 761 746 41 15 56
1000 1809 1401 11713 11446 408 267 675
10000 18159 14076 153357 153094 4083 263 4346
100000 181635 140814 1903137 1868412 40821 34725 75546
1000000 1816414 1408203 22885636 22819844 408211 65792 474003
1 Comparison required in Carlsson’s Heap Construction in the worst case
2 Comparison required in the proposed Heap Construction in the worst case
3 Comparison required in Carlsson’s Sorting in the worst case
4 Comparison required in the proposed sorting algorithm in the worst case

In section 3, we considered our heap to be full and the number of comparisons


required for sorting phase of our proposed algorithm was,

h
n−2+ 2i+1 (i + ⌊log i⌋ + 1)
i=1

However, for our experimental data we considered any heap. Let the height of
the heap be h. That means, we have full heap of height h − 1 and t = n − 2 2h − 1
extra nodes in the last level. So for full heap of height h− 1 we apply the formula in
section 3.3. For the last level we will have t paths of elder sons of length h and we
need h comparisons to determine the path and ⌊log h⌋+1 comparisons for inserting
pair into the path. So the number of comparisons for n element heap is:

h−1
n−2+ 2i+1 (i + ⌊log i⌋ + 1) + n − 2 2h − 1 ∗ (h + ⌊log h⌋ + 1)
i=1

Similarly, the number of comparisons for Carlsson’s algorithm in sorting phase
is:

h−1
2i (i + ⌊log i⌋ + 1) + n − 2h − 1 ∗ (h + ⌊log h⌋ + 1)
i=1

We computed the number of comparisons required for the heap construction
phase of Carlsson’s variant and the proposed algorithm by Algorithm 1 and Algo-
rithm 2 respectively.

Algorithm 1 The number of comparisons required for the heap construction phase
of Carlsson’s variant
Require: n ≥ 1
1: h ← ⌊log n⌋
2: CH ← 0
3: active ← n − 2h + 1
4: for i = 1 to h do
5: current ← 2h−i
6: active ← ⌈active/2⌉
7: passive = current − active
8: CH ← CH + active ∗ (i + ⌊log i⌋ + 1)
9: CH ← CH + passive ∗ ((i − 1) + ⌊log (i − 1)⌋ + 1)
10: end for

5. Further improvement

Using Gonnet, Munro or Carlsson’s [2, 6] algorithm we can further improve heap
construction and sorting phases. To replace the maximum in a heap log n+log∗ n =
h + log∗ h + 1 comparisons are necessary and suﬃcient as h = log n.
So the number of comparisons required for construction in our proposed data
structure is:


Algorithm 2 The number of comparisons required for the heap construction phase
of the proposed algorithm
Require: n ≥ 1
1: h ← ⌊log ⌈n/2⌉⌋
2: N H ← ⌊n/2⌋
3: active ← ⌈n/2⌉ − 2h + 1
4: for i = 1 to h do
5: current ← 2h−i
6: active ← ⌈active/2⌉
7: passive = current − active
8: N H ← N H + active ∗ (i + ⌊log i⌋ + 1)
9: N H ← N H + passive ∗ ((i − 1) + ⌊log (i − 1)⌋ + 1)
10: end for

h
n/2 + 2h−i (i + log∗ i + 1)
i=1
h h
h−i
= n/2 + 2 (i + 1) + 2h−i log∗ i
i=1 i=1
h
log∗ i
= n/2 + (3 ∗ 2h − h) + 2h + o(1)
2i
i=1

≤ n/2 + (3 ∗ 2h − h) + 0.812515 ∗ 2h + o(1)
= n/2 + 3.812515 ∗ 2h − h + o(1)
= 1.453129n − log n + o(1)

Here,

h
log∗ i
2i
i=1
∞
log∗ i
<
2i
i=1
50 ∞
log∗ i log∗ i
< +
2i 2i
i=1 i=51
50
log∗ i log∗ 51 ∞
log∗ i
< + +
2i 251 i=51 2i
i=1
50
log∗ i log∗ 51 ∞
⌊log i⌋
< + +
2i 251 i=51 2i
i=1
< 0.812515

The integral and the summand are calculated using www.wolframalpha.com and
a simple C program respectively.


Similarly the number of comparisons required for sorting is:

h
n−2+ 2i+1 (i + log∗ i + 1)
i=1
h h
i+1
=n−2+ 2 (i + 1) + 2i+1 log∗ i
i=1 i=1

≤ n − 2 + 2h+2 h + log∗ h 2h+2 − 3

= n − 2 + (n + 2)(log n − 2) + (log∗ n − 1)(n − 2)
= (n + 2) log n + (n − 2) log∗ n − 2n + o(1)
= n log n + n log∗ n − 2n + O(log n)

So total number of comparison is:

n log n + n log∗ n − 0.546871n + O(log n)

6. Generalization of the data structure

Let us now generalize the data structure and keep c elements per node. So there
will be n nodes and if the height of the heap is h then 2h+1 − 1 = n . So we sort
c c
the elements at each node and construct the heap as described above. We sort
the elements at a node by traditional inplace heapsort algorithm requiring 2c log c
comparisons. So to create the heap the required number of comparisons is:

h
2n log c + 2h−i (i + log∗ i + 1)
i=1

.
In sorting phase, when there will be c elements left in the heap then we do not
need further comparisons. So we need to adjust the root n−c times. For adjustment
we take the largest element at the last node and insert it into the root of the heap
where there is c − 1 elements. Since we require ⌊log i⌋ + 1 comparisons to insert an
element in a sorted list of i elements, the required number of comparisons is:

h
(n − c) (⌊log (c − 1)⌋ + 1) + c2i (i + log∗ i + 1)
i=1

Asymptotic analysis shows that the optimal value of c is O(log n). However,
experimental results do not always support this theoretical ﬁndings. One of the
reasons is that the cost function contains integral functions that are not so easy to
analyze by diﬀerential calculus.

7. Pairing trick in Weak Heapsort algorithm

Similar to the previous one, we pair up the numbers and construct a Weak Heap [4]
according to the Primary elements of these pairs. These pairs acts as a node in
the Weak Heap and each node have an extra bit just like normal Weak Heap. We


keep an extra variable suppose temp which is initially set to empty. After finding
the pair with the largest Primary element in the heap (we call this node head)
we check temp variable. If temp is empty, the Primary element of the head is the
next largest number and the Secondary element of the head is assigned in temp.
If temp is not empty and temp is larger than the Primary element of the head,
the next two largest number will be the number in temp and the Primary element
of the head, the Secondary element will be assigned in temp. And if temp is not
empty and temp is smaller than the Primary element of the head, the next largest
number will be the Primary element of the head and we make a pair by combining
the number in temp and the Secondary element of the head. Each of these scenarios
requires at most two extra comparisons. So the total number of comparisons is:

n/2 + n/2 − 1 + 2(n/2 ∗ (k − 1) − 2k−1 − n/2 + 2) + n
= n − 1 + nk − n − 2k − n + 4 + n
= nk − 2k + 3

where k = ⌈log n⌉. This is the same expression as the number of comparisons of
Weak Heapsort algorithm [4]. But here we required only n/2 bits to sort n numbers
in (n − 1) log n + 0.086013n comparisons.
We can also apply this trick similarly to the Weak Heapsort algorithm modified
by Edelkamp and Stiegeler [5]. It gives a sorting algorithm with n log n − 0.9n
comparisons and requiring n/2 bits and n size array to store indices.

8. Experimental results

In table 2, the practical performance of the three algorithms is presented. The
algorithms were run on the same random sets of floating point data several times
and the average time is presented in table 2. In implementation we used bitwise
operation instead of normal arithmetic operation where it is possible. We also
avoided recomputations of some equations storing it into a temporary variable
where it is applicable. These two optimizations give us better runtime.
Table 2. Experimental results
Runtime in millisecond
Number of elements The proposed algorithm Carlsson’s variant Dutton’s Weak Heapsort
10000 0 0 0
50000 31 31 78
100000 47 47 156
500000 313 375 969
1000000 580 907 2234
5000000 3340 5601 15554
10000000 7078 12385 35297
30000000 22765 49938 138937

9. Conclusion

We have presented a new approach to reduce the number of comparisons for in-
place heapsort. The main idea of the paper is to consider groups of elements at
each node of a heap. This reduces the height of the tree, and hence results in a
slight reduction in time for each of the deletemax operations, despite an increase
of some comparisons to ensure that every node (except the last) has a sorted

12 REFERENCES

group of elements. This is because of the reduction in the cost due to lower order
terms of the deletemax operations. So overall there is a reduction in the number of
comparisons made in the lower order (more precisely the linear) term for heapsort.
However optimum number of comparisons to be stored in a node in our proposed
data structure is still inconclusive and requires further investigation.

Acknowledgement: The authors profusely thank anonymous referees for sug-
gesting signiﬁcant improvement and for pointing out errors and inaccuracies of an
earlier version of the manuscript.

References

[1] S. Carlsson, A variant of heapsort with almost optimal number of comparisons, Inf. Process. Lett. 24
(1987), pp. 247–250.
[2] ———, An optimal algorithm for deleting the root of a heap, Inf. Process. Lett. 37 (1991), pp. 117–120.
[3] S. Carlsson and J. Chen, Heap Construction: Optimal in Both Worst and Average Cases?, in ISAAC,
Lecture Notes in Computer Science, vol. 1004, Springer, 1995, pp. 254–263.
[4] R.D. Dutton, Weak-heap sort, BIT 33 (1993), pp. 372–381.
[5] S. Edelkamp and P. Stiegeler, Implementing heapsort with (n logn - 0.9n) and quicksort with (n logn
+ 0.2n) comparisons, ACM Journal of Experimental Algorithmics 7 (2002), p. 5.
[6] G.H. Gonnet and J.I. Munro, Heaps on heaps, SIAM J. Comput. 15 (1986), pp. 964–971.
[7] T.M. Islam and M. Kaykobad, Worst-case analysis of generalized heapsort algorithm revisited, Int.
J. Comput. Math. 83 (2006), pp. 59–67.
[8] J. Katajainen, The Ultimate Heapsort, in CATS, 1998, pp. 87–96.
[9] M. Kaykobad, M.M. Islam, M.E. Amyeen, and M.M. Murshed, 3 is a more promising algorithmic
parameter than 2, Comput. Math. Appl. 36 (1998), pp. 19–24.
[10] C. McDiarmid and B.A. Reed, Building heaps fast, J. Algorithms 10 (1989), pp. 352–365.
[11] A. Paulik, Worst-case analysis of a generalized heapsort algorithm, Inf. Process. Lett. 36 (1990), pp.
159–165.
[12] S.K.N. Rezaul Alam Chowdhury and M. Kaykobad, A simpliﬁed complexity analysis of mcdiarmid
and reed’s variant of bottom-up heapsort algorithm, Int. J. Comput. Math. 73 (2000), pp. 293–297.
[13] M. Shahjalal and M. Kaykobad, A New Data Structure for Heapsort with Improved Number of
Comparisons, in WALCOM, 2007, pp. 88–96.
[14] X.D. Wang and Y.J. Wu, An improved heapsort algorithm with log n - 0.788928 comparisons in
the worst case, J. Comput. Sci. Technol. 22 (2007), pp. 898–903.
[15] I. Wegener, Bottom-up-heapsort, a new variant of heapsort, beating, on an average, quicksort (if n
is not very small), Theor. Comput. Sci. 118 (1993), pp. 81–98.

Ijcm

Recommended

Recommended

More Related Content

What's hot

What's hot (16)

Viewers also liked

Viewers also liked (7)

Similar to Ijcm

Similar to Ijcm (20)

Ijcm