Quicksort with Equal Keys

1. Quicksort with Equal Keys Sebastian Wild wild@cs.uni-kl.de based on joint work with Martin Aumüller, Martin Dietzfelbinger, Conrado Martínez, and Markus Nebel Data Structures and Advanced Models of Computation on Big Data Dagstuhl Seminar 16 101 Sebastian Wild Quicksort with Equal Keys 2016-03-07 1 / 12

2. Motivation Practical Dimension: solve togetherness problem e.g. SQL’s GROUP BY clause preprocessing to remove dups Theoretical Dimension: many equals lower entropy We should be able to sort faster! Academic Dimension: Open Problem of Sedgewick & Bentley: Is Quicksort with median-of-k entropy-optimal for k → ∞? Personal Dimension: This is my tenth talk on Quicksort ... Equal Keys on my wishlist since 2012 Why Equal Keys? Sebastian Wild Quicksort with Equal Keys 2016-03-07 2 / 12

7. Previous Work Rather little is known! Sedgewick 1977: Quicksort on Equal Keys Sedgewick & Bentley 2002: Quicksort is Optimal (Talk at Knuthfest) A bit more on BSTs: Burge 1976: An Analysis of BSTs Formed from Sequences of Nondistinct Keys Kemp 1996: BSTs constructed from nondistinct keys with/without speciﬁed probabilities Archibald & Clément 2006: Average depth in a BST with repeated keys This is basically all literature on analysis of Quicksort with equal keys! Sebastian Wild Quicksort with Equal Keys 2016-03-07 3 / 12

8. Previous Work Rather little is known! Sedgewick 1977: Quicksort on Equal Keys Sedgewick & Bentley 2002: Quicksort is Optimal (Talk at Knuthfest) A bit more on BSTs: Burge 1976: An Analysis of BSTs Formed from Sequences of Nondistinct Keys Kemp 1996: BSTs constructed from nondistinct keys with/without speciﬁed probabilities Archibald & Clément 2006: Average depth in a BST with repeated keys This is basically all literature on analysis of Quicksort with equal keys! SIAM J. COMPUT. Vol. 6, No. 2, June 1977 QUICKSORT WITH EQUAL KEYS* ROBERT SEDGEWICK? Abstract. This paper considers the problem of implementing and analyzing a Quicksort program when equal keys are likely to be present in the file to be sorted. Upper and lower bounds are derived on the average number of comparisons needed by any Quicksort programwhen equal keys are present. It is shown that, of the three strategies which have been suggested for dealing with equal keys, the method of always stopping the scanning pointers on keys equal to the partitioning element performs best. Key words, analysis of algorithms, equal keys, Quicksort, sorting Introduction. The Quicksort algorithm, which was introduced by C. A. R. Hoare in 1960 [6], [7], has gained wide acceptance as the most efficient general-purpose sorting method suitable for use on computers. The algorithm has a rich history: many modifications have been suggested to improve its perfor- mance, and exact formulas have been derived describing the time and spaceSebastian Wild Quicksort with Equal Keys 2016-03-07 3 / 12

9. Previous Work Rather little is known! Sedgewick 1977: Quicksort on Equal Keys Sedgewick & Bentley 2002: Quicksort is Optimal (Talk at Knuthfest) A bit more on BSTs: Burge 1976: An Analysis of BSTs Formed from Sequences of Nondistinct Keys Kemp 1996: BSTs constructed from nondistinct keys with/without speciﬁed probabilities Archibald & Clément 2006: Average depth in a BST with repeated keys This is basically all literature on analysis of Quicksort with equal keys! SIAM J. COMPUT. Vol. 6, No. 2, June 1977 QUICKSORT WITH EQUAL KEYS* ROBERT SEDGEWICK? Abstract. This paper considers the problem of implementing and analyzing a Quicksort program when equal keys are likely to be present in the file to be sorted. Upper and lower bounds are derived on the average number of comparisons needed by any Quicksort programwhen equal keys are present. It is shown that, of the three strategies which have been suggested for dealing with equal keys, the method of always stopping the scanning pointers on keys equal to the partitioning element performs best. Key words, analysis of algorithms, equal keys, Quicksort, sorting Introduction. The Quicksort algorithm, which was introduced by C. A. R. Hoare in 1960 [6], [7], has gained wide acceptance as the most efficient general-purpose sorting method suitable for use on computers. The algorithm has a rich history: many modifications have been suggested to improve its perfor- mance, and exact formulas have been derived describing the time and space Now is the time For all good men To come to the aid Of their party QUICKSORT IS OPTIMAL Robert Sedgewick Jon Bentley Sebastian Wild Quicksort with Equal Keys 2016-03-07 3 / 12

10. Previous Work Rather little is known! Sedgewick 1977: Quicksort on Equal Keys Sedgewick & Bentley 2002: Quicksort is Optimal (Talk at Knuthfest) A bit more on BSTs: Burge 1976: An Analysis of BSTs Formed from Sequences of Nondistinct Keys Kemp 1996: BSTs constructed from nondistinct keys with/without speciﬁed probabilities Archibald & Clément 2006: Average depth in a BST with repeated keys This is basically all literature on analysis of Quicksort with equal keys! SIAM J. COMPUT. Vol. 6, No. 2, June 1977 QUICKSORT WITH EQUAL KEYS* ROBERT SEDGEWICK? Abstract. This paper considers the problem of implementing and analyzing a Quicksort program when equal keys are likely to be present in the file to be sorted. Upper and lower bounds are derived on the average number of comparisons needed by any Quicksort programwhen equal keys are present. It is shown that, of the three strategies which have been suggested for dealing with equal keys, the method of always stopping the scanning pointers on keys equal to the partitioning element performs best. Key words, analysis of algorithms, equal keys, Quicksort, sorting Introduction. The Quicksort algorithm, which was introduced by C. A. R. Hoare in 1960 [6], [7], has gained wide acceptance as the most efficient general-purpose sorting method suitable for use on computers. The algorithm has a rich history: many modifications have been suggested to improve its perfor- mance, and exact formulas have been derived describing the time and space Now is the time For all good men To come to the aid Of their party QUICKSORT IS OPTIMAL Robert Sedgewick Jon Bentley An Analysis of Binary Search Trees Formed from Sequences of Nondistinct Keys WILLIAM H. BURGE IBM Thomas J. Watson Research Center, Yorktown Heights, New York ABSTRACT. The expected depth of each key in the set of binary search trees formed from all sequences composed from a multmet {pl • 1, p2 ' 2, p3 • 3, ..., p~ • n} is obtained, and hence the expected weight of such trees. The expected number of left-to-right local minima and the expected number of cycles in sequences composed from a multiset are then deduced from these results. KEY WORDS AND PHRASES : binary search trees, multiset CR CATEGORIES: 5.3 Introduction The expected depth of the number r in the set of binary search trees formed from allSebastian Wild Quicksort with Equal Keys 2016-03-07 3 / 12

11. Previous Work Rather little is known! Sedgewick 1977: Quicksort on Equal Keys Sedgewick & Bentley 2002: Quicksort is Optimal (Talk at Knuthfest) A bit more on BSTs: Burge 1976: An Analysis of BSTs Formed from Sequences of Nondistinct Keys Kemp 1996: BSTs constructed from nondistinct keys with/without speciﬁed probabilities Archibald & Clément 2006: Average depth in a BST with repeated keys This is basically all literature on analysis of Quicksort with equal keys! SIAM J. COMPUT. Vol. 6, No. 2, June 1977 QUICKSORT WITH EQUAL KEYS* ROBERT SEDGEWICK? Abstract. This paper considers the problem of implementing and analyzing a Quicksort program when equal keys are likely to be present in the file to be sorted. Upper and lower bounds are derived on the average number of comparisons needed by any Quicksort programwhen equal keys are present. It is shown that, of the three strategies which have been suggested for dealing with equal keys, the method of always stopping the scanning pointers on keys equal to the partitioning element performs best. Key words, analysis of algorithms, equal keys, Quicksort, sorting Introduction. The Quicksort algorithm, which was introduced by C. A. R. Hoare in 1960 [6], [7], has gained wide acceptance as the most efficient general-purpose sorting method suitable for use on computers. The algorithm has a rich history: many modifications have been suggested to improve its perfor- mance, and exact formulas have been derived describing the time and space Now is the time For all good men To come to the aid Of their party QUICKSORT IS OPTIMAL Robert Sedgewick Jon Bentley An Analysis of Binary Search Trees Formed from Sequences of Nondistinct Keys WILLIAM H. BURGE IBM Thomas J. Watson Research Center, Yorktown Heights, New York ABSTRACT. The expected depth of each key in the set of binary search trees formed from all sequences composed from a multmet {pl • 1, p2 ' 2, p3 • 3, ..., p~ • n} is obtained, and hence the expected weight of such trees. The expected number of left-to-right local minima and the expected number of cycles in sequences composed from a multiset are then deduced from these results. KEY WORDS AND PHRASES : binary search trees, multiset CR CATEGORIES: 5.3 Introduction The expected depth of the number r in the set of binary search trees formed from all Theoretical Computer Science ELSEVIER Theoretical Computer Science 156 (1996) 39%70 Binary search trees constructed from nondistinct keys with/without specified probabilities Rainer Kemp Johann Wo@ng Goethe-Universitiit, Fachbereich Informarik, D-60054 Fran!+rt am Main, German) Received May 1994; revised November 1994 Communicated by H. Prodinger Sebastian Wild Quicksort with Equal Keys 2016-03-07 3 / 12

12. Previous Work Rather little is known! Sedgewick 1977: Quicksort on Equal Keys Sedgewick & Bentley 2002: Quicksort is Optimal (Talk at Knuthfest) A bit more on BSTs: Burge 1976: An Analysis of BSTs Formed from Sequences of Nondistinct Keys Kemp 1996: BSTs constructed from nondistinct keys with/without specified probabilities Archibald & Clément 2006: Average depth in a BST with repeated keys This is basically all literature on analysis of Quicksort with equal keys! SIAM J. COMPUT. Vol. 6, No. 2, June 1977 QUICKSORT WITH EQUAL KEYS* ROBERT SEDGEWICK? Abstract. This paper considers the problem of implementing and analyzing a Quicksort program when equal keys are likely to be present in the file to be sorted. Upper and lower bounds are derived on the average number of comparisons needed by any Quicksort programwhen equal keys are present. It is shown that, of the three strategies which have been suggested for dealing with equal keys, the method of always stopping the scanning pointers on keys equal to the partitioning element performs best. Key words, analysis of algorithms, equal keys, Quicksort, sorting Introduction. The Quicksort algorithm, which was introduced by C. A. R. Hoare in 1960 [6], [7], has gained wide acceptance as the most efficient general-purpose sorting method suitable for use on computers. The algorithm has a rich history: many modifications have been suggested to improve its perfor- mance, and exact formulas have been derived describing the time and space Now is the time For all good men To come to the aid Of their party QUICKSORT IS OPTIMAL Robert Sedgewick Jon Bentley An Analysis of Binary Search Trees Formed from Sequences of Nondistinct Keys WILLIAM H. BURGE IBM Thomas J. Watson Research Center, Yorktown Heights, New York ABSTRACT. The expected depth of each key in the set of binary search trees formed from all sequences composed from a multmet {pl • 1, p2 ' 2, p3 • 3, ..., p~ • n} is obtained, and hence the expected weight of such trees. The expected number of left-to-right local minima and the expected number of cycles in sequences composed from a multiset are then deduced from these results. KEY WORDS AND PHRASES : binary search trees, multiset CR CATEGORIES: 5.3 Introduction The expected depth of the number r in the set of binary search trees formed from all Theoretical Computer Science ELSEVIER Theoretical Computer Science 156 (1996) 39%70 Binary search trees constructed from nondistinct keys with/without specified probabilities Rainer Kemp Johann Wo@ng Goethe-Universitiit, Fachbereich Informarik, D-60054 Fran!+rt am Main, German) Received May 1994; revised November 1994 Communicated by H. Prodinger Fourth Colloquium on Mathematics and Computer Science DMTCS proc. AG, 2006, 309–320 Average depth in a binary search tree with repeated keys Margaret Archibald1 and Julien Clément2,3 1 School of Mathematics, University of the Witwatersrand, P. O. Wits, 2050 Johannesburg, South Africa, email: marchibald@maths.wits.ac.za. 2 CNRS UMR 8049, Institut Gaspard-Monge, Laboratoire d’informatique, Université de Marne-la-Vallée, France 3 CNRS UMR 6072, GREYC Laboratoire d’informatique, Université de Caen, France, email: Julien.Clement@info.unicaen.fr Random sequences from alphabet {1, . . . , r} are examined where repeated letters are allowed. Binary search trees are formed from these, and the average left-going depth of the first 1 is found. Next, the right-going depth of the first r is examined, and finally a merge (or ‘shuffle’) operator is used to obtain the average depth of an arbitrary node, which canSebastian Wild Quicksort with Equal Keys 2016-03-07 3 / 12

13. Previous Work Rather little is known! Sedgewick 1977: Quicksort on Equal Keys Sedgewick & Bentley 2002: Quicksort is Optimal (Talk at Knuthfest) A bit more on BSTs: Burge 1976: An Analysis of BSTs Formed from Sequences of Nondistinct Keys Kemp 1996: BSTs constructed from nondistinct keys with/without specified probabilities Archibald & Clément 2006: Average depth in a BST with repeated keys This is basically all literature on analysis of Quicksort with equal keys! SIAM J. COMPUT. Vol. 6, No. 2, June 1977 QUICKSORT WITH EQUAL KEYS* ROBERT SEDGEWICK? Abstract. This paper considers the problem of implementing and analyzing a Quicksort program when equal keys are likely to be present in the file to be sorted. Upper and lower bounds are derived on the average number of comparisons needed by any Quicksort programwhen equal keys are present. It is shown that, of the three strategies which have been suggested for dealing with equal keys, the method of always stopping the scanning pointers on keys equal to the partitioning element performs best. Key words, analysis of algorithms, equal keys, Quicksort, sorting Introduction. The Quicksort algorithm, which was introduced by C. A. R. Hoare in 1960 [6], [7], has gained wide acceptance as the most efficient general-purpose sorting method suitable for use on computers. The algorithm has a rich history: many modifications have been suggested to improve its perfor- mance, and exact formulas have been derived describing the time and space Now is the time For all good men To come to the aid Of their party QUICKSORT IS OPTIMAL Robert Sedgewick Jon Bentley An Analysis of Binary Search Trees Formed from Sequences of Nondistinct Keys WILLIAM H. BURGE IBM Thomas J. Watson Research Center, Yorktown Heights, New York ABSTRACT. The expected depth of each key in the set of binary search trees formed from all sequences composed from a multmet {pl • 1, p2 ' 2, p3 • 3, ..., p~ • n} is obtained, and hence the expected weight of such trees. The expected number of left-to-right local minima and the expected number of cycles in sequences composed from a multiset are then deduced from these results. KEY WORDS AND PHRASES : binary search trees, multiset CR CATEGORIES: 5.3 Introduction The expected depth of the number r in the set of binary search trees formed from all Theoretical Computer Science ELSEVIER Theoretical Computer Science 156 (1996) 39%70 Binary search trees constructed from nondistinct keys with/without specified probabilities Rainer Kemp Johann Wo@ng Goethe-Universitiit, Fachbereich Informarik, D-60054 Fran!+rt am Main, German) Received May 1994; revised November 1994 Communicated by H. Prodinger Fourth Colloquium on Mathematics and Computer Science DMTCS proc. AG, 2006, 309–320 Average depth in a binary search tree with repeated keys Margaret Archibald1 and Julien Clément2,3 1 School of Mathematics, University of the Witwatersrand, P. O. Wits, 2050 Johannesburg, South Africa, email: marchibald@maths.wits.ac.za. 2 CNRS UMR 8049, Institut Gaspard-Monge, Laboratoire d’informatique, Université de Marne-la-Vallée, France 3 CNRS UMR 6072, GREYC Laboratoire d’informatique, Université de Caen, France, email: Julien.Clement@info.unicaen.fr Random sequences from alphabet {1, . . . , r} are examined where repeated letters are allowed. Binary search trees are formed from these, and the average left-going depth of the first 1 is found. Next, the right-going depth of the first r is examined, and finally a merge (or ‘shuffle’) operator is used to obtain the average depth of an arbitrary node, which canSebastian Wild Quicksort with Equal Keys 2016-03-07 3 / 12

14. Previous Work Rather little is known! Sedgewick 1977: Quicksort on Equal Keys Sedgewick & Bentley 2002: Quicksort is Optimal (Talk at Knuthfest) A bit more on BSTs: Burge 1976: An Analysis of BSTs Formed from Sequences of Nondistinct Keys Kemp 1996: BSTs constructed from nondistinct keys with/without specified probabilities Archibald & Clément 2006: Average depth in a BST with repeated keys This is basically all literature on analysis of Quicksort with equal keys! SIAM J. COMPUT. Vol. 6, No. 2, June 1977 QUICKSORT WITH EQUAL KEYS* ROBERT SEDGEWICK? Abstract. This paper considers the problem of implementing and analyzing a Quicksort program when equal keys are likely to be present in the file to be sorted. Upper and lower bounds are derived on the average number of comparisons needed by any Quicksort programwhen equal keys are present. It is shown that, of the three strategies which have been suggested for dealing with equal keys, the method of always stopping the scanning pointers on keys equal to the partitioning element performs best. Key words, analysis of algorithms, equal keys, Quicksort, sorting Introduction. The Quicksort algorithm, which was introduced by C. A. R. Hoare in 1960 [6], [7], has gained wide acceptance as the most efficient general-purpose sorting method suitable for use on computers. The algorithm has a rich history: many modifications have been suggested to improve its perfor- mance, and exact formulas have been derived describing the time and space Now is the time For all good men To come to the aid Of their party QUICKSORT IS OPTIMAL Robert Sedgewick Jon Bentley An Analysis of Binary Search Trees Formed from Sequences of Nondistinct Keys WILLIAM H. BURGE IBM Thomas J. Watson Research Center, Yorktown Heights, New York ABSTRACT. The expected depth of each key in the set of binary search trees formed from all sequences composed from a multmet {pl • 1, p2 ' 2, p3 • 3, ..., p~ • n} is obtained, and hence the expected weight of such trees. The expected number of left-to-right local minima and the expected number of cycles in sequences composed from a multiset are then deduced from these results. KEY WORDS AND PHRASES : binary search trees, multiset CR CATEGORIES: 5.3 Introduction The expected depth of the number r in the set of binary search trees formed from all Theoretical Computer Science ELSEVIER Theoretical Computer Science 156 (1996) 39%70 Binary search trees constructed from nondistinct keys with/without specified probabilities Rainer Kemp Johann Wo@ng Goethe-Universitiit, Fachbereich Informarik, D-60054 Fran!+rt am Main, German) Received May 1994; revised November 1994 Communicated by H. Prodinger Fourth Colloquium on Mathematics and Computer Science DMTCS proc. AG, 2006, 309–320 Average depth in a binary search tree with repeated keys Margaret Archibald1 and Julien Clément2,3 1 School of Mathematics, University of the Witwatersrand, P. O. Wits, 2050 Johannesburg, South Africa, email: marchibald@maths.wits.ac.za. 2 CNRS UMR 8049, Institut Gaspard-Monge, Laboratoire d’informatique, Université de Marne-la-Vallée, France 3 CNRS UMR 6072, GREYC Laboratoire d’informatique, Université de Caen, France, email: Julien.Clement@info.unicaen.fr Random sequences from alphabet {1, . . . , r} are examined where repeated letters are allowed. Binary search trees are formed from these, and the average left-going depth of the first 1 is found. Next, the right-going depth of the first r is examined, and finally a merge (or ‘shuffle’) operator is used to obtain the average depth of an arbitrary node, which can All these works only consider basic Quicksort: No sampling to choose pivots. No multi-way partitioning. Sebastian Wild Quicksort with Equal Keys 2016-03-07 3 / 12

15. Input Models Random-Permutation Model no duplicates 1 1, . . . , n, randomly permuted Parameters: n Sebastian Wild Quicksort with Equal Keys 2016-03-07 4 / 12

16. Input Models Random-Permutation Model no duplicates 1 1, . . . , n, randomly permuted Parameters: n Random u-ary Word Model 1 random choice from {1, . . . , u}n Parameters: n, u My claim: u-ary words most natural model ... in particular in i.i.d. interpretation! All previous works use combinatorial detour Stochastic point of view provides shortcut! Sebastian Wild Quicksort with Equal Keys 2016-03-07 4 / 12

17. Input Models Random-Permutation Model no duplicates 1 1, . . . , n, randomly permuted 2 n elem. i.i.d. uniform in (0, 1) Parameters: n Random u-ary Word Model 1 random choice from {1, . . . , u}n 2 n elem. i.i.d. uniform in {1, . . . , u} Parameters: n, u My claim: u-ary words most natural model ... in particular in i.i.d. interpretation! All previous works use combinatorial detour Stochastic point of view provides shortcut! Sebastian Wild Quicksort with Equal Keys 2016-03-07 4 / 12

18. Input Models Random-Permutation Model no duplicates 1 1, . . . , n, randomly permuted 2 n elem. i.i.d. uniform in (0, 1) Parameters: n Random u-ary Word Model 1 random choice from {1, . . . , u}n 2 n elem. i.i.d. uniform in {1, . . . , u} Parameters: n, u cont. disc. My claim: u-ary words most natural model ... in particular in i.i.d. interpretation! All previous works use combinatorial detour Stochastic point of view provides shortcut! Sebastian Wild Quicksort with Equal Keys 2016-03-07 4 / 12

19. Input Models Random-Permutation Model no duplicates 1 1, . . . , n, randomly permuted 2 n elem. i.i.d. uniform in (0, 1) Parameters: n Random u-ary Word Model 1 random choice from {1, . . . , u}n 2 n elem. i.i.d. uniform in {1, . . . , u} Parameters: n, u Multiset-Permutation Model randomly permuted multiset xi copies of i, i = 1, . . . , u Parameters: u; x1, . . . , xuset multiset cont. disc. My claim: u-ary words most natural model ... in particular in i.i.d. interpretation! All previous works use combinatorial detour Stochastic point of view provides shortcut! Sebastian Wild Quicksort with Equal Keys 2016-03-07 4 / 12

20. Input Models Random-Permutation Model no duplicates 1 1, . . . , n, randomly permuted 2 n elem. i.i.d. uniform in (0, 1) Parameters: n Random u-ary Word Model 1 random choice from {1, . . . , u}n 2 n elem. i.i.d. uniform in {1, . . . , u} Parameters: n, u Multiset-Permutation Model randomly permuted multiset xi copies of i, i = 1, . . . , u Parameters: u; x1, . . . , xuset multiset aver age cont. disc. My claim: u-ary words most natural model ... in particular in i.i.d. interpretation! All previous works use combinatorial detour Stochastic point of view provides shortcut! Sebastian Wild Quicksort with Equal Keys 2016-03-07 4 / 12

21. Input Models Random-Permutation Model no duplicates 1 1, . . . , n, randomly permuted 2 n elem. i.i.d. uniform in (0, 1) Parameters: n Random u-ary Word Model 1 random choice from {1, . . . , u}n 2 n elem. i.i.d. uniform in {1, . . . , u} Parameters: n, u Multiset-Permutation Model randomly permuted multiset xi copies of i, i = 1, . . . , u Parameters: u; x1, . . . , xuset multiset aver age cont. disc. hard to analyze My claim: u-ary words most natural model ... in particular in i.i.d. interpretation! All previous works use combinatorial detour Stochastic point of view provides shortcut! Sebastian Wild Quicksort with Equal Keys 2016-03-07 4 / 12

24. Setup Assumptions: 1 Input: U1, . . . , Un, i.i.d. uniform in {1, . . . , u}. 2 fat-pivot partitioning < P P P P > P recursive call recursive callall duplicates of pivots removed subproblems again random u-ary words, (with smaller u and n) 3 Cost: # ternary comparisons (extensions possible) Generalized Quicksort: s-way partitioning: divide into s segments pivot sampling: parameter t = (t1, . . . , ts) median-of-(2t + 1) t = (t, t) asymmetric sampling allowed Example: s = 3 and t = (1, 2, 3) P1 P2 t1 t2 t3 2nd and 5th from sample of 8. Sebastian Wild Quicksort with Equal Keys 2016-03-07 5 / 12

31. Quicksort & Search Trees Classic Fact: Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

34. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

35. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) 4 2 1 3 3 5 4 4 3 5 2 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

36. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) 4 2 1 3 3 5 4 4 3 5 2 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

37. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 5 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

38. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

39. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

40. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

41. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

42. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 5 5 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

43. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 5 5 1 3 33 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

44. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 5 5 1 3 33 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

45. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) Binary Search Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 5 5 1 3 33 4 2 1 3 3 5 4 4 3 5 2 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

46. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) Binary Search Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 5 5 1 3 33 4 2 1 3 3 5 4 4 3 5 2 4 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

47. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) Binary Search Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 5 5 1 3 33 4 2 1 3 3 5 4 4 3 5 2 4 2 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

48. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) Binary Search Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 5 5 1 3 33 4 2 1 3 3 5 4 4 3 5 2 4 2 1 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

49. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) Binary Search Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 5 5 1 3 33 4 2 1 3 3 5 4 4 3 5 2 4 2 1 3 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

50. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) Binary Search Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 5 5 1 3 33 4 2 1 3 3 5 4 4 3 5 2 4 2 1 3 3 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

51. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) Binary Search Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 5 5 1 3 33 4 2 1 3 3 5 4 4 3 5 2 4 2 1 3 5 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

52. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) Binary Search Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 5 5 1 3 33 4 2 1 3 3 5 4 4 3 5 2 4 2 1 3 5 4 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

53. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) Binary Search Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 5 5 1 3 33 4 2 1 3 3 5 4 4 3 5 2 4 2 1 3 5 1 4 Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

56. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) Binary Search Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 5 5 1 3 33 4 2 1 3 3 5 4 4 3 5 2 4 2 1 3 5 1 2 Equivalence holds also with duplicates. Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

57. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) Binary Search Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 5 5 1 3 33 4 2 1 3 3 5 4 4 3 5 2 4 2 1 3 5 1 Equivalence holds also with duplicates. Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

58. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) Binary Search Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 5 5 1 3 33 4 2 1 3 3 5 4 4 3 5 2 4 2 1 3 5 1 Equivalence holds also with duplicates. This was only basic Quicksort ... how about pivot sampling? Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

59. Quicksort & Search Trees Classic Fact: (without duplicates) Recursion Tree of Quicksort = Naturally grown BST from input Comparisons in Quicksort = Comparisons to built BST = Comparisons to search input in ﬁnal BST How about inputs with duplicates? Quicksort (Fat-Pivot) Binary Search Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 3 3 3 2 5 54 44 1 3 3 32 2 5 5 1 3 33 4 2 1 3 3 5 4 4 3 5 2 4 2 1 3 5 1 Equivalence holds also with duplicates. This was only basic Quicksort ... how about pivot sampling? Sebastian Wild Quicksort with Equal Keys 2016-03-07 6 / 12

60. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

61. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) 4 2 1 3 3 5 4 4 3 5 2 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

64. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

65. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 Insertionsort Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

68. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 44 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

69. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 44 Insertionsort Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

70. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 442 1 2 5 5 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

71. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 442 1 2 5 5 4 2 1 3 3 5 4 4 3 5 2 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

72. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 442 1 2 5 5 4 2 1 3 3 5 4 4 3 5 2 4 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

73. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 442 1 2 5 5 4 2 1 3 3 5 4 4 3 5 2 4 2 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

74. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 442 1 2 5 5 4 2 1 3 3 5 4 4 3 5 2 4 2 1 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

75. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 442 1 2 5 5 4 2 1 3 3 5 4 4 3 5 2 4 2 1 3 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

76. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 442 1 2 5 5 4 2 1 3 3 5 4 4 3 5 2 4 2 1 3 3 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

78. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 442 1 2 5 5 4 2 1 3 3 5 4 4 3 5 2 2 1 43 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

79. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 442 1 2 5 5 4 2 1 3 3 5 4 4 3 5 2 3 2 1 4 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

81. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 442 1 2 5 5 4 2 1 3 3 5 4 4 3 5 2 3 2 1 4 5 4 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

82. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 442 1 2 5 5 4 2 1 3 3 5 4 4 3 5 2 3 2 1 4 5 4 4 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

83. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 442 1 2 5 5 4 2 1 3 3 5 4 4 3 5 2 3 4 5 4 42 1 3 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

84. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 442 1 2 5 5 4 2 1 3 3 5 4 4 3 5 2 3 2 1 4 5 4 4 5 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

85. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 442 1 2 5 5 4 2 1 3 3 5 4 4 3 5 2 3 2 1 4 4 4 5 5 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

87. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 442 1 2 5 5 4 2 1 3 3 5 4 4 3 5 2 3 2 1 4 5 5 Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

88. Fringe-Balanced Trees Fringe-Balanced Search Trees: Leaves have buﬀer for elements, to emulate pivot sampling Median-of-5 Quicksort t = (2, 2) Fringe-Balanced Tree 4 2 1 3 3 5 4 4 3 5 2 2 1 2 4 5 4 4 533 3 5 54 442 1 2 5 5 4 2 1 3 3 5 4 4 3 5 2 3 4 5 5 2 1 2 Correspondence extends to Pivot Sampling (any scheme, not only median) s-way Partitioning s-ary search trees Analyze search trees instead of Quicksort. Sebastian Wild Quicksort with Equal Keys 2016-03-07 7 / 12

93. Tree-Growing and Searching Quicksort costs = costs to search input U = (U1, . . . , Un) in final tree T. T fixed search cost depends only on profile X = (X1, . . . , Xu) but: T also depends on U (Recall: T is built from U!) direct analysis no simpler than for Quicksort 4 5 5 5 2 3 5 4 2 2 2 5 5 5 4 2 5 1 5 4 2 3 4 2 2 1 2 4 2 2 5 5 2 3 4 2 3 4 1 3 2 2 1 4 4 4 1 2 3 3 Observation: T becomes stationary after 1 insert of each value! with leaf-buffer size w Split input into tree-growing part and searching part: 1 We built T until it is stationary, ignoring costs. 2 Determine costs of searching remaining elements. profile XS of search part independent of T Sebastian Wild Quicksort with Equal Keys 2016-03-07 8 / 12

110. Tree-Growing and Searching Quicksort costs = costs to search input U = (U1, . . . , Un) in final tree T. T fixed search cost depends only on profile X = (X1, . . . , Xu) but: T also depends on U (Recall: T is built from U!) direct analysis no simpler than for Quicksort 4 5 5 5 2 3 5 4 2 2 2 5 5 5 4 2 5 1 5 4 2 3 4 2 2 1 2 4 2 2 5 5 2 3 4 2 3 4 1 3 2 2 1 4 4 4 1 2 3 3 (w = 1) Observation: T becomes stationary after w inserts of each value! with leaf-buffer size w Split input into tree-growing part and searching part: 1 We built T until it is stationary, ignoring costs. 2 Determine costs of searching remaining elements. profile XS of search part independent of T Sebastian Wild Quicksort with Equal Keys 2016-03-07 8 / 12

111. Tree-Growing and Searching Quicksort costs = costs to search input U = (U1, . . . , Un) in final tree T. T fixed search cost depends only on profile X = (X1, . . . , Xu) but: T also depends on U (Recall: T is built from U!) direct analysis no simpler than for Quicksort 4 5 5 5 2 3 5 4 2 2 2 5 5 5 4 2 5 1 5 4 2 3 4 2 2 1 2 4 2 2 5 5 2 3 4 2 3 4 1 3 2 2 1 4 4 4 1 2 3 3 (w = 1) Observation: T becomes stationary after w inserts of each value! with leaf-buffer size w Split input into tree-growing part and searching part: 1 We built T until it is stationary, ignoring costs. 2 Determine costs of searching remaining elements. profile XS of search part independent of T Sebastian Wild Quicksort with Equal Keys 2016-03-07 8 / 12

112. Tree-Growing and Searching Quicksort costs = costs to search input U = (U1, . . . , Un) in final tree T. T fixed search cost depends only on profile X = (X1, . . . , Xu) but: T also depends on U (Recall: T is built from U!) direct analysis no simpler than for Quicksort 4 5 5 5 2 3 5 4 2 2 2 5 5 5 4 2 5 1 5 4 2 3 4 2 2 1 2 4 2 2 5 5 2 3 4 2 3 4 1 3 2 2 1 4 4 4 1 2 3 3 (w = 1) tree-growing part T Observation: T becomes stationary after w inserts of each value! with leaf-buffer size w Split input into tree-growing part and searching part: 1 We built T until it is stationary, ignoring costs. 2 Determine costs of searching remaining elements. profile XS of search part independent of T Sebastian Wild Quicksort with Equal Keys 2016-03-07 8 / 12

113. Tree-Growing and Searching Quicksort costs = costs to search input U = (U1, . . . , Un) in final tree T. T fixed search cost depends only on profile X = (X1, . . . , Xu) but: T also depends on U (Recall: T is built from U!) direct analysis no simpler than for Quicksort 4 5 5 5 2 3 5 4 2 2 2 5 5 5 4 2 5 1 5 4 2 3 4 2 2 1 2 4 2 2 5 5 2 3 4 2 3 4 1 3 2 2 1 4 4 4 1 2 3 3 (w = 1) tree-growing part T searching part XS Observation: T becomes stationary after w inserts of each value! with leaf-buffer size w Split input into tree-growing part and searching part: 1 We built T until it is stationary, ignoring costs. 2 Determine costs of searching remaining elements. profile XS of search part independent of T Sebastian Wild Quicksort with Equal Keys 2016-03-07 8 / 12

114. Tree-Growing and Searching Quicksort costs = costs to search input U = (U1, . . . , Un) in final tree T. T fixed search cost depends only on profile X = (X1, . . . , Xu) but: T also depends on U (Recall: T is built from U!) direct analysis no simpler than for Quicksort 4 5 5 5 2 3 5 4 2 2 2 5 5 5 4 2 5 1 5 4 2 3 4 2 2 1 2 4 2 2 5 5 2 3 4 2 3 4 1 3 2 2 1 4 4 4 1 2 3 3 (w = 1) tree-growing part T searching part XS Observation: T becomes stationary after w inserts of each value! with leaf-buffer size w Split input into tree-growing part and searching part: 1 We built T until it is stationary, ignoring costs. 2 Determine costs of searching remaining elements. profile XS of search part independent of T Sebastian Wild Quicksort with Equal Keys 2016-03-07 8 / 12

115. Bounding Tree-Growing Parts We ignore tree-growing in analysis have to make that part short. Allow first nT = n2/3 elements to build T. 4 5 5 5 2 3 5 4 2 2 2 5 5 5 4 2 5 1 5 4 2 3 4 2 2 1 2 4 2 2 5 5 2 3 4 2 3 4 1 3 2 2 1 4 4 4 1 2 3 3 tree-growing part T searching part XS nT Input is degenerate if a value occurs < w times in first nT elements. T not complete after nT inserts ... Have to make this happen rarely. By Chernoff bounds, input is non-degenerate w.h.p. if u = O(n1/4 ) building T costs nT · u = O(n1−ε ) Expected Quicksort cost E[Cn,u] = αu · n ± O(n1−ε ) αu = expected search cost in random T error term: covers tree-growing, sorting samples, degenerate inputs Sebastian Wild Quicksort with Equal Keys 2016-03-07 9 / 12

126. Search Costs Recall: T built from inserting i.i.d. uniform {1, . . . , u}, until saturation. αu = 1 u u v=1 cost of searching v in T (u · αu = path length of T) 1 Easy case: t = (0, 0) ordinary BST, w = 1 no duplicates in tree T is BST under random permutation model Allen & Munro 1978: Self-Organizing Binary Search Trees 2 General t = (t1, . . . , ts): duplicates inside leaves (during construction) 3 3 for s 3: duplicates in same internal node possible = random permutation But: can set up a (messy) recurrence for αu asymptotic solution by Continuous Master Theorem Roura 2001: Improved Master Theorems for Divide-and-Conquer Recurrences Sebastian Wild Quicksort with Equal Keys 2016-03-07 10 / 12

131. Search Costs Recall: T built from inserting i.i.d. uniform {1, . . . , u}, until saturation. αu = 1 u u v=1 cost of searching v in T (u · αu = path length of T) 1 Easy case: t = (0, 0) ordinary BST, w = 1 no duplicates in tree T is BST under random permutation model Allen & Munro 1978: Self-Organizing Binary Search Trees αu = 2 1 + 1 u Hu − 3 = 2 ln(u) ± O(1) 2 General t = (t1, . . . , ts): duplicates inside leaves (during construction) 3 3 for s 3: duplicates in same internal node possible = random permutation But: can set up a (messy) recurrence for αu asymptotic solution by Continuous Master Theorem Roura 2001: Improved Master Theorems for Divide-and-Conquer Recurrences Sebastian Wild Quicksort with Equal Keys 2016-03-07 10 / 12

134. Search Costs Recall: T built from inserting i.i.d. uniform {1, . . . , u}, until saturation. αu = 1 u u v=1 cost of searching v in T (u · αu = path length of T) 1 Easy case: t = (0, 0) ordinary BST, w = 1 no duplicates in tree T is BST under random permutation model Allen & Munro 1978: Self-Organizing Binary Search Trees αu = 2 1 + 1 u Hu − 3 = 2 ln(u) ± O(1) 2 General t = (t1, . . . , ts): duplicates inside leaves (during construction) 3 3 1 2 4 for s 3: duplicates in same internal node possible = random permutation But: can set up a (messy) recurrence for αu asymptotic solution by Continuous Master Theorem Roura 2001: Improved Master Theorems for Divide-and-Conquer Recurrences Sebastian Wild Quicksort with Equal Keys 2016-03-07 10 / 12

138. Result Search costs in T: αu = ¯a H ln(u) ± O(1) u → ∞ H = s r=1 tr + 1 k + 1 Hk+1 − Htr+1 k = sample size ¯a = avg. search costs inside node 1 2 3 4 (¯a = 1 for binary trees) Quicksort costs: E[Cn,u] = αu · n ± O(n1−ε ) u = O(n1/4) Time to put the pieces together! Quicksort Costs u-ary words: E[Cn,u] = ¯a H n ln(u) ± O(n) u = ω(1), u = O(n1/4 ) Permutations: E[Cn] = ¯a H n ln(n) ± O(n) Sebastian Wild Quicksort with Equal Keys 2016-03-07 11 / 12

143. Result Search costs in T: αu = ¯a H ln(u) ± O(1) u → ∞ H = s r=1 tr + 1 k + 1 Hk+1 − Htr+1 k = sample size ¯a = avg. search costs inside node 1 2 3 4 1 2 3 4 3 1 2 4 (¯a = 1 for binary trees) Quicksort costs: E[Cn,u] = αu · n ± O(n1−ε ) u = O(n1/4) Time to put the pieces together! Quicksort Costs u-ary words: E[Cn,u] = ¯a H n ln(u) ± O(n) u = ω(1), u = O(n1/4 ) Permutations: E[Cn] = ¯a H n ln(n) ± O(n) Sebastian Wild Quicksort with Equal Keys 2016-03-07 11 / 12

144. Result Search costs in T: αu = ¯a H ln(u) ± O(1) u → ∞ H = s r=1 tr + 1 k + 1 Hk+1 − Htr+1 k = sample size ¯a = avg. search costs inside node 1 2 3 4 1 2 3 4 3 1 2 4 ¯a = 2.4 (¯a = 1 for binary trees) Quicksort costs: E[Cn,u] = αu · n ± O(n1−ε ) u = O(n1/4) Time to put the pieces together! Quicksort Costs u-ary words: E[Cn,u] = ¯a H n ln(u) ± O(n) u = ω(1), u = O(n1/4 ) Permutations: E[Cn] = ¯a H n ln(n) ± O(n) Sebastian Wild Quicksort with Equal Keys 2016-03-07 11 / 12

151. Result Search costs in T: αu = ¯a H ln(u) ± O(1) u → ∞ H = s r=1 tr + 1 k + 1 Hk+1 − Htr+1 k = sample size ¯a = avg. search costs inside node 1 2 3 4 1 2 3 4 3 1 2 4 ¯a = 2.4 (¯a = 1 for binary trees) Quicksort costs: E[Cn,u] = αu · n ± O(n1−ε ) u = O(n1/4) Time to put the pieces together! Quicksort Costs u-ary words: E[Cn,u] = ¯a H n ln(u) ± O(n) u = ω(1), u = O(n1/4 ) Permutations: E[Cn] = ¯a H n ln(n) ± O(n) Results deceptively similar! (derivation isn’t) Same speedup from sampling and multi-way partitioning. Sebastian Wild Quicksort with Equal Keys 2016-03-07 11 / 12

152. Result Search costs in T: αu = ¯a H ln(u) ± O(1) u → ∞ H = s r=1 tr + 1 k + 1 Hk+1 − Htr+1 k = sample size ¯a = avg. search costs inside node 1 2 3 4 1 2 3 4 3 1 2 4 ¯a = 2.4 (¯a = 1 for binary trees) Quicksort costs: E[Cn,u] = αu · n ± O(n1−ε ) u = O(n1/4) Time to put the pieces together! Quicksort Costs u-ary words: E[Cn,u] = ¯a H n ln(u) ± O(n) u = ω(1), u = O(n1/4 ) Permutations: E[Cn] = ¯a H n ln(n) ± O(n) Results deceptively similar! (derivation isn’t) Same speedup from sampling and multi-way partitioning. Sebastian Wild Quicksort with Equal Keys 2016-03-07 11 / 12

153. Conclusion Findings First analysis of generalized Quicksort (sampling & multi-way) on equal keys. Limitations: restricted range for universe size: u = ω(1) and u = O(n1/3−ε ) currently only stateless cost measures Partial Answer to conjecture of Sedgewick & Bentley: Median-of-k Quicksort approaches lower bound for k → ∞ ... for uniform distribution. Finally cleared the item from my to-do list Future Work other discrete distributions expected-proﬁle model better asymptotic approximation for αu Sebastian Wild Quicksort with Equal Keys 2016-03-07 12 / 12

Quicksort with Equal Keys

Recommended

Recommended

More Related Content

Similar to Quicksort with Equal Keys

Similar to Quicksort with Equal Keys (20)

More from Sebastian Wild

More from Sebastian Wild (7)

Recently uploaded

Recently uploaded (20)

Quicksort with Equal Keys