Successfully reported this slideshow.
Upcoming SlideShare
×

# Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

1,464 views

Published on

I gave this talk at the 24th International Meeting on Probabilistic, Combinatorial and Asymptotic Methods for the Analysis of Algorithms (AofA 2013) on Menorca (Spain).

A paper covering the analyses of this talk (and some more!) has been submitted.
Also, in the talk, I refer to the previous speaker at the conference, my advisor Markus Nebel - corresponding results can be found in an earlier talk of mine:
slideshare.net/sebawild/average-case-analysis-of-java-7s-dual-pivot-quicksort

Check my website for preprints of papers and my other talks:
wwwagak.cs.uni-kl.de/sebastian-wild.html

Published in: Technology, News & Politics
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

### Quickselect Under Yaroslavskiy's Dual Pivoting Algorithm

1. 1. Quickselect under Yaroslavskiy’s Dual-Pivoting Algorithm Sebastian Wild Markus E. Nebel Hosam Mahmoud wild@cs.uni-kl.de Computer Science Department 28 May 2013 AofA 2013 Menorca Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 1 / 17
2. 2. Selection Consider the following problem The Selection Problem Given an unsorted array A of n data elements, ﬁnd the m-th smallest element. Applications estimate distribution of data collection in statistics world rankings in sports Algorithms trivial solution: sort(A); return A[m]; O(n log n) linear worst case possible: Median-of-Medians (Blum et al.) but rather slow in practice (compared to sorting!) Hoare 1961: Quicksort idea usable for selection Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 2 / 17
3. 3. From Quicksort to Quickselect Sort whole array by Quicksort Markus Nebel just showed: Yaroslavskiy’s partitioning can speed up Quicksort Does success of dual partitioning in sorting carry over to selection? Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17
4. 4. From Quicksort to Quickselect Can prune many branches Markus Nebel just showed: Yaroslavskiy’s partitioning can speed up Quicksort Does success of dual partitioning in sorting carry over to selection? Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17
5. 5. From Quicksort to Quickselect Can prune many branches Dual Pivot Quicksort Markus Nebel just showed: Yaroslavskiy’s partitioning can speed up Quicksort Does success of dual partitioning in sorting carry over to selection? Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17
6. 6. From Quicksort to Quickselect Can prune many branches Prune even more branches Markus Nebel just showed: Yaroslavskiy’s partitioning can speed up Quicksort Does success of dual partitioning in sorting carry over to selection? Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17
7. 7. From Quicksort to Quickselect Can prune many branches Prune even more branches Markus Nebel just showed: Yaroslavskiy’s partitioning can speed up Quicksort Does success of dual partitioning in sorting carry over to selection? Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 3 / 17
8. 8. Notation Analyse (random) number of data comparisons C (m) n to select from random permutation P < Q: (random) ranks of the two pivots subarray sizes determined by P and Q: left P rightQmiddle J1 := P − 1 J2 := Q − P − 1 J3 := n − Q selection continues recursively in left subarray, if m < P middle subarray, if P < m < Q right subarray, if Q < m or terminates if m = P or m = Q corresponding events E1 := {m < P} E2 := {P < m < Q} E3 := {Q < m} Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 4 / 17
9. 9. Notation Analyse (random) number of data comparisons C (m) n to select from random permutation P < Q: (random) ranks of the two pivots subarray sizes determined by P and Q: left P rightQmiddle J1 := P − 1 J2 := Q − P − 1 J3 := n − Q selection continues recursively in left subarray, if m < P middle subarray, if P < m < Q right subarray, if Q < m or terminates if m = P or m = Q corresponding events E1 := {m < P} E2 := {P < m < Q} E3 := {Q < m} Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 4 / 17
10. 10. Notation Analyse (random) number of data comparisons C (m) n to select from random permutation P < Q: (random) ranks of the two pivots subarray sizes determined by P and Q: left P rightQmiddle J1 := P − 1 J2 := Q − P − 1 J3 := n − Q selection continues recursively in left subarray, if m < P middle subarray, if P < m < Q right subarray, if Q < m or terminates if m = P or m = Q corresponding events E1 := {m < P} E2 := {P < m < Q} E3 := {Q < m} Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 4 / 17
11. 11. Quickselect Recurrence Yaroslavskiy’s partitioning preserves randomness can set up recurrence for the distribution of C (m) n : C (m) n D = Tn +    C (m) J1 if m lies in left subarray C (m−P) J2 if m lies in middle subarray C (m−Q) J3 if m lies in right subarray = Tn + 1E1 · C (m) J1 + 1E2 · C (m−P) J2 + 1E3 · C (m−Q) J3 Tn is the (random) number of comparisons during partitioning Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 5 / 17
12. 12. Getting Rid of m Complication: second parameter m in recurrence C (m) n D = Tn + 1E1 · C (m) J1 + 1E2 · C (m−P) J2 + 1E3 · C (m−Q) J3 consider simpler quantities 1 Random Rank: m = Mn D = Uniform {1, . . . , n} abbreviate ¯Cn := C (Mn) n ← focus in talk ¯Cn D = Tn+ 1E1 ¯CJ1 + 1E2 ¯CJ2 + 1E3 ¯CJ3 2 Extreme Rank: m = 1 abbreviate ˆCn := C (1) n ˆCn D = Tn + ˆCP−1 explicit solution for expectations possible (generating functions) But: Requires tedious computations More elegant shortcut to asymptotics of E[ ¯Cn] and E[ ˆCn]? Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 6 / 17
13. 13. Getting Rid of m Complication: second parameter m in recurrence C (m) n D = Tn + 1E1 · C (m) J1 + 1E2 · C (m−P) J2 + 1E3 · C (m−Q) J3 consider simpler quantities 1 Random Rank: m = Mn D = Uniform {1, . . . , n} abbreviate ¯Cn := C (Mn) n ← focus in talk ¯Cn D = Tn+ 1E1 ¯CJ1 + 1E2 ¯CJ2 + 1E3 ¯CJ3 2 Extreme Rank: m = 1 abbreviate ˆCn := C (1) n ˆCn D = Tn + ˆCP−1 explicit solution for expectations possible (generating functions) But: Requires tedious computations More elegant shortcut to asymptotics of E[ ¯Cn] and E[ ˆCn]? Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 6 / 17
14. 14. Getting Rid of m Complication: second parameter m in recurrence C (m) n D = Tn + 1E1 · C (m) J1 + 1E2 · C (m−P) J2 + 1E3 · C (m−Q) J3 consider simpler quantities 1 Random Rank: m = Mn D = Uniform {1, . . . , n} abbreviate ¯Cn := C (Mn) n ← focus in talk ¯Cn D = Tn+ 1E1 ¯CJ1 + 1E2 ¯CJ2 + 1E3 ¯CJ3 2 Extreme Rank: m = 1 abbreviate ˆCn := C (1) n ˆCn D = Tn + ˆCP−1 explicit solution for expectations possible (generating functions) But: Requires tedious computations More elegant shortcut to asymptotics of E[ ¯Cn] and E[ ˆCn]? Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 6 / 17
15. 15. Convergence Idea: Computation easy if we can drop index n i. e. when recurrence “in equilibrium” For that, we need stochastic convergence of ¯Cn 0 100 200 300 400 500 600 99.9% massn = 10 n = 20 n = 30 n = 40 ¯Cn But: E[ ¯Cn] grows with n high mass region moves with n ¯Cn does not converge Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 7 / 17
16. 16. Convergence Idea: Computation easy if we can drop index n i. e. when recurrence “in equilibrium” Try normalized variables ¯C∗ n := ¯Cn n instead: 0 2 4 6 8 10 12 14 16 18 99.9% massn = 10 n = 20 n = 30 n = 40 ¯Cn/n E[ ¯C∗ n] seems constant range of high mass looks bounded ¯C∗ n might converge Works because of property of Quickselect: E[ ¯Cn] = O Var( ¯Cn) Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 7 / 17
17. 17. The Contraction Method Rewrite recurrence: (Write as sum) ¯Cn D = Tn + 1E1 ¯CJ1 + 1E2 ¯CJ2 + 1E3 ¯CJ3 Contraction Method: Convergence of coefﬁcients + contraction condition implies convergence to ﬁxpoint solution. have to show: (all convergences in L1-norm) 1 T∗ n → T∗ 2 1Ei Ji n → A∗ i 3 3 i=1 E |A∗ i | < 1 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
18. 18. The Contraction Method Rewrite recurrence: (Divide by n) ¯Cn D = Tn + 3 i=1 1Ei ¯CJi Contraction Method: Convergence of coefﬁcients + contraction condition implies convergence to ﬁxpoint solution. have to show: (all convergences in L1-norm) 1 T∗ n → T∗ 2 1Ei Ji n → A∗ i 3 3 i=1 E |A∗ i | < 1 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
19. 19. The Contraction Method Rewrite recurrence: (Rearrange) ¯Cn n D = Tn n + 3 i=1 1Ei ¯CJi n · Ji Ji Contraction Method: Convergence of coefﬁcients + contraction condition implies convergence to ﬁxpoint solution. have to show: (all convergences in L1-norm) 1 T∗ n → T∗ 2 1Ei Ji n → A∗ i 3 3 i=1 E |A∗ i | < 1 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
20. 20. The Contraction Method Rewrite recurrence: (use normalized variables) ¯Cn n D = Tn n + 3 i=1 1Ei Ji n · ¯CJi Ji Contraction Method: Convergence of coefﬁcients + contraction condition implies convergence to ﬁxpoint solution. have to show: (all convergences in L1-norm) 1 T∗ n → T∗ 2 1Ei Ji n → A∗ i 3 3 i=1 E |A∗ i | < 1 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
21. 21. The Contraction Method Recurrence for ¯C∗ n ¯C∗ n D = T∗ n + 3 i=1 1Ei Ji n · ¯C∗ Ji with T∗ n := Tn n Contraction Method: Convergence of coefﬁcients + contraction condition implies convergence to ﬁxpoint solution. have to show: (all convergences in L1-norm) 1 T∗ n → T∗ 2 1Ei Ji n → A∗ i 3 3 i=1 E |A∗ i | < 1 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
22. 22. The Contraction Method Recurrence for ¯C∗ n ¯C∗ n D = T∗ n + 3 i=1 1Ei Ji n · ¯C∗ Ji ↓ ↓ T∗ A∗ i Contraction Method: Convergence of coefﬁcients + contraction condition implies convergence to ﬁxpoint solution. have to show: (all convergences in L1-norm) 1 T∗ n → T∗ 2 1Ei Ji n → A∗ i 3 3 i=1 E |A∗ i | < 1 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
23. 23. The Contraction Method Recurrence for ¯C∗ n ¯C∗ n D = T∗ n + 3 i=1 1Ei Ji n · ¯C∗ Ji ↓ ↓ 3 i=1 E |A∗ i | < 1 T∗ A∗ i Contraction Method: Convergence of coefﬁcients + contraction condition implies convergence to ﬁxpoint solution. have to show: (all convergences in L1-norm) 1 T∗ n → T∗ 2 1Ei Ji n → A∗ i 3 3 i=1 E |A∗ i | < 1 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
24. 24. The Contraction Method Recurrence for ¯C∗ n ¯C∗ n D = T∗ n + 3 i=1 1Ei Ji n · ¯C∗ Ji ↓ ↓ ↓ ↓ 3 i=1 E |A∗ i | < 1 ¯C∗ D = T∗ + 3 i=1 A∗ i · ¯C∗ Contraction Method: Convergence of coefﬁcients + contraction condition implies convergence to ﬁxpoint solution. have to show: (all convergences in L1-norm) 1 T∗ n → T∗ 2 1Ei Ji n → A∗ i 3 3 i=1 E |A∗ i | < 1 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
25. 25. The Contraction Method “Equilibrium” equation for ¯C∗ ¯C∗ D = T∗ + 3 i=1 A∗ i · ¯C∗ Contraction Method: Convergence of coefﬁcients + contraction condition implies convergence to ﬁxpoint solution. have to show: (all convergences in L1-norm) 1 T∗ n → T∗ 2 1Ei Ji n → A∗ i 3 3 i=1 E |A∗ i | < 1 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
26. 26. The Contraction Method “Equilibrium” equation for ¯C∗ (Take expectation on both sides) ¯C∗ D = T∗ + 3 i=1 A∗ i · ¯C∗ Contraction Method: Convergence of coefﬁcients + contraction condition implies convergence to ﬁxpoint solution. have to show: (all convergences in L1-norm) 1 T∗ n → T∗ 2 1Ei Ji n → A∗ i 3 3 i=1 E |A∗ i | < 1 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
27. 27. The Contraction Method “Equilibrium” equation for ¯C∗ (and solve for E ¯C∗) E ¯C∗ = E T∗+ 3 i=1 E A∗ i · E ¯C∗ Contraction Method: Convergence of coefﬁcients + contraction condition implies convergence to ﬁxpoint solution. have to show: (all convergences in L1-norm) 1 T∗ n → T∗ 2 1Ei Ji n → A∗ i 3 3 i=1 E |A∗ i | < 1 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
28. 28. The Contraction Method “Equilibrium” equation for ¯C∗ (and solve for E ¯C∗) E ¯C∗ = E T∗ 1 − 3 i=1 E A∗ i Contraction Method: Convergence of coefﬁcients + contraction condition implies convergence to ﬁxpoint solution. have to show: (all convergences in L1-norm) 1 T∗ n → T∗ 2 1Ei Ji n → A∗ i 3 3 i=1 E |A∗ i | < 1 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
29. 29. The Contraction Method “Equilibrium” equation for ¯C∗ (and solve for E ¯C∗) E ¯C∗ = E T∗ 1 − 3 i=1 E A∗ i Contraction Method: Convergence of coefﬁcients + contraction condition implies convergence to ﬁxpoint solution. have to show: (all convergences in L1-norm) 1 T∗ n → T∗ 2 1Ei Ji n → A∗ i . . . upcoming 3 3 i=1 E |A∗ i | < 1 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 8 / 17
30. 30. Probabilistic Model Need “convenient” probabilistic model: same probability space for ﬁnite n and limit case Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1) 0 1 ﬁrst two = pivots values Our equivalent model: 1 Draw spacings S1, S2, S3 on (0, 1) with S D = Dirichlet(1, 1, 1) 2 Draw subarray sizes J1, J2, J3 with J D = Multinomial(n − 2; S1, S2,S3) 3 Continue recursively Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
31. 31. Probabilistic Model Need “convenient” probabilistic model: same probability space for ﬁnite n and limit case Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1) 0 1 ﬁrst two = pivots values Our equivalent model: 1 Draw spacings S1, S2, S3 on (0, 1) with S D = Dirichlet(1, 1, 1) 2 Draw subarray sizes J1, J2, J3 with J D = Multinomial(n − 2; S1, S2,S3) 3 Continue recursively Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
32. 32. Probabilistic Model Need “convenient” probabilistic model: same probability space for ﬁnite n and limit case Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1) 0 1U1 U2 ﬁrst two = pivots values Our equivalent model: 1 Draw spacings S1, S2, S3 on (0, 1) with S D = Dirichlet(1, 1, 1) 2 Draw subarray sizes J1, J2, J3 with J D = Multinomial(n − 2; S1, S2,S3) 3 Continue recursively Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
33. 33. Probabilistic Model Need “convenient” probabilistic model: same probability space for ﬁnite n and limit case Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1) 0 1U1 U2 ﬁrst two = pivots values Our equivalent model: 1 Draw spacings S1, S2, S3 on (0, 1) with S D = Dirichlet(1, 1, 1) 2 Draw subarray sizes J1, J2, J3 with J D = Multinomial(n − 2; S1, S2,S3) 3 Continue recursively 0 1 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
34. 34. Probabilistic Model Need “convenient” probabilistic model: same probability space for ﬁnite n and limit case Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1) 0 1U1 U2 ﬁrst two = pivots values Our equivalent model: 1 Draw spacings S1, S2, S3 on (0, 1) with S D = Dirichlet(1, 1, 1) 2 Draw subarray sizes J1, J2, J3 with J D = Multinomial(n − 2; S1, S2,S3) 3 Continue recursively 0 1 S1 S2 S3 induces pivot values: U1 = S1, U2 = S1 + S2 0 0.5 1 0 0.5 1 0 0.5 1 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
35. 35. Probabilistic Model Need “convenient” probabilistic model: same probability space for ﬁnite n and limit case Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1) 0 1U1 U2 ﬁrst two = pivots values Our equivalent model: 1 Draw spacings S1, S2, S3 on (0, 1) with S D = Dirichlet(1, 1, 1) 2 Draw subarray sizes J1, J2, J3 with J D = Multinomial(n − 2; S1, S2,S3) 3 Continue recursively 0 1 S1 S2 S3 U1 U2 induces pivot values: U1 = S1, U2 = S1 + S2 0 0.5 1 0 0.5 1 0 0.5 1 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
36. 36. Probabilistic Model Need “convenient” probabilistic model: same probability space for ﬁnite n and limit case Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1) 0 1U1 U2 ﬁrst two = pivots values Our equivalent model: 1 Draw spacings S1, S2, S3 on (0, 1) with S D = Dirichlet(1, 1, 1) 2 Draw subarray sizes J1, J2, J3 with J D = Multinomial(n − 2; S1, S2,S3) 3 Continue recursively 0 1 S1 S2 S3 U1 U2 J1 J2 J3 S1 = Pr(small) , S2 = Pr(medium) , S3 = Pr(large) n − 2 elements to draw conditional on S Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
37. 37. Probabilistic Model Need “convenient” probabilistic model: same probability space for ﬁnite n and limit case Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1) 0 1U1 U2 ﬁrst two = pivots values Our equivalent model: 1 Draw spacings S1, S2, S3 on (0, 1) with S D = Dirichlet(1, 1, 1) 2 Draw subarray sizes J1, J2, J3 with J D = Multinomial(n − 2; S1, S2,S3) 3 Continue recursively 0 1 S1 S2 S3 U1 U2 J1 J2 J3 S1 = Pr(small) , S2 = Pr(medium) , S3 = Pr(large) n − 2 elements to draw conditional on S Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
38. 38. Probabilistic Model Need “convenient” probabilistic model: same probability space for ﬁnite n and limit case Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1) 0 1U1 U2 ﬁrst two = pivots values Our equivalent model: 1 Draw spacings S1, S2, S3 on (0, 1) with S D = Dirichlet(1, 1, 1) 2 Draw subarray sizes J1, J2, J3 with J D = Multinomial(n − 2; S1, S2,S3) 3 Continue recursively 0 1 S1 S2 S3 U1 U2 J1 J2 J3 S1 = Pr(small) , S2 = Pr(medium) , S3 = Pr(large) n − 2 elements to draw conditional on S Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
39. 39. Probabilistic Model Need “convenient” probabilistic model: same probability space for ﬁnite n and limit case Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1) 0 1U1 U2 ﬁrst two = pivots values Our equivalent model: 1 Draw spacings S1, S2, S3 on (0, 1) with S D = Dirichlet(1, 1, 1) 2 Draw subarray sizes J1, J2, J3 with J D = Multinomial(n − 2; S1, S2,S3) 3 Continue recursively 0 1 S1 S2 S3 U1 U2 J1 J2 J3 S1 = Pr(small) , S2 = Pr(medium) , S3 = Pr(large) n − 2 elements to draw conditional on S Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
40. 40. Probabilistic Model Need “convenient” probabilistic model: same probability space for ﬁnite n and limit case Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1) 0 1U1 U2 ﬁrst two = pivots values Our equivalent model: 1 Draw spacings S1, S2, S3 on (0, 1) with S D = Dirichlet(1, 1, 1) 2 Draw subarray sizes J1, J2, J3 with J D = Multinomial(n − 2; S1, S2,S3) 3 Continue recursively 0 1U1U1 U2 U2 Continue of sub-interval with smaller n Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
41. 41. Probabilistic Model Need “convenient” probabilistic model: same probability space for ﬁnite n and limit case Natural model: input elements U1, . . . , Un i. i. d. Uniform(0, 1) 0 1U1 U2 ﬁrst two = pivots values Our equivalent model: 1 Draw spacings S1, S2, S3 on (0, 1) with S D = Dirichlet(1, 1, 1) 2 Draw subarray sizes J1, J2, J3 with J D = Multinomial(n − 2; S1, S2,S3) 3 Continue recursively 0 1U1U1 U2 U2 Continue of sub-interval with smaller n Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 9 / 17
42. 42. Distribution of Tn — Repetition 1 T∗ n → T∗ Invariant: < p → > qg ← p ◦ q k → k’s range K G Tn = number of comparisons in ﬁrst partitioning step ∼ n one for every element + J2 second for all medium elements + l @ K second for large elements at Ck + s@ G second for small elements at Cg Markus Nebel just showed: l @ K D = Hypergeometric(n − 2; J3 , J1 + J2 ) conditional on J s@ G D = Hypergeometric(n − 2; J1 , J3 ) Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 10 / 17
43. 43. Distribution of Tn — Repetition 1 T∗ n → T∗ Invariant: < p → > qg ← p ◦ q k → k’s range K G Tn = number of comparisons in ﬁrst partitioning step ∼ n one for every element + J2 second for all medium elements + l @ K second for large elements at Ck + s@ G second for small elements at Cg Markus Nebel just showed: l @ K D = Hypergeometric(n − 2; J3 , J1 + J2 ) conditional on J s@ G D = Hypergeometric(n − 2; J1 , J3 ) Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 10 / 17
44. 44. Distribution of Tn — Repetition 1 T∗ n → T∗ Invariant: < p → > qg ← p ◦ q k → k’s range K G Tn = number of comparisons in ﬁrst partitioning step ∼ n one for every element + J2 second for all medium elements + l @ K second for large elements at Ck + s@ G second for small elements at Cg Markus Nebel just showed: l @ K D = Hypergeometric(n − 2; J3 , J1 + J2 ) conditional on J s@ G D = Hypergeometric(n − 2; J1 , J3 ) Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 10 / 17
45. 45. Convergence of T∗ n 1 T∗ n → T∗ T∗ n = Tn n = 1 + J2 n + l @ K n + s@ G n J2 n → S2 by law of large numbers, since E[Ji | S] = n · Si Multinomial and Hypergeometric concentrated around mean l @ K n → E l @ K n S by Chernoff bound arguments = E E l @ K | J n S = E J3 ( J1 + J2 ) n(n − 2) S ∼ S3 · (S1 + S2) s@ G n → S1 · S3 similar T∗ n → T∗ = 1 + S2 + S3(2S1 + S2) Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 11 / 17
46. 46. Convergence of T∗ n 1 T∗ n → T∗ T∗ n = Tn n = 1 + J2 n + l @ K n + s@ G n J2 n → S2 by law of large numbers, since E[Ji | S] = n · Si Multinomial and Hypergeometric concentrated around mean l @ K n → E l @ K n S by Chernoff bound arguments = E E l @ K | J n S = E J3 ( J1 + J2 ) n(n − 2) S ∼ S3 · (S1 + S2) s@ G n → S1 · S3 similar T∗ n → T∗ = 1 + S2 + S3(2S1 + S2) Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 11 / 17
47. 47. Convergence of T∗ n 1 T∗ n → T∗ T∗ n = Tn n = 1 + J2 n + l @ K n + s@ G n J2 n → S2 by law of large numbers, since E[Ji | S] = n · Si Multinomial and Hypergeometric concentrated around mean l @ K n → E l @ K n S by Chernoff bound arguments = E E l @ K | J n S = E J3 ( J1 + J2 ) n(n − 2) S ∼ S3 · (S1 + S2) s@ G n → S1 · S3 similar T∗ n → T∗ = 1 + S2 + S3(2S1 + S2) Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 11 / 17
48. 48. Convergence of Coefﬁcients & Contraction Condition 2 1Ei Ji n → A∗ i standard computation: 1E1 J1 n → 1F1 · S1 with F1 = {V < S1} i = 2, 3 similar 3 3 i=1 E |A∗ i | < 1 E 1F1 S1 = 1 6 by symmetry: 3 i=1 E |A∗ i | = 1 2 < 1 We (ﬁnally) proved convergence ¯C∗ n → ¯C∗. Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 12 / 17
49. 49. Convergence of Coefﬁcients & Contraction Condition 2 1Ei Ji n → A∗ i standard computation: 1E1 J1 n → 1F1 · S1 with F1 = {V < S1} i = 2, 3 similar 3 3 i=1 E |A∗ i | < 1 E 1F1 S1 = 1 6 by symmetry: 3 i=1 E |A∗ i | = 1 2 < 1 We (ﬁnally) proved convergence ¯C∗ n → ¯C∗. Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 12 / 17
50. 50. Convergence of Coefﬁcients & Contraction Condition 2 1Ei Ji n → A∗ i standard computation: 1E1 J1 n → 1F1 · S1 with F1 = {V < S1} i = 2, 3 similar 3 3 i=1 E |A∗ i | < 1 E 1F1 S1 = 1 6 by symmetry: 3 i=1 E |A∗ i | = 1 2 < 1 We (ﬁnally) proved convergence ¯C∗ n → ¯C∗. Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 12 / 17
51. 51. Results — Expectations Computing expectations and plugging in . . . E T∗ = 19 12 Recall: E ¯C∗ = E T∗ 1 − 3 i=1 E A∗ i3 i=1 E A∗ i = 1 2 E ¯C∗ = 19 6 = 3.16 L1 convergence E ¯Cn ∼ 3.16n similarly: E ˆCn ∼ 2.375n Comparisons Swaps 0 1 2 3 +5.55% +100% random rank Classic Quickselect Yaroslavskiy Select Comparisons Swaps 0 1 2 +18.8% +125% extreme rank Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 13 / 17
52. 52. Results — Expectations Computing expectations and plugging in . . . E T∗ = 19 12 Recall: E ¯C∗ = E T∗ 1 − 3 i=1 E A∗ i3 i=1 E A∗ i = 1 2 E ¯C∗ = 19 6 = 3.16 L1 convergence E ¯Cn ∼ 3.16n similarly: E ˆCn ∼ 2.375n Comparisons Swaps 0 1 2 3 +5.55% +100% random rank Classic Quickselect Yaroslavskiy Select Comparisons Swaps 0 1 2 +18.8% +125% extreme rank Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 13 / 17
53. 53. Results — Expectations Computing expectations and plugging in . . . E T∗ = 19 12 Recall: E ¯C∗ = E T∗ 1 − 3 i=1 E A∗ i3 i=1 E A∗ i = 1 2 E ¯C∗ = 19 6 = 3.16 L1 convergence E ¯Cn ∼ 3.16n similarly: E ˆCn ∼ 2.375n Comparisons Swaps 0 1 2 3 +5.55% +100% random rank Classic Quickselect Yaroslavskiy Select Comparisons Swaps 0 1 2 +18.8% +125% extreme rank Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 13 / 17
54. 54. Results — Expectations Computing expectations and plugging in . . . E T∗ = 19 12 Recall: E ¯C∗ = E T∗ 1 − 3 i=1 E A∗ i3 i=1 E A∗ i = 1 2 E ¯C∗ = 19 6 = 3.16 L1 convergence E ¯Cn ∼ 3.16n similarly: E ˆCn ∼ 2.375n Comparisons Swaps 0 1 2 3 +5.55% +100% random rank Classic Quickselect Yaroslavskiy Select Comparisons Swaps 0 1 2 +18.8% +125% extreme rank Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 13 / 17
55. 55. Pivot Sampling successful optimization of Quicksort/-select: median of three Here: generalized pivot selection scheme 1 Choose random sample of k elements 2 Sort the sample 3 Select pivots s. t. in sorted sample t1 smaller , t2 between and t3 larger than pivots Example with k = 8 and t = (3, 2, 1): U1 U2 t1 t2 t3 Our stochastic model nicely generalizes to pivot sampling: only the distribution of spacings S changes! S D = Dirichlet(t1 + 1, t2 + 1, t3 + 1) density: f(s) = s t1 1 s t2 2 s t3 3 B contraction conditions remain valid Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 14 / 17
56. 56. Pivot Sampling successful optimization of Quicksort/-select: median of three Here: generalized pivot selection scheme 1 Choose random sample of k elements 2 Sort the sample 3 Select pivots s. t. in sorted sample t1 smaller , t2 between and t3 larger than pivots Example with k = 8 and t = (3, 2, 1): U1 U2 t1 t2 t3 Our stochastic model nicely generalizes to pivot sampling: only the distribution of spacings S changes! S D = Dirichlet(t1 + 1, t2 + 1, t3 + 1) density: f(s) = s t1 1 s t2 2 s t3 3 B contraction conditions remain valid Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 14 / 17
57. 57. Results — Pivot Sampling General result: ¯Cn ∼ 1 + t2+1 k+1 + (2(t1+1)+(t2+1))(t3+1) (k+1)2 1 − 3 i=1 (ti+1)2 (k+1)2 · n inconvenient for interpretation . . . Limiting case for large sample: ti k → αi as k → ∞ = take exact quantiles as pivots ¯Cn ∼ 1 + α2 + (2α1 + α2)α3 1 − 3 i=1α2 i · n 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 α1 α2 optimal choice is not symmetric classic Quickselect with median of 2t + 1: 2.5n for t = 1 and 2.3n for t = 2 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 15 / 17
58. 58. Results — Pivot Sampling General result: ¯Cn ∼ 1 + t2+1 k+1 + (2(t1+1)+(t2+1))(t3+1) (k+1)2 1 − 3 i=1 (ti+1)2 (k+1)2 · n inconvenient for interpretation . . . Limiting case for large sample: ti k → αi as k → ∞ = take exact quantiles as pivots ¯Cn ∼ 1 + α2 + (2α1 + α2)α3 1 − 3 i=1α2 i · n 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 α1 α2 optimal choice is not symmetric classic Quickselect with median of 2t + 1: 2.5n for t = 1 and 2.3n for t = 2 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 15 / 17
59. 59. Results — Pivot Sampling General result: ¯Cn ∼ 1 + t2+1 k+1 + (2(t1+1)+(t2+1))(t3+1) (k+1)2 1 − 3 i=1 (ti+1)2 (k+1)2 · n inconvenient for interpretation . . . Limiting case for large sample: ti k → αi as k → ∞ = take exact quantiles as pivots ¯Cn ∼ 1 + α2 + (2α1 + α2)α3 1 − 3 i=1α2 i · n 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 α1 α2 optimal choice is not symmetric classic Quickselect with median of 2t + 1: 2.5n for t = 1 and 2.3n for t = 2 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 15 / 17
60. 60. Results — Pivot Sampling General result: ¯Cn ∼ 1 + t2+1 k+1 + (2(t1+1)+(t2+1))(t3+1) (k+1)2 1 − 3 i=1 (ti+1)2 (k+1)2 · n inconvenient for interpretation . . . Limiting case for large sample: ti k → αi as k → ∞ = take exact quantiles as pivots ¯Cn ∼ 1 + α2 + (2α1 + α2)α3 1 − 3 i=1α2 i · n 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 2.4654 2.5 α1 α2 optimal choice is not symmetric classic Quickselect with median of 2t + 1: 2.5n for t = 1 and 2.3n for t = 2 Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 15 / 17
61. 61. Conclusion Practitioners’ Lesson: Yaroslavskiy’s dual-pivoting not helpful for selection Analyzers’ Lesson: expected costs of Quickselect derivable by contraction method pivot sampling included naturally Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 16 / 17
62. 62. References & Further Reading M. Blum, R. W. Floyd, V. Pratt, R. L. Rivest and R. E. Tarjan. Time bounds for selection. J. of Comp. and System Sc., 7(4):448–461, 1973. C. A. R. Hoare. Algorithm 65: Find. Communications of the ACM, 4(7):321–322, July 1961. H.-K. Hwang and T.-H. Tsai. Quickselect and Dickman function. Combinatorics, Probability and Computing, 11, 2000. Hosam M. Mahmoud. Sorting: A distribution theory. John Wiley & Sons, Hoboken, NJ, USA, 2000. H. M. Mahmoud, R. Modarres and R. T. Smythe. Analysis of quickselect : an algorithm for order statistics. Informatique théorique et appl., 29(4):255–276, 1995. C. Martínez, D. Panario and A. Viola. Adaptive sampling strategies for quickselects. ACM Transactions on Algorithms, 6(3):1–45, June 2010. C. Martínez and S. Roura. Optimal Sampling Strategies in Quicksort and Quickselect. SIAM Journal on Computing, 31(3):683, 2001. R. Neininger. On a multivariate contraction method for random recursive structures with applications to Quicksort. Random Structures & Alg., 19(3-4):498–524, 2001. R. Neininger and L. Rüschendorf. A General Limit Theorem for Recursive Algorithms and Combinatorial Structures. The Annals of Applied Probability, 14(1):378–418, 2004. A. Panholzer and H. Prodinger. A generating functions approach for the analysis of grand averages for multiple QUICKSELECT. Random Structures & Alg., 13(3-4):189–209, 1998. U. Rösler and L. Rüschendorf. The contraction method for recursive algorithms. Algorithmica, 29(1):3–33, 2001. S. Wild. Java 7’s Dual Pivot Quicksort. Master thesis, University of Kaiserslautern, 2012. S. Wild, M. E. Nebel and H. Mahmoud. Analysis of Quickselect under Yaroslavskiy’s Dual-Pivoting Algorithm. Algorithmica, in preparation. S. Wild, M. E. Nebel and R. Neininger. Average Case and Distributional Analysis of Java 7’s Dual Pivot Quicksort. ACM Transactions on Algorithms, submitted. V. Yaroslavskiy. Dual-Pivot Quicksort. 2009. Sebastian Wild Java 7’s Dual Pivot Quicksort 2013/05/28 17 / 17