Successfully reported this slideshow.
Upcoming SlideShare
×

# Engineering Java 7's Dual Pivot Quicksort Using MaLiJAn

2,586 views

Published on

I gave this talk at the 2013 Meeting On Algorithm Engineering and Experiments (ALENEX) meeting.

Find my other talks and the corresponding papers on my web page:
http://wwwagak.cs.uni-kl.de/sebastian-wild.html

Published in: Technology
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

### Engineering Java 7's Dual Pivot Quicksort Using MaLiJAn

1. 1. Engineering Java 7’s Dual Pivot Quicksort Using MaLiJAnSebastian Wild Markus E. Nebel Raphael Reitzig Ulrich Laube [wild, nebel, r_reitzi, laube] @cs.uni-kl.de Computer Science Department University of Kaiserslautern January 7, 2013 Meeting on Algorithm Engineering & Experiments 2013 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 1 / 23
2. 2. Background Since Java 7: new dual pivot Quicksort in JRE library Basic algorithm by Vladimir Yaroslavskiy Optimizations by Jon Bentley, Joshua Bloch and others (see java.core-libs.devel mailing list) Motivated by experience with classic Quicksort Validated by running time benchmarkIn this talk: Can we exploit special properties of dual pivot Quicksort? Can we get more insight than running time measurements?. . . stay tuned Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 2 / 23
3. 3. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) p q 3 5 1 8 4 7 2 9 6 Select two elements as pivots. Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
4. 4. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) p q 3 5 1 8 4 7 2 9 6 Only value relative to pivot counts. Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
5. 5. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) k 3 5 1 8 4 7 2 9 6 A[k] is medium go on Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
6. 6. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) k 3 5 1 8 4 7 2 9 6 A[k] is small Swap to left Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
7. 7. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) k 3 5 1 8 4 7 2 9 6 Swap small element to left end. Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
8. 8. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) k 3 1 5 8 4 7 2 9 6 Swap small element to left end. Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
9. 9. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) k 3 1 5 8 4 7 2 9 6 A[k] is large Find swap partner. Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
10. 10. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) k g 3 1 5 8 4 7 2 9 6 A[k] is large Find swap partner: g skips over large elements. Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
11. 11. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) k g 3 1 5 8 4 7 2 9 6 A[k] is large Swap Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
12. 12. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) k g 3 1 5 2 4 7 8 9 6 A[k] is large Swap Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
13. 13. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) k g 3 1 5 2 4 7 8 9 6 A[k] is old A[g], small Swap to left Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
14. 14. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) k g 3 1 2 5 4 7 8 9 6 A[k] is old A[g], small Swap to left Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
15. 15. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) k g 3 1 2 5 4 7 8 9 6 A[k] is medium go on Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
16. 16. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) k g 3 1 2 5 4 7 8 9 6 A[k] is large Find swap partner. Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
17. 17. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) g k 3 1 2 5 4 7 8 9 6 A[k] is large Find swap partner: g skips over large elements. Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
18. 18. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) g k 3 1 2 5 4 7 8 9 6 g and k have crossed! Swap pivots in place Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
19. 19. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) g k 2 1 3 5 4 6 8 9 7 g and k have crossed! Swap pivots in place Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
20. 20. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) 2 1 3 5 4 6 8 9 7 Partitioning done! Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
21. 21. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) 2 1 3 5 4 6 8 9 7 Recursively sort three sublists. Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
22. 22. Java 7’s Dual Pivot Quicksort – Example Yaroslavskiy’s Dual Pivot Quicksort (used in Oracle’s Java 7 Arrays.sort(int[])) 1 2 3 4 5 6 7 8 9 Done. Invariant: <p p ◦ q k ? g >q → → ← Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 3 / 23
23. 23. Control Flow Graph of Partitioning Loop 1 bc: 3 no k g yes 2 bc: 7 t := A[k]; 7 bc: 2 yes t<p g := g − 1; yes no 3 bc: 12 4 bc: 3 yes 5 bc: 5 yes 6 bc: 3 A[k] := A[ ]; A[ ] := t; t q A[g] > q k<g := + 1; no no no 8 bc: 5 A[g] < p yes no 9 bc: 14 10 bc: 6 A[k] := A[ ]; A[k] := A[g] A[ ] := A[g] := + 1; 11 bc: 5 12 bc: 2 A[g] := t; k := k + 1 g := g − 1; Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
24. 24. Control Flow Graph of Partitioning Loop 1 bc: 3 no k g Cycle 1 yes 2 bc: 7 7 bc: 2 A[k]: small t := A[k]; yes g := g − 1; t<p yes A[g]: — no 3 bc: 12 4 bc: 3 yes 5 bc: 5 yes 6 bc: 3 A[k] := A[ ]; A[ ] := t; := + 1; t q A[g] > q k<g ∆(g − k): 1 no no no 8 bc: 5 A[g] < p Bytecode yes no Instructions: 24 9 bc: 14 10 bc: 6 A[k] := A[ ]; A[k] := A[g] A[ ] := A[g] := + 1; 11 bc: 5 12 bc: 2 A[g] := t; k := k + 1 g := g − 1; Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
25. 25. Control Flow Graph of Partitioning Loop 1 bc: 3 no k g Cycle 2 yes 2 bc: 7 7 bc: 2 A[k]: medium t := A[k]; yes g := g − 1; t<p yes A[g]: — no 3 bc: 12 4 bc: 3 yes 5 bc: 5 yes 6 bc: 3 A[k] := A[ ]; A[ ] := t; := + 1; t q A[g] > q k<g ∆(g − k): 1 no no no 8 bc: 5 A[g] < p Bytecode yes no Instructions: 15 9 bc: 14 10 bc: 6 A[k] := A[ ]; A[k] := A[g] A[ ] := A[g] := + 1; 11 bc: 5 12 bc: 2 A[g] := t; k := k + 1 g := g − 1; Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
26. 26. Control Flow Graph of Partitioning Loop 1 bc: 3 no k g Cycle 3 yes 2 bc: 7 7 bc: 2 A[k]: large t := A[k]; yes g := g − 1; t<p yes A[g]: large no 3 bc: 12 4 bc: 3 yes 5 bc: 5 yes 6 bc: 3 A[k] := A[ ]; A[ ] := t; := + 1; t q A[g] > q k<g ∆(g − k): 1 no no no 8 bc: 5 A[g] < p Bytecode yes no Instructions: 10 9 bc: 14 10 bc: 6 A[k] := A[ ]; A[k] := A[g] A[ ] := A[g] := + 1; 11 bc: 5 12 bc: 2 A[g] := t; k := k + 1 g := g − 1; Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
27. 27. Control Flow Graph of Partitioning Loop 1 bc: 3 no k g Cycle 4 yes 2 bc: 7 7 bc: 2 A[k]: large t := A[k]; yes g := g − 1; t<p yes A[g]: small no 3 bc: 12 4 bc: 3 yes 5 bc: 5 yes 6 bc: 3 A[k] := A[ ]; A[ ] := t; := + 1; t q A[g] > q k<g ∆(g − k): 2 no no no 8 bc: 5 A[g] < p Bytecode yes no Instructions: 44 9 bc: 14 10 bc: 6 A[k] := A[ ]; A[k] := A[g] A[ ] := A[g] := + 1; 11 bc: 5 12 bc: 2 A[g] := t; k := k + 1 g := g − 1; Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
28. 28. Control Flow Graph of Partitioning Loop 1 bc: 3 no k g Cycle 5 yes 2 bc: 7 7 bc: 2 A[k]: large t := A[k]; yes g := g − 1; t<p yes A[g]: medium no 3 bc: 12 4 bc: 3 yes 5 bc: 5 yes 6 bc: 3 A[k] := A[ ]; A[ ] := t; := + 1; t q A[g] > q k<g ∆(g − k): 2 no no no 8 bc: 5 A[g] < p Bytecode yes no Instructions: 36 9 bc: 14 10 bc: 6 A[k] := A[ ]; A[k] := A[g] A[ ] := A[g] := + 1; 11 bc: 5 12 bc: 2 A[g] := t; k := k + 1 g := g − 1; Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 4 / 23
29. 29. Asymmetry 1 bc: 3 no k g 2 yes bc: 7 Algorithm is asymmetric: t := A[k]; 7 bc: 2 yes t<p no g := g − 1; yes Cycles have different cost 3 bc: 12 A[k] := A[ ]; 4 bc: 3 yes 5 bc: 5 yes 6 bc: 3 Would rather execute cheap A[ ] := t; t q A[g] > q k<g := + 1; no no no ones often 8 bc: 5 yes A[g] < p no Cycles chosen by classes 9 bc: 14 A[k] := A[ ]; 10 bc: 6 A[k] := A[g] small , medium or large A[ ] := A[g] := + 1; Probability for classes depends 12 bc: 2 k := k + 1 11 bc: 5 A[g] := t; on pivot values g := g − 1; Maybe we can “inﬂuence pivot values accordingly”? Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 5 / 23
30. 30. Pivot Sampling Well-known optimization for classic Quicksort: median-of-three pivot closer to median of whole list In JRE7 Quicksort implementation: natural extension for 2 pivots: tertiles-of-ﬁve pivots closer to tertiles of whole list 9 other possibilities to pick p and q out of 5 elements: Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 6 / 23
31. 31. Pivot Sampling Well-known optimization for classic Quicksort: median-of-three pivot closer to median of whole list In JRE7 Quicksort implementation: natural extension for 2 pivots: tertiles-of-ﬁve pivots closer to tertiles of whole list 9 other possibilities to pick p and q out of 5 elements: Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 6 / 23
32. 32. Pivot Sampling Well-known optimization for classic Quicksort: median-of-three pivot closer to median of whole list In JRE7 Quicksort implementation: natural extension for 2 pivots: p q tertiles-of-ﬁve pivots closer to tertiles of whole list 9 other possibilities to pick p and q out of 5 elements: Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 6 / 23
33. 33. Pivot Sampling Well-known optimization for classic Quicksort: median-of-three pivot closer to median of whole list In JRE7 Quicksort implementation: natural extension for 2 pivots: p q tertiles-of-ﬁve pivots closer to tertiles of whole list 9 other possibilities to pick p and q out of 5 elements: Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 6 / 23
34. 34. Optimizing Pivot Sampling Which are “good” pivot selection schemes? Is the symmetric choice best possible? Need objective function to optimize Typical approaches to judge efﬁciency: A Count number of basic operations. (Here: number of executed Java Bytecode instructions.) B Measure total running time. Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 7 / 23
35. 35. Optimizing Pivot SamplingRelative performance of pivot sampling compared to tertiles-of-ﬁve: Pivot Selection Scheme A 1 B 2 JRE7 +5.14% +0.80% JRE7(1,3) −1.85% −0.44% +3.34% −0.42% — (stack overﬂow!) +10.6% +2.48% +2.73% +11.3% +3.31% +12.7% +3.29% +16.4% +2.48% +39.0% +5.87% 1 Average number of executed bytecodes on almost sorted lists of length 105 . 2 Average running time on random permutations of length 106 . Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 8 / 23
36. 36. MethodsSebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 9 / 23
37. 37. Model and MethodWhat made JRE7(1,3) faster than JRE7 ? . . . hard to tell from total time/bytecodes. Need a more detailed model of the program.Idea: Decompose along control ﬂow graph! 1 View program as Markov chain over blocks 2 7 Termination via absorbing state 3 4 5 6 Transition i → j has probability p(n) i→j 8 depending on input size n 9 10 Visiting block i incurs constant costs c(i) 12 11 Total cost is sum of block costs Expected costs of program = expected costs of run of Markov chain Latter easy to compute Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 9 / 23
38. 38. Model and MethodWhat made JRE7(1,3) faster than JRE7 ? . . . hard to tell from total time/bytecodes. Need a more detailed model of the program.Idea: Decompose along control ﬂow graph! 1 View program as Markov chain over blocks 2 7 Termination via absorbing state 3 4 5 6 Transition i → j has probability p(n) i→j 8 depending on input size n 9 10 Visiting block i incurs constant costs c(i) 12 11 Total cost is sum of block costs Expected costs of program = expected costs of run of Markov chain Latter easy to compute Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 9 / 23
39. 39. Maximum Likelihood AnalysisHow to determine block costs and transition probabilities?Transition Probabilities Count transitions in executions on sample data 1 Allows arbitrary input distributions! 2 Take relative frequency as estimate for p(n) i→j Extrapolate p(n) to a function pi→j (n) in n i→jBlock Costs We consider two cost measures: 1 A bc(i) = number of Bytecodes instructions in block i. 2 B t(i) = running time of block i All steps are automated in our tool MaLiJAn3 3 http://wwwagak.cs.uni-kl.de/malijan.html Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 10 / 23
40. 40. Block Sampling Running times t(i) in B are typically few nanoseconds direct measurement not possible.Idea: Sampling Based Approach 12 11 12 ns 1 2 3 1 2 4 5 6 7 5 6 7 5 6 7 8 10 1time µssampling 3 2 6 5 5 8 10 In regular intervals, store current basic block (concurrently) We observe only ≈ 1 of all blocks repeat executionRelative frequencies of observed samples approachrelative running time contribution of blocks. Count in separate run how often block i gets executed in total Together, this allows to compute t(i) Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 11 / 23
41. 41. A Decent Word of Caution 1 Determining current block adds a small systematic error. 2 Java Specialty: Just-in-time Compilation Running time heavily inﬂuenced by HotSpot JIT compiler JIT collects proﬁling information at beginning First input determines which optimizations are found . . . more details in the paper Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 12 / 23
42. 42. Input DistributionsWe consider 2 different input distributions: 1 Random Permutations well-studied in literature 2 Almost Sorted Lists Random model by Brodal et al.4 : A[i] chosen i. i. d. uniform in [i − d, i + d] for constant d (here d = 100) 4 G. Brodal, R. Fagerberg, G. Moruz: On the Adaptiveness of Quicksort,J. Exp. Algorithmics 12 (2008), pp. 3.2:1–3.2:20 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 13 / 23
43. 43. ResultsSebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
44. 44. Asymptotic Expected Costs Measure Algorithm Random Permutations Almost Sorted Lists JRE7 19.40 n ln n + 51 n Bytecodes A JRE7(1,3) 18.73 n ln n + 62 n JRE7 time -Xcomp B JRE7(1,3) JRE7 time warmup B JRE7(1,3)24 log. plot, normalized by n ln n JRE7, JRE7(1,3)23 model ﬁts data well!22 105 106 107 108 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
45. 45. Asymptotic Expected Costs Measure Algorithm Random Permutations Almost Sorted Lists JRE7 19.40 n ln n + 51 n Bytecodes A JRE7(1,3) 18.73 n ln n + 62 n JRE7 time -Xcomp B JRE7(1,3) 19.40 n ln n + 51 n 18.73 n ln n + 62 n 24 JRE7 time warmup B JRE7 JRE7(1,3) JRE7(1,3) n ln n bc24 23 log. plot, normalized by n ln n JRE7, JRE7(1,3)23 model ﬁts data well! 2222 105 106 107 108 105 106 107 108 n Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
46. 46. Asymptotic Expected Costs Measure Algorithm Random Permutations Almost Sorted Lists JRE7 19.40 n ln n + 51 n 15.10 n ln n + 68 n Bytecodes A JRE7(1,3) 18.73 n ln n + 62 n 13.52 n ln n + 85 n JRE7 time -Xcomp B JRE7(1,3) JRE7 time warmup B JRE7(1,3)21 log. plot, normalized by n ln n20 JRE7, JRE7(1,3) model ﬁts data well!1918 105 106 107 108 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
47. 47. Asymptotic Expected Costs Measure Algorithm Random Permutations Almost Sorted Lists JRE7 19.40 n ln n + 51 n 15.10 n ln n + 68 n Bytecodes A JRE7(1,3) 18.73 n ln n + 62 n 13.52 n ln n + 85 n JRE7 time -Xcomp B JRE7(1,3) JRE7 time warmup B JRE7(1,3) asymptotically, JRE7(1,3) executes less Bytecodes! Can we explain, why? Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 14 / 23
48. 48. Cycle Costs · cost(Cycle 5) 1 In #Bytecodes: Cycle 3 cheapest0.5 Cycle 1 most expensive of all cycles 0 bc -Xcomp with warmup Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 1 1 1 1 1 2 7 2 7 2 7 2 7 2 7 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 8 8 8 8 8 9 10 9 10 9 10 9 10 9 10 12 11 12 11 12 11 12 11 12 11 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 15 / 23
49. 49. Asymptotic Cycle Frequencies · n ln n + O(n)0.4 JRE7(1,3) executes Cycle 3 more often0.2 Cycle 1 less often than JRE7 0 JRE7 JRE7(1,3) JRE7 JRE7(1,3) random permutations almost sorted Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 1 1 1 1 1 2 7 2 7 2 7 2 7 2 7 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 8 8 8 8 8 9 10 9 10 9 10 9 10 9 10 12 11 12 11 12 11 12 11 12 11 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 16 / 23
50. 50. Asymptotic Cycle Frequencies · n ln n + O(n)0.4 JRE7(1,3) executes Cycle 3 more often0.2 Cycle 1 less often than JRE7 0 JRE7 JRE7(1,3) JRE7 JRE7(1,3) random permutations almost sorted Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 1 1 1 1 1 2 7 2 7 2 7 2 7 2 7 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 8 8 8 8 8 9 10 9 10 9 10 9 10 9 10 12 11 12 11 12 11 12 11 12 11 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 16 / 23
51. 51. Asymptotic Cycle Frequencies · n ln n + O(n)0.4 JRE7(1,3) executes Cycle 3 more often0.2 Cycle 1 less often than JRE7 0 JRE7 JRE7(1,3) JRE7 JRE7(1,3) JRE7(1,3) random permutations executes cheap Cycle 3 more often almost sorted and expensive Cycle 1 less often than JRE7. Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 Asymptotically, less executed Bytecodes! 1 1 1 1 1 2 7 2 7 2 7 2 7 2 7 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 8 8 8 8 8 9 10 9 10 9 10 9 10 9 10 12 11 12 11 12 11 12 11 12 11 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 16 / 23
52. 52. Running Time ResultsHow about running time? HotSpot JIT compiler has two modes -Xcomp JIT compiler without proﬁling information warmup proﬁling JIT with warmup on ﬁxed input trigger JIT compilation Do Block Sampling for both modes Should we expect same block running times? . . . stay tuned Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 17 / 23
53. 53. Cycle Costs · cost(Cycle 5) 10.5 0 bc -Xcomp with warmup Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 1 1 1 1 1 2 7 2 7 2 7 2 7 2 7 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 8 8 8 8 8 9 10 9 10 9 10 9 10 9 10 12 11 12 11 12 11 12 11 12 11 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 18 / 23
54. 54. Cycle Costs · cost(Cycle 5) 1 measures agree qualitatively0.5 but: smaller difference 0 bc tJRE7 tJRE7(1,3) tJRE7 -Xcomp with warmup Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 1 1 1 1 1 2 7 2 7 2 7 2 7 2 7 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 8 8 8 8 8 9 10 9 10 9 10 9 10 9 10 12 11 12 11 12 11 12 11 12 11 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 18 / 23
55. 55. Asymptotic Expected Costs Measure Algorithm Random Permutations Almost Sorted Lists JRE7 19.40 n ln n + 51 n 15.10 n ln n + 68 n Bytecodes A JRE7(1,3) 18.73 n ln n + 62 n 13.52 n ln n + 85 n JRE7 20.10 n ln n + 26 n 11.95 n ln n + 54 n time -Xcomp B JRE7(1,3) 19.95 n ln n + 32 n 11.09 n ln n + 64 n JRE7 time warmup B JRE7(1,3) 1824 17 1622 1520 14 105 106 107 108 105 106 107 108 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
56. 56. Asymptotic Expected Costs Measure Algorithm Random Permutations Almost Sorted Lists JRE7 19.40 n ln n + 51 n 15.10 n ln n + 68 n Bytecodes A JRE7(1,3) 18.73 n ln n + 62 n 13.52 n ln n + 85 n JRE7 20.10 n ln n + 26 n 11.95 n ln n + 54 n time -Xcomp B JRE7(1,3) 19.95 n ln n + 32 n 11.09 n ln n + 64 n JRE7 time warmup B JRE7(1,3) 1824 17 JIT without proﬁling 1622 15 asymptotically, JRE7(1,3) faster!20 14 105 106 107 108 105 106 107 108 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
57. 57. Asymptotic Expected Costs Measure Algorithm Random Permutations Almost Sorted Lists JRE7 19.40 n ln n + 51 n 15.10 n ln n + 68 n Bytecodes A JRE7(1,3) 18.73 n ln n + 62 n 13.52 n ln n + 85 n JRE7 20.10 n ln n + 26 n 11.95 n ln n + 54 n time -Xcomp B JRE7(1,3) 19.95 n ln n + 32 n 11.09 n ln n + 64 n JRE7 10.02 n ln n + 9 n 5.52 n ln n + 13 n time warmup B JRE7(1,3) 11.39 n ln n + 15 n 5.38 n ln n + 19 n 812 610 4 105 106 107 108 105 106 107 108 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
58. 58. Asymptotic Expected Costs Measure Algorithm Random Permutations Almost Sorted Lists JRE7 19.40 n ln n + 51 n 15.10 n ln n + 68 n Bytecodes A JRE7(1,3) 18.73 n ln n + 62 n 13.52 n ln n + 85 n JRE7 20.10 n ln n + 26 n 11.95 n ln n + 54 n time -Xcomp B JRE7(1,3) 19.95 n ln n + 32 n 11.09 n ln n + 64 n JRE7 10.02 n ln n + 9 n 5.52 n ln n + 13 n time warmup B JRE7(1,3) 11.39 n ln n + 15 n 5.38 n ln n + 19 n 812 JIT with proﬁling and warmup 610 asymptotically, JRE7(1,3) slower! 4 105 106 107 108 105 106 107 108 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
59. 59. Asymptotic Expected Costs Measure Algorithm Random Permutations Almost Sorted Lists JRE7 19.40 n ln n + 51 n 15.10 n ln n + 68 n Bytecodes A JRE7(1,3) 18.73 n ln n + 62 n 13.52 n ln n + 85 n JRE7 20.10 n ln n + 26 n 11.95 n ln n + 54 n time -Xcomp B JRE7(1,3) 19.95 n ln n + 32 n 11.09 n ln n + 64 n JRE7 10.02 n ln n + 9 n 5.52 n ln n + 13 n time warmup B JRE7(1,3) 11.39 n ln n + 15 n 5.38 n ln n + 19 n 812 JIT with proﬁling and warmup 610 asymptotically, JRE7(1,3) slower! 4 105 106 107 108 105 106 107 108 What changes with proﬁling enabled? Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 19 / 23
60. 60. Cycle Costs · cost(Cycle 5) 1 measures agree qualitatively0.5 0 bc tJRE7 tJRE7(1,3) tJRE7 -Xcomp with warmup Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 1 1 1 1 1 2 7 2 7 2 7 2 7 2 7 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 8 8 8 8 8 9 10 9 10 9 10 9 10 9 10 12 11 12 11 12 11 12 11 12 11 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 20 / 23
61. 61. Cycle Costs · cost(Cycle 5) 1 measures agree qualitatively0.5 except for JRE7(1,3) with proﬁling JIT! 0 bc tJRE7 tJRE7(1,3) tJRE7 tJRE7(1,3) -Xcomp with warmup Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 1 1 1 1 1 2 7 2 7 2 7 2 7 2 7 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 8 8 8 8 8 9 10 9 10 9 10 9 10 9 10 12 11 12 11 12 11 12 11 12 11 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 20 / 23
62. 62. Cycle Costs · cost(Cycle 5) 1 measures agree qualitatively0.5 except for JRE7(1,3) with proﬁling JIT! 0 bc tJRE7 tJRE7 tJRE7 tJRE7 For -Xcomp (1,3), the code created by proﬁling JIT JRE7(1,3) (1,3) with warmup for Cycle 3 is much slower than for JRE7! Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 That’s the place to focus future research on. 1 1 1 1 1 2 7 2 7 2 7 2 7 2 7 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 8 8 8 8 8 9 10 9 10 9 10 9 10 9 10 12 11 12 11 12 11 12 11 12 11 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 20 / 23
63. 63. Cycle Costs · cost(Cycle 5) 1 measures agree qualitatively0.5 except for JRE7(1,3) with proﬁling JIT! 0 bc tJRE7 tJRE7 tJRE7 tJRE7 For -Xcomp (1,3), the code created by proﬁling JIT JRE7(1,3) (1,3) with warmup for Cycle 3 is much slower than for JRE7! Cycle 1 Cycle 2 Cycle 3 Cycle 4 Cycle 5 That’s the place to focus future research on. 1 1 1 1 1 2 7 2 7 2 7 2 7 2 7 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 3 4 5 6 8 8 8 8 8 9 10 9 10 9 10 9 10 9 10 12 11 12 11 12 11 12 11 12 11 Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 20 / 23
64. 64. ConclusionSummary Java 7’s dual pivot Quicksort is highly asymmetric. executes less Bytecodes than . Almost sorted inputs amplify impact of pivot sampling. Oracle’s proﬁling JIT compiler creates different code for JRE7(1,3) , which potentially overcompensates gains. Control ﬂow graph decomposition supported by MaLiJAn makes difference in code efﬁciency directly visible.Open Problems ? What causes different costs for Cycle 3? ? Are the differences idiosyncracies of Java / Oracle’s JRE? ? Performance of JRE7(1,3) on other inputs, especially with equal keys? Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 21 / 23
65. 65. ConclusionSummary Java 7’s dual pivot Quicksort is highly asymmetric. executes less Bytecodes than . Almost sorted inputs amplify impact of pivot sampling. Oracle’s proﬁling JIT compiler creates different code for JRE7(1,3) , which potentially overcompensates gains. Control ﬂow graph decomposition supported by MaLiJAn makes difference in code efﬁciency directly visible.Open Problems ? What causes different costs for Cycle 3? ? Are the differences idiosyncracies of Java / Oracle’s JRE? ? Performance of JRE7(1,3) on other inputs, especially with equal keys? Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 21 / 23
66. 66. Asymptotic Expected Costs Measure Algorithm Random Permutations Almost Sorted Lists JRE7 19.40 n ln n + 51 n 15.10 n ln n + 68 n Bytecodes A JRE7(1,3) 18.73 n ln n + 62 n 13.52 n ln n + 85 n JRE7 20.10 n ln n + 26 n 11.95 n ln n + 54 n time -Xcomp B JRE7(1,3) 19.95 n ln n + 32 n 11.09 n ln n + 64 n JRE7 10.02 n ln n + 9 n 5.52 n ln n + 13 n time warmup B JRE7(1,3) 11.39 n ln n + 15 n 5.38 n ln n + 19 n 812 JIT with proﬁling and warmup 610 asymptotically, JRE7(1,3) slower! 4 105 106 107 108 105 106 107 108 What changes with proﬁling enabled? Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 22 / 23
67. 67. ConclusionSummary Java 7’s dual pivot Quicksort is highly asymmetric. executes less Bytecodes than . Almost sorted inputs amplify impact of pivot sampling. Oracle’s proﬁling JIT compiler creates different code for JRE7(1,3) , which potentially overcompensates gains. Control ﬂow graph decomposition supported by MaLiJAn makes difference in code efﬁciency directly visible.Open Problems ? What causes different costs for Cycle 3? ? Are the differences idiosyncracies of Java / Oracle’s JRE? ? Performance of JRE7(1,3) on other inputs, especially with equal keys? Sebastian Wild Java 7’s Dual Pivot Quicksort 2012/09/11 23 / 23