So You Want To Write Your Own Benchmark

yo ur
y r ite
w ark
tsk to m
zni
er e a nt ch
Dr or B w en
ou rob
y ic
So m
n
ow 8t h2
008
er 1
mb
Dece

Agenda

• Introduction
• Java™ micro benchmarking pitfalls
• Writing your own benchmark
• Micro benchmarking tools
• Summary

2

Microbenchmark – simple definition

1. Start the 2. Run the code 3. Stop the 4. Report
clock clock

3

Better microbenchmark definition

• Small program
• Goal: Measure something about a few
lines of code
• All other variables should be removed
• Returns some kind of a numeric
result

4

Why do I need microbenchmarks?

• Discover something about my code:
• How fast is it
• Calculate throughput – TPS, KB/s
• Measure the result of changing my code:
• Should I replace a HashMap with a TreeMap?
• What is the cost of synchronizing a method?

5

Why are you talking about this?

• It’s hard to write a robust
microbenchmark
• it’s even harder to do it in Java™
• There are not enough Java
microbenchmarking tools
• There are too many flawed
microbenchmarks out there

6

Agenda

• Introduction
• Java micro benchmarking pitfalls
• Summary

7

A microbenchmark story: the problem

The boss asks you to solve a performance issue
in one of the components

Blah, blah …

8

A microbenchmark story: the cause

You find out that the cause is excessive use
of Math.sqrt()

9

A microbenchmark story: a solution?

• You decide to develop a state of the art
square root approximation
• After developing the square root
approximation you want to benchmark it
against the java.lang.Math
implementation

10

SQRT approximation microbenchmark

Let’s run this little piece of code in a loop
and see what happens …

public static void main(String[] args) {
long start = System.currentTimeMillis(); // start the clock
for (double i = 0; i < 10 * 1000 * 1000; i++) {
mySqrt(i); // little piece of code
}
long end = System.currentTimeMillis(); // stop the clock
long duration = end - start;
System.out.format(quot;Test duration: %d (ms) %nquot;, duration);
}

11

SQRT microbenchmark results

Wow, this is really fast !

Test duration: 0 (ms)

12

Flawed microbenchmark

13

SQRT microbenchmark: what’s wrong?

Dynamic optimizations
Garbage collection Dead code elimination

The Java™ HotSpot virtual machine
Classloading
Dynamic Compilation
On Stack Replacement

14

The HotSpot: a mixed mode system

2
Code is
1
interpreted Profiling

3
Interpreted again Dynamic
or recompiled Compilation
5

Stuff 4
Happen

15

Dynamic compilation

• Dynamic compilation is unpredictable
• Don’t know when the compiler will run
• Don’t know how long the compiler will run
• Same code may be compiled more than once
• The JVM can switch to compiled code at will

16

Dynamic compilation cont.

• Dynamic compilation can seriously
influence microbenchmark results

Continuous recompilation Steady-state

Interpreted execution +
Compiled / Interpreted code
Dynamic compilation + ≠ execution
Compiled code execution

17


• The HotSpot server compiler performs
large variety of optimizations:
• loop unrolling
• range check elimination
• dead-code elimination
• code hoisting …

18

Code hoisting ?

Did he just said
“code
hoisting”?

19

What the heck is code hoisting ?

• Hoist = to raise or lift
• Size optimization
• Eliminate duplicated pieces
of code in method bodies
by hoisting expressions
or statements

20

Code hoisting example

a + b is a busy After hoisting the
expression expression a + b. A
new local variable t
has been introduced

Optimizing Java for Size: Compiler Techniques for Code Compaction, Samuli Heilala

21

Dynamic optimizations cont.

• Most of the optimizations are performed
at runtime
• Profiling data is used by the compiler to
improve optimization decisions
• You don’t have access to the dynamically
compiled code

22

Example: Very fast square root?

10,000,000 calls to Math.sqrt() ~ 4 ms

long start = System.nanoTime();
int result = 0;
for (int i = 0; i < 10 * 1000 * 1000; i++) {
result += Math.sqrt(i);
}
long duration = (System.nanoTime() - start) / 1000000;
}

23

Example: not so fast?

Now it takes ~ 2000 ms ?!?
int result = 0;
for (int i = 0; i < 10 * 1000 * 1000; i++) {
result += Math.sqrt(i); Single line
of code
} added
System.out.format(quot;Result: %d %nquot;, result);
}

24

DCE - Dead Code Elimination

• Dead code - code that has no effect on the
outcome of the program execution
int result = 0;
for (int i = 0; i < 10 * 1000 * 1000; i++) {
result += Math.sqrt(i);
} Dead Code

}

25

OSR - On Stack Replacement

• Methods are HOT if they cumulatively
execute more than 10,000 of loop
iterations
• Older JVM versions did not switch to the
compiled version until the method exited
and was re-entered
• OSR - switch from interpretation to
compiled code in the middle of a loop

26

OSR and microbenchmarking

• OSR’d code may be less performant
• Some optimizations are not performed
• OSR usually happen when you put
everything into one long method
• Developers tend to write long main()
methods when benchmarking
• Real life applications are hopefully divided
into more fine grained methods
27

Classloading

• Classes are usually loaded only when
they are first used
• Class loading takes time
• I/O
• Parsing
• Verification
• May flow your benchmark results
28

Garbage Collection

• JVM automatically claim resources by
• Garbage collection
• Objects finalization
• Outside of developer’s control
• Unpredictable
• Should be measured if invoked as a result
of the benchmarked code

29

Time measurement

How long is one millisecond?
public static void main(String[] args) throws
InterruptedException {
long start = System.currentTimeMillis();
Thread.sleep(1);
final long end = System.currentTimeMillis();
final long duration = (end - start);
}

Test duration: 16 (ms)

30

System.curremtTimeMillis()

• Accuracy varies with platform

Resolution Platform Source
55 ms Windows 95/98 Java Glossary

10 – 15 ms Windows NT, 2K, XP, 2003 David Holmes

1 ms Mac OS X Java Glossary

1 ms Linux – 2.6 kernel Markus Kobler

31

Wrong target platform

• Choosing the wrong platform for your
microbenchmark
• Benchmarking on Windows when your
target platform is Linux
• Benchmarking a highly threaded
application on a single core machine
• Benchmarking on a Sun JVM when the
target platform is Oracle (BEA) JRockit

32

Caching

• Caching
• Hardware – CPU caching
• Operating System – File system caching
• Database – query caching

33

Caching: CPU L1 and L2 caches

• The more the data accessed are far from the
CPU, the more the delays are high
• Size of dataset affects access cost
Array size Time (us) Cost (ns)
16k 413451 9.821
8192K 5743812 136.446

Jcachev2 results for Intel® core™2 duo T8300, L1 = 32 KB, L2 = 3 MB

34

Busy environment

• Running in a busy environment – CPU,
IO, Memory

35

Agenda

• Introduction
• Summary

36

Warm-up your code

37

Warm-up up your code

• Let the JVM reach steady state execution
profile before you start benchmarking
• All classes should be loaded before
benchmarking
• Usually executing your code for ~10
seconds should be enough

38

Warm-up up your code – cont.

• Detect JIT compilations by using
• CompilationMXBean.
getTotalCompilationTime()
• -XX:+PrintCompilation
• Measure classloading time
• Use the ClassLoadingMXBean

39

CompilationMXBean usage

import java.lang.management.ManagementFactory;
import java.lang.management.CompilationMXBean;

long compilationTimeTotal;

CompilationMXBean compBean =
ManagementFactory.getCompilationMXBean();

if (compBean.isCompilationTimeMonitoringSupported())
compilationTimeTotal = compBean.getTotalCompilationTime();

40


• Avoid on stack replacement
• Don’t put all your benchmark code in one
big main() method
• Avoid dead code elimination
• Print the final result
• Report unreasonable speedups

41

Garbage Collection

• Measure garbage collection time
• Force garbage collection and finalization
before benchmarking
• Perform enough iteration to reach garbage
collection steady state
• Gather gc stats:
-XX:PrintGCTimeStamps
-XX:PrintGCDetails

42

Time measurement

• Use System.nanoTime()
• Microseconds accuracy on modern operating
systems and hardware
• Not worse than currentTimeMillis()
• Notice: Windows users
• executes in microseconds
• don’t overuse !

43

JVM configuration

• Use similar JVM options to your target
environment:
• -server or –client JVM
• Enough heap space (-Xmx)
• Garbage collection options
• Thread stack size (-Xss)
• JIT compiling options

44

Other issues

• Use fixed size data sets
• Too large data sets can cause L1 cache
blowout
• Notice system load
• Don’t play GTA while benchmarking !

45

Agenda

• Introduction
• Summary

46

Java™ benchmarking tools

• Various specialized benchmarks
• SPECjAppServer ®
• SPECjvm™
• CaffeineMark 3.0™
• SciMark 2.0
• Only a few benchmarking frameworks

47

Japex Micro-Benchmark framework

• Similar in spirit to JUnit
• Measures throughput – work over time
• Transactions Per Second (Default)
• KBs per second
• XML based configuration
• XML/HTML reports

48

Japex: Drivers

• Encapsulates knowledge about a specific
algorithm implementation
• Must extend JapexDriverBase
public interface JapexDriver extends Runnable {
public void initializeDriver();
public void prepare(TestCase testCase);
public void warmup(TestCase testCase);
public void run(TestCase testCase);
public void finish(TestCase testCase);
public void terminateDriver();
}
49

Japex: Writing your own driver

public class SqrtNewtonApproxDriver extends JapexDriverBase {
private long tmp;
…
@Override
public void warmup(TestCase testCase) {
tmp += sqrt(getNextRandomNumber());
}
…
}

50

Japex: Test suite

<testSuite name=quot;SQRT Test Suitequot;
xmlns=http://www.sun.com/japex/testSuite …>
<param name=quot;libraryDirquot; value=quot;C:/java/japex/libquot;/>
<param name=quot;japex.classPathquot; value=quot;./target/classesquot;/>
<param name=quot;japex.runIterationsquot; value=quot;1000000quot; />
<driver name=quot;SqrtApproxNewtonDriverquot;>
<param name=quot;Descriptionquot; value=quot;Newton Driverquot;/>
<param name=quot;japex.driverClass“
value=quot;com.alphacsp.javaedge.benchmark.
japex.driver.SqrtNewtonApproxDriverquot;/>
</driver>
<testCase name=quot;testcase1quot;/>
</testSuite>

51

Japex: HTML Reports

52

Japex: more chart types

Scatter chart

Line chart

53

Japex: pros and cons

• Pros
• Similar to JUnit
• Nice HTML reports
• Cons
• Last stable release on March 2007
• HotSpot issues are not handled
• XML configuration
54

Brent Boyer’s Benchmark framework

• Part of the “Robust Java benchmarking”
article by Brent Boyer
• Automate as many aspects as possible:
• Resource reclamation
• Class loading
• Dead code elimination
• Statistics

55

Benchmark framework example

Benchmark.Params params = new Benchmark.Params(true);

params.setExecutionTimeGoal(0.5);

params.setNumberMeasurements(50);

Runnable task = new Runnable() {

public void run() {

sqrt(getNextRandomNumber());

}

};

Benchmark benchmark = new Benchmark(task, params);

System.out.println(benchmark.toString());

56

Benchmark single line summary

Benchmark output:
first = 25.702 us,
mean = 91.070 ns
(CI deltas: -115.591 ps, +171.423 ps)
sd = 1.451 us (CI deltas: -461.523 ns, +676.964 ns)

WARNING: execution times have mild outliers, SD
VALUES MAY BE INACCURATE

57

Outlier and serial correlation issues

• Records outlier and serial correlation
issues
• Outliers indicate that a major
measurement error happened
• Large outliers - some other activity started on the
computer during measurement
• Small outliers might hint that DCE occurred
• Serial correlation indicates that the JVM has not
reached its steady-state performance profile

58

Benchmark : pros and cons

• Pros
• Handles HotSpot related issues
• Detailed statistics
• Cons
• Each run takes a lot of time
• Not a formal project
• Lacks documentation
59

Agenda

• Introduction
• Summary

60

Summary 1

• Micro benchmarking is hard when it
comes to Java™
• Define what you want to measure and
how want to do it, pick your goals
• Know what you are doing
• Always warm-up your code
• Handle DCE, OSR, GC issues
• Use fixed size data sets and fixed work
61

Summary 2

• Do not rely solely on microbenchmark
results
• Sanity check results
• Use a profiler
• Test your code in real life scenarios under
realistic load (macro-benchmark)

62

Summary: resources

• http://www.ibm.com/developerworks/java/librar
y/j-benchmark1.html
• http://www.azulsystems.com/events/javaone_20
02/microbenchmarks.pdf
• https://japex.dev.java.net/
• http://www.ibm.com/developerworks/java/librar
y/j-jtp12214/
• http://www.dei.unipd.it/~bertasi/jcache/

63

So You Want To Write Your Own Benchmark

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to So You Want To Write Your Own Benchmark

Similar to So You Want To Write Your Own Benchmark (20)

Recently uploaded

Recently uploaded (20)

So You Want To Write Your Own Benchmark