- The document discusses best practices for micro-benchmarking in Java, including using frameworks like JMH that account for JVM warmup and avoid benchmark overhead.
- It explains common pitfalls like dead code elimination and loop unrolling that can incorrectly optimize away the code being measured.
- An example benchmark compares the performance of ArrayList and LinkedList iteration in different Java versions.
Presentation given at the Toulouse JUG in Dec 2019
GraalVM and its native-image component allow building native standalone executables from Java or any other language compiling to Java bytecode like Scala or Kotlin.
This talks goes through the practical steps leading to producing a native executable for a command-line tool, explaining the benefits and also the limits of GraalVM native-image.
Hearts Of Darkness - a Spring DevOps ApocalypseJoris Kuipers
In this talk Joris shares several real-life failure cases concerning running Spring applications in production. Examples include services being killed because of health check issues, Micrometer metrics getting lost, circuit breakers never closing after opening, OOM errors caused by unbounded queues and other nightmarish scenario’s. Not only will you come to understand how these problems could sneak through staging to make their way to production, you will also be given practical tips on how to avoid these things from happening to your own applications. Otto von Bismarck famously said “Fools say that they learn by experience. I prefer to profit by others’ experience”. Don’t be a fool, and profit by viewing this talk!
Asynchronous API in Java8, how to use CompletableFutureJosé Paumard
Slides of my talk as Devoxx 2015. How to set up asynchronous data processing pipelines using the CompletionStage / CompletableFuture API, including how to control threads and how to handle exceptions.
Presentation given at the Toulouse JUG in Dec 2019
GraalVM and its native-image component allow building native standalone executables from Java or any other language compiling to Java bytecode like Scala or Kotlin.
This talks goes through the practical steps leading to producing a native executable for a command-line tool, explaining the benefits and also the limits of GraalVM native-image.
Hearts Of Darkness - a Spring DevOps ApocalypseJoris Kuipers
In this talk Joris shares several real-life failure cases concerning running Spring applications in production. Examples include services being killed because of health check issues, Micrometer metrics getting lost, circuit breakers never closing after opening, OOM errors caused by unbounded queues and other nightmarish scenario’s. Not only will you come to understand how these problems could sneak through staging to make their way to production, you will also be given practical tips on how to avoid these things from happening to your own applications. Otto von Bismarck famously said “Fools say that they learn by experience. I prefer to profit by others’ experience”. Don’t be a fool, and profit by viewing this talk!
Asynchronous API in Java8, how to use CompletableFutureJosé Paumard
Slides of my talk as Devoxx 2015. How to set up asynchronous data processing pipelines using the CompletionStage / CompletableFuture API, including how to control threads and how to handle exceptions.
Winning the Lottery with Spring: A Microservices Case Study for the Dutch Lot...VMware Tanzu
SpringOne 2021
Session Title: Winning the Lottery with Spring: A Microservices Case Study for the Dutch Lotteries
Speaker: Joris Kuipers, CTO at Trifork
This presentation shows the feature updates from Scala 2.12 to 2.13.
The list of features is not comprehensive, but it is my personal selection of favorites.
I will focus on those which IMO impact/ease the programmers live most.
I will look at 5 feature areas: compiler, standard library, language changes,
Future and finally the most important change the redesigned collections library.
I will not only show the new features of 2.13. In many cases I will show how the
new features of 2.13 can be backported to 2.12 und be used in mostly the same way as in 2.13.
Finally I'll give some guide lines for the migration from 2.12 to 2.13 and for a cross version
project which compiles a code base with both compiler versions.
Writing software for a virtual machine enables developers to forget about machine code assembly, interrupts, and processor caches. This makes Java a convenient language, but all too many developers see the JVM as a black box and are often unsure of how to optimize their code for performance. This unfortunately adds credence to the myth that Java is always outperformed by native languages. This session takes a peek at the inner workings of Oracle’s HotSpot virtual machine, its just-in-time compiler, and the interplay with a computer’s hardware. From this, you will understand the more common optimizations a virtual machine applies, to be better equipped to improve and reason about a Java program’s performance and how to correctly measure runtime!
Automating Web Application Security Testing With OWASP ZAP DOT NET API - Tech...gmaran23
Automating Web Application Security Testing With OWASP ZAP DOT NET API - Tech Talk - Dec 22 - 2015
Screen Recording: https://vimeo.com/gmaran23/AutomatingWebApplicationSecurityWithOWASPZAPDOTNETAPI
You may be hearing a lot of buzz around functional programming. For example, Java 8 recently introduced new features (lambda expressions and method references) and APIs (Streams, Optional and CompletableFutures) inspired from functional ideas such as first-class functions, composition and immutability.
However, what does this mean for my existing codebase?
In this talk we show how you can refactor your traditional object-oriented Java to using FP features and APIs from Java 8 in a beneficial manner.
We will discuss:
- How to adapt to requirement changes using first-class functions
- How you can enhance code reusability using currying
- How you can make your code more robust by favouring immutability over mutability
- How you can design better APIs and reduce unintended null pointer exceptions using an optional data type"
This is a basic tutorial on Spring core.
Best viewed when animations and transitions are supported, e.g., view in MS Powerpoint. So, please try to view it with animation else the main purpose of this presentation will be defeated.
Instrumenting application code is like flossing your teeth. Developers know they ought to be doing it more often. Code instrumentation is an important practice for establishing baseline performance metrics and identifying bottlenecks. Getting the right metrics is core to understanding how much concurrency your application can handle, determining what latency is normal for the application, and indicating when performance is deviating from those norms.
While most developers acknowledge the value of instrumentation, few actually implement it. If Bytecode injection sounds as scary as a root canal, take heart, effective instrumentation doesn't have to be complicated. I've written an open-source instrumentation framework to encourage developers to get the metrics they need to pilot their application safely. We'll examine some strategies for code instrumentation, run some load tests, and make sense of the numbers.
Winning the Lottery with Spring: A Microservices Case Study for the Dutch Lot...VMware Tanzu
SpringOne 2021
Session Title: Winning the Lottery with Spring: A Microservices Case Study for the Dutch Lotteries
Speaker: Joris Kuipers, CTO at Trifork
This presentation shows the feature updates from Scala 2.12 to 2.13.
The list of features is not comprehensive, but it is my personal selection of favorites.
I will focus on those which IMO impact/ease the programmers live most.
I will look at 5 feature areas: compiler, standard library, language changes,
Future and finally the most important change the redesigned collections library.
I will not only show the new features of 2.13. In many cases I will show how the
new features of 2.13 can be backported to 2.12 und be used in mostly the same way as in 2.13.
Finally I'll give some guide lines for the migration from 2.12 to 2.13 and for a cross version
project which compiles a code base with both compiler versions.
Writing software for a virtual machine enables developers to forget about machine code assembly, interrupts, and processor caches. This makes Java a convenient language, but all too many developers see the JVM as a black box and are often unsure of how to optimize their code for performance. This unfortunately adds credence to the myth that Java is always outperformed by native languages. This session takes a peek at the inner workings of Oracle’s HotSpot virtual machine, its just-in-time compiler, and the interplay with a computer’s hardware. From this, you will understand the more common optimizations a virtual machine applies, to be better equipped to improve and reason about a Java program’s performance and how to correctly measure runtime!
Automating Web Application Security Testing With OWASP ZAP DOT NET API - Tech...gmaran23
Automating Web Application Security Testing With OWASP ZAP DOT NET API - Tech Talk - Dec 22 - 2015
Screen Recording: https://vimeo.com/gmaran23/AutomatingWebApplicationSecurityWithOWASPZAPDOTNETAPI
You may be hearing a lot of buzz around functional programming. For example, Java 8 recently introduced new features (lambda expressions and method references) and APIs (Streams, Optional and CompletableFutures) inspired from functional ideas such as first-class functions, composition and immutability.
However, what does this mean for my existing codebase?
In this talk we show how you can refactor your traditional object-oriented Java to using FP features and APIs from Java 8 in a beneficial manner.
We will discuss:
- How to adapt to requirement changes using first-class functions
- How you can enhance code reusability using currying
- How you can make your code more robust by favouring immutability over mutability
- How you can design better APIs and reduce unintended null pointer exceptions using an optional data type"
This is a basic tutorial on Spring core.
Best viewed when animations and transitions are supported, e.g., view in MS Powerpoint. So, please try to view it with animation else the main purpose of this presentation will be defeated.
Instrumenting application code is like flossing your teeth. Developers know they ought to be doing it more often. Code instrumentation is an important practice for establishing baseline performance metrics and identifying bottlenecks. Getting the right metrics is core to understanding how much concurrency your application can handle, determining what latency is normal for the application, and indicating when performance is deviating from those norms.
While most developers acknowledge the value of instrumentation, few actually implement it. If Bytecode injection sounds as scary as a root canal, take heart, effective instrumentation doesn't have to be complicated. I've written an open-source instrumentation framework to encourage developers to get the metrics they need to pilot their application safely. We'll examine some strategies for code instrumentation, run some load tests, and make sense of the numbers.
Efficient and Advanced Omniscient Debugging for xDSMLs (SLE 2015)Benoit Combemale
Talk given at the 8th ACM SIGPLAN Int'l Conf. on Software Language Engineering (SLE 2015), Pittsburgh, PA, USA on October 27, 2015. Preprint available at https://hal.inria.fr/hal-01182517
A brief introduction about the art of unit test, how to test a class, mock a collaborator and use a fake database. TDD is introduced. Code for exercises is available on github along with a detailed explanation. Examples and exercises are written in Java.
Have you ever wondered how to speed up your code in Python? This presentation will show you how to start. I will begin with a guide how to locate performance bottlenecks and then give you some tips how to speed up your code. Also I would like to discuss how to avoid premature optimization as it may be ‘the root of all evil’ (at least according to D. Knuth).
SoCal Code Camp 2015: An introduction to Java 8Chaitanya Ganoo
Java 8 introduced cool new features such as Lambdas and Streams. We'll take a look at what they are how to use them effectively. We'll also walkthrough an example of a lightweight Java 8 service running in AWS cloud, which can read and index tweets into an ElasticSearch cluster
Aiming at complete code coverage by unit tests tends to be cumbersome, especially for cases where external API calls a part of the code base. For these reasons, Python comes with the unittest.mock library, appearing to be a powerful companion in replacing parts of the system under test.
Performance and how to measure it - ProgSCon London 2016Matt Warren
Starting with the premise that "Performance is a Feature", this session will look at how to measure, what to measure and how get the best performance from your .NET code.
We will look at real-world examples from the Roslyn code-base and StackOverflow (the product), including how the .NET Garbage Collector needs to be tamed!
maXbox Starter 43 Work with Code Metrics ISO StandardMax Kleiner
Today we step through optimize your code with metrics and some style guide conventions. You cannot improve what you don’t measure and what you don’t measure, you cannot prove. A tool can be great for code quality but also provides a mechanism for extending your functions and quality with checks and tests.
From System Modeling to Automated System TestingFlorian Lier
Often, high-level (functional) tests are carried out manually, implementing a tailored solution, e.g, via shell scripts
or launch files, for a specific system setup. Besides the effort of manual execution and supervision, current tests mostly do not take timing and orchestration, i.e., required process start-up sequence, into account. Furthermore, successful execution of components is not verified, which might lead to subsequent errors during the execution chain. Most importantly, all this knowledge about the test and its environment is implicit, often hidden in the actual implementation of the tailored test suite. To overcome these issues, this contribution introduces a generic and configurable state-machine based process.
Starting with the premise that "Performance is a Feature", Matt Warren will show you how to measure, what to measure and how to get the best performance from your .NET code.
We will look at real-world examples from the Roslyn code-base and StackOverflow (the product), including how the .NET Garbage Collector needs to be tamed!
The presentation covers:
Why we should care about performance
Pitfalls to avoid when measuring performance
How the .NET Garbage Collector can hurt performance
Real-world performance lessons from open-source code
The webinar recording can be found here: http://www.postsharp.net/blog/post/webinar-recording-performance-is-a-feature
BarcelonaJUG2016: walkmod: how to run and design code transformationswalkmod
walkmod is an open source tool to apply and share code conventions. Code conventions are internally designed as code transformations. In this sessions we will se how to run walkmod in a real project and how to contribute designing automatic quick fixes for existing PMD, Checkstyle or Sonar rules
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
2. Agenda
Benchmark definition, types, common problems
Tools needed to measure performance
Code warm-up, what happens before the steady-state
Using JMH
Side effects that can affect performance
JVM optimizations (Good or Evil?)
A word about concurrency
Full example, “human factor” included
A fleeting glimpse on the JMH details
2 of 50
4. What is a “Benchmark”?
Benchmark is a program for performance measurement
Requirements:
Dimensions: throughput and latency
Avoid significant overhead
Test what is to be tested
Perform a set of executions and provide stable reproducible
results
Should be easy to run
4 of 50
5. Benchmark types
By scale
• Micro-benchmark (component level)
• Macro-benchmark (system level)
By nature
• Synthetic benchmark (emulate component load)
• Application benchmark (run real-world application)
5 of 50
6. We’ll talk about
Synthetic micro-benchmark
• Mimic component workload separately from the application
• Measure performance of a small isolated piece of code
The main concern
• The smaller the Component we test – the stronger the impact of
• Benchmark infrastructure overhead
• JVM internal processes
• OS and Hardware internals
• … and the phases of the Moon
• Don’t we really test one of those?
6 of 50
7. When micro-benchmark is needed
Most of the time it is not needed at all
Does algorithm A work faster than B?
(Consider equal analytical estimation)
Does this tiny modification make any difference?
(from Java, JVM, native code or hardware point of view)
7 of 50
9. You had one job…
9 of 50
final int COUNT = 100;
long start = System.currentTimeMillis();
for (int i = 0; i < COUNT; i++) {
// doStuff();
}
long duration = System.currentTimeMillis() - start;
long avg = duration / COUNT;
System.out.println("Average execution time is " + avg + " ms");
10. Pitfall #0
Using profiler to measure performance of small methods
(adds significant overhead, measures execution “as is”)
“You had one job” approach is enough in real life
(not for micro-benchmarks, we got it already)
Annotations and reflective benchmark invocations
(you must be great at java.lang.reflect measurement)
10 of 50
11. Micro-benchmark frameworks
JMH – Takes into account a lot of internal VM processes
and executes benchmarks with minimal infrastructure (Oracle)
Caliper – Allows to measure repetitive code, works for Android,
allows to post results online (Google)
Japex – Allows to reduce infrastructure code, generates nice
HTML reports with JFreeChart plots
JUnitPerf – Measure functionality of the existing JUnit tests
11 of 50
12. Java time interval measurement
System.currentTimeMillis()
• Value in milliseconds, but granularity depends on the OS
• Represents a “wall-clock” time (since the start of Epoch)
System.nanoTime()
• Value in nanoseconds, since some time offset
• The accuracy is not worse than System.currentTimeMillis()
ThreadMXBean.getCurrentThreadCpuTime()
• The actual CPU time spent for the thread (nanoseconds)
• Might be unsupported by your VM
• Might be expensive
• Relevant for a single thread only
12 of 50
14. Code warm-up, class loading
A single warm-up iteration is NOT enough for Class Loading
(not all the branches of classes may load on the first iteration)
Sometimes classes are unloaded (it would be a shame if
something messed your results up with a huge peak)
Get help between iterations from
• ClassLoadingMXBean.getTotalLoadedClassCount()
• ClassLoadingMXBean.getUnloadedClassCount()
• -verbose:class
14 of 50
15. Code warm-up, compilation
Classes are loaded, verified and then being compiled
Oracle HotSpot and Azul Zing run application in interpreter
The hot method is being compiled after
~10k (server), ~1.5k (client) invocations
Long methods with loops are likely to be compiled earlier
Check CompilationMXBean.getTotalCompilationTime
Enable compilation logging with
• -XX:+UnlockDiagnosticVMOptions
• -XX:+PrintCompilation
• -XX:+LogCompilation -XX:LogFile=<filename>
15 of 50
16. Code warm-up, OSR
Normal compilation and OSR will result in a similar code
…unless compiler is not able to optimize a given frame
(e.g. inner loop is compiled before the outer one)
In the real world normal compilation is more likely to happen, so
it’s better to avoid OSR in your benchmark
• Do a set of small warm-up iterations instead of a single big one
• Do not perform warm-up loops in the steady-state testing method
16 of 50
17. Code warm-up, OSR example
Now forget about array range check elimination
17 of 50
Before:
public static void main(String… args) {
loop1: if(P1) goto done1
i=0;
loop2: if(P2) goto done2
A[i++];
goto loop2; // OSR goes here
done2:
goto loop1;
done1:
}
After:
void OSR_main() {
A= // from interpreter
i= // from interpreter
loop2: if(P2) {
if(P1) goto done1
i=0;
} else { A[i++]; }
goto loop2
done1:
}
18. Reaching the steady-state, summary
Always do warm-up to reach steady-state
• Use the same data and the same code
• Discard warm-up results
• Avoid OSR
• Don’t run benchmark in the “mixed modes” (interpreter/compiler)
• Check class loading and compilation
18 of 50
19. Using JMH
Provides Maven archetype for a quick project setup
Annotate your methods with @GenerateMicroBenchmark
mvn install will build ready to use runnable jar with your
benchmarks and needed infrastructure
java -jar target/mb.jar <benchmark regex> [options]
Will perform warm-up following by a set of iterations
Print the results
19 of 50
22. Synchronization puzzle, side effect
Biased Locking: an optimization in the VM that leaves an object
as logically locked by a given thread even after the thread has
released the lock (cheap reacquisition)
Does not work on VM start up
(4 sec in HotSpot)
Use -XX:BiasedLockingStartupDelay=0
22 of 50
23. JVM optimizations (Good or Evil?)
WARNING: some of the following optimizations
will not work (at least for the given examples)
in Java 6 (jdk1.6.0_26), consider using Java 7 (jdk1.7.0_21)
23 of 50
24. Dead code elimination
VM optimization eliminates dead branches of code
Even if the code is meant to be executed,
but the result is never used and does not have any side effect
Always consume all the results of your benchmarked code
Or you’ll get the “over 9000” performance level
Do not accumulate results or store them in class fields that are
never used either
Use them in the unobvious logical expression instead
24 of 50
25. Dead code elimination, example
Measurement: average nanoseconds / operation, less is better
25 of 50
private double n = 10;
public void stub() { }
public void dead() {
@SuppressWarnings("unused")
double r = n * Math.log(n) / 2;
}
public void alive() {
double r = n * Math.log(n) / 2;
if(r == n && r == 0)
throw new IllegalStateException();
}
1.017
48.514
1.008
26. Constant folding
If the compiler sees that the result of calculation will always be
the same, it will be stored in the constant value and reused
Measurement: average nanoseconds / operation, less is better
26 of 50
private double x = Math.PI;
public void stub() { }
public double wrong() {
return Math.log(Math.PI);
}
public double measureRight() {
return Math.log(x);
}
1.014
1.695
43.435
27. Loop unrolling
Is there anything bad?
Measurement: average nanoseconds / operation, less is better
27 of 50
private double[] A = new double[2048];
public double plain() {
double sum = 0;
for (int i = 0; i < A.length; i++)
sum += A[i];
return sum;
}
public double manualUnroll() {
double sum = 0;
for (int i = 0; i < A.length; i += 4)
sum += A[i] + A[i + 1] + A[i + 2] + A[i + 3];
return sum;
}
2773.883
816.791
28. Loop unrolling and hoisting
Something bad happens when
the loops of benchmark infrastructure code are unrolled
And the calculations that we try to measure
are hoisted from the loop
For example, Caliper style benchmark looks like
private int reps(int reps) {
int s = 0;
for (int i = 0; i < reps; i++)
s += (x + y);
return s;
}
28 of 50
29. Loop unrolling and hoisting, example
29 of 50
@GenerateMicroBenchmark
public int measureRight() {
return (x + y);
}
@GenerateMicroBenchmark
@OperationsPerInvocation(1)
public int measureWrong_1() {
return reps(1);
}
...
@GenerateMicroBenchmark
@OperationsPerInvocation(N)
public int measureWrong_N() {
return reps(N);
}
30. Loop unrolling and hoisting, example
Method Result
Right 2.104
Wrong_1 2.055
Wrong_10 0.267
Wrong_100 0.033
Wrong_1000 0.057
Wrong_10000 0.045
Wrong_100000 0.043
30 of 50
Measurement: average nanoseconds / operation, less is better
31. A word about concurrency
Processes and threads fight for resources
(single threaded benchmark is a utopia)
31 of 50
32. Concurrency problems of benchmarks
Benchmark states should be correctly
• Initialized
• Published
• Shared between certain group of threads
Multi threaded benchmark iteration should be synchronized
and all threads should start their work at the same time
No need to implement this infrastructure yourself,
just write a correct benchmark using your favorite framework
32 of 50
34. List iteration
Which list implementation is faster for the foreach loop?
ArrayList and LinkedList sequential iteration is linear, O(n)
• ArrayList Iterator.next(): return array[cursor++];
• LinkedList Iterator.next(): return current = current.next;
Let’s check for the list of 1 million Integer’s
34 of 50
35. List iteration, foreach vs iterator
35 of 50
public List<Integer> arrayListForeach() {
for(Integer i : arrayList) {
}
return arrayList;
}
public Iterator<Integer> arrayListIterator() {
Iterator<Integer> iterator = arrayList.iterator();
while(iterator.hasNext()) {
iterator.next();
}
return iterator;
}
23.659
Measurement: average milliseconds / operation, less is better
22.445
36. List iteration, foreach < iterator, why?
Foreach variant assigns element to a local variable
for(Integer i : arrayList)
Iterator variant does not
iterator.next();
We need to change Iterator variant to
Integer i = iterator.next();
So now it’s correct to compare the results, at least according to
the bytecode
36 of 50
37. List iteration, benchmark
37 of 50
@GenerateMicroBenchmark(BenchmarkType.All)
public List<Integer> arrayListForeach() {
for(Integer i : arrayList) {
}
return arrayList;
}
@GenerateMicroBenchmark(BenchmarkType.All)
public Iterator<Integer> arrayListIterator() {
Iterator<Integer> iterator = arrayList.iterator();
while(iterator.hasNext()) {
Integer i = iterator.next();
}
return iterator;
}
38. List iteration, benchmark, result
List impl Iteration Java 6 Java 7
ArrayList
foreach 24.792 5.118
iterator 24.769 0.140
LinkedList
foreach 15.236 9.485
iterator 15.255 9.306
38 of 50
Measurement: average milliseconds / operation, less is better
Java 6 ArrayList uses AbstractList.Itr,
LinkedList has its own, so there is less abstractions
(in Java 7 ArrayList has its own optimized iterator)
39. List iteration, benchmark, result
List impl Iteration Java 6 Java 7
ArrayList
foreach 24.792 5.118
iterator 24.769 0.140
LinkedList
foreach 15.236 9.485
iterator 15.255 9.306
39 of 50
Measurement: average milliseconds / operation, less is better
WTF?!
40. List iteration, benchmark, loop-hoisting
40 of 50
ListBenchmark.arrayListIterator()
Iterator<Integer> iterator = arrayList.iterator();
while(iterator.hasNext()) {
iterator.next();
}
return iterator;
ArrayList.Itr<E>.next()
if (modCount != expectedModCount) throw new CME();
int i = cursor;
if (i >= size) throw new NoSuchElementException();
Object[] elementData = ArrayList.this.elementData;
if (i >= elementData.length) throw new CME();
cursor = i + 1;
return (E) elementData[lastRet = i];
41. List iteration, benchmark, BlackHole
41 of 50
@GenerateMicroBenchmark(BenchmarkType.All)
public void arrayListForeach(BlackHole bh) {
for(Integer i : arrayList) {
bh.consume(i);
}
}
@GenerateMicroBenchmark(BenchmarkType.All)
public void arrayListIterator(BlackHole bh) {
Iterator<Integer> iterator = arrayList.iterator();
while(iterator.hasNext()) {
Integer i = iterator.next();
bh.consume(i);
}
}
42. List iteration, benchmark, correct result
List impl Iteration Java 6 Java 7 Java 7 BlackHole
ArrayList
foreach 24.792 5.118 8.550
iterator 24.769 0.140 8.608
LinkedList
foreach 15.236 9.485 11.739
iterator 15.255 9.306 11.763
42 of 50
Measurement: average milliseconds / operation, less is better
43. A fleeting glimpse on the JMH details
We already know that JMH
• Uses maven
• Uses annotation-driven approach to detect benchmarks
• Provides BlackHole to consume results (and CPU cycles)
43 of 50
44. JMH: Building infrastructure
Finds annotated micro-benchmarks using reflection
Generates infrastructure plain java source code
around the calls to the micro-benchmarks
Compile, pack, run, profit
No reflection during benchmark execution
44 of 50
45. JMH: Various metrics
Single execution time
Operations per time unit
Average time per operation
Percentile estimation of time per operation
45 of 50
46. JMH: Concurrency infrastructure
@State of benchmark data is shared across the benchmark,
thread, or a group of threads
Allows to perform Fixtures (setUp and tearDown) in scope of
the whole run, iteration or single execution
@Threads a simple way to run concurrent test if you defined
correct @State
@Group threads to assign them for a particular role in the
benchmark
46 of 50
47. JMH: VM forking
Allows to compare results obtained from various instances of VM
• First test will work on the clean JVM and others will not
• VM processes are not determined and may vary from run to run
(compilation order, multi-threading, randomization)
47 of 50
48. JMH: @CompilerControl
Instructions whether to compile method or not
Instructions whether to inline methods
Inserting breakpoints into generated code
Printing methods assembly
48 of 50
49. Conclusions
Do not reinvent the wheel, if you are not sure how it should work
(consider using existing one)
Consider the results being wrong if you don’t have a clear
explanation. Do not swallow that mystical behavior
49 of 50