Performance has always been a major concern in software development and should not be taken lightly even when commodity computers have multicore CPUs and a few gigabytes of RAM. One of the most handy, simple tools for performance testing are microbenchmarks. Unfortunately, developing correct Java microbenchmarks is a complex task with many pitfalls on the way. This presentation is about the Do's and Don'ts of Java microbenchmarking and about what tools are out there to help with this tricky task.
Java Tools and Techniques for Solving Tricky ProblemWill Iverson
Most Java software problems come from the little “broken windows” – a null pointer here or there. Sometimes, however, you find yourself in a nasty section of town, with the heap, stack, and permgen brutally fighting for memory. Threads in nasty knife fights over resources. Sometimes just plain freaky things – how did I wind up with 1.5GB of HashSet allocations?
In this edition of CSI: Seattle Java Edition, we’ll look at the tools available to combat these nasty foes and even see some of them in action – we will blow up a lot of application servers and JVMs in the process, with graphic results.
For More information, refer to Java EE 7 performance tuning and optimization book:
The book is published by Packt Publishing:
http://www.packtpub.com/java-ee-7-performance-tuning-and-optimization/book
There is a number of tools that are as part of a JDK installation.
Often, you can rely only on these successfully analyse issues, without the need to resort to (often expensive) 3rd party tools. What is better, these being part of the JDK, they can be used as early as development and testing!
Taking and analysing memory dumps, stack traces of java processes running in a particular system, monitoring GC activity.. and more, command line as you would hope when accessing this well-protected machine in a data centre somewhere far.
This session will iterate through a number of such tools, discuss purpose and capabilities. All followed with demonstrations of most common usages.
Unleash the power of the tools that you already have, today!
Java Tools and Techniques for Solving Tricky ProblemWill Iverson
Most Java software problems come from the little “broken windows” – a null pointer here or there. Sometimes, however, you find yourself in a nasty section of town, with the heap, stack, and permgen brutally fighting for memory. Threads in nasty knife fights over resources. Sometimes just plain freaky things – how did I wind up with 1.5GB of HashSet allocations?
In this edition of CSI: Seattle Java Edition, we’ll look at the tools available to combat these nasty foes and even see some of them in action – we will blow up a lot of application servers and JVMs in the process, with graphic results.
For More information, refer to Java EE 7 performance tuning and optimization book:
The book is published by Packt Publishing:
http://www.packtpub.com/java-ee-7-performance-tuning-and-optimization/book
There is a number of tools that are as part of a JDK installation.
Often, you can rely only on these successfully analyse issues, without the need to resort to (often expensive) 3rd party tools. What is better, these being part of the JDK, they can be used as early as development and testing!
Taking and analysing memory dumps, stack traces of java processes running in a particular system, monitoring GC activity.. and more, command line as you would hope when accessing this well-protected machine in a data centre somewhere far.
This session will iterate through a number of such tools, discuss purpose and capabilities. All followed with demonstrations of most common usages.
Unleash the power of the tools that you already have, today!
Software Profiling: Java Performance, Profiling and FlamegraphsIsuru Perera
Guest lecture at University of Colombo School of Computing on 30th May 2018
Covers following topics:
Software Profiling
Measuring Performance
Java Garbage Collection
Sampling vs Instrumentation
Java Profilers. Java Flight Recorder
Java Just-in-Time (JIT) compilation
Flame Graphs
Linux Profiling
Efficient Memory and Thread Management in Highly Parallel Java ApplicationsPhillip Koza
This presentation discusses strategies to estimate and control the memory use of multi-threaded java applications. It includes a quick overview of how the JVM uses memory, followed by techniques to estimate the memory usage of various types of objects during testing. This knowledge is then used as the basis for a runtime scheme to estimate and control the memory use of multiple threads. The final part of the presentation describes how to implement robust handling for unchecked exceptions, especially Out Of Memory (OOM) errors, and how to ensure threads stop properly when unexpected events occur.
Accelerated .NET Memory Dump Analysis training public slidesDmitry Vostokov
The slides from Software Diagnostics Services .NET memory dump analysis training. The training description: "Covers 22 .NET memory dump analysis patterns plus additional 11 unmanaged patterns. Learn how to analyze CLR 4 .NET application and service crashes and freezes, navigate through memory dump space (managed and unmanaged code) and diagnose corruption, leaks, CPU spikes, blocked threads, deadlocks, wait chains, resource contention, and much more. The training consists of practical step-by-step exercises using Microsoft WinDbg debugger to diagnose patterns in 64-bit and 32-bit process memory dumps. The training uses a unique and innovative pattern-oriented analysis approach to speed up the learning curve. The third edition was fully reworked to use the latest WinDbg version and Windows 10. It also includes 9 optional legacy exercises from the previous editions covering CLR 2 and 4, Windows Vista and Windows 7. Prerequisites: Basic .NET programming and debugging. Audience: Software technical support and escalation engineers, system administrators, DevOps, performance and reliability engineers, software developers and quality assurance engineers."
Software Profiling: Java Performance, Profiling and FlamegraphsIsuru Perera
Guest lecture at University of Colombo School of Computing on 30th May 2018
Covers following topics:
Software Profiling
Measuring Performance
Java Garbage Collection
Sampling vs Instrumentation
Java Profilers. Java Flight Recorder
Java Just-in-Time (JIT) compilation
Flame Graphs
Linux Profiling
Efficient Memory and Thread Management in Highly Parallel Java ApplicationsPhillip Koza
This presentation discusses strategies to estimate and control the memory use of multi-threaded java applications. It includes a quick overview of how the JVM uses memory, followed by techniques to estimate the memory usage of various types of objects during testing. This knowledge is then used as the basis for a runtime scheme to estimate and control the memory use of multiple threads. The final part of the presentation describes how to implement robust handling for unchecked exceptions, especially Out Of Memory (OOM) errors, and how to ensure threads stop properly when unexpected events occur.
Accelerated .NET Memory Dump Analysis training public slidesDmitry Vostokov
The slides from Software Diagnostics Services .NET memory dump analysis training. The training description: "Covers 22 .NET memory dump analysis patterns plus additional 11 unmanaged patterns. Learn how to analyze CLR 4 .NET application and service crashes and freezes, navigate through memory dump space (managed and unmanaged code) and diagnose corruption, leaks, CPU spikes, blocked threads, deadlocks, wait chains, resource contention, and much more. The training consists of practical step-by-step exercises using Microsoft WinDbg debugger to diagnose patterns in 64-bit and 32-bit process memory dumps. The training uses a unique and innovative pattern-oriented analysis approach to speed up the learning curve. The third edition was fully reworked to use the latest WinDbg version and Windows 10. It also includes 9 optional legacy exercises from the previous editions covering CLR 2 and 4, Windows Vista and Windows 7. Prerequisites: Basic .NET programming and debugging. Audience: Software technical support and escalation engineers, system administrators, DevOps, performance and reliability engineers, software developers and quality assurance engineers."
The Performance Engineer's Guide To HotSpot Just-in-Time CompilationMonica Beckwith
Adaptive compilation and runtime in the OpenJDK Hotspot VM offers significant performance enhancements for our tools and applications in Java and other JVM languages. Understanding how it works provides developers with critical information on the Java HotSpot JIT compilation and runtime techniques such as vectorization, compressed OOPs etc., to assist in understanding performance for both client and server applications. We will focus on the internals of OpenJDK 8, the reference implementation for Java SE 8.
Title: Sista: Improving Cog’s JIT performance
Speaker: Clément Béra
Thu, August 21, 9:45am – 10:30am
Video Part1
https://www.youtube.com/watch?v=X4E_FoLysJg
Video Part2
https://www.youtube.com/watch?v=gZOk3qojoVE
Description
Abstract: Although recent improvements of the Cog VM performance made it one of the fastest available Smalltalk virtual machine, the overhead compared to optimized C code remains important. Efficient industrial object oriented virtual machine, such as Javascript V8's engine for Google Chrome and Oracle Java Hotspot can reach on many benchs the performance of optimized C code thanks to adaptive optimizations performed their JIT compilers. The VM becomes then cleverer, and after executing numerous times the same portion of codes, it stops the code execution, looks at what it is doing and recompiles critical portion of codes in code faster to run based on the current environment and previous executions.
Bio: Clément Béra and Eliot Miranda has been working together on Cog's JIT performance for the last year. Clément Béra is a young engineer and has been working in the Pharo team for the past two years. Eliot Miranda is a Smalltalk VM expert who, among others, has implemented Cog's JIT and the Spur Memory Manager for Cog.
Have you ever wondered how to speed up your code in Python? This presentation will show you how to start. I will begin with a guide how to locate performance bottlenecks and then give you some tips how to speed up your code. Also I would like to discuss how to avoid premature optimization as it may be ‘the root of all evil’ (at least according to D. Knuth).
High Performance Erlang - Pitfalls and SolutionsYinghai Lu
Presented at Erlang Factory 2016, San Francisco, CA.
Erlang is widely used for building concurrent applications. However, when we push the performance of our Erlang based application to handle millions of concurrent clients, some Erlang scalability issues begin to show and some conventional programming paradigm of Erlang no longer hold. We would like to share some of these issue and how we address them. In addition, we share some of our experience on how to profile an Erlang application to identify bottlenecks.
We will take a deep look at some of the basic mechanisms of Erlang and show how they behave under high load and parallelism, which includes message delivery, process management and shared data structures such as maps and ETS tables. We will demonstrate their limitations and propose techniques to alleviate the issues.
We will also share profiling techniques on how to find those bottlenecks in Erlang applications across different levels. We will share techniques for writing highly performant Erlang applications.
Get Lower Latency and Higher Throughput for Java ApplicationsScyllaDB
Getting the best performance out of your Java applications can often be a challenge due to the managed environment nature of the Java Virtual Machine and the non-deterministic behaviour that this introduces. Automatic garbage collection (GC) can seriously affect the ability to hit SLAs for the 99th percentile and above.
This session will start by looking at what we mean by speed and how the JVM, whilst extremely powerful, means we don’t always get the performance characteristics we want. We’ll then move on to discuss some critical features and tools that address these issues, i.e. garbage collection, JIT compilers, etc. At the end of the session, attendees will have a clear understanding of the challenges and solutions for low-latency Java.
When Node.js Goes Wrong: Debugging Node in Production
The event-oriented approach underlying Node.js enables significant concurrency using a deceptively simple programming model, which has been an important factor in Node's growing popularity for building large scale web services. But what happens when these programs go sideways? Even in the best cases, when such issues are fatal, developers have historically been left with just a stack trace. Subtler issues, including latency spikes (which are just as bad as correctness bugs in the real-time domain where Node is especially popular) and other buggy behavior often leave even fewer clues to aid understanding. In this talk, we will discuss the issues we encountered in debugging Node.js in production, focusing upon the seemingly intractable challenge of extracting runtime state from the black hole that is a modern JIT'd VM.
We will describe the tools we've developed for examining this state, which operate on running programs (via DTrace), as well as VM core dumps (via a postmortem debugger). Finally, we will describe several nasty bugs we encountered in our own production environment: we were unable to understand these using existing tools, but we successfully root-caused them using these new found abilities to introspect the JavaScript VM.
The Performance Engineer's Guide to Java (HotSpot) Virtual MachineMonica Beckwith
Monica Beckwith has worked with the Java Virtual Machine for more than a decade not just optimizing the JVM heuristics, but also improving the Just-in-time (JIT) code quality for various processor architectures as well as working with the garbage collectors and improving garbage collection for server systems.
During this talk, Monica will cover a few JIT and Runtime optimizations and she will dive into the HotSpot garbage collection and provide an overview of the various garbage collectors available in HotSpot.
Instrumenting application code is like flossing your teeth. Developers know they ought to be doing it more often. Code instrumentation is an important practice for establishing baseline performance metrics and identifying bottlenecks. Getting the right metrics is core to understanding how much concurrency your application can handle, determining what latency is normal for the application, and indicating when performance is deviating from those norms.
While most developers acknowledge the value of instrumentation, few actually implement it. If Bytecode injection sounds as scary as a root canal, take heart, effective instrumentation doesn't have to be complicated. I've written an open-source instrumentation framework to encourage developers to get the metrics they need to pilot their application safely. We'll examine some strategies for code instrumentation, run some load tests, and make sense of the numbers.
100 bugs in Open Source C/C++ projects Andrey Karpov
This article demonstrates capabilities of the static code analysis methodology. The readers are offered to study the samples of one hundred errors found in open-source projects in C/C++.
Mingbo Zhang, Rutgers University
Saman Zonouz, Rutgers University
Time-of-check-to-time-of-use (TOCTOU) also known as “race condition” or “double fetch” is a long standing problem. Since memory read/write is so common an operation, it barely triggers no security mechanisms. We leverage a CPU feature called SMAP(Supervisor Mode Access Prevention) to efficiently monitor the events of kernel accessing user-mode memory. When user pages being accessed by kernel, our mitigation kicks in and protect them against further modifications from other user-mode threads. We also leverage the same CPU feature to find double fetch errors in kernel modules. A simple hypervisor is used to confine a system wide CPU feature such as SMAP to particular process.
At JavaOne keynote this year, Mark Reinhold talked about how Java 9 was much bigger than Jigsaw. To put that in numbers - 80+ JEPs bigger! Yes, we see more presentations on Jigsaw since it brings about modularity to the once monolithic JDK. But what about those other JEPs?! One of those "other" JEPs, is JEP 143 - 'Improve Contended Locking'. Monica will apply her performance engineering approach and talk about JEP 143 and Oracle's Studio Analyzer Performance Tool. The crux of the presentation will entail comparing performance of contended locks in JDK 9 to JDK 8.
The Java Memory Model describes how threads in the Java programming language interact through memory. Together with the description of single-threaded execution of code, the memory model provides the semantics of the Java programming language.
It is crucial for a programmer to know how, according to Java Language Specification, write correctly synchronized, race free programs.
Performs code analysis in C, C++, C++/CLI, C++/CX, C#. Plugin for Visual Studio 2010-2015. Integration with SonarQube, QtCreator, CLion, Eclipse CDT, Anjuta DevStudio and so on. Standalone utility. Direct integration of the analyzer into the systems of build automation and the BlameNotifier utility (e-mail notification). Automatic analysis of modified files. Great scalability. Why do people need code analyzers?
Similar to So You Want To Write Your Own Benchmark (20)
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Threats to mobile devices are more prevalent and increasing in scope and complexity. Users of mobile devices desire to take full advantage of the features
available on those devices, but many of the features provide convenience and capability but sacrifice security. This best practices guide outlines steps the users can take to better protect personal devices and information.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
3. Microbenchmark – simple definition
1. Start the 2. Run the code 3. Stop the 4. Report
clock clock
3
4. Better microbenchmark definition
• Small program
• Goal: Measure something about a few
lines of code
• All other variables should be removed
• Returns some kind of a numeric
result
4
5. Why do I need microbenchmarks?
• Discover something about my code:
• How fast is it
• Calculate throughput – TPS, KB/s
• Measure the result of changing my code:
• Should I replace a HashMap with a TreeMap?
• What is the cost of synchronizing a method?
5
6. Why are you talking about this?
• It’s hard to write a robust
microbenchmark
• it’s even harder to do it in Java™
• There are not enough Java
microbenchmarking tools
• There are too many flawed
microbenchmarks out there
6
8. A microbenchmark story: the problem
The boss asks you to solve a performance issue
in one of the components
Blah, blah …
8
9. A microbenchmark story: the cause
You find out that the cause is excessive use
of Math.sqrt()
9
10. A microbenchmark story: a solution?
• You decide to develop a state of the art
square root approximation
• After developing the square root
approximation you want to benchmark it
against the java.lang.Math
implementation
10
11. SQRT approximation microbenchmark
Let’s run this little piece of code in a loop
and see what happens …
public static void main(String[] args) {
long start = System.currentTimeMillis(); // start the clock
for (double i = 0; i < 10 * 1000 * 1000; i++) {
mySqrt(i); // little piece of code
}
long end = System.currentTimeMillis(); // stop the clock
long duration = end - start;
System.out.format(quot;Test duration: %d (ms) %nquot;, duration);
}
11
14. SQRT microbenchmark: what’s wrong?
Dynamic optimizations
Garbage collection Dead code elimination
The Java™ HotSpot virtual machine
Classloading
Dynamic Compilation
On Stack Replacement
14
15. The HotSpot: a mixed mode system
2
Code is
1
interpreted Profiling
3
Interpreted again Dynamic
or recompiled Compilation
5
Stuff 4
Happen
15
16. Dynamic compilation
• Dynamic compilation is unpredictable
• Don’t know when the compiler will run
• Don’t know how long the compiler will run
• Same code may be compiled more than once
• The JVM can switch to compiled code at will
16
20. What the heck is code hoisting ?
• Hoist = to raise or lift
• Size optimization
• Eliminate duplicated pieces
of code in method bodies
by hoisting expressions
or statements
20
21. Code hoisting example
a + b is a busy After hoisting the
expression expression a + b. A
new local variable t
has been introduced
Optimizing Java for Size: Compiler Techniques for Code Compaction, Samuli Heilala
21
22. Dynamic optimizations cont.
• Most of the optimizations are performed
at runtime
• Profiling data is used by the compiler to
improve optimization decisions
• You don’t have access to the dynamically
compiled code
22
23. Example: Very fast square root?
10,000,000 calls to Math.sqrt() ~ 4 ms
public static void main(String[] args) {
long start = System.nanoTime();
int result = 0;
for (int i = 0; i < 10 * 1000 * 1000; i++) {
result += Math.sqrt(i);
}
long duration = (System.nanoTime() - start) / 1000000;
System.out.format(quot;Test duration: %d (ms) %nquot;, duration);
}
23
24. Example: not so fast?
Now it takes ~ 2000 ms ?!?
public static void main(String[] args) {
long start = System.nanoTime();
int result = 0;
for (int i = 0; i < 10 * 1000 * 1000; i++) {
result += Math.sqrt(i); Single line
of code
} added
System.out.format(quot;Result: %d %nquot;, result);
long duration = (System.nanoTime() - start) / 1000000;
System.out.format(quot;Test duration: %d (ms) %nquot;, duration);
}
24
25. DCE - Dead Code Elimination
• Dead code - code that has no effect on the
outcome of the program execution
public static void main(String[] args) {
long start = System.nanoTime();
int result = 0;
for (int i = 0; i < 10 * 1000 * 1000; i++) {
result += Math.sqrt(i);
} Dead Code
long duration = (System.nanoTime() - start) / 1000000;
System.out.format(quot;Test duration: %d (ms) %nquot;, duration);
}
25
26. OSR - On Stack Replacement
• Methods are HOT if they cumulatively
execute more than 10,000 of loop
iterations
• Older JVM versions did not switch to the
compiled version until the method exited
and was re-entered
• OSR - switch from interpretation to
compiled code in the middle of a loop
26
27. OSR and microbenchmarking
• OSR’d code may be less performant
• Some optimizations are not performed
• OSR usually happen when you put
everything into one long method
• Developers tend to write long main()
methods when benchmarking
• Real life applications are hopefully divided
into more fine grained methods
27
28. Classloading
• Classes are usually loaded only when
they are first used
• Class loading takes time
• I/O
• Parsing
• Verification
• May flow your benchmark results
28
29. Garbage Collection
• JVM automatically claim resources by
• Garbage collection
• Objects finalization
• Outside of developer’s control
• Unpredictable
• Should be measured if invoked as a result
of the benchmarked code
29
30. Time measurement
How long is one millisecond?
public static void main(String[] args) throws
InterruptedException {
long start = System.currentTimeMillis();
Thread.sleep(1);
final long end = System.currentTimeMillis();
final long duration = (end - start);
System.out.format(quot;Test duration: %d (ms) %nquot;, duration);
}
Test duration: 16 (ms)
30
31. System.curremtTimeMillis()
• Accuracy varies with platform
Resolution Platform Source
55 ms Windows 95/98 Java Glossary
10 – 15 ms Windows NT, 2K, XP, 2003 David Holmes
1 ms Mac OS X Java Glossary
1 ms Linux – 2.6 kernel Markus Kobler
31
32. Wrong target platform
• Choosing the wrong platform for your
microbenchmark
• Benchmarking on Windows when your
target platform is Linux
• Benchmarking a highly threaded
application on a single core machine
• Benchmarking on a Sun JVM when the
target platform is Oracle (BEA) JRockit
32
33. Caching
• Caching
• Hardware – CPU caching
• Operating System – File system caching
• Database – query caching
33
34. Caching: CPU L1 and L2 caches
• The more the data accessed are far from the
CPU, the more the delays are high
• Size of dataset affects access cost
Array size Time (us) Cost (ns)
16k 413451 9.821
8192K 5743812 136.446
Jcachev2 results for Intel® core™2 duo T8300, L1 = 32 KB, L2 = 3 MB
34
38. Warm-up up your code
• Let the JVM reach steady state execution
profile before you start benchmarking
• All classes should be loaded before
benchmarking
• Usually executing your code for ~10
seconds should be enough
38
39. Warm-up up your code – cont.
• Detect JIT compilations by using
• CompilationMXBean.
getTotalCompilationTime()
• -XX:+PrintCompilation
• Measure classloading time
• Use the ClassLoadingMXBean
39
40. CompilationMXBean usage
import java.lang.management.ManagementFactory;
import java.lang.management.CompilationMXBean;
long compilationTimeTotal;
CompilationMXBean compBean =
ManagementFactory.getCompilationMXBean();
if (compBean.isCompilationTimeMonitoringSupported())
compilationTimeTotal = compBean.getTotalCompilationTime();
40
41. Dynamic optimizations
• Avoid on stack replacement
• Don’t put all your benchmark code in one
big main() method
• Avoid dead code elimination
• Print the final result
• Report unreasonable speedups
41
42. Garbage Collection
• Measure garbage collection time
• Force garbage collection and finalization
before benchmarking
• Perform enough iteration to reach garbage
collection steady state
• Gather gc stats:
-XX:PrintGCTimeStamps
-XX:PrintGCDetails
42
43. Time measurement
• Use System.nanoTime()
• Microseconds accuracy on modern operating
systems and hardware
• Not worse than currentTimeMillis()
• Notice: Windows users
• executes in microseconds
• don’t overuse !
43
44. JVM configuration
• Use similar JVM options to your target
environment:
• -server or –client JVM
• Enough heap space (-Xmx)
• Garbage collection options
• Thread stack size (-Xss)
• JIT compiling options
44
45. Other issues
• Use fixed size data sets
• Too large data sets can cause L1 cache
blowout
• Notice system load
• Don’t play GTA while benchmarking !
45
47. Java™ benchmarking tools
• Various specialized benchmarks
• SPECjAppServer ®
• SPECjvm™
• CaffeineMark 3.0™
• SciMark 2.0
• Only a few benchmarking frameworks
47
48. Japex Micro-Benchmark framework
• Similar in spirit to JUnit
• Measures throughput – work over time
• Transactions Per Second (Default)
• KBs per second
• XML based configuration
• XML/HTML reports
48
49. Japex: Drivers
• Encapsulates knowledge about a specific
algorithm implementation
• Must extend JapexDriverBase
public interface JapexDriver extends Runnable {
public void initializeDriver();
public void prepare(TestCase testCase);
public void warmup(TestCase testCase);
public void run(TestCase testCase);
public void finish(TestCase testCase);
public void terminateDriver();
}
49
50. Japex: Writing your own driver
public class SqrtNewtonApproxDriver extends JapexDriverBase {
private long tmp;
…
@Override
public void warmup(TestCase testCase) {
tmp += sqrt(getNextRandomNumber());
}
…
}
50
54. Japex: pros and cons
• Pros
• Similar to JUnit
• Nice HTML reports
• Cons
• Last stable release on March 2007
• HotSpot issues are not handled
• XML configuration
54
55. Brent Boyer’s Benchmark framework
• Part of the “Robust Java benchmarking”
article by Brent Boyer
• Automate as many aspects as possible:
• Resource reclamation
• Class loading
• Dead code elimination
• Statistics
55
56. Benchmark framework example
Benchmark.Params params = new Benchmark.Params(true);
params.setExecutionTimeGoal(0.5);
params.setNumberMeasurements(50);
Runnable task = new Runnable() {
public void run() {
sqrt(getNextRandomNumber());
}
};
Benchmark benchmark = new Benchmark(task, params);
System.out.println(benchmark.toString());
56
57. Benchmark single line summary
Benchmark output:
first = 25.702 us,
mean = 91.070 ns
(CI deltas: -115.591 ps, +171.423 ps)
sd = 1.451 us (CI deltas: -461.523 ns, +676.964 ns)
WARNING: execution times have mild outliers, SD
VALUES MAY BE INACCURATE
57
58. Outlier and serial correlation issues
• Records outlier and serial correlation
issues
• Outliers indicate that a major
measurement error happened
• Large outliers - some other activity started on the
computer during measurement
• Small outliers might hint that DCE occurred
• Serial correlation indicates that the JVM has not
reached its steady-state performance profile
58
59. Benchmark : pros and cons
• Pros
• Handles HotSpot related issues
• Detailed statistics
• Cons
• Each run takes a lot of time
• Not a formal project
• Lacks documentation
59
61. Summary 1
• Micro benchmarking is hard when it
comes to Java™
• Define what you want to measure and
how want to do it, pick your goals
• Know what you are doing
• Always warm-up your code
• Handle DCE, OSR, GC issues
• Use fixed size data sets and fixed work
61
62. Summary 2
• Do not rely solely on microbenchmark
results
• Sanity check results
• Use a profiler
• Test your code in real life scenarios under
realistic load (macro-benchmark)
62