High-Performance Java Bill La Forge CTO, Jactor Consulting http://jactorconsulting.com
High-Performance Java● When is it needed?● Development Methodology● High-Performance Considerations
When is it Needed?● Most Java code is fast enough for its intended use.● When optimization is needed, it is usually best done after the code is debugged.● But when the utility of the code is directly linked to its performance, the development of high- performance code can sometimes justify the expense of its development.
Development Methodology● A test-centric approach is needed to identify non-performant code early in the development cycle.● Performance testing is needed in both unit testing and system testing.● For critical sections of code it is sometimes better to duplicate code instead of subclassing. But a small memory footprint may be more important. Finding the most performant compromises requires performance testing.
Algorithms● There is no best algorithm or best data structure, only best fit for a specific context.● Algorithms which fit in high-speed cache may perform better than expected.● Array backed data structures shared across threads may work better than linked data structures.● Critical performance considerations are often opaque, with performance testing the only recourse.
JIT● Methods with bytecode longer than 60 bytes are NOT optimized, so adding a line of code to a method will sometimes result in a dramatic loss of speed.● Use final classes and final methods where possible. Consider code duplication in place of subclassing for critical sections.● Performance tests should exercise code heavily before doing any timings to ensure that the JIT has compiled the bytecode under test.
Garbage Collection● Garbage collection is a common cause for non- performant code.● Minimize object creation within loops.● Minimize the number of references an object has to other objects.● Avoid circular structures as much as possible.● Clear references to objects as soon as possible.
Multi-Threading● In general, using a single thread is orders of magnitude faster than using multiple threads, as passing data between threads is comparitively slow.● When there is justification for passing data between threads, pass as much a possible each time. For example, use a pipeline where backpressure from the next stage is used to control the amount of data being passed.● When passing data between threads, flow control is critical for good overall performance.
Memory Architecture● When using a thread pool, remember that the code and the data will need to be loaded into the CPUs local cache, making for a slow start when a thread is allocated a task. And having more CPUs only makes this worse.● Linked data structures make for frequent cache misses, which is why table-backed sturctures are often faster.● Sharing data blocks between threads, especially when more than one thread does the updates, will slow things down—even if the same data within a block is not being shared.