1. Oracle OpenWorld / JavaOne
September 19-23
San Francisco
Eliminating the Pauses in your Java Application
Mark Stoodley and Mike Fulton
2. Why should you care about pauses?
• Better question: why wouldn't you care?
• Reducing pauses would help most systems:
– Telco: Could you repeat that? The line is crackling
– Desktop Systems: I'm clicking but nothing is happening
– Web Servers: Forget it – it's taking too long to load
– Financial: 'Most' trades complete quickly
– Safety Critical: no pauses in an emergency maneuver!
3. Simplified Financial Trading Scenario
Time
Process
News Feed
BUY!
News Flash: IBM 1Q Profits beat the street!
IBM Stock Price
Most events are processed quickly and efficiently.
But, once in a while, system delays will occur as
things like garbage collection are performed. A
pause at the wrong time means buying IBM stock
much later than it should have, resulting in lost profits
of $0.34/share.
$121.47
BUY!
$121.13
System
Delay
Process
News Feed
4. Agenda
• Examples of pauses in Java applications
• Why do pauses happen?
• Fixing common sources of pauses
– JVM
– Java Application
5. Where do pauses come from?
CPU
(caches,
instruction
latencies)
Firmware
(system
management
interrupts)
JVM
(garbage
collection,
class loading,
native code
compilation)
Application
(threading,
synchronization,
varying execution
paths)
OS
(caches,
daemons,
interrupts,
kernel pre-
emption,
RTOS or not)
Hypervisor
(guest
interleaving)
1ns
1µs
1ms
1s
6. Where do pauses come from?
CPU
(caches,
instruction
latencies)
Firmware
(system
management
interrupts)
OS
(caches,
daemons,
interrupts,
kernel pre-
emption,
RTOS or not)
Hypervisor
(guest
interleaving)
Not This Talk!
JVM
(garbage
collection,
class loading,
native code
compilation)
Application
(threading,
synchronization,
varying execution
paths)
1ns
1µs
1ms
1s
7. Where do pauses come from?
CPU
(caches,
instruction
latencies)
Firmware
(system
management
interrupts)
OS
(caches,
daemons,
interrupts,
kernel pre-
emption,
RTOS or not)
Hypervisor
(guest
interleaving)
JVM
(garbage
collection,
class loading,
native code
compilation)
Application
(threading,
synchronization,
varying execution
paths)
1ns
1µs
1ms
1s
This Talk!
13. Audience Participation
• How long a pause is a problem for your Java application?
– 10s?
– 1s?
– 100ms?
– 10ms?
– 1ms?
14. Audience Participation
• How long a pause is a problem for your Java application?
– 10s?
– 1s?
– 100ms?
– 10ms?
– 1ms?
– < 1ms?
15. Understanding Pauses in the JVM
1. Garbage Collection
2. Class Loading
3. Native Code Compilation
16. JVM Pause #1: Garbage Collection
• GC activity pauses application threads
– consumes resources the application might otherwise use
• Best way to avoid GC pauses: don't generate garbage!
• Many GC implementations and many GC policies
– Many pause all threads at once
– Many pause threads for relatively long periods of time
• There are lots of tuning guides for GC policies
– Topic mostly outside the realm of this talk
– First step: pick appropriate GC policy for your application
17. Choosing your GC Policy
• How does your application use heap memory?
– Most applications allocate significant garbage
– But what about live data?
– Does the set of live objects change over time?
• Characterizing live data can help select appropriate GC policy
• Concerned about pauses? Then 2 main choices:
– Generational (possibly concurrent)
– Real-time (incremental or concurrent)
18. Generational GC
• Heap divided into (at least) two separate areas:
– Allocate objects into (small) new space / nursery
– Promote objects that survive (fast) nursery collections to (usually
larger) old / tenured space
– If tenured space fills up, do a full collection (longer pause!)
• Most implementations do at least some GC work “concurrently”
with application
– Shorter pauses and not all threads stopped at the same time
19. Generational GC
• Best when live data is significant and doesn't change much
– Most pauses are nursery collections
– If objects die quickly then nursery can be relatively small so pauses are
short
• If live data changes over time
– Slow full collections may happen “too” often
– Tuning can help but full collections are unavoidable
• If objects don't die quickly
– Need a large nursery which extends pauses
20. Real Time (Incremental) GC
• One of two basic approaches (can be used together):
1. Do GC work when application isn't doing work
– Great when there are natural idle pauses in application workload
2. Do GC work with short pauses, controlling how long and how often
– Best for applications seeking high throughput if no idle moments
– Except: performance overhead due to more frequent pauses
● Both VERY good at minimizing GC pauses
– Even if live data set changes over time
21. Pick GC policy for your application
• Generational GC works well if live data set isn't changing and nursery
can be kept small
• Real-time GC can work well even if live data changes
– Expect some throughput performance loss
22. JVM Pause #2: Class Loading
• JVM is required to load classes dynamically
– Classes are resolved and installed and initialized on first use
– Thread must wait until class load completes
• As well as any threads that try to use class while it's being loaded
– If static initializer uses another class for the first time..
• Anything that speeds up loading a class will help shorten the pause
– Streamline initialization, use class sharing
• Application can also change when it first references a class
• Explicit class preloading can avoid unexpected class loading
– Maintain a list of classes needed by the application
– When class loading pauses are acceptable, walk the list and force each
class to be loaded
• May make application start-up slower
23. Difficulties using class pre-loading
• Biggest challenge is building and maintaining the list
– Active code base means list may change frequently
• -verbose:class gives list of classes loaded in one run
– But not classes loaded on paths that didn't execute
• -verbose:class output not quite portable among JVMs
• Tools can help create the list (Alphaworks technologies)
– RATCAT
– Java Application Execution Optimizer
• Not all classes can be preloaded (e.g. generated classes)
24. JVM Pause #3: Native Code Compilation
• Most modern JVMs compile methods to native code
– Just-In-Time compilations occur “randomly”
– Compilation thread(s) “borrow” cores from application
• Java particularly suited for Just In Time (JIT) compilation
– Dynamic class loading
– Bytecodes not a particularly efficient program representation for execution
or optimization
• JIT compiler picks methods to compile
– Typically compiled asynchronously on compilation thread(s)
25. Native code compilation “Pauses”
• Compiler “borrows” core(s) to do compilation work
– Happens at unpredictable times, at mercy of OS scheduler
• JIT compilers (and most people optimizing applications)
traditionally target the average case to improve performance
– 80-20 rule / “Make the common case fast”
– Often rare cases get slower to make common cases faster, but improving
average performance
– Over last decade, JIT compilers have generally become increasingly
speculative and aggressively optimize on profile data
26. Options to eliminate JIT “pauses”
1. Pre-compile using java.lang.Compiler.compileClass()
● May not be supported on all JVM implementations
2. Disable JIT compiler using java.lang.Compiler.disable()
3. Use Real-time JVM: RealtimeThreads can be made higher priority
than compilation thread(s)
4. Use Real-time JVM that can generate Ahead-Of-Time (AOT)
compiled code
5. Use Real-time JVM to avoid highly speculative optimizations
27. Preload classes, precompile methods, disable
Iterator<String> classNameIt = listOfClassNamesToLoad.iterator();
LinkedList<Class> listOfClassesToCompile = new LinkedList<Class>();
while (classNameIt.hasNext()) {
String className = (String) classItName.next();
try {
Class clazz = Class.forName(className); // preload class
listOfClassesToCompile.add(clazz);
} catch (Exception e) { ... }
}
Iterator<Class> classIt = listOfClassesToCompile.iterator();
while (classIt.hasNext()) {
Class clazz = (Class) classIt.next();
java.lang.Compiler.compileClass(clazz); // may return false!
}
java.lang.Compiler.disable();
28. The good and the bad about precompilation
• Application start-up will probably be longer
• Overall performance will probably be lower
– Better to load all classes first, then compile all their methods
• Gives JIT best knowledge about class hierarchy
– You won't get profile-based optimizations (maybe big loss!)
– You won't get adaptive recompilation (in JVMs that recompile very hot
methods)
• But no compilation-related pauses once disable() returns
30. Application Pause #1: Threading
• Java thread scheduling depends a lot on OS
– Java threading model is pretty loose
– Usually time shared, round-robin, non-strict priorities
– OS typically time shares all Java threads (e.g. SCHED_OTHER)
• Might be able to override default scheduling policy
– e.g. Using chrt command in Linux to use SCHED_RR (round-robin at each
priority, higher priority strictly pre-empts)
• chrt -r -p <priority> java <options>
– Requires JVM to propagate scheduling policy from initial thread to other
threads created inside JVM (not all JVMs do this)
– Beware: make sure all your threads have the same priority or make sure JVM
doesn't employ spin locks (hard: low-level locks may spin)
• Otherwise, you can hit livelock
31. Real-time JVMs give you control over threading
• Use JVM that supports Real Time Specification for Java (RTSJ)
– Introduces new thread type RealtimeThread with strict priority scheduling and
typically uses OS policy SCHED_FIFO
• Access to 40 distinct priority levels, strictly prioritized
• Be warned: FIFO scheduled threads do NOT time-share
– Context switch only when thread blocks / higher prio thread needs core
– Thread.yield() is no-op (!)
• Locks also have Priority Inheritance support so low-prio thread get
boosted if higher prio thread needs the lock (need RTOS)
– Avoids the evil priority inversion problem
32. Using RealtimeThreads
• Easy if you only use JVM that supports RTSJ:
• But you may not want to tie yourself to a JVM that implements
the RTSJ...
RealtimeThread rtThrd = new RealtimeThread();
33. Using RealtimeThread
• More generally, hide use of RTSJ in a ThreadFactory:
• Still need RTSJ to compile (and exercise!) this code
import java.util.concurrent.ThreadFactory;
import javax.realtime.PriorityScheduler;
import javax.realtime.RealtimeThread;
import javax.realtime.Scheduler;
import javax.realtime.PriorityParameters;
class RealtimeThreadFactory implements ThreadFactory {
public Thread newThread(Runnable r) {
RealtimeThread rtThr =
new RealtimeThread(null,null,null,null,null,r);
PriorityParameters pp =
(PriorityParameters) rtThr.getSchedulingParameters();
PriorityScheduler scheduler = PriorityScheduler.instance();
pp.setPriority(scheduler.getMaxPriority());
return rtThr;
}
}
34. How to know if RTSJ is available?
• Try to load a specific RTSJ class, e.g. :
• Once class with this initializer is loaded, you know if the RTSJ is
supported or not
• RTSJ classes can also be used via reflection (painful!)
static boolean rtsjSupported;
static {
try {
Class c = Class.forName(“javax.realtime.RealtimeThread”);
rtsjSupported = true;
}
catch (Exception e) {
rtsjSupported = false;
}
}
35. Application Pause #2: Synchronization
• Do threads always arrive at a lock in the same order?
– Not usually
• Do threads acquire lock in the order they arrived?
– No guarantees
• So: contended locks introduce variable pauses
• Another area where Real Time JVM can help
– FIFO scheduling minimizes time holding the lock
– Priority inheritance means low prio thread with lock won't be
scheduled out if a higher prio thread is waiting for lock
36. Application Pause #3: Different Execution Paths
• Different execution pauses mimic “pauses”
• e.g. Timing for application-level cache: hit versus miss
• Different major execution paths (e.g. transactions) may have
different timings
– Easy to get fooled into thinking you have pauses to worry about
– Best to track statistics on different transactions independently
37. Summary
• Pauses suck, nobody likes them
• Many sources of pauses in Java applications
– From CPU up to application
• GC, class loading, compilation cause pauses in the JVM
– Pick the right GC policy/preload classes/precompile methods
– Probably experience some throughput performance impact
• Threading and synchronization are key application issues
– Try to use SCHED_RR policy or Real-time JVMs
• Measure statistics for different major execution paths separately
38. Resources
• More information about the RTSJ
– http://www.rtsj.org
• Some Real-Time Java articles on developerWorks
– http://bit.ly/8mD0l
• Tools on alphaWorks
– http://www.alphaworks.ibm.com/topics/realtimejava
– http://www.alphaworks.ibm.com/tech/javaoptimizer
• Some soft and hard real-time (WebSphere Real Time) JVMs
– http://www.ibm.com/software/webservers/realtime/
– Evaluation copies are available
• User guides for WebSphere Real Time JVMs
– http://publib.boulder.ibm.com/infocenter/realtime/v2r0/index.jsp
39. IBM at Oracle Open World / JavaOne
• See IBM in each of these areas throughout the OOW event:
– JD Edwards Pavilion at the InterContinental Hotel Level 3… IBM Booth #HIJ-012
– Moscone West ….Guardium Booth #3618
– Java One Expo Floor at the Hilton San Francisco Union Square…IBM Booth #5104
– ILOG at the Java One Expo at the Hilton San Francisco Union Square….IBM Booth #5104
• Meet with IBM experts at our “Solution Spotlight Sessions” in Moscone South, Booth #1111.
• Access a wealth of insight including downloadable whitepapers & client successes at ibm.com/
oracle.
For the most current OOW updates follow us on Twitter at www.twitter.com/IBMandOracle
41. JVM Pause #2: Class Loading
• JVM is required to load classes dynamically
– But application can change when it first references a class
– Anything that speeds up loading a class will help shorten the pause
• Initialization, length of class path, class sharing
• Explicit class preloading can avoid unexpected class loading
– Maintain a list of classes needed by the application
– When class loading pauses are acceptable, walk the list and force each
class to be loaded
• May make application start-up slower
42. Unexpected class loading?
• Rare code paths can reference new classes
• Exception handlers are typical examples
Iterator<MyClass> cursor = list.iterator();
while (cursor.hasNext()) {
MyClass o = (MyClass) cursor.next();
if (o.getID() == 42) {
NeverBeforeLoadedClass o2 = new NeverBeforeLoadedClass(o);
// do something with o2
}
else {
// do something with o
}
}
43. Preloading a list of classes
• “Simple” code to pre-load a set of classes:
• But there are still challenges in using this technique...
Iterator<String> classIt = listOfClassNamesToLoad.iterator();
while (classIt.hasNext()) {
String className = (String) classIt.next();
try {
Class clazz = Class.forName(className);
} catch (Exception e) {
System.err.println("Could not load: " + className);
System.err.println(e);
}
}
44. Practical tip for class pre-loading
• Practical tip: Class.forName expects “.” not “/”
– e.g. java.lang.String not java/lang/String
45. Real-Time JVMs and Compilation
• Real-time JVMs aim for predictable performance
– Provide predictability by eliminating pauses
– Trade off is generally lower throughput performance
– Spectrum of goals: “soft” through “hard”
• Compiler-induced pauses an area of focus for Real Time JVMs
– RTSJ can make application threads higher priority than compiler
– Avoid speculative opts with severe consequences when wrong
– Ahead-Of-Time compilation