Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

JVM Performance Tuning


Published on

An opinionated position on JVM performance tuning with practical application. Includes expertise from Jason Goth

Published in: Software
  • Login to see the comments

JVM Performance Tuning

  1. 1. Discussion Document – Strictly Confidential & Proprietary JVM Performance Tuning August 2016
  2. 2. JVM Performance Tuning ● Overview ● Garbage Collection ○ Overview of how it works ○ Practical application ○ Recommendations for production ● JVM Profiling ○ Overview ○ Recommendations for production ○ Profiling demos JVM Tuning Technical Talk … Agenda … JVM Performance Tuning Technical Talk Agenda
  3. 3. Overview
  4. 4. Before we start...
  5. 5. Physical Server Hypervisor Operating System JVM Process Your code Internet Physical Server Operating System RDBMS Your Schema/ Queries Local Disk Physical Server Disk Disk DiskSAN Storage OS Performance tuning requires looking at a complete picture of the system and addressing the right issues (usually your code) ● Tuning the wrong component can have no impact and can sometimes make things worse! ● This presentation is focused on the JVM - others will follow... JVM Tuning Technical Talk … Overview …
  6. 6. There are many implementations of the JVM in existence today, and it is important to be aware of what you are working with because they may be tuned differently • There are many JVM implementations in existence today (75+) • 2 main implementations to be aware of: – OpenJDK ▪ This is a completely open source JVM choice ▪ This is the default packaged JVM with many Linux distros – HotSpot ▪ Currently owned by Oracle ▪ Based on OpenJDK, but includes in other implementations of various pieces (some closed source) – Other commercial JVMs exist: be aware which one you are running ▪ Running “java -version” will tell you Reference: JVM Tuning Technical Talk … Overview …
  7. 7. Tuning your JVM can have a significant impact on application performance • JVM Tuning - what and why? – Involves using various tools to get more insight on what the JVM is doing “under the hood” – Generally involves various command line arguments to tune behaviors primarily centered around modifying garbage collection – Garbage collection is expensive! • When approaching JVM tuning, remember 2 things: – Tuning your JVM can have a significant impact on application performance – “Premature optimization is the root of all evil” - Donald Knuth JVM Tuning Technical Talk … Overview …
  8. 8. Garbage Collection
  9. 9. Garbage collection “eliminates” the need for a programmer to have to manage memory themselves in code ● Trivial C++ example of manual memory management: { foo* f = new foo(); // Do interesting things with f delete f; // you'll see that the object is destroyed. } ● In Java, there is no “delete” ● The JVM shields the developer from the complexities of manual memory management ● The memory still needs to be reclaimed to become available for future use ● Garbage collection automatically identifies “dead objects” to reclaim the memory they occupy and make it available again for future use JVM Tuning Technical Talk … Garbage Collection … Garbage collection is a form of automatic memory management which attempts to keep the developer from doing it manually
  10. 10. JVM Tuning Technical Talk … Garbage Collection … ● “Generational hypothesis”: most objects survive for only a short period of time ○ The majority of objects “die young” ○ GC can be typically be more efficiently performed by focusing on collecting younger objects ● The JVM uses the concept of “generations” of an object, based on analysis of how typical object allocation/deallocation occurs over time Common terms when tuning GC: ● Throughput - how much actual work your application is doing (i.e. total time spent not doing GC) ● Pauses - Times your application appears unresponsive due to performing GC Reference: GC algorithms in HotSpot/OpenJDK operate under the “generational hypothesis”
  11. 11. With the Java 8-based memory architecture, objects move through the heap in stages 1. When objects are first created, they are allocated into eden 2. Once eden fills up, a GC event is triggered ○ Dead objects are effectively removed from eden ○ Surviving objects are moved into the survivor spaces 3. When a survivor space fill up, a GC event is triggered ○ Objects alternate between S0/S1 through GC ○ Objects which survive a certain number of GC passes move to tenured 4. When oldgen/tenured fills up, a GC event is triggered JVM Tuning Technical Talk … Garbage Collection … JVM Memory management utilizes “generations” to bucket objects in memory based on their age Eden S0 (Survivor) S1 (Survivor) Tenured Young gen Old gen (tenured) -Xmx (max heap)
  12. 12. Different GC events: ● Minor GC - cleanup younggen ○ Happens when eden or survivor spaces fills up ○ Generally fast ● Major GC - cleanup oldgen ○ Generally takes longer (more objects to deal with) ● Full GC - cleanup both younggen and oldgen Notes: ● Minor GC collections happen much more frequently - this is how the generational hypothesis is implemented ● All of these GC events are “stop the world” operations at some point during operation JVM Tuning Technical Talk … Garbage Collection … Three main types of GC events can occur to trigger the cleanup of dead objects and graduate live ones Eden S0 (Survivor) S1 (Survivor) Tenured Young gen Old gen (tenured) -Xmx (max heap) ○ “Stop the world” events cause ALL application threads to stop while the GC code does its work ● Different GC algorithms take different approaches in cleaning these spaces ○ Some algorithms do different parts of this concurrently ● GC tuning revolves around minimizing the occurrence and reducing the duration of these application pauses
  13. 13. Metaspace (new in Java 8) is a block of “native memory” where the java classes and methods are loaded JVM Tuning Technical Talk … Garbage Collection … Java 8 replaced PermGen with Metaspace Eden S0 (Survivor) S1 (Survivor) Tenured Young gen Old gen (tenured) -Xmx (max heap) Entire java process memory allocation Metaspace
  14. 14. GC roots are special to the garbage collector in that they cause other objects to stay alive and reachable During live object identification, the JVM: ● Begins by finding all “GC roots” ● Iterates, starting from GC roots, to all objects that are reachable and “marks them” ● Marking always performs a “stop the world” operation at some point ○ more alive objects == more time marking == more time with all threads paused ○ Sometimes more heap memory can be bad - it can take a lot more time to perform GC Example GC roots: ● Local variables (active method) ● Live threads ● Static fields of loaded classes ● Some JNI references Image courtesy of: JVM Tuning Technical Talk … Garbage Collection …
  15. 15. There are four main choices for GC implementations, each with their own strengths and weaknesses Common Name Details Serial GC -XX:+UseSerialGC ● All threads are stopped and then live objects are marked, copied/made contiguous ● NEVER use this for any multi-core machine; it can be beneficial for single-core boxes ● Ideal use case: Single processor machines (think embedded devices) Parallel GC -XX:+UseParallelGC -XX:+UseParallelOldGC ● All phases are run with multiple threads (a parallelized serial GC) ● All cores do GC work; no CPU cycles are used for GC when not running GC ● Can result in increased latency (stop the world collections) ● Ideal use case: High throughput, don’t mind long pauses Concurrent Mark and Sweep (CMS) -XX:+UseConcMarkSweepGC -XX:+UseParNewGC ● 1/4 of the CPU cores are regularly used to perform background object analysis ● Minimizes pause times, but results in lower throughput ● Can be less predictable on large heaps ● Ideal use case: Minimizes pause times, at the sacrifice of total throughput G1 -XX:+UseG1GC ● This is the latest in GC algorithms and is regularly being improved with JVM updates ● This divides young and oldgen into over 2000 different regions, allowing for incremental collection ● Most advanced, most changing implementation ● Will be the default for Java 9 ● Ideal use case: Large 6G+ heaps with desire for predictable GC pauses as well as minimum impact on throughput JVM Tuning Technical Talk … Garbage Collection …
  16. 16. Garbage Collection: Practical Application
  17. 17. The high-level approach to tuning garbage collection involves trying different combinations, measuring throughput and pauses, and making educated guesses Practical approach to performing GC tuning at a high-level: 1. Pick a GC algorithm which initially makes sense for your application based on known strengths/weaknesses 2. Derive a load test for your application (JMeter, LoadUI, etc.) ○ Must independently measure throughput (i.e., how well did the test perform versus my goals) ○ Must be consistent, repeatable ○ Minimize your variables 3. Enable verbose GC logging (and dump to a separate file - see recommended JVM args later) ○ Allows you to collect the data you need to understand your results 4. Execute load test 5. Analyze load test results, GC logs 6. Make adjustments (1 variable at a time), rinse, repeat If you aren’t sure which GC algorithm is best for your application, you can try multiple! ● Recommend doing this before tuning individual settings for specific GC algorithms ● Attempt load test with GC algorithm-specific default settings first JVM Tuning Technical Talk … Garbage Collection …
  18. 18. ● GCHisto ○ Allows for visual analysis of verbose GC logs ○ Free, originally a plugin for VisualVM ○ ● GCViewer ○ Allows for visual analysis of verbose GC logs similar to GCHisto ○ ● JConsole ○ Free, shipped with the JDK ○ Can use JMX connection to get access to and insight about a running JVM “on the fly” ○ Gives you insight into current memory usage and GC ○ Can be used to view and manipulate specifically exposed data within the JVM on the fly without restarting the process (future talk) There are a few useful tools to help analyze garbage collection performance JVM Tuning Technical Talk … Garbage Collection …
  19. 19. JVM Profiling
  20. 20. JVM Tuning Technical Talk … JVM Profiling … A Java process can be inspected to get insight on data related to performance and memory usage Main goals around JVM profiling: ● Analyze code behavior ○ Inspect individual method execution invocation counts and and durations ○ Inspect individual thread states ○ Can be used to help find your performance bottlenecks ● Analyze memory usage ○ Every object in the heap is available for inspection ○ Identify GC roots (or lack thereof) for each object These things can be done in real-time via attaching a profiler, or offline inspecting data files: ○ thread dump file ■ The state of all the threads in the application ○ hprof file ■ Snapshot of everything in the JVM heap with other useful metadata ■ Can be generated via: JConsole, JVisualVM, jmap, etc. ■ hprof files can be generated automatically on OutOfMemoryErrors - *** very useful ***
  21. 21. JVM Tuning Technical Talk … JVM Profiling … Profiling is made available locally and remotely via different mechanisms and it is important to understand how you get access to this information Your profiling tools are running in different processes than your target JVMs ● They can even be on different machines ● It is important to be aware of these boundaries ● The profiling tools can connect via a socket connection - be aware of firewall rules, etc. JVM monitoring connection options: ● jstatd ○ this is a standalone daemon process which needs to be running on the same machine as the JVM you want to monitor/manage ○ This requires starting jstatd on the machine you want to get access to ○ Monitoring capabilities are limited ○ Can be useful in the case where you don’t have JMX enabled and can’t restart your process ● JMX ○ More powerful access to the JVM ○ Requires starting your target java process with some command line arguments ○ This is enabled by default on Java 6+ locally ■ Be warned - if the target process isn’t running as the same user or JVM as the monitoring tool, they won’t find each other unless you explicitly define ports ●
  22. 22. ● VisualVM ○ Free, shipped with the JDK ○ Allows for deep insight into the performance of a given JVM ○ Will demo this today ● YourKit ○ Has many useful features above and beyond VisualVM, but usually not necessary ○ ○ Paid commercial tool with free trial period ● JProfiler ○ Has a graphical representation of call stacks, can be easier to navigate ○ Paid commercial tool It’s demo time!!! There are a plethora of tools available for getting insight into JVM activities and performance JVM Tuning Technical Talk … JVM Profiling …
  23. 23. ● Many other command line tools are shipped with the JDK ○ jmap - can be used to force a heap dump from a running JVM ○ jstatd - allows remote access to already running JVMs ○ JVM Tuning Technical Talk … JVM Tuning Tools … Other notable tools available for getting insight into and analysis of JVM performance
  24. 24. JVM Configuration for Production
  25. 25. ● -server ○ This causes the JVM’s JIT to be more aggressive ○ This increases boot times, but improves performance ○ Usually set by default based on “physical” system profile, but doesn’t hurt to force it ● -Xms=<size> and -Xmx=<size> ○ These set the minimum and maximum heap allocation sizes ○ If not specified, the JVM will use some defaults based on the amount of memory on the machine - can be dangerous ○ Protip: set them to the same value (i.e. -Xms=512m -Xmx=512m) ■ Removing the JVM’s ability to reduce the heap allocation will improve performance ■ Also helps ensure you have enough memory to handle all the processes you intend to run ● -XX:MaxMetaspaceSize=<size> ○ This limits the metaspace memory allocation, which is important since a metaspace leak is outside of the heap ○ Without this, “worst case scenario” is that your java process grows unbounded ○ Must monitor load test to get a feel for how much will be needed JVM Tuning Technical Talk … JVM Arguments for Production … The following JVM arguments should be utilized in production for most server-side java processes
  26. 26. ● -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=<some legit path to dump the log> ○ Forces the generation of an hprof file when OOME occurs ○ Setting the path forces the JVM to put the hprof (which can be big) to go somewhere you expect ○ Make sure that appropriate space will exist (hprof will be larger than -Xmx value) ○ Make sure it won’t cause disk space issues in production ● -XX:+PrintGCDateStamps -verbose:gc -XX:+PrintGCDetails -Xloggc:<file path and name> ○ These arguments enable GC logging which is ingestable via the GC analysis tools ○ Necessary to get insight into JVM GC behavior and performance ○ Be aware that this increases logging ● -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=<X> -XX:GCLogFileSize=<max size> ○ This causes the GC logs to be automatically rotated ○ This will limit the disk space the GC logs occupy (making it easier to have verbose GC logs on in prod) ● -XX:+PrintCommandLineFlags ○ This will show you which GC is being used (as well as other defaulted JVM args) JVM Tuning Technical Talk … JVM Arguments for Production … The following JVM arguments should be utilized in production for most server-side java processes
  27. 27. Oracle documentation: Plumbr garbage collection handbook: SE Radio podcast (which contains its own useful links): G1GC tuning reference: JVM Tuning Technical Talk … Garbage Collection … Links for further study