Practical Performance: Understand and improve the performance of your application

1,923 views

Published on

This session discusses how you can maximize the performance of your application deployment with tools that are native to your server platform as well as cross-platform Java analysis and monitoring tools. The session begins with systematic steps you can take to locate a performance problem in a complex system and moves on to analysis you can do to understand the root cause of the problem. The picture is completed by consideration of the tools and techniques available to monitor application performance in normal operation so that you can catch performance issues before they build up into serious problems.

Presented at JavaOne 2012

Video available from Parleys.com:
https://www.parleys.com/talk/the-hidden-world-your-java-application-what-its-really-doing

Published in: Technology

Practical Performance: Understand and improve the performance of your application

  1. 1. Chris Bailey – IBM Java Service Architect4th October 2012Practical PerformanceUnderstand and improve the performance of your application 1 © 2012 IBM Corporation
  2. 2. Important DisclaimersTHE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY.WHILST EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINEDIN THIS PRESENTATION, IT IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED.ALL PERFORMANCE DATA INCLUDED IN THIS PRESENTATION HAVE BEEN GATHERED IN A CONTROLLEDENVIRONMENT. YOUR OWN TEST RESULTS MAY VARY BASED ON HARDWARE, SOFTWARE OR INFRASTRUCTUREDIFFERENCES.ALL DATA INCLUDED IN THIS PRESENTATION ARE MEANT TO BE USED ONLY AS A GUIDE.IN ADDITION, THE INFORMATION CONTAINED IN THIS PRESENTATION IS BASED ON IBM’S CURRENT PRODUCTPLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM, WITHOUT NOTICE.IBM AND ITS AFFILIATED COMPANIES SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USEOF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION.NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, OR SHALL HAVE THE EFFECT OF:- CREATING ANY WARRANT OR REPRESENTATION FROM IBM, ITS AFFILIATED COMPANIES OR ITS OR THEIRSUPPLIERS AND/OR LICENSORS2 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  3. 3. Introduction to the speaker■ 12 years experience developing and deploying Java SDKs■ Recent work focus: – Java applications in the cloud – Java usability and quality – Debugging tools and capabilities – Requirements gathering – Highly resilient and scalable deployments■ My contact information: – baileyc@uk.ibm.com – http://www.linkedin.com/in/chrisbaileyibm – http://www.slideshare.net/cnbailey/3 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  4. 4. Goals of the Talk■ Introduce a simple general methodology for performance analysis■ Discuss the common performance bottlenecks■ Show how to analyse a simple application4 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  5. 5. Agenda■ Approaches to performance■ Layers of the application■ Identifying and resolving performance issues5 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  6. 6. Approaches to performance■ Outside in approach – Start from where performance can be measured – Work along the activity path – Ideal for identified performance problems■ Layered approach – “Bottom up” or “Top down” – Analyze and eliminate layers of the application – Simplify the problem as you go – Ideal for application health check■ A hybrid of both approaches can often be useful6 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  7. 7. Performance baseline■ Important to have a repeatable and representative performance test■ Measure baseline performance – Internal measurements affect the performance of what your measuring – External measurements have less impact on system performance■ Where possible, measure multiple times – Variation will occur between test runs■ Where possible, ensure consistency – Not just the load test thats run – State of machine and network can have interesting effects7 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  8. 8. Page response performance benchmark: baseline Page Performance PlantsbyWebSphere 20000 15000 Avg Latency (microseconds) Baseline 10000 5000 0 servlet_ShoppingServlet Shopping{1} Shopping_1_1 Shopping_2_3 Shopping_2_5 PlantsByWebSphere servlet_ShoppingServlet{2} Shopping{4} Shopping_2_2 Shopping_2_4 Page URL8 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  9. 9. A Layered Approach■ Three layers of a deployment: – Infrastructure: Machine hardware and Operating System – Java Runtime: Garbage collection – Java Application: Java application code■ Each can suffer from resource constraints, typically: – Memory – CPU – Synchronization – I/O9 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  10. 10. Infrastructure■ Typical resource constraints: – Memory: insufficient physical memory results in paging/swapping – CPU: insufficient CPU time limits throughput of the application – I/O: insufficient I/O limits throughput of the application – Synchronization driven by Java runtime/Java application■ Easy to diagnose■ Easy to resolve (relatively)■ Note that each can also be caused by deficiencies higher up the stack!10 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  11. 11. Infrastructure: Memory usage■ Infrastructure uses memory for: – Backing the process data: OS runtime, Java runtime, Java application – Caching of IO: filesystem and network buffers■ Lack of physical memory causes: – Reduction and removal of IO caching – Paging/swapping of process memory to disk■ Paging/swapping is costly for a Java process – Particularly affects Garbage Collection performance • Paging usually occurs on Least Recently Used basis • All of Java heap is traversed during mark and sweep phases • Least Recently Used does not work well for the Java heap11 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  12. 12. Infrastructure: CPU usage■ Insufficient CPU time availability will reduce performance■ Can occur periodically: – Cron Jobs running batch applications – Database backups■ Or during periods of high load: – System becomes CPU bound, limiting performance12 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  13. 13. Detecting infrastructure issues■ Detect using Operating System level tools■ Memory on Windows: – Paging: using “perfmon” with “Process” counter for “Page Faults/sec” – File Cache: using “perfmon” with “Memory” counter for “System Cache Resident Bytes”■ CPU on Windows: – Per process: using “perfmon” with “Process” counter for “% Processor Time” – Per machine: using “perfmon” with “Processor” counter for “% Processor Time”■ IO on Windows: – Network: using “perfmon” with “Network Interface” counter for “Output Queue Length” – Disk: using “perfmon” with “Physical Disk” counter for “Current Disk Queue Length”13 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  14. 14. Paging in perfmon14 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  15. 15. Resolving infrastructure issues■ Add more physical resources to the process – Assign more to the: Machine, Guest OS, LPAR, Zone, etc■ Reduce the physical resource requirements – Reduce the application footprint – Reduce the application CPU usage – Reduce the IO15 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  16. 16. Page response performance benchmark: Baseline Page Performance PlantsbyWebSphere 20000 15000 Avg Latency (microseconds) Baseline 10000 5000 0 servlet_ShoppingServlet Shopping{1} Shopping_1_1 Shopping_2_3 Shopping_2_5 PlantsByWebSphere servlet_ShoppingServlet{2} Shopping{4} Shopping_2_2 Shopping_2_4 Page URL16 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  17. 17. Page response performance benchmark: Paging removed Page Performance PlantsbyWebSphere 20000 15000 Avg Latency (microseconds) Baseline Baseline No Paging 10000 5000 0 servlet_ShoppingServlet servlet_ShoppingServlet Shopping{1} Shopping{1} Shopping_1_1 Shopping_1_1 Shopping_2_3 Shopping_2_3 Shopping_2_5 Shopping_2_5 PlantsByWebSphere servlet_ShoppingServlet{2} servlet_ShoppingServlet{2} Shopping{4} Shopping{4} Shopping_2_2 Shopping_2_2 Shopping_2_4 Shopping_2_4 Page URL Page URL17 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  18. 18. Effect on page performance – PlantsByWebSphere 1.7% – servlet_ShoppingServlet{2} 22.8% – servlet_ShoppingServlet 0.3% – Shopping{1} 0.4% – Shopping{4} 21.4% – Shopping_1_1 17.9% – Shopping_2_2 2.8% – Shopping_2_3 16.7% – Shopping_2_4 24.1% – Shopping_2_5 13.8%■ Improvement in page performance of 0.3% to 24.1% – Without outliers: 0.4% to 22.8%■ Biggest gains are on most performant pages. Total gain is only ~4% – Relatively small effect on overall page performance18 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  19. 19. Garbage Collection Pause Times19 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  20. 20. Garbage Collection Pause Times20 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  21. 21. Effect on Garbage Collection Pause Times■ Reduction in: – Maximum pause time: 38% – Average pause time: 13% – Time spent in GC 11%■ Large effect on GC performance, particularly pause times21 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  22. 22. Java runtime■ Typical resource constraints: – Memory: insufficient Java heap results in OutOfMemory or high GC overhead – CPU garbage collection overhead, or driven by Java application – Synchronization driven by Java application – IO driven by Java application■ Easy to diagnose■ Easy to resolve (relatively)22 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  23. 23. Java runtime memory■ Java runtime uses memory for: – Java Heap(s), Java Virtual Machine (JVM), “Native” heap, OS and C-language runtime 0 GB 2 GB 4 GB -Xmx OS and C-Runtime JVM Java Heap(s) 0x40000000 0xC0000000 0x0 0x80000000 0xFFFFFFFF■ Java heap(s) are managed using Garbage Collection■ Other memory usage can be indirectly driven by application usage and garbage collection – eg. Java Threads Java Heap Native Heap23 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  24. 24. Java runtime problems■ Insufficient Java heap memory leads to: – OutOfMemoryError due to Java heap exhaustion – Garbage Collection running excessively, increasing CPU and affecting performance■ Insufficient non-Java (“native”) heap leads to: – OutOfMemoryError due to process address space exhaustion – Driver for Java heap garbage collection (DirectByteBuffer cleaners)24 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  25. 25. Detecting Java runtime problems■ Log and trace analysis: – “Native” heap: OS level logs (ps, svmon, perfmon) – Java heap: verbose:gc output – Post processed using: IBM Monitoring and Diagnostic Tools for Java - Garbage Collection and Memory Visualizer (GCMV)■ Live monitoring: – “Native” heap IBM Monitoring and Diagnostic Tools for Java - Health Center – Java heap: IBM Monitoring and Diagnostic Tools for Java - Health Center Visual VM, Mission Control25 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  26. 26. Too Frequent Garbage Collection26 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  27. 27. Garbage collection performance Heap Size Too Frequent Garbage Collection Memory Heap Occupancy Long Garbage Collection Cycles Time27 JavaOne Session 23401 – The Hidden World of Your Java Application and What its Really Doing © 2011 IBM Corporation
  28. 28. Resolving Java runtime problems■ Add more resources to the Java runtime – Java heap: Increase Java heap size – Native heap: Move to 64bit or reduce Java heap size■ Reduce the memory requirements – Reduce the Java application footprint28 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  29. 29. Increased Java heap size29 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  30. 30. Effect on Garbage Collection Pause Times■ Reduction in: – Time spent in GC 59%■ However this is only 4.84% of total time30 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  31. 31. Page response performance benchmark: Baseline Page Performance PlantsbyWebSphere 20000 15000 Avg Latency (microseconds) Baseline 10000 5000 0 servlet_ShoppingServlet Shopping{1} Shopping_1_1 Shopping_2_3 Shopping_2_5 PlantsByWebSphere servlet_ShoppingServlet{2} Shopping{4} Shopping_2_2 Shopping_2_4 Page URL31 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  32. 32. Page response performance benchmark: Paging removed Page Performance PlantsbyWebSphere 20000 15000 Avg Latency (microseconds) Baseline Baseline No Paging 10000 5000 0 servlet_ShoppingServlet servlet_ShoppingServlet Shopping{1} Shopping{1} Shopping_1_1 Shopping_1_1 Shopping_2_3 Shopping_2_3 Shopping_2_5 Shopping_2_5 PlantsByWebSphere servlet_ShoppingServlet{2} servlet_ShoppingServlet{2} Shopping{4} Shopping{4} Shopping_2_2 Shopping_2_2 Shopping_2_4 Shopping_2_4 Page URL Page URL32 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  33. 33. Page response performance benchmark: Java Heap size increased Page Performance PlantsbyWebSphere 20000 15000 Avg Latency (microseconds) Baseline No Paging 10000 Bigger Heap 5000 0 servlet_ShoppingServlet Shopping{1} Shopping_1_1 Shopping_2_3 Shopping_2_5 PlantsByWebSphere servlet_ShoppingServlet{2} Shopping{4} Shopping_2_2 Shopping_2_4 Page URL33 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  34. 34. Page performance improvements – PlantsByWebSphere 0.4% – servlet_ShoppingServlet 0.0% – servlet_ShoppingServlet{2} 33.2% – Shopping{1} +0.6% – Shopping{4} 4.5% – Shopping_1_1 2.9% – Shopping_2_2 0.1% – Shopping_2_3 2.1% – Shopping_2_4 14.2% – Shopping_2_5 8.2%■ Improvement in page performance of -0.6% to 33.2% – Without outliers: 0.0% to 14.2%■ Total gain is only ~4% – Relatively small effect on overall page performance34 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  35. 35. Java application■ Typical resource constraints: – Memory: insufficient caching affects application throughput and responsiveness – CPU: insufficient threading causes limits on scalability – Synchronsation: synchronized resources limits scalability and throughput of the application – I/O: blocking on I/O limits throughput and responsiveness■ Hard to diagnose■ Can be expensive (or impossible!) to resolve35 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  36. 36. Java application CPU usage■ High CPU usage by Java methods highlight areas of potential optimization – Code is being invoked more than it needs to be • Easily done with event driven models – An algorithm is not the most efficient • Easily done if performance is not the focus at development time■ Fixing CPU bound applications requires knowledge of what code is being run – Identify methods which are suitable for optimisation • Optimising methods which the application doesn’t spend time in is a waste of your time – Identify methods where more time is being spent that you expect • “Why is so much of time being spent in this trivial method?”36 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  37. 37. Java application synchronization■ Throughput does not increase linearly with load■ At limit of throughput the CPU is still low – Inability to scale – Not all CPU can be utilized – Limit on throughput and responsiveness■ Bottleneck where threads need to synchronize with each other for application correctness – Caused by large numbers of threads requiring synchronized resource at the same time – Caused by long hold time by thread that owns resource – Or a mixture of both37 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  38. 38. Health Center: application method CPU usage38 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  39. 39. Health Center: application synchronization39 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  40. 40. ShoppingServlet.deliberateSlowMethod() private void deliberateSlowMethod() { // ---------------------------------------------------------------- // User clicked on the Tulips, lets tip toe through a // slow method // ---------------------------------------------------------------- System.out.println("==> STARTING SLOW METHOD"); long timestamp = System.currentTimeMillis(); long target = timestamp + SLOWTIME; System.out.println("timestamp="+timestamp); System.out.println("resume at="+target); while(timestamp < target) { timestamp = System.currentTimeMillis(); } System.out.println("==> ENDING SLOW METHOD"); }40 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  41. 41. Page response performance benchmark: Baseline Page Performance PlantsbyWebSphere 20000 15000 Avg Latency (microseconds) Baseline 10000 5000 0 servlet_ShoppingServlet Shopping{1} Shopping_1_1 Shopping_2_3 Shopping_2_5 PlantsByWebSphere servlet_ShoppingServlet{2} Shopping{4} Shopping_2_2 Shopping_2_4 Page URL41 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  42. 42. Page response performance benchmark: Paging removed Page Performance PlantsbyWebSphere 20000 15000 Avg Latency (microseconds) Baseline Baseline No Paging 10000 5000 0 servlet_ShoppingServlet servlet_ShoppingServlet Shopping{1} Shopping{1} Shopping_1_1 Shopping_1_1 Shopping_2_3 Shopping_2_3 Shopping_2_5 Shopping_2_5 PlantsByWebSphere servlet_ShoppingServlet{2} servlet_ShoppingServlet{2} Shopping{4} Shopping{4} Shopping_2_2 Shopping_2_2 Shopping_2_4 Shopping_2_4 Page URL Page URL42 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  43. 43. Page response performance benchmark: Java Heap size increased Page Performance PlantsbyWebSphere 20000 15000 Avg Latency (microseconds) Baseline No Paging 10000 Bigger Heap 5000 0 servlet_ShoppingServlet Shopping{1} Shopping_1_1 Shopping_2_3 Shopping_2_5 PlantsByWebSphere servlet_ShoppingServlet{2} Shopping{4} Shopping_2_2 Shopping_2_4 Page URL43 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  44. 44. Page response performance benchmark: deliberateSlowMethod() changed Page Performance PlantsbyWebSphere 20000 15000 Avg Latency (microseconds) Baseline No Paging 10000 Bigger Heap App Fix 5000 0 servlet_ShoppingServlet Shopping{1} Shopping_1_1 Shopping_2_3 Shopping_2_5 PlantsByWebSphere servlet_ShoppingServlet{2} Shopping{4} Shopping_2_2 Shopping_2_4 Page URL44 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  45. 45. Page performance improvements■ servlet_ShoppingServlet 80.8% 5x improvement■ Shopping{1} 94.3% 20x improvement■ Improvement in page performance of 80 and 95% – 5x and 20x improvement for affected pages■ Total gain of 48%!45 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  46. 46. Health Center: application method CPU usage46 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  47. 47. Health Center: application synchronization47 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  48. 48. Summary■ Importance of: – Repeatable benchmark – Incremental measurements as changes are made – Repeated testing and verification■ Tools are available to help you see whats going on: – Garbage Collection and Memory Visualizer (all vendors) – HealthCenter (IBM only) – Other profilers (eg. YourKit) (all vendors) – Memory Analyzer (all vendors)48 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  49. 49. Page response performance benchmark: Summary Page Performance PlantsbyWebSphere 20000 15000 Avg Latency (microseconds) Baseline No Paging 10000 Bigger Heap App Fix 5000 0 servlet_ShoppingServlet Shopping{1} Shopping_1_1 Shopping_2_3 Shopping_2_5 PlantsByWebSphere servlet_ShoppingServlet{2} Shopping{4} Shopping_2_2 Shopping_2_4 Page URL49 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  50. 50. Summary■ Infrastructure resources affect performance – Paging and Garbage Collection much less than you might expect – However, beware of CPU “starvation” from other processes!■ Vast majority of performance gains are in the application!50 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  51. 51. References■ Get Products and Technologies: – IBM Monitoring and Diagnostic Tools for Java: • https://www.ibm.com/developerworks/java/jdk/tools/■ Learn: – Health Center InfoCenter: • http://publib.boulder.ibm.com/infocenter/hctool/v1r0/index.jsp■ Discuss: – IBM on Troubleshooting Java Applications Blog: • https://www.ibm.com/developerworks/mydeveloperworks/blogs/troubleshootingjava/ – Health Center Forum: • http://www.ibm.com/developerworks/forums/forum.jspa?forumID=1461 – IBM Java Runtimes and SDKs Forum: • http://www.ibm.com/developerworks/forums/forum.jspa?forumID=367&start=051 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation
  52. 52. Copyright and Trademarks© IBM Corporation 2012. All Rights Reserved.IBM, the IBM logo, and ibm.com are trademarks or registered trademarks of International BusinessMachines Corp., and registered in many jurisdictions worldwide.Other product and service names might be trademarks of IBM or other companies.A current list of IBM trademarks is available on the Web – see the IBM “Copyright and trademarkinformation” page at URL: www.ibm.com/legal/copytrade.shtml52 Practical Performance: Understand and Improve the Performance of Your Application © 2012 IBM Corporation

×