Jvm Performance Tunning

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Jvm Performance Tunning - Presentation Transcript

    1. JavaStudy Network Daehyub Cho JVM [Java Virtual Machine] Performance Tuning
    2. AGENDA Basic concept of JVM Tuning 1 Hotspot compiler 2 Threading Model 3 Memory Model 4
    3. Basic Concept of JVM Tuning Basic concept of JVM Tuning
    4. Basic of performance tuning
      • Decide what performance level is “good enough”
      • Test & measurement
        • Scenario based
        • Stress Tool (Load Runner)
        • Profiling Tool (J probe, etc)
      • Profile application to find bottlenecks
      • Tuning
        • Application *
        • Middleware [WAS]
        • OS
        • JVM
      • Return to Step 2 [feedback]
    5. JVM Tuning
      • Improve performance about 10~20%
      • Find appropriate parameter for your application
        • Hotspot compile option
        • Thread model option *
        • GC and memory related option **
      • Changing parameter is very dangerous action
        • Need more test and feed back
        • Ref spec.org
    6. Hotspot Compiler Hotspot compiler
    7. JVM Layout
      • Hotspot from JDK 1.3
      VM Client Compiler Server Compiler
      • Runtime
      • GC
      • Interpreter
      • Threading & Locking
      • … .
      JVM Hotspot Compiler
    8. Hotspot compiler
      • JIT (Just-In-Time Compiler)
        • Compile byte code to native code
        • Compile as rules of optimization (Not thinking)
        • At execution/installation
        • Compile byte code to native code
      • Hotspot
        • Compile byte code to native code
        • ‘ Thinking’ to trying find where optimization can take place
        • Adaptive Optimizing in runtime
    9. Hotspot Detection
      • Hotspot detection
      • Method Inlining
      • Dynamic Deoptimization
    10. Hotspot Detection and Method Inlining
      • Literal constants are folded
      • String concatenation is sometimes folded
      • Constant fields are inlined
      int foo = 9* 10;  int foo = 90; String foo = “Hello “ + (9*10);  String foo = “Hello 90”; public class A{ public static final VALUE=99; } public class B{ static int VALUE2=A.VALUE; } public class B{ static int VALUE2=99; }  When after compiling class B
    11. Hotspot detection / Method Inlining
      • Dead code branches are eliminated
      public class A{ static final boolean DEBUG = false; public void methodA() if(DEBUG) System.out.println(“DEBUG MODE); System.out.println(“Say Hello”); }// method A }// class A ↓ public class A{ static final boolean DEBUG = false; public void methodA() System.out.println(“Say Hello”); }// method A }// class A
    12. Hotspot Client compiler
      • Java Option : -client
      • Focused on Simple & Fast start up
      • 3 Phase compiler
        • HIR (High Level Intermediate Representation)
        • LIR (Low Level Intermediate Representation)
        • Machine code
      • It focuses on local code quality and does very few global optimizations since those are often the most expensive in terms of compile time
      • It has for inlining any function that has no exception handlers or synchronization and also supports deoptimization for debugging and inlining
    13. Hotspot Server compiler
      • Java Option : -server
      • Focused on optimization
      • SSA (Static Single Assignment)-based IR
    14. Hotspot compiler Option
      • Hotspot compile option
        • -XX:MaxInlineSize=<size>
          • Integer specifying maximum number of bytecode instructions in a method which gets inlined.
        • -XX:FreqInlineSize=<size>
          • Integer specifying maximum number of bytecode instructions in a frequently executed method which gets inlined.
        • -Xint
          • Interpreter only (no JIT compilation)
        • -XX:+PrintCompilation
    15. Threading Threading model
    16. Threading Model
      • Thread Model
        • Java is multi threaded programming language
        • Native thread model from JDK 1.2
          • Thread mapping (M:N and 1:1)
          • Thread synchronization
      Java Application Java Thread Operating System Thread Handling Thread Scheduling Lock Mgmt (synchronization) JVM
    17. Solaris M:N Thread Model Java Application Java Thread JVM Solaris OS OS Kernel Solaris Thread LWP Kernel Thread
    18. Solaris M:N Thread Model
      • Solaris M:N Thread Model
        • Thread based synchronization
        • LWP based synchronization
      Default -XX:-UseLWPSynchronization JDK1.4 -XX:+UseLWPSynchronization Default JDK1.3 Default N/A JDK1.2 LWP based sync Thread based sync
    19. Solaris 1:1 Thread Model Java Application Java Thread JVM Solaris OS OS Kernel Solaris Thread LWP Kernel Thread
    20. Solaris 1:1 Thread Model
      • Solaris 1:1 Thread Model
        • Bound thread
        • Alternate Libthread
      ※ In Solaris 9, alternate lib thread is default, do not add /usr/lib/lwp to LD_LIBRARY_PATH export LD_LIBRARY_PATH=/usr/lib/lwp -XX:+UseBoundThreads JDK1.4 export LD_LIBRARY_PATH=/usr/lib/lwp -XX:+UseBoundThreads JDK1.3 export LD_LIBRARY_PATH=/usr/lib/lwp N/A JDK1.2 Alternate Libthread* Bound Thread
    21. JVM Performance Test on Solaris < Solaris 8 with JVM 1.3 > See next page graph!! Architecture Cpus Threads Model %diff in throughput (against Standard Model) Sparc 30 400/2000 Standard --- Sparc 30 400/2000 LWP Synchronization 215%/800% Sparc 30 400/2000 Bound Threads -10%/-80% Sparc 30 400/2000 Alternate One-to-one 275%/900% Sparc 4 400/2000 Standard --- Sparc 4 400/2000 LWP Synchronization 30%/60% Sparc 4 400/2000 Bound Threads -5%/-45% Sparc 4 400/2000 Alternate One-to-one 30%/50% Sparc 2 400/2000 Standard --- Sparc 2 400/2000 LWP Synchronization 0%/25% Sparc 2 400/2000 Bound Threads -30%/-40% Sparc 2 400/2000 Alternate One-to-one -10%/0% Intel 4 400/2000 Standard --- Intel 4 400/2000 LWP Synchronization 25%/60% Intel 4 400/2000 Bound Threads 0%/-10% Intel 4 400/2000 Alternate One-to-one 20%/60% Intel 2 400/2000 Standard --- Intel 2 400/2000 LWP Synchronization 15%/45% Intel 2 400/2000 Bound Threads -10%/-15% Intel 2 400/2000 Alternate One-to-one 15%/35%
    22. JVM Performance Test on Solaris
      • Performance Test Result Graph
    23. Memory Tuning Memory Model
    24. Memory Tuning
      • Garbage Collection
      • JVM Memory Layout
      • Garbage Collection Model
      • Server VM and Client VM
      • Garbage Collection Measurement & Analysis
      • Tuning Garbage Collection
    25. Generational Garbage Collection
    26. JVM Memory Layout
      • New/Young – Recently created object
      • Old – Long lived object
      • Perm – JVM classes and methods
      Eden Old Perm New/Young Old Used in Application JVM Total Heap Size SS1 SS2
    27. Garbage Collection
      • Garbage Collection
        • Collecting unused java object
        • Cleaning memory
        • Minor GC
          • Collection memory in New/Young generation
        • Major GC (Full GC)
          • Collection memory in Old generation
    28. Minor GC
      • Minor Collection
        • New/Young Generation
        • Copy and Scavenge
        • Very Fast
    29. Minor GC Eden SS1 SS1 Copy live objects to Survivor area New Object Garbage Lived Object 1 st Minor GC Old Old Old
    30. Minor GC 2 nd Minor GC Old Old Old New Object Garbage Lived Object
    31. Minor GC OLD 3 rd Minor GC Objects moved old space when they become tenured New Object Garbage Lived Object
    32. Major GC
      • Major Collection
        • Old Generation
        • Mark and compact
        • Slow
          • 1 st – goes through the entire heap , marking unreachable objects
          • 2 nd – unreachable objects are compacted
    33. Major GC Eden SS1 SS2 Eden SS1 SS2 Mark the objects to be removed Eden SS1 SS2 Compact the objects to be removed
    34. Server option versus Client option
      • -X:NewRatio=2 (1.3) , -Xmn128m(1.4), -XX:NewSize=<size> -XX:MaxNewSize=<size>
    35. GC Tuning Parameter
      • Memory Tuning Parameter
        • Perm Size : -XX:MaxPermSize=64m
        • Total Heap Size : -ms512m –mx 512m
        • New Size
          • -XX:NewRatio=2  Old/New Size
          • -XX:NewSize=128m
          • -Xmn128m (JDK 1.4)
        • Survivor Size : -XX:SurvivorRatio=64 (eden/survivor)
        • Heap Ratio
          • -XX:MaxHeapFreeRatio=70
          • -XX:MinHeapFreeRatio=40
        • Suvivor Ratio
          • -XX:TargetSurvivorRatio=50
    36. Support for –XX Option
      • Options that begin with -X are nonstandard (not guaranteed to be supported on all VM implementations), and are subject to change without notice in subsequent releases of the Java 2 SDK.
      • Because the -XX options have specific system requirements for correct operation and may require privileged access to system configuration parameters, they are not recommended for casual use. These options are also subject to change without notice .
    37. Garbage Collection Model
      • New type of GC
        • Default Collector
        • Parallel GC for young generation - JDK 1.4
        • Concurrent GC for old generation - JDK 1.4
        • Incremental Low Pause Collector (Train GC)
    38. Parallel GC
      • Parallel GC
        • Improve performance of GC
        • For young generation (Minor GC)
        • More than 4CPU and 256MB Physical memory required
      threads time gc threads Default GC Parallel GC Young Generation
    39. Parallel GC
      • Two Parallel Collectors
        • Low-pause : -XX:+UseParNewGC
          • Near real-time or pause dependent application
          • Works with
            • Mark and compact collector
            • Concurrent old area collector
        • Throughput : -XX:+UseParallelGC
          • Enterprise or throughput oriented application
          • Works only with the mark and compact collector
    40. Parallel GC
      • Throughput Collector
        • – XX:+UseParallelGC
        • -XX:ParallelGCThreads=<desired number>
        • -XX:+UseAdaptiveSizePolicy
          • Adaptive resizing of the young generation
    41. Parallel GC
      • Throughput Collector
        • AggressiveHeap
          • Enabled By-XX:+AggresiveHeap
          • Inspect machine resources and attempts to set various parameters to be optimal for long-running,memory-intensive jobs
            • Useful in more than 4 CPU machine, more than 256M
            • Useful in Server Application
            • Do not use with –ms and –mx
          • Example) HP Itanium 1.4.2 java -XX:+ServerApp -XX:+AggresiveHeap -Xmn3400m -spec.jbb.JBBmain -propfile Test1
    42. Concurrent GC
      • Concurrent GC
        • Reduce pause time to collect Old Generation
        • For old generation (Full GC)
        • Enabled by - XX:+UseConcMarkSweepGC
      threads time gc threads Default GC Concurrent GC Old Generation
    43. Incremental GC
      • Incremental GC
        • Enabled by –XIncgc (from JDK 1.3)
        • Collect Old generation whenever collect young generation
        • Reduce pause time for collect old generation
        • Disadvantage
          • More frequently young generation GC has occurred.
          • More resource is needed
          • Do not use with –XX:+UseParallelGC and –XX:+UseParNewGC
    44. Incremental GC
      • Incremental GC
      Minor GC After many time of Minor GC Full GC Minor GC Minor GC Old Generation is collected in Minor GC Default GC Incremental GC Young Generation Old Generation
    45. Incremental GC
      • Incremental GC
        • -client –XX:+PrintGCDetails -Xincgc –ms32m –mx32m
      [GC [DefNew: 540K->35K(576K), 0.0053557 secs][Train: 3495K->3493K(32128K), 0.0043531 secs] 4036K->3529K(32704K), 0.0099856 secs] [GC [DefNew: 547K->64K(576K), 0.0048216 secs][Train: 3529K->3540K(32128K), 0.0058683 secs] 4041K->3604K(32704K), 0.0109779 secs] [GC [DefNew: 575K->64K(576K), 0.0164904 secs] 4116K->3670K(32704K), 0.0169019 secs] [GC [DefNew: 576K->64K(576K), 0.0057541 secs][Train: 3671K->3651K(32128K), 0.0051286 secs] 4182K->3715K(32704K), 0.0113042 secs] [GC [DefNew: 575K->56K(576K), 0.0114559 secs] 4227K->3745K(32704K), 0.0191390 secs] [ Full GC [Train MSC: 3689K->3280K(32128K), 0.0909523 secs] 4038K->3378K(32704K), 0.0910213 secs ] [GC [ DefNew: 502K->64K(576K), 0.0173220 secs ][Train: 3329K->3329K(32128K), 0.0066279 secs] 3782K->3393K(32704K), 0.0325125 secs Young Generation GC Old Generation GC in Minor GC Time Minor GC Full GC Sun JVM 1.4.1 in Windows OS
    46. Best Pause Concurrent GC Best Throughput Parallel GC Better Pause Incremental GC(Train) Better throughput Mark-compact
    47. Garbage Collection Measurement
      • -verbosegc (All Platform)
      • -XX:+PrintGCDetails ( JDK 1.4)
      • -Xverbosegc (HP)
    48. Garbage Collection Measurement
      • -verbosegc
      [GC 40549K->20909K(64768K), 0.0484179 secs] [GC 41197K->21405K(64768K), 0.0411095 secs] [GC 41693K->22995K(64768K), 0.0846190 secs] [GC 43283K->23672K(64768K), 0.0492838 secs] [Full GC 43960K->1749K(64768K), 0.1452965 secs] [GC 22037K->2810K(64768K), 0.0310949 secs] [GC 23098K->3657K(64768K), 0.0469624 secs] [GC 23945K->4847K(64768K), 0.0580108 secs] Full GC Total Heap Size GC Time Heap size after GC Heap size before GC
    49. GC Log analysis using AWK script
      • Awk script
      BEGIN{ printf(&quot;Minor Major Alive Free &quot;); } { if( substr($0,1,4) == &quot;[GC &quot;){ split($0,array,&quot; &quot;); printf(&quot;%s 0.0 &quot;,array[3]) split(array[2],barray,&quot;K&quot;) before=barray[1] after=substr(barray[2],3) reclaim=before-after printf(&quot;%s %s &quot;,after,reclaim) } if( substr($0,1,9) == &quot;[Full GC &quot;){ split($0,array,&quot; &quot;); printf(&quot;0.0 %s &quot;,array[4]) split(array[3],barray,&quot;K&quot;) before = barray[1] after = substr(barray[2],3) reclaim = before - after printf(&quot;%s %s &quot;,after,reclaim) } next; } % awk –f gc.awk gc.log ※ Usage gc.awk Minor       Major       Alive       Freed 0.0484179   0.0         20909       19640 0.0411095   0.0         21405       19792 0.0846190   0.0         22995       18698 0.0492838   0.0         23672       19611 0.0         0.1452965   1749        42211 0.0310949   0.0         2810        19227 0.0469624   0.0         3657        19441 0.0580108   0.0         4847        19098 gc.log
    50. GC Log analysis using AWK script < GC Time >
    51. GC Log analysis using HPJtune ※ http://www.hp.com/products1/unix/java/java2/hpjtune/index.html
    52. GC Log analysis using AWK script < GC Amount >
    53. Garbage Collection Tuning
      • GC Tuning
        • Find Most Important factor
          • Low pause? Or High performance?
          • Select appropriate GC model (New Model has risk!!)
        • Select “server” or “client”
        • Find appropriate Heap size by reviewing GC log
        • Find ratio of young and old generation
    54. Garbage Collection Tuning
      • GC Tuning
        • Full GC  Most important factor in GC tuning
          • How frequently ? How long ?
          • Short and Frequently  decrease old space
          • Long and Sometimes  increase old space
          • Short and Sometimes  decrease throughput  by Load balancing
        • Fix Heap size
          • Set “ms” and “mx” as same
          • Remove shrinking and growing overhead
        • Don’t
          • Don’t make heap size bigger than physical memory (SWAP)
          • Don’t make new generation bigger than half the heap
    55. Jmeter / Threads Histogram
    56. Jmeter /Threads Group Histogram
    57. Example
    58. Example 2004-01-08 오후 7:14 2004-01-09 오전 8 시 전후 2004-01-09 오후 7 시 전후 금요일 업무시간 2004-01-10 오전 10 시 전후 2004-01-10 오후 6 시 전후 PEAK TIME 52000~56000 sec 9 시 ~ 1 시간 가량 Before Tuned Old Area
    59. Example Peak Time 시에 Old GC 시간이 4~8 sec 로 이로 인한 Hang 현상 유발이 가능함 Before Tuned GC Time
    60. Example 12 일 03:38A 12 일 05:58P 13 일 07:18A 13 일 09:38P 14 일 11:58A 15 일 01:18A 15 일 03:38P 16 일 05:58A 16 일 07:18P 17 일 08:38A 17 일 10:58P Weekend Mon Office Our Tue Office Our Thur Office Our Fri Office Our After AP Tuned GC Time
    61. Example 12 일 03:38A 12 일 05:58P 13 일 07:18A 13 일 09:38P 14 일 11:58A 15 일 01:18A 15 일 03:38P 16 일 05:58A 16 일 07:18P 17 일 08:38A 17 일 10:58P Weekend Mon Office Our Tue Office Our Thur Office Our Fri Office Our
    62. Summary
    63. JVM Tuning Summary
      • Determine JVM performance goal
      • Gather statistics on your application
      • Select hotspot compiler
      • Tuning heap
      • Check threading model
      • Feedback
    64. More Tips More Tips
    65. Thread dump
      • Thread dump
        • Enabled by
          • Unix “kill –3 [JAVA PID]”
          • Windows “Ctrl+Break”
        • Snapshot of java application
        • Can profiling “hang-up”, and “slow-down”
    66. Thread dump example
      • &quot;&quot;
      • Thread dump when slowdown in WAS
      ExecuteThread: '232' for queue: 'default'&quot; daemon prio=5 tid=0x573ca630 nid=0xd2c waiting for monitor entry [0x5cebf000..0x5cebfdb8] at java.util.Hashtable.get(Hashtable.java:314) at java.util.ListResourceBundle.handleGetObject(ListResourceBundle.java:122) at java.util.ResourceBundle.getObject(ResourceBundle.java:371) at java.util.ResourceBundle.getObject(ResourceBundle.java:374) at java.text.DateFormatSymbols.initializeData(DateFormatSymbols.java:483) at java.text.DateFormatSymbols.<init>(DateFormatSymbols.java:99) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:275) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:264) at XXX.uv.com.cm.CmDateTimeUtil.getCurrentTime(CmDateTimeUtil.java:88) at XXX.uv.com.util.CmLog.setFileLog(CmLog.java:171) at XXX.uv.com.jsp.EjbJspBase.service(EjbJspBase.java:371) at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:265) at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:200) at weblogic.servlet.internal.WebAppServletContext.invokeServlet(WebAppServletContext.java:2546) at weblogic.servlet.internal.ServletRequestImpl.execute(ServletRequestImpl.java:2260) at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:139) at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:120) &quot;ExecuteThread: '231' for queue: 'default'&quot; daemon prio=5 tid=0x573f9a60 nid=0x13a8 waiting for monitor entry [0x5ce7f000..0x5ce7fdb8] at java.util.Hashtable.get(Hashtable.java:314) at java.text.DecimalFormatSymbols.initialize(DecimalFormatSymbols.java:333) at java.text.DecimalFormatSymbols.<init>(DecimalFormatSymbols.java:55) at java.text.NumberFormat.getInstance(NumberFormat.java:565) at java.text.NumberFormat.getInstance(NumberFormat.java:324) at java.text.SimpleDateFormat.initialize(SimpleDateFormat.java:327) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:276) at java.text.SimpleDateFormat.<init>(SimpleDateFormat.java:264) at XXX.uv.com.cm.CmDateTimeUtil.getCurrentTime(CmDateTimeUtil.java:88) at XXX.uv.com.cm.CmDateTimeUtil.getCurrentTime(CmDateTimeUtil.java:67) at XXX.uv.com.datastu.DateTime.setCurrentTime(DateTime.java:190) at XXX.uv.com.jsp.EjbJspBase.service(EjbJspBase.java:239) at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:265) at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:200) at weblogic.servlet.internal.WebAppServletContext.invokeServlet(WebAppServletContext.java:2546) at weblogic.servlet.internal.ServletRequestImpl.execute(ServletRequestImpl.java:2260) at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:139) at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:120)
      • Profiling CPU usage/HP UX
        • HP UX : Glance + Thread Dump
      HP Glance Press “G” Thread monitoring
      • Profiling CPU usage/HP UX
      &quot;Application Manager Thread&quot; prio=8 tid=0x002a6c00 nid=62 lwp_id=15999 waiting o n monitor [0x64bce000..0x64bce4b8] at java.lang.Thread.sleep(Native Method) at weblogic.management.mbeans.custom.ApplicationManager$ApplicationPolle r.run(ApplicationManager.java:1137) CPU Load of Thread 15999 is 17.7% Thread 15999 is working on weblogic.management.mbeans.custom.ApplicationManager (ApplicationManger.java 1137) Glance Thread Monitoring Java Thread Dump
      • Other tools
        • Profile with Java option
        • Analyze using HP Jmeter
        • Jprobe
        • Stress Test
          • Load Runner
          • MS Stress (Free)
      • Related URL
        • Java Thread http://java.sun.com/docs/hotspot/threads/threads.htm
        • Java Performance http://java.sun.com/docs/hotspot/PerformanceFAQ.html
        • Java Thread http://www.javaworld.com/javaworld/jw-09-1998/jw-09-threads.html
        • Pick up performance with generational gc http://www.javaworld.com/javaworld/jw-01-2002/jw-0111-hotspotgc.html
        • JVM1.4 GC Tunning http://java.sun.com/docs/hotspot/gc1.4.2/index.html
        • HP Jmeter,Jtune,Jconfig http://www.hp.com/products1/unix/java/developers/index.html
        • SPECjvm98
        • SPECjAppServer2001/2002
    67. Thank you
    SlideShare Zeitgeist 2009

    + guest1f2740guest1f2740 Nominate

    custom

    758 views, 0 favs, 0 embeds more stats

    Java Virtual Machine Tuning guide.
    It describes JVM more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 758
      • 758 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 51
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories