Your SlideShare is downloading. ×
Java Performance Tuning
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Saving this for later?

Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime - even offline.

Text the download link to your phone

Standard text messaging rates apply

Java Performance Tuning

5,375
views

Published on



<div dir=&quot;ltr&quot;&gt;<br&gt;
</div&gt;

Published in: Technology

3 Comments
13 Likes
Statistics
Notes
No Downloads
Views
Total Views
5,375
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
528
Comments
3
Likes
13
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Java Performance Tuning Atthakorn Chanthong
  • 2. What is software tuning?
  • 3. User Experience The software has poor response time. I need it runs more faster
  • 4. Software tuning is to make application runs faster
  • 5. Many people think Java application is slow, why?
  • 6. There are two major reasons
  • 7. The first is the Bottleneck
  • 8. The Bottleneck Increased Memory Use Lots of Casts Automatic memory management by Garbage Collector All Object are allocated on the Heap. Java application is not native
  • 9. The second is The Bad Coding Practice
  • 10. How to make it run faster?
  • 11. The bottleneck is unavoidable
  • 12. But the man could have a good coding practice
  • 13. A good design
  • 14. A good coding practice
  • 15. Java application normally run fast enough
  • 16. So the tuning game comes into play
  • 17. Knowing the strategy
  • 18. Tuning Strategy 1 Identify the main causes 2 Choose the quickest and easier one to fix 3 Fix it, and repeat again for other root cause
  • 19. Inside the strategy
  • 20. Tuning Strategy Need more faster, repeat again Profile, Measure Problem Priority Yes, it’s better Identify the location of bottleneck Test and compare Still bad? Before/after alteration The result Think a hypothesis isn’t good enough Code alteration Create a test scenario
  • 21. How to measure the software performance?
  • 22. We use the profiler
  • 23. Profiler is a programming tool that can track the performance of another computer program
  • 24. The two common usages of profiler are to analyze a software problem Profiler Profile application Monitor application performance memory usage
  • 25. How we get the profiler?
  • 26. Don’t pay for it!
  • 27. An opensource profiler is all around
  • 28. Some interesting opensource profilers
  • 29. Opensource Profiler JConsole http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html
  • 30. Opensource Profiler Eclipse TPTP http://www.eclipse.org/tptp/index.php
  • 31. Opensource Profiler NetBeans Built-in Profiler http://profiler.netbeans.org/
  • 32. Opensource Profiler This is pulled out from NetBeans to act as standalone profiler VisualVM https://visualvm.dev.java.net/
  • 33. And much more …
  • 34. Opensource Profiler JRat Cougaar DrMem InfraRED Profiler4j Jmeasurement TIJMP
  • 35. We love opensource
  • 36. Make the brain smart with good code practice
  • 37. 1st Rule Avoid Object-Creation
  • 38. Object-Creation causes problem Why?
  • 39. Avoid Object-Creation Lots of objects in memory means GC does lots of work Program is slow down when GC starts
  • 40. Avoid Object-Creation Creating object costs time and CPU effort for application
  • 41. Reuse objects where possible
  • 42. Pool Management Most container (e.g. Vector) objects could be reused rather than created and thrown away
  • 43. Pool Management VectorPoolManager V1 V3 V4 V5 getVector() returnVector() V2
  • 44. Pool Management public void doSome() { for (int i=0; i < 10; i++) { Vector v = new Vector() … do vector manipulation stuff } } public static VectorPoolManager vpl = new VectorPoolManager(25) public void doSome() { for (int i=0; i < 10; i++) { Vector v = vectorPoolManager.getVector( ); … do vector manipulation stuff vectorPoolManager.returnVector(v); } }
  • 45. Canonicalizing Objects Replace multiple object by a single object or just a few
  • 46. Canonicalizing Objects public class VectorPoolManager { private static final VectorPoolManager poolManager; private Vector[] pool; private VectorPoolManager(int size) { .... } public static Vector getVector() { if (poolManager== null) poolManager = new VectorPoolManager(20); ... return pool[pool.length-1]; } } Singleton Pattern
  • 47. Canonicalizing Objects Boolean b1 = new Boolean(true); Boolean b2 = new Boolean(false); Boolean b3 = new Boolean(false); Boolean b4 = new Boolean(false); 4 objects in memory Boolean b1 = Boolean.TRUE Boolean b2 = Boolean.FALSE Boolean b3 = Boolean.FALSE Boolean b4 = Boolean.FALSE 2 objects in memory
  • 48. Canonicalizing Objects String string = quot;55quot;; Integer theInt = new Integer(string); No Cache String string = quot;55quot;; Integer theInt = Integer.valueOf(string); Object Cached
  • 49. Canonicalizing Objects private static class IntegerCache { private IntegerCache(){} static final Integer cache[] = new Integer[-(-128) + 127 + 1]; static { for(int i = 0; i < cache.length; i++) cache[i] = new Integer(i - 128); } } public static Integer valueOf(int i) { final int offset = 128; if (i >= -128 && i <= 127) { // must cache return IntegerCache.cache[i + offset]; } return new Integer(i); } Caching inside Integer.valueOf(…)
  • 50. Keyword, ‘final’ Use the final modifier on variable to create immutable internally accessible object
  • 51. Keyword, ‘final’ public void doSome(Dimension width, Dimenstion height) { //Re-assign allow width = new Dimension(5,5); ... } public void doSome(final Dimension width, final Dimenstion height) { //Re-assign disallow width = new Dimension(5,5); ... }
  • 52. Auto-Boxing/Unboxing Use Auto-Boxing as need not as always
  • 53. Auto-Boxing/UnBoxing Integer i = 0; //Counting by 10M while (i < 100000000) { i++; } Takes 2313 ms Why it takes 2313/125 =~ 20 times longer? int p = 0; //Counting by 10M while (p < 100000000) { p++; } Takes 125 ms
  • 54. Auto-Boxing/UnBoxing Object-Creation made every time we wrap primitive by boxing
  • 55. 2nd Rule Knowing String Better
  • 56. String is the Object mostly used in the application
  • 57. Overlook the String The software may have the poor performance
  • 58. Compile-Time String Initialization Use the string concatenation (+) operator to create Strings at compile-time.
  • 59. Compile-Time Initialization for (int i =0; i < loop; i++) { //Looping 10M rounds String x = quot;Helloquot; + quot;,quot; +quot; quot;+ quot;Worldquot;; } Takes 16 ms for (int i =0; i < loop; i++) { //Looping 10M rounds String x = new String(quot;Helloquot; + quot;,quot; +quot; quot;+ quot;Worldquot;); } Takes 672 ms
  • 60. Runtime String Initialization Use StringBuffers/StringBuilder to create Strings at runtime.
  • 61. Runtime String Initialization String name = quot;Smithquot;; for (int i =0; i < loop; i++) { //Looping 1M rounds String x = quot;Helloquot;; x += quot;,quot;; x += quot; Mr.quot;; x += name; } Takes 10298 ms String name = quot;Smithquot;; for (int i =0; i < loop; i++) { //Looping 1M rounds String x = (new StringBuffer()).append(quot;Helloquot;) .append(quot;,quot;).append(quot; quot;) .append(name).toString(); } Takes 6187 ms
  • 62. String comparison Use appropriate method to compare the String
  • 63. To Test String is Empty for (int i =0; i < loop; i++) { //10m loops if (a != null && a.equals(quot;quot;)) { } }. Takes 125 ms for (int i =0; i < loop; i++) { //10m loops if (a != null && a.length() == 0) { } } Takes 31 ms
  • 64. If two strings have the same length String a = “abc” String b = “cdf” for (int i =0; i < loop; i++) { if (a.equalsIgnoreCase(b)) { } } Takes 750 ms String a = “abc” String b = “cdf” for (int i =0; i < loop; i++) { if (a.equals(b)) { } Takes 125 ms }
  • 65. If two strings have different length String a = “abc” String b = “cdfg” for (int i =0; i < loop; i++) { if (a.equalsIgnoreCase(b)) { } } Takes 780 ms String a = “abc” String b = “cdfg” for (int i =0; i < loop; i++) { if (a.equals(b)) { } Takes 858 ms }
  • 66. String.equalsIgnoreCase() does only 2 steps It checks for identity and then for Strings being the same size
  • 67. Intern String To compare String by identity
  • 68. Intern String Normally, string can be created by two ways
  • 69. Intern String By new String(…) String s = new String(“This is a string literal.”); By String Literals String s = “This is a string literal.”;
  • 70. Intern String Create Strings by new String(…) JVM always allocate a new memory address for each new String created even if they are the same.
  • 71. Intern String String a = new String(“This is a string literal.”); String b = new String(“This is a string literal.”); a “This is a string literal.” The different memory address b “This is a string literal.”
  • 72. Intern String Create Strings by Literals Strings will be stored in Pool Double create Strings by laterals They will share as a unique instances
  • 73. Intern String String a = “This is a string literal.”; String b = “This is a string literal.”; a Same memory address “This is a string literal.” b
  • 74. Intern String We can point two Stings variable to the same address if they are the same values. By using String.intern() method
  • 75. Intern String String a = new String(“This is a string literal.”).intern(); String b = new String(“This is a string literal.”).intern(); a Same memory address “This is a string literal.” b
  • 76. Intern String The idea is … Intern String could be used to compare String by identity
  • 77. Intern String What “compare by identity” means?
  • 78. Intern String If (a == b) Identity comparison (by reference) If (a.equals(b)) Value comparison
  • 79. Intern String By using reference so identity comparison is fast
  • 80. Intern String In traditionally style String must be compare by equals() to avoid the negative result
  • 81. Intern String But Intern String… If Strings have different value they also have different address. If Strings have same value they also have the same address.
  • 82. Intern String So we can say that (a == b) is equivalent to (a.equals(b))
  • 83. Intern String For these string variables String a = quot;abcquot;; String b = quot;abcquot;; String c = new String(quot;abcquot;).intern() They are pointed to the same address with the same value
  • 84. Intern String for (int i =0; i < loop; i++) { if (a.equals(b)) { } } Takes 312 ms for (int i =0; i < loop; i++) { if (a == b) { } Takes 32 ms }
  • 85. Intern String Wow, Intern String is good Unfortunately, it makes code hard understand, use it carefully
  • 86. Intern String String.intern() comes with overhead as there is a step to cache Use Intern String if they are planed to compare two or more times
  • 87. char array instead of String Avoid doing some stuffs by String object itself for optimal performance
  • 88. char array String x = quot;abcdefghijklmnquot;; for (int i =0; i < loop; i++) { if (x.charAt(5) == 'x') { } } Takes 281 ms String x = quot;abcdefghijklmnquot;; char y[] = x.toCharArray(); for (int i =0; i < loop; i++) { if ( (20 < y.length && 20 >= 0) && y[20] == 'x') { } } Takes 156 ms
  • 89. 3rd Rule Exception and Cast
  • 90. Stop exception to be thrown if it is possible Exception is really expensively to execute
  • 91. Avoid Exception Object obj = null; for (int i =0; i < loop; i++) { try { obj.hashCode(); } catch (Exception e) {} } Takes 18563 ms Object obj = null; for (int i =0; i < loop; i++) { if (obj != null) { obj.hashCode(); } Takes 16 ms }
  • 92. Cast as Less We can reduce runtime cost by grouping cast object which is several used
  • 93. Cast as Less Integer io = new Integer(0); Object obj = (Object)io; for (int i =0; i < loop; i++) { if (obj instanceof Integer) { byte x = ((Integer) obj).byteValue(); double d = ((Integer) obj).doubleValue(); float f = ((Integer) obj).floatValue(); } } Takes 31 ms for (int i =0; i < loop; i++) { if (obj instanceof Integer) { Integer icast = (Integer)obj; byte x = icast.byteValue(); double d = icast.doubleValue(); float f = icast.floatValue(); } } Takes 16 ms
  • 94. 4th Rule The Rhythm of Motion
  • 95. Loop Optimization There are several ways to make a faster loop
  • 96. Don’t terminate loop with method calls
  • 97. Eliminate Method Call byte x[] = new byte[loop]; for (int i = 0; i < x.length; i++) { for (int j = 0; j < x.length; j++) { } } Takes 109 ms byte x[] = new byte[loop]; int length = x.length; for (int i = 0; i < length; i++) { for (int j = 0; j < length; j++) { } Takes 62 ms }
  • 98. Method Call generates some overhead in Object Oriented Paradigm
  • 99. Use int to iterate over loop
  • 100. Iterate over loop by int for (int i = 0; i < length; i++) { for (int j = 0; j < length; j++) { } } Takes 62 ms for (short i = 0; i < length; i++) { for (short j = 0; j < length; j++) { } } Takes 125 ms
  • 101. VM is optimized to use int for loop iteration not by byte, short, char
  • 102. Use System.arraycopy(…) for copying object instead of running over loop
  • 103. System.arraycopy(….) for (int i = 0; i < length; i++) { x[i] = y[i]; } Takes 62 ms System.arraycopy(x, 0, y, 0, x.length); Takes 16 ms
  • 104. System.arraycopy() is native function It is efficiently to use
  • 105. Terminate loop by primitive use not by function or variable
  • 106. Terminate Loop by Primitive for(int i = 0; i < countArr.length; i++) { for(int j = 0; j < countArr.length; j++) { } } Takes 424 ms for(int i = countArr.length-1; i >= 0; i--) { for(int j = countArr.length-1; j >= 0; j--) { } } Takes 298 ms
  • 107. Primitive comparison is more efficient than function or variable comparison
  • 108. The average time of switch vs. if-else is about equally in random case
  • 109. Switch vs. If-else for(int i = 0; i < loop; i++) for(int i = 0; i < loop; i++) { { if (i%10== 0) switch (i%10) { { case 0: break; } else if (i%10 == 1) case 1: break; { ... ... case 7: break; } else if (i%10 == 8) case 8: break; { default: break; } else if (i%10 == 9) } { } } } Takes 2623 ms Takes 2608 ms
  • 110. Switch is quite fast if the case falls into the middle but slower than if-else in case of falling at the beginning or default case ** Test against a contiguous range of case values eg, 1,2,3,4,..
  • 111. Recursive Algorithm Recursive function is easy to read but it costs for each recursion
  • 112. Tail Recursion A recursive function for which each recursive call to itself is a reduction of the original call.
  • 113. Recursive vs. Tail-Recursive public static long factorial1(int n) { if (n < 2) return 1L; else return n*factorial1(n-1); } Takes 172 ms public static long factorial1a(int n) { if (n < 2) return 1L; else return factorial1b(n, 1L); } public static long factorial1b(int n, long result) { if (n == 2) return 2L*result; else return factorial1b(n-1, result*n); Takes 125 ms }
  • 114. Dynamic Cached Recursive Do cache to gain more speed
  • 115. Dynamic-Cached Recursive public static long factorial1(int n) { if (n < 2) return 1L; else return n*factorial1(n-1); } Takes 172 ms public static final int CACHE_SIZE = 15; public static final long[ ] factorial3Cache = new long[CACHE_SIZE]; public static long factorial3(int n) { if (n < 2) return 1L; else if (n < CACHE_SIZE) { if (factorial3Cache[n] == 0) factorial3Cache[n] = n*factorial3(n-1); return factorial3Cache[n]; } else return n*factorial3(n-1); Takes 94 ms }
  • 116. Recursion Summary Dynamic-Cached Tail Recursive is better than Tail Recursive is better than Recursive
  • 117. 5th Rule Use Appropriate Collection
  • 118. Accession ArrayList vs. LinkedList
  • 119. Random Access ArrayList al = new ArrayList(); for (int i =0; i < loop; i++) { al.get(i); } Takes 281 ms LinkedList ll = new LinkedList(); for (int i =0; i < loop; i++) { ll.get(i); } Takes 5828 ms
  • 120. Sequential Access ArrayList al = new ArrayList(); for (Iterator i = al.iterator(); i.hasNext();) { i.next(); } Takes 1375 ms LinkedList ll = new LinkedList(); for (Iterator i = ll.iterator(); i.hasNext();) { i.next(); } Takes 1047 ms
  • 121. ArrayList is good for random access LinkedList is good for sequential access
  • 122. Random vs. Sequential Access ArrayList al = new ArrayList(); for (int i =0; i < loop; i++) { al.get(i); } Takes 281 ms LinkedList ll = new LinkedList(); for (Iterator i = ll.iterator(); i.hasNext();) { i.next(); } Takes 1047 ms
  • 123. Random Access is better than Sequential Access
  • 124. Insertion ArrayList vs. LinkedList
  • 125. Insertion at zero index ArrayList al = new ArrayList(); for (int i =0; i < loop; i++) { al.add(0, Integer.valueOf(i)); } Takes 328 ms LinkedList ll = new LinkedList(); for (int i =0; i < loop; i++) { ll.add(0, Integer.valueOf(i)); } Takes 109 ms
  • 126. LinkedList does insertion better than ArrayList
  • 127. Vector is likely to ArrayList but it is synchronized version
  • 128. Accession and Insertion Vector vs. ArrayList
  • 129. Random Accession ArrayList al = new ArrayList(); for (int i =0; i < loop; i++) { al.get(i); } Takes 281 ms Vector vt = new Vector(); for (int i =0; i < loop; i++) { vt.get(i); } Takes 422 ms
  • 130. Sequential Accession ArrayList al = new ArrayList(); for (Iterator i = al.iterator(); i.hasNext();) { i.next(); } Takes 1375 ms Vector vt = new Vector(); for (Iterator i = vt.iterator(); i.hasNext();) { i.next(); } Takes 1890 ms
  • 131. Insertion ArrayList al = new ArrayList(); for (int i =1; i < loop; i++) { al.add(0, Integer.valueOf(i)); } Takes 328 ms Vector vt = new Vector(); for (int i =0; i < loop; i++) { vt.add(0, Integer.valueOf(i)); } Takes 360 ms
  • 132. Vector is slower than ArrayList in every method Use Vector if only synchronize needed
  • 133. Summary Type Random Sequential Insertion (get) (Iterator) ArrayList 281 1375 328 LinkedList 5828 1047 109 Vector 422 1890 360
  • 134. Addition and Accession Hashtable vs HashMap
  • 135. Addition Hashtable ht = new Hashtable(); for (int i =0; i < loop; i++) { ht.put(Integer.valueOf(i), Integer.valueOf(i)); } Takes 453 ms HashMap hm = new HashMap(); for (int i =0; i < loop; i++) { hm.put(Integer.valueOf(i), Integer.valueOf(i)); } Takes 328 ms
  • 136. Accession Hashtable ht = new Hashtable(); for (int i =0; i < loop; i++) { ht.get(Integer.valueOf(i)); } Takes 94 ms HashMap hm = new HashMap(); for (int i =0; i < loop; i++) { hm.get(Integer.valueOf(i)); } Takes 47 ms
  • 137. Hashtable is synchronized so it is slower than HashMap
  • 138. Q&A
  • 139. Reference O'Reilly Java Performance Tuning 2nd http://www.javaperformancetuning.com http://www.glenmccl.com/jperf/
  • 140. Future Topic I/O Logging, and Console Output Sorting Threading Tweak JVM and GC Strategy
  • 141. The End