Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The bytecode mumbo-jumbo

1,174 views

Published on

Presentation about bytecode and what is going on at the JVM at 360|AnDev in Denver

Published in: Mobile
  • Be the first to comment

The bytecode mumbo-jumbo

  1. 1. @rrafols the bytecode mumbo-jumbo #perfmatters
  2. 2. @rrafols Disclaimer: This presentation contains bytecode Content is my own experimentation and might differ on other environments
  3. 3. @rrafols Competitions Work 2015 Local winner 2016+ Ambassador Runner up Demoscene Speaker Author Entrepreneurship
  4. 4. @rrafols Our friend the java compiler.
  5. 5. @rrafols *.java → [javac] → *.class
  6. 6. @rrafols Or, for example, on Android
  7. 7. @rrafols *.java → [javac] → *.class *.class → [dx] → dex file
  8. 8. @rrafols *.java → [javac] → *.class *.class → [dx] → dex dex → [dexopt] → opt. dex dex → [dex2oat] → native
  9. 9. @rrafols but change is coming! Jack & Jill
  10. 10. @rrafols but change is coming! Jack & Jill
  11. 11. @rrafols *.java → [jack] → dex file
  12. 12. @rrafols Let’s focus on javac Javac vs other compilers
  13. 13. @rrafols Compilers Produces optimized code for the target platform
  14. 14. @rrafols javac Does not produce optimized code*
  15. 15. @rrafols javac Does not know on which architecture the code will be executed
  16. 16. @rrafols Source: Oracle
  17. 17. @rrafols For this reason Java bytecode & operations are stack based
  18. 18. @rrafols Easy to interpret But not the most performant solution
  19. 19. @rrafols Quick example Stack based integer addition
  20. 20. @rrafols j = j + i
  21. 21. @rrafols Java bytecode
  22. 22. @rrafols iload_3 iload_2 iadd istore_2
  23. 23. @rrafols Register based approach
  24. 24. @rrafols add r01, r02, r01 or add eax, ebx
  25. 25. @rrafols Let’s make things interesting… j = j + i + k + w + h * 2 + p * p;
  26. 26. @rrafols Java bytecode
  27. 27. @rrafols 0: iload_2 1: iload_1 2: iadd 3: iload_3 4: iadd 5: iload 4 7: iadd 8: iload 5 10: iconst_2 11: imul 12: iadd 13: iload 6 15: iload 6 17: imul 18: iadd 19: istore_2 j = j + i + k + w + h * 2 + p * p; Local vars 1: i 2: j 3: k 4: w 5: h 6: p
  28. 28. @rrafols 0: iload_2 1: iload_1 2: iadd 3: iload_3 4: iadd 5: iload 4 7: iadd 8: iload 5 10: iconst_2 11: imul 12: iadd 13: iload 6 15: iload 6 17: imul 18: iadd 19: istore_2 j = j + i + k + w + h * 2 + p * p; Local vars 1: i 2: j 3: k 4: w 5: h 6: p
  29. 29. @rrafols 0: iload_2 1: iload_1 2: iadd 3: iload_3 4: iadd 5: iload 4 7: iadd 8: iload 5 10: iconst_2 11: imul 12: iadd 13: iload 6 15: iload 6 17: imul 18: iadd 19: istore_2 j = j + i + k + w + h * 2 + p * p; Local vars 1: i 2: j 3: k 4: w 5: h 6: p
  30. 30. @rrafols 0: iload_2 1: iload_1 2: iadd 3: iload_3 4: iadd 5: iload 4 7: iadd 8: iload 5 10: iconst_2 11: imul 12: iadd 13: iload 6 15: iload 6 17: imul 18: iadd 19: istore_2 j = j + i + k + w + h * 2 + p * p; Local vars 1: i 2: j 3: k 4: w 5: h 6: p
  31. 31. @rrafols 0: iload_2 1: iload_1 2: iadd 3: iload_3 4: iadd 5: iload 4 7: iadd 8: iload 5 10: iconst_2 11: imul 12: iadd 13: iload 6 15: iload 6 17: imul 18: iadd 19: istore_2 j = j + i + k + w + h * 2 + p * p; Local vars 1: i 2: j 3: k 4: w 5: h 6: p
  32. 32. @rrafols 0: iload_2 1: iload_1 2: iadd 3: iload_3 4: iadd 5: iload 4 7: iadd 8: iload 5 10: iconst_2 11: imul 12: iadd 13: iload 6 15: iload 6 17: imul 18: iadd 19: istore_2 j = j + i + k + w + h * 2 + p * p; Local vars 1: i 2: j 3: k 4: w 5: h 6: p
  33. 33. @rrafols 0: iload_2 1: iload_1 2: iadd 3: iload_3 4: iadd 5: iload 4 7: iadd 8: iload 5 10: iconst_2 11: imul 12: iadd 13: iload 6 15: iload 6 17: imul 18: iadd 19: istore_2 j = j + i + k + w + h * 2 + p * p; Local vars 1: i 2: j 3: k 4: w 5: h 6: p
  34. 34. @rrafols Register based approach
  35. 35. @rrafols add r01, r02, r01 add r01, r03, r01 add r01, r04, r01 mul r07, r05, #2 add r01, r07, r01 mul r08, r06, r06 add r02, r08, r01 j = j + i + k + w + h * 2 + p * p; r01: i r02: j r03: k r04: w r05: h r06: p
  36. 36. @rrafols Java VM (JVM) Only the JVM knows the architecture where is running. In this case, for example, we used up to 8 registers
  37. 37. @rrafols Java VM (JVM) All optimizations are left to be done by the JVM
  38. 38. @rrafols Maybe takes this concept a bit too far...
  39. 39. @rrafols Imagine this simple C code #include <stdio.h> int main() { int a = 10; int b = 1 + 2 + 3 + 4 + 5 + 6 + a; printf("%dn", b); }
  40. 40. @rrafols GCC compiler #include <stdio.h> int main() { int a = 10; int b = 1 + 2 + 3 + 4 + 5 + 6 + a; printf("%dn", b); } … movl $31, %esi call _printf … * Using gcc & -O2 compiler option
  41. 41. @rrafols javac public static void main(String args[]) { int a = 10; int b = 1 + 2 + 3 + 4 + 5 + 6 + a; System.out.println(b); } 0: bipush 10 2: istore_1 3: bipush 21 5: iload_1 6: iadd 7: istore_2 ...
  42. 42. @rrafols Let's do a small change #include <stdio.h> int main() { int a = 10; int b = 1 + 2 + 3 + 4 + 5 + a + 6; printf("%dn", b); }
  43. 43. @rrafols GCC compiler #include <stdio.h> int main() { int a = 10; int b = 1 + 2 + 3 + 4 + 5 + a + 6; printf("%dn", b); } … movl $31, %esi call _printf … * Using gcc & -O2 compiler option
  44. 44. @rrafols javac public static void main(String args[]) { int a = 10; int b = 1 + 2 + 3 + 4 + 5 + a + 6; System.out.println(b); } 0: bipush 10 2: istore_1 3: bipush 15 5: iload_1 6: iadd 7: bipush 6 9: iadd 10: istore_2
  45. 45. @rrafols Let's do another quick change.. public static void main(String args[]) { int a = 10; int b = a + 1 + 2 + 3 + 4 + 5 + 6; System.out.println(b); }
  46. 46. @rrafols javac 0: bipush 10 2: istore_1 3: iload_1 4: iconst_1 5: iadd 6: iconst_2 7: iadd 8: iconst_3 9: iadd 10: iconst_4 11: iadd 12: iconst_5 13: iadd 14: bipush 6 16: iadd 17: istore_2 public static void main(String args[]) { int a = 10; int b = a + 1 + 2 + 3 + 4 + 5 + 6; System.out.println(b); }
  47. 47. @rrafols On Android there is jack to the rescue...
  48. 48. @rrafols jack public static void main(String args[]) { int a = 10; int b = a + 1 + 2 + 3 + 4 + 5 + 6; System.out.println(b); } ... 0: const/16 v0, #int 31 2: sget-object v1, Ljava/lang/System; 4: invoke-virtual {v1, v0} 7: return-void ...
  49. 49. @rrafols And on Java there is the JIT compiler to the rescue
  50. 50. @rrafols JIT assembly output public static void main(String args[]) { int a = 10; int b = a + 1 + 2 + 3 + 4 + 5 + 6; System.out.println(b); } ... 0x00000001104b2bff: mov eax, 0x0001f ... 0: bipush 10 2: istore_1 3: iload_1 4: iconst_1 5: iadd 6: iconst_2 7: iadd 8: iconst_3 9: iadd 10: iconst_4 11: iadd 12: iconst_5 13: iadd 14: bipush 6 16: iadd 17: istore_2
  51. 51. @rrafols Language additions Thinks to consider
  52. 52. @rrafols Autoboxing Transparent to the developer but compiler adds some 'extra' code
  53. 53. @rrafols Autoboxing long total = 0; for(int i = 0; i < N; i++) { total += i; } 0: lconst_0 1: lstore_1 2: iconst_0 3: istore_3 4: iload_3 5: ldc #8 // N 7: if_icmpge 21 10: lload_1 11: iload_3 12: i2l 13: ladd 14: lstore_1 15: iinc 3, 1 18: goto 4
  54. 54. @rrafols Autoboxing long total = 0; for(int i = 0; i < N; i++) { total += i; } 0: lconst_0 1: lstore_1 2: iconst_0 3: istore_3 4: iload_3 5: ldc #8 // N 7: if_icmpge 21 10: lload_1 11: iload_3 12: i2l 13: ladd 14: lstore_1 15: iinc 3, 1 18: goto 4
  55. 55. @rrafols Autoboxing long total = 0; for(int i = 0; i < N; i++) { total += i; } 0: lconst_0 1: lstore_1 2: iconst_0 3: istore_3 4: iload_3 5: ldc #8 // N 7: if_icmpge 21 10: lload_1 11: iload_3 12: i2l 13: ladd 14: lstore_1 15: iinc 3, 1 18: goto 4
  56. 56. @rrafols Autoboxing long total = 0; for(int i = 0; i < N; i++) { total += i; } 0: lconst_0 1: lstore_1 2: iconst_0 3: istore_3 4: iload_3 5: ldc #8 // N 7: if_icmpge 21 10: lload_1 11: iload_3 12: i2l 13: ladd 14: lstore_1 15: iinc 3, 1 18: goto 4 21:
  57. 57. @rrafols Autoboxing long total = 0; for(int i = 0; i < N; i++) { total += i; } 0: lconst_0 1: lstore_1 2: iconst_0 3: istore_3 4: iload_3 5: ldc #8 // N 7: if_icmpge 21 10: lload_1 11: iload_3 12: i2l 13: ladd 14: lstore_1 15: iinc 3, 1 18: goto 4
  58. 58. @rrafols Autoboxing long total = 0; for(int i = 0; i < N; i++) { total += i; } 0: lconst_0 1: lstore_1 2: iconst_0 3: istore_3 4: iload_3 5: ldc #8 // N 7: if_icmpge 21 10: lload_1 11: iload_3 12: i2l 13: ladd 14: lstore_1 15: iinc 3, 1 18: goto 4
  59. 59. @rrafols Autoboxing long total = 0; for(int i = 0; i < N; i++) { total += i; } 0: lconst_0 1: lstore_1 2: iconst_0 3: istore_3 4: iload_3 5: ldc #8 // N 7: if_icmpge 21 10: lload_1 11: iload_3 12: i2l 13: ladd 14: lstore_1 15: iinc 3, 1 18: goto 4
  60. 60. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; }
  61. 61. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; } 00: lconst_0 01: invokestatic #7 // java/lang/Long.valueOf:(J)Ljava/la 04: astore_1 05: iconst_0 06: invokestatic #8 // java/lang/Integer.valueOf:(I)Ljava/ 09: astore_2 10: aload_2 11: invokevirtual #9 // java/lang/Integer.intValue:()I 14: sipush N 17: if_icmpge 54 20: aload_1 21: invokevirtual #10 // java/lang/Long.longValue:()J 24: aload_2 25: invokevirtual #9 // java/lang/Integer.intValue:()I 28: i2l 29: ladd 30: invokestatic #7 // java/lang/Long.valueOf:(J)Ljava/la 33: astore_1 34: aload_2 35: astore_3 36: aload_2 37: invokevirtual #9 // java/lang/Integer.intValue:()I 40: iconst_1 41: iadd 42: invokestatic #8 // java/lang/Integer.valueOf:(I)Ljava/ 45: dup 46: astore_2 47: astore 4 49: aload_3 50: pop 51: goto 10
  62. 62. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; } 00: lconst_0 01: invokestatic #7 // java/lang/Long.valueOf:(J)Ljava/la 04: astore_1 05: iconst_0 06: invokestatic #8 // java/lang/Integer.valueOf:(I)Ljava/ 09: astore_2 10: aload_2 11: invokevirtual #9 // java/lang/Integer.intValue:()I 14: sipush N 17: if_icmpge 54 20: aload_1 21: invokevirtual #10 // java/lang/Long.longValue:()J 24: aload_2 25: invokevirtual #9 // java/lang/Integer.intValue:()I 28: i2l 29: ladd 30: invokestatic #7 // java/lang/Long.valueOf:(J)Ljava/la 33: astore_1 34: aload_2 35: astore_3 36: aload_2 37: invokevirtual #9 // java/lang/Integer.intValue:()I 40: iconst_1 41: iadd 42: invokestatic #8 // java/lang/Integer.valueOf:(I)Ljava/ 45: dup 46: astore_2 47: astore 4 49: aload_3 50: pop 51: goto 10
  63. 63. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; } 00: lconst_0 01: invokestatic #7 // java/lang/Long.valueOf:(J)Ljava/la 04: astore_1 05: iconst_0 06: invokestatic #8 // java/lang/Integer.valueOf:(I)Ljava/ 09: astore_2 10: aload_2 11: invokevirtual #9 // java/lang/Integer.intValue:()I 14: sipush N 17: if_icmpge 54 20: aload_1 21: invokevirtual #10 // java/lang/Long.longValue:()J 24: aload_2 25: invokevirtual #9 // java/lang/Integer.intValue:()I 28: i2l 29: ladd 30: invokestatic #7 // java/lang/Long.valueOf:(J)Ljava/la 33: astore_1 34: aload_2 35: astore_3 36: aload_2 37: invokevirtual #9 // java/lang/Integer.intValue:()I 40: iconst_1 41: iadd 42: invokestatic #8 // java/lang/Integer.valueOf:(I)Ljava/ 45: dup 46: astore_2 47: astore 4 49: aload_3 50: pop 51: goto 10
  64. 64. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; } 00: lconst_0 01: invokestatic #7 // java/lang/Long.valueOf:(J)Ljava/la 04: astore_1 05: iconst_0 06: invokestatic #8 // java/lang/Integer.valueOf:(I)Ljava/ 09: astore_2 10: aload_2 11: invokevirtual #9 // java/lang/Integer.intValue:()I 14: sipush N 17: if_icmpge 54 20: aload_1 21: invokevirtual #10 // java/lang/Long.longValue:()J 24: aload_2 25: invokevirtual #9 // java/lang/Integer.intValue:()I 28: i2l 29: ladd 30: invokestatic #7 // java/lang/Long.valueOf:(J)Ljava/la 33: astore_1 34: aload_2 35: astore_3 36: aload_2 37: invokevirtual #9 // java/lang/Integer.intValue:()I 40: iconst_1 41: iadd 42: invokestatic #8 // java/lang/Integer.valueOf:(I)Ljava/ 45: dup 46: astore_2 47: astore 4 49: aload_3 50: pop 51: goto 10
  65. 65. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; } 00: lconst_0 01: invokestatic #7 // java/lang/Long.valueOf:(J)Ljava/la 04: astore_1 05: iconst_0 06: invokestatic #8 // java/lang/Integer.valueOf:(I)Ljava/ 09: astore_2 10: aload_2 11: invokevirtual #9 // java/lang/Integer.intValue:()I 14: sipush N 17: if_icmpge 54 20: aload_1 21: invokevirtual #10 // java/lang/Long.longValue:()J 24: aload_2 25: invokevirtual #9 // java/lang/Integer.intValue:()I 28: i2l 29: ladd 30: invokestatic #7 // java/lang/Long.valueOf:(J)Ljava/la 33: astore_1 34: aload_2 35: astore_3 36: aload_2 37: invokevirtual #9 // java/lang/Integer.intValue:()I 40: iconst_1 41: iadd 42: invokestatic #8 // java/lang/Integer.valueOf:(I)Ljava/ 45: dup 46: astore_2 47: astore 4 49: aload_3 50: pop 51: goto 10
  66. 66. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; } 00: lconst_0 01: invokestatic #7 // java/lang/Long.valueOf:(J)Ljava/la 04: astore_1 05: iconst_0 06: invokestatic #8 // java/lang/Integer.valueOf:(I)Ljava/ 09: astore_2 10: aload_2 11: invokevirtual #9 // java/lang/Integer.intValue:()I 14: sipush N 17: if_icmpge 54 20: aload_1 21: invokevirtual #10 // java/lang/Long.longValue:()J 24: aload_2 25: invokevirtual #9 // java/lang/Integer.intValue:()I 28: i2l 29: ladd 30: invokestatic #7 // java/lang/Long.valueOf:(J)Ljava/la 33: astore_1 34: aload_2 35: astore_3 36: aload_2 37: invokevirtual #9 // java/lang/Integer.intValue:()I 40: iconst_1 41: iadd 42: invokestatic #8 // java/lang/Integer.valueOf:(I)Ljava/ 45: dup 46: astore_2 47: astore 4 49: aload_3 50: pop 51: goto 10
  67. 67. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; } 00: lconst_0 01: invokestatic #7 // java/lang/Long.valueOf:(J)Ljava/la 04: astore_1 05: iconst_0 06: invokestatic #8 // java/lang/Integer.valueOf:(I)Ljava/ 09: astore_2 10: aload_2 11: invokevirtual #9 // java/lang/Integer.intValue:()I 14: sipush N 17: if_icmpge 54 20: aload_1 21: invokevirtual #10 // java/lang/Long.longValue:()J 24: aload_2 25: invokevirtual #9 // java/lang/Integer.intValue:()I 28: i2l 29: ladd 30: invokestatic #7 // java/lang/Long.valueOf:(J)Ljava/la 33: astore_1 34: aload_2 35: astore_3 36: aload_2 37: invokevirtual #9 // java/lang/Integer.intValue:()I 40: iconst_1 41: iadd 42: invokestatic #8 // java/lang/Integer.valueOf:(I)Ljava/ 45: dup 46: astore_2 47: astore 4 49: aload_3 50: pop 51: goto 10
  68. 68. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; } // ? 00: lconst_0 01: invokestatic #7 // Method java/lang/Long.valueOf:(J)L 04: astore_1 05: iconst_0 06: invokestatic #8 // Method java/lang/Integer.valueOf:(I 09: astore_2 10: aload_2 11: invokevirtual #9 // Method java/lang/Integer.intValue: 14: sipush N 17: if_icmpge 54 20: aload_1 21: invokevirtual #10 // Method java/lang/Long.longValue:( 24: aload_2 25: invokevirtual #9 // Method java/lang/Integer.intValue: 28: i2l 29: ladd 30: invokestatic #7 // Method java/lang/Long.valueOf:(J)L 33: astore_1 34: aload_2 35: astore_3 36: aload_2 37: invokevirtual #9 // Method java/lang/Integer.intValue: 40: iconst_1 41: iadd 42: invokestatic #8 // Method java/lang/Integer.valueOf:(I 45: dup 46: astore_2 47: astore 4 49: aload_3 50: pop 51: goto 10
  69. 69. @rrafols Autoboxing This is what that code is actually doing: Long total = Long.valueOf(0); for(Integer i = Integer.valueOf(0); i.intValue() < N; i = Integer.valueOf(i.intValue() + 1)) { total = Long.valueOf(total.longValue() + (long)i.intValue()) }
  70. 70. @rrafols Autoboxing Object creation Long total = Long.valueOf(0); for(Integer i = Integer.valueOf(0); i.intValue() < N; i = Integer.valueOf(i.intValue() + 1)) { total = Long.valueOf(total.longValue() + (long)i.intValue()) }
  71. 71. @rrafols Autoboxing What about Jack?
  72. 72. @rrafols Autoboxing Jack does not help in this situation
  73. 73. @rrafols Autoboxing What about the JIT compiler?
  74. 74. @rrafols Autoboxing Let's run that loop N times (on my desktop computer) N = 10.000.000.000
  75. 75. @rrafols Autoboxing
  76. 76. @rrafols Autoboxing Let’s try it on Android Dalvik VM & ART
  77. 77. @rrafols Autoboxing
  78. 78. @rrafols Language Additions Use them wisely!
  79. 79. @rrafols Sorting No bytecode mumbo-jumbo here
  80. 80. @rrafols Let's sort some numbers… Arrays.sort(...)
  81. 81. @rrafols Difference between sorting primitive types & objects
  82. 82. @rrafols Using int & Integer
  83. 83. @rrafols Sorting objects is a stable sort Default java algorithm: TimSort adaptation
  84. 84. @rrafols Sorting primitives does not require to be stable sort Default java algorithm: Dual-Pivot quicksort
  85. 85. @rrafols Sorting Use primitive types as much as possible
  86. 86. @rrafols Loops What is going on behind the scenes
  87. 87. @rrafols Loops - List ArrayList<Integer> list = new … static long loopStandardList() { long result = 0; for(int i = 0; i < list.size(); i++) { result += list.get(i); } return result; }
  88. 88. @rrafols ArrayList<Integer> list = new … static long loopStandardList() { long result = 0; for(int i = 0; i < list.size(); i++) { result += list.get(i); } return result; } 07: lload_0 08: getstatic list 11: iload_2 12: invokevirtual java/util/ArrayList.get 15: checkcast java/lang/Integer 18: invokevirtual java/lang/Integer.intValue 21: i2l 22: ladd 23: lstore_0 24: iinc 2, 1 27: iload_2 28: getstatic list 31: invokevirtual java/util/ArrayList.size 34: if_icmplt 7 Loops - List
  89. 89. @rrafols ArrayList<Integer> list = new … static long loopStandardList() { long result = 0; for(int i = 0; i < list.size(); i++) { result += list.get(i); } return result; } Loops - List 07: lload_0 08: getstatic list 11: iload_2 12: invokevirtual java/util/ArrayList.get 15: checkcast java/lang/Integer 18: invokevirtual java/lang/Integer.intValue 21: i2l 22: ladd 23: lstore_0 24: iinc 2, 1 27: iload_2 28: getstatic list 31: invokevirtual java/util/ArrayList.size 34: if_icmplt 7
  90. 90. @rrafols ArrayList<Integer> list = new … static long loopStandardList() { long result = 0; for(int i = 0; i < list.size(); i++) { result += list.get(i); } return result; } Loops - List 07: lload_0 08: getstatic list 11: iload_2 12: invokevirtual java/util/ArrayList.get 15: checkcast java/lang/Integer 18: invokevirtual java/lang/Integer.intValue 21: i2l 22: ladd 23: lstore_0 24: iinc 2, 1 27: iload_2 28: getstatic list 31: invokevirtual java/util/ArrayList.size 34: if_icmplt 7
  91. 91. @rrafols ArrayList<Integer> list = new … static long loopStandardList() { long result = 0; for(int i = 0; i < list.size(); i++) { result += list.get(i); } return result; } Loops - List 07: lload_0 08: getstatic list 11: iload_2 12: invokevirtual java/util/ArrayList.get 15: checkcast java/lang/Integer 18: invokevirtual java/lang/Integer.intValue 21: i2l 22: ladd 23: lstore_0 24: iinc 2, 1 27: iload_2 28: getstatic list 31: invokevirtual java/util/ArrayList.size 34: if_icmplt 7
  92. 92. @rrafols Loops - foreach ArrayList<Integer> list = new … static long loopForeachList() { long result = 0; for(int v : list) { result += v; } return result; }
  93. 93. @rrafols ArrayList<Integer> list = new … static long loopForeachList() { long result = 0; for(int v : list) { result += v; } return result; } 12: aload_3 13: invokeinterface java/util/Iterator.next 18: checkcast java/lang/Integer 21: invokevirtual java/lang/Integer.intValue 24: istore_2 25: lload_0 26: iload_2 27: i2l 28: ladd 29: lstore_0 30: aload_3 31: invokeinterface java/util/Iterator.hasNext 36: ifne 12 Loops - foreach
  94. 94. @rrafols ArrayList<Integer> list = new … static long loopForeachList() { long result = 0; for(int v : list) { result += v; } return result; } Loops - foreach 12: aload_3 13: invokeinterface java/util/Iterator.next 18: checkcast java/lang/Integer 21: invokevirtual java/lang/Integer.intValue 24: istore_2 25: lload_0 26: iload_2 27: i2l 28: ladd 29: lstore_0 30: aload_3 31: invokeinterface java/util/Iterator.hasNext 36: ifne 12
  95. 95. @rrafols ArrayList<Integer> list = new … static long loopForeachList() { long result = 0; for(int v : list) { result += v; } return result; } Loops - foreach 12: aload_3 13: invokeinterface java/util/Iterator.next 18: checkcast java/lang/Integer 21: invokevirtual java/lang/Integer.intValue 24: istore_2 25: lload_0 26: iload_2 27: i2l 28: ladd 29: lstore_0 30: aload_3 31: invokeinterface java/util/Iterator.hasNext 36: ifne 12
  96. 96. @rrafols Loops - Array static int[] array = new ... static long loopStandardArray() { long result = 0; for(int i = 0; i < array.length; i++) { result += array[i]; } return result; }
  97. 97. @rrafols static int[] array = new ... static long loopStandardArray() { long result = 0; for(int i = 0; i < array.length; i++) { result += array[i]; } return result; } 07: lload_0 08: getstatic array 11: iload_2 12: iaload 13: i2l 14: ladd 15: lstore_0 16: iinc 2, 1 19: iload_2 20: getstatic array 23: arraylength 24: if_icmplt 7 Loops - Array
  98. 98. @rrafols static int[] array = new ... static long loopStandardArray() { long result = 0; for(int i = 0; i < array.length; i++) { result += array[i]; } return result; } Loops - Array 07: lload_0 08: getstatic array 11: iload_2 12: iaload 13: i2l 14: ladd 15: lstore_0 16: iinc 2, 1 19: iload_2 20: getstatic array 23: arraylength 24: if_icmplt 7
  99. 99. @rrafols static int[] array = new ... static long loopStandardArray() { long result = 0; for(int i = 0; i < array.length; i++) { result += array[i]; } return result; } Loops - Array 07: lload_0 08: getstatic array 11: iload_2 12: iaload 13: i2l 14: ladd 15: lstore_0 16: iinc 2, 1 19: iload_2 20: getstatic array 23: arraylength 24: if_icmplt 7
  100. 100. @rrafols Loops - size cached static int[] array = new ... static long loopStandardArray () { long result = 0; int length = array.length; for(int i = 0; i < length; i++) { result += array[i]; } return result; }
  101. 101. @rrafols static int[] array = new ... static long loopStandardArray () { long result = 0; int length = array.length; for(int i = 0; i < length; i++) { result += array[i]; } return result; } 12: lload_0 13: getstatic array 16: iload_3 17: iaload 18: i2l 19: ladd 20: lstore_0 21: iinc 3, 1 24: iload_3 25: iload_2 26: if_icmplt 12 Loops - size cached
  102. 102. @rrafols static int[] array = new ... static long loopStandardArray () { long result = 0; int length = array.length; for(int i = 0; i < length; i++) { result += array[i]; } return result; } Loops - size cached 12: lload_0 13: getstatic array 16: iload_3 17: iaload 18: i2l 19: ladd 20: lstore_0 21: iinc 3, 1 24: iload_3 25: iload_2 26: if_icmplt 12
  103. 103. @rrafols Loops - backwards static int[] array = new ... static long loopStandardArray () { long result = 0; for(int i = array.length - 1; i >= 0; i--) { result += array[i]; } return result; }
  104. 104. @rrafols static int[] array = new ... static long loopStandardArray () { long result = 0; for(int i = array.length - 1; i >= 0; i--) { result += array[i]; } return result; } 12: lload_0 13: getstatic array 16: iload_2 17: iaload 18: i2l 19: ladd 20: lstore_0 21: iinc 2, -1 24: iload_2 25: ifge 12 Loops - backwards
  105. 105. @rrafols static int[] array = new ... static long loopStandardArray () { long result = 0; for(int i = array.length - 1; i >= 0; i--) { result += array[i]; } return result; } Loops - backwards 12: lload_0 13: getstatic array 16: iload_2 17: iaload 18: i2l 19: ladd 20: lstore_0 21: iinc 2, -1 24: iload_2 25: ifge 12
  106. 106. @rrafols
  107. 107. @rrafols
  108. 108. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; }
  109. 109. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 00: lconst_0 01: lstore_1 02: aload_0 03: dup 04: astore 6 06: arraylength 07: istore 5 09: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 - 1 - 5 - 2 - 6 - 3 - 7 -
  110. 110. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 00: lconst_0 01: lstore_1 02: aload_0 03: dup 04: astore 6 06: arraylength 07: istore 5 09: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 - 1 0 (result) 5 - 2 - 6 - 3 - 7 -
  111. 111. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 00: lconst_0 01: lstore_1 02: aload_0 03: dup 04: astore 6 06: arraylength 07: istore 5 09: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 - 1 0 (result) 5 - 2 - 6 array 3 - 7 -
  112. 112. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 00: lconst_0 01: lstore_1 02: aload_0 03: dup 04: astore 6 06: arraylength 07: istore 5 09: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 - 1 0 (result) 5 array.length 2 - 6 array 3 - 7 -
  113. 113. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 00: lconst_0 01: lstore_1 02: aload_0 03: dup 04: astore 6 06: arraylength 07: istore 5 09: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 0 (loop index) 1 0 (result) 5 array.length 2 - 6 array 3 - 7 -
  114. 114. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 00: lconst_0 01: lstore_1 02: aload_0 03: dup 04: astore 6 06: arraylength 07: istore 5 09: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 0 (loop index) 1 0 (result) 5 array.length 2 - 6 array 3 - 7 -
  115. 115. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 00: lconst_0 01: lstore_1 02: aload_0 03: dup 04: astore 6 06: arraylength 07: istore 5 09: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 0 (loop index) 1 0 (result) 5 array.length 2 - 6 array 3 array[index] 7 -
  116. 116. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 00: lconst_0 01: lstore_1 02: aload_0 03: dup 04: astore 6 06: arraylength 07: istore 5 09: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 0 (loop index) 1 0 + array[index] 5 array.length 2 - 6 array 3 array[index] 7 -
  117. 117. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 00: lconst_0 01: lstore_1 02: aload_0 03: dup 04: astore 6 06: arraylength 07: istore 5 09: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 0 (loop index) + 1 1 0 + array[index] 5 array.length 2 - 6 array 3 array[index] 7 -
  118. 118. @rrafols
  119. 119. @rrafols
  120. 120. @rrafols Loops Use arrays instead of lists
  121. 121. @rrafols Loops When using lists, avoid foreach or iterator constructions if performance is a requirement
  122. 122. @rrafols Manual bytecode optimization Worth it?
  123. 123. @rrafols foreach loop 0: lconst_0 1: lstore_1 2: aload_0 3: dup 4: astore 6 6: arraylength 7: istore 5 9: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0: aload_0 1: arraylength 2: istore_3 3: iconst_0 4: istore_1 5: lconst_0 6: goto 17 9: aload_0 10: iload_1 11: iaload 12: i2l 13: ladd 14: iinc 1, 1 17: iload_1 18: iload_3 19: if_icmplt 9 Manual bytecode optimization
  124. 124. @rrafols Manual bytecode optimization 0: aload_0 1: arraylength 2: istore_3 3: iconst_0 4: istore_1 5: lconst_0 6: goto 17 9: aload_0 10: iload_1 11: iaload 12: i2l 13: ladd 14: iinc 1, 1 17: iload_1 18: iload_3 19: if_icmplt 9 Local variables 0 array 3 - 1 - 4 - 2 - 5 - Stack - - -
  125. 125. @rrafols 0: aload_0 1: arraylength 2: istore_3 3: iconst_0 4: istore_1 5: lconst_0 6: goto 17 9: aload_0 10: iload_1 11: iaload 12: i2l 13: ladd 14: iinc 1, 1 17: iload_1 18: iload_3 19: if_icmplt 9 Local variables 0 array 3 array.length 1 - 4 - 2 - 5 - Stack - - - Manual bytecode optimization
  126. 126. @rrafols 0: aload_0 1: arraylength 2: istore_3 3: iconst_0 4: istore_1 5: lconst_0 6: goto 17 9: aload_0 10: iload_1 11: iaload 12: i2l 13: ladd 14: iinc 1, 1 17: iload_1 18: iload_3 19: if_icmplt 9 Local variables 0 array 3 array.length 1 0 (index) 4 - 2 - 5 - Stack - - - Manual bytecode optimization
  127. 127. @rrafols 0: aload_0 1: arraylength 2: istore_3 3: iconst_0 4: istore_1 5: lconst_0 6: goto 17 9: aload_0 10: iload_1 11: iaload 12: i2l 13: ladd 14: iinc 1, 1 17: iload_1 18: iload_3 19: if_icmplt 9 Local variables 0 array 3 array.length 1 0 (index) 4 - 2 - 5 - Stack 0 (result) - - Manual bytecode optimization
  128. 128. @rrafols 0: aload_0 1: arraylength 2: istore_3 3: iconst_0 4: istore_1 5: lconst_0 6: goto 17 9: aload_0 10: iload_1 11: iaload 12: i2l 13: ladd 14: iinc 1, 1 17: iload_1 18: iload_3 19: if_icmplt 9 Local variables 0 array 3 array.length 1 0 (index) 4 - 2 - 5 - Stack 0 (result) array[index] - Manual bytecode optimization
  129. 129. @rrafols 0: aload_0 1: arraylength 2: istore_3 3: iconst_0 4: istore_1 5: lconst_0 6: goto 17 9: aload_0 10: iload_1 11: iaload 12: i2l 13: ladd 14: iinc 1, 1 17: iload_1 18: iload_3 19: if_icmplt 9 Local variables 0 array 3 array.length 1 0 (index) 4 - 2 - 5 - Stack 0 + array[index] - - Manual bytecode optimization
  130. 130. @rrafols 0: aload_0 1: arraylength 2: istore_3 3: iconst_0 4: istore_1 5: lconst_0 6: goto 17 9: aload_0 10: iload_1 11: iaload 12: i2l 13: ladd 14: iinc 1, 1 17: iload_1 18: iload_3 19: if_icmplt 9 Local variables 0 array 3 array.length 1 0 (index) + 1 4 - 2 - 5 - Stack 0 + array[index] - - Manual bytecode optimization
  131. 131. @rrafols
  132. 132. @rrafols Worth it? Only in very specific, rare, unique, special, peculiar cases. Too much effort involved.
  133. 133. @rrafols Calling a method Is there an overhead?
  134. 134. @rrafols Overhead of calling a method for(int i = 0; i < N; i++) { setVal(getVal() + 1); } for(int i = 0; i < N; i++) { val = val + 1; } vs
  135. 135. @rrafols
  136. 136. @rrafols String concatenation The evil + sign
  137. 137. @rrafols String concatenation String str = ""; for(int i = 0; i < N; i++) { str += OTHER_STR; }
  138. 138. @rrafols String concatenation String str = ""; for(int i = 0; i < N; i++) { str += OTHER_STR; } 0: ldc String 2: astore_1 3: iconst_0 4: istore_2 5: iload_2 6: sipush N 9: if_icmpge 40 12: new class java/lang/StringBuilder 15: dup 16: invokespecial java/lang/StringBuilder."<init>" 19: aload_1 20: invokevirtual java/lang/StringBuilder.append 23: aload_0 24: getfield OTHER_STR 27: invokevirtual java/lang/StringBuilder.append 30: invokevirtual java/lang/StringBuilder.toString 33: astore_1 34: iinc 2, 1 37: goto 5
  139. 139. @rrafols String concatenation String str = ""; for(int i = 0; i < N; i++) { str += OTHER_STR; } 0: ldc String 2: astore_1 3: iconst_0 4: istore_2 5: iload_2 6: sipush N 9: if_icmpge 40 12: new class java/lang/StringBuilder 15: dup 16: invokespecial java/lang/StringBuilder."<init>" 19: aload_1 20: invokevirtual java/lang/StringBuilder.append 23: aload_0 24: getfield OTHER_STR 27: invokevirtual java/lang/StringBuilder.append 30: invokevirtual java/lang/StringBuilder.toString 33: astore_1 34: iinc 2, 1 37: goto 5
  140. 140. @rrafols String concatenation String str = ""; for(int i = 0; i < N; i++) { str += OTHER_STR; } 0: ldc String 2: astore_1 3: iconst_0 4: istore_2 5: iload_2 6: sipush N 9: if_icmpge 40 12: new class java/lang/StringBuilder 15: dup 16: invokespecial java/lang/StringBuilder."<init>" 19: aload_1 20: invokevirtual java/lang/StringBuilder.append 23: aload_0 24: getfield OTHER_STR 27: invokevirtual java/lang/StringBuilder.append 30: invokevirtual java/lang/StringBuilder.toString 33: astore_1 34: iinc 2, 1 37: goto 5
  141. 141. @rrafols String concatenation String str = ""; for(int i = 0; i < N; i++) { str += OTHER_STR; } 0: ldc String 2: astore_1 3: iconst_0 4: istore_2 5: iload_2 6: sipush N 9: if_icmpge 40 12: new class java/lang/StringBuilder 15: dup 16: invokespecial java/lang/StringBuilder."<init>" 19: aload_1 20: invokevirtual java/lang/StringBuilder.append 23: aload_0 24: getfield OTHER_STR 27: invokevirtual java/lang/StringBuilder.append 30: invokevirtual java/lang/StringBuilder.toString 33: astore_1 34: iinc 2, 1 37: goto 5
  142. 142. @rrafols String concatenation String str = ""; for(int i = 0; i < N; i++) { str += OTHER_STR; } 0: ldc String 2: astore_1 3: iconst_0 4: istore_2 5: iload_2 6: sipush N 9: if_icmpge 40 12: new class java/lang/StringBuilder 15: dup 16: invokespecial java/lang/StringBuilder."<init>" 19: aload_1 20: invokevirtual java/lang/StringBuilder.append 23: aload_0 24: getfield OTHER_STR 27: invokevirtual java/lang/StringBuilder.append 30: invokevirtual java/lang/StringBuilder.toString 33: astore_1 34: iinc 2, 1 37: goto 5
  143. 143. @rrafols String concatenation String str = ""; for(int i = 0; i < N; i++) { str += OTHER_STR; } 0: ldc String 2: astore_1 3: iconst_0 4: istore_2 5: iload_2 6: sipush N 9: if_icmpge 40 12: new class java/lang/StringBuilder 15: dup 16: invokespecial java/lang/StringBuilder."<init>" 19: aload_1 20: invokevirtual java/lang/StringBuilder.append 23: aload_0 24: getfield OTHER_STR 27: invokevirtual java/lang/StringBuilder.append 30: invokevirtual java/lang/StringBuilder.toString 33: astore_1 34: iinc 2, 1 37: goto 5
  144. 144. @rrafols String str = ""; for(int i = 0; i < N; i++) { StringBuilder sb = new StringBuilder(); sb.append(str); sb.append(OTHER_STR); str = sb.toString(); } String concatenation
  145. 145. @rrafols Object creation: String str = ""; for(int i = 0; i < N; i++) { StringBuilder sb = new StringBuilder(); sb.append(str); sb.append(OTHER_STR); str = sb.toString(); } String concatenation
  146. 146. @rrafols String concatenation alternatives
  147. 147. @rrafols String.concat() • Concat cost is O(N) + O(M) • Concat returns a new String Object. String str = ""; for(int i = 0; i < N; i++) { str = str.concat(OTHER_STR); }
  148. 148. @rrafols String.concat() Object creation: String str = ""; for(int i = 0; i < N; i++) { str = str.concat(OTHER_STR); }
  149. 149. @rrafols StringBuilder • StringBuilder.append cost is O(M) [M being the length of appended String] StringBuilder sb = new StringBuilder() for(int i = 0; i < N; i++) { sb.append(OTHER_STR); } str = sb.toString();
  150. 150. @rrafols StringBuilder sb = new StringBuilder() for(int i = 0; i < N; i++) { sb.append(OTHER_STR); } str = sb.toString(); 0: ldc String 2: astore_1 3: new java/lang/StringBuilder 6: dup 7: invokespecial java/lang/StringBuilder."<init>" 10: astore_2 11: iconst_0 12: istore_3 13: iload_3 14: sipush N 17: if_icmpge 35 20: aload_2 21: aload_0 22: getfield OTHER_STR 25: invokevirtual java/lang/StringBuilder.append 28: pop 29: iinc 3, 1 32: goto 13
  151. 151. @rrafols 0: ldc String 2: astore_1 3: new java/lang/StringBuilder 6: dup 7: invokespecial java/lang/StringBuilder."<init>" 10: astore_2 11: iconst_0 12: istore_3 13: iload_3 14: sipush N 17: if_icmpge 35 20: aload_2 21: aload_0 22: getfield OTHER_STR 25: invokevirtual java/lang/StringBuilder.append 28: pop 29: iinc 3, 1 32: goto 13 sb = new StringBuilder() for(int i = 0; i < N; i++) { sb.append(OTHER_STR); } str = sb.toString(); StringBuilder
  152. 152. @rrafols Object creation: StringBuilder sb = new StringBuilder(); for(int i = 0; i < N; i++) { sb.append(OTHER_STR); } str = sb.toString(); StringBuilder
  153. 153. @rrafols String concatenation Use StringBuilder (properly) as much as possible. StringBuffer is the thread safe implementation.
  154. 154. @rrafols Strings in case statements
  155. 155. @rrafols public void taskStateMachine(String status) { switch(status) { case "PENDING": System.out.println("Status pending"); break; case "EXECUTING": System.out.println("Status executing"); break; } }
  156. 156. @rrafols
  157. 157. @rrafols
  158. 158. @rrafols Tooling
  159. 159. @rrafols Tooling - Disassembler Java • javap -c <classfile> Android: •Dexdump -d <dexfile>
  160. 160. @rrafols Tooling - Assembler Krakatau https://github.com/Storyyeller/Krakatau
  161. 161. @rrafols Tooling – Disassembler - ART adb pull /data/dalvik- cache/arm/data@app@<package>-1@base apk@classes.dex gobjdump -D <file>
  162. 162. @rrafols Tooling – Disassembler - ART adb shell oatdump --oat-file=/data/dalvik- cache/arm/data@app@<package>- 1@base.apk@classes.dex
  163. 163. @rrafols Tooling – PrintAssembly - JIT -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -XX:CompileCommand=print,com.raimon.test.Test::method Under the Hood of the JVM: From Bytecode Through the JIT to Assembly http://alblue.bandlem.com/2016/09/javaone-hotspot.html
  164. 164. @rrafols Always measure example - yuv2rgb optimization
  165. 165. @rrafols Source: Wikipedia
  166. 166. @rrafols
  167. 167. @rrafols Slightly optimized version precalc tables, fixed point operations, 2 pixels per loop…
  168. 168. @rrafols
  169. 169. @rrafols
  170. 170. @rrafols Lets compare: Normal, minified, minified with optimizations & jack Minified = obfuscated using Proguard
  171. 171. @rrafols Normal Minified Minified & optimized Jack 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 non-optimized optimized
  172. 172. @rrafols Performance measurements Avoid doing multiple tests in one run JIT might be evil!
  173. 173. @rrafols Thank you! http://blog.rafols.org @rrafols https://es.linkedin.com/in/raimonrafols

×