Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
@rrafols
The bytecode
hocus-pocus
#perfmatters
@rrafols
Disclaimer
This presentation contains bytecode
Content is my own experimentation and
might differ on other enviro...
@rrafols
Competitions
Work
2015 Local winner
2016 Ambassador
Runner up
Demoscene
Speaker
Author Entrerpreneurship
About me
@rrafols
Our friend the java compiler
@rrafols
*.java → [javac] → *.class
@rrafols
Or, for example, on Android
@rrafols
*.java → [javac] → *.class
*.class → [dx] → dex file
@rrafols
*.java → [javac] → *.class
*.class → [dx] → dex
dex → [dexopt] → optimized dex
dex → [dex2oat] → optimized native
@rrafols
but change is coming!
Jack & Jill
@rrafols
*.java → [jack] → dex file
@rrafols
Let’s focus on javac
Javac vs other compilers
@rrafols
Compilers
Produces optimized code for
the target platform
@rrafols
Javac
Does not produce optimized code*
@rrafols
Javac
Does not know on which
architecture the code will be
executed
@rrafols
Source: Oracle
@rrafols
For this reason
Java bytecode & operations are
stack based
@rrafols
Easy to interpret
But not the most optimal solution
(regarding performance)
@rrafols
Quick example
Stack based integer addition
@rrafols
j = j + i
@rrafols
Java bytecode
@rrafols
iload_3
iload_2
iadd
istore_2
@rrafols
Register based approach
@rrafols
add r01, r02, r01
or
add eax, ebx
@rrafols
Let’s make things interesting…
j = j + i + k + w + h * 2 + p * p;
@rrafols
Java bytecode
@rrafols
0: iload_2
1: iload_1
2: iadd
3: iload_3
4: iadd
5: iload 4
7: iadd
8: iload 5
10: iconst_2
11: imul
12: iadd
13:...
@rrafols
0: iload_2
1: iload_1
2: iadd
3: iload_3
4: iadd
5: iload 4
7: iadd
8: iload 5
10: iconst_2
11: imul
12: iadd
13:...
@rrafols
0: iload_2
1: iload_1
2: iadd
3: iload_3
4: iadd
5: iload 4
7: iadd
8: iload 5
10: iconst_2
11: imul
12: iadd
13:...
@rrafols
0: iload_2
1: iload_1
2: iadd
3: iload_3
4: iadd
5: iload 4
7: iadd
8: iload 5
10: iconst_2
11: imul
12: iadd
13:...
@rrafols
0: iload_2
1: iload_1
2: iadd
3: iload_3
4: iadd
5: iload 4
7: iadd
8: iload 5
10: iconst_2
11: imul
12: iadd
13:...
@rrafols
0: iload_2
1: iload_1
2: iadd
3: iload_3
4: iadd
5: iload 4
7: iadd
8: iload 5
10: iconst_2
11: imul
12: iadd
13:...
@rrafols
0: iload_2
1: iload_1
2: iadd
3: iload_3
4: iadd
5: iload 4
7: iadd
8: iload 5
10: iconst_2
11: imul
12: iadd
13:...
@rrafols
Register based approach
@rrafols
add r01, r02, r01
add r01, r03, r01
add r01, r04, r01
mul r07, r05, #2
add r01, r07, r01
mul r08, r06, r06
add r0...
@rrafols
Java VM (JVM)
Only the JVM knows on which
architecture is running. In this
case, for example, we used up to
8 reg...
@rrafols
Java VM (JVM)
All optimizations are left to be
done by the JVM
@rrafols
Maybe takes this concept a bit too
far...
@rrafols
Imagine this simple C code
#include <stdio.h>
int main() {
int a = 10;
int b = 1 + 2 + 3 + 4 + 5 + 6 + a;
printf(...
@rrafols
GCC compiler
#include <stdio.h>
int main() {
int a = 10;
int b = 1 + 2 + 3 + 4 + 5 + 6 + a;
printf("%dn", b);
}
…...
@rrafols
javac
public static void main(String args[]) {
int a = 10;
int b = 1 + 2 + 3 + 4 + 5 + 6 + a;
System.out.println(...
@rrafols
Let's do a small change
#include <stdio.h>
int main() {
int a = 10;
int b = 1 + 2 + 3 + 4 + 5 + a + 6;
printf("%d...
@rrafols
GCC compiler
#include <stdio.h>
int main() {
int a = 10;
int b = 1 + 2 + 3 + 4 + 5 + a + 6;
printf("%dn", b);
}
…...
@rrafols
javac
public static void main(String args[]) {
int a = 10;
int b = 1 + 2 + 3 + 4 + 5 + a + 6;
System.out.println(...
@rrafols
Let's do another quick change..
public static void main(String args[]) {
int a = 10;
int b = a + 1 + 2 + 3 + 4 + ...
@rrafols
javac
public static void main(String args[]) {
int a = 10;
int b = a + 1 + 2 + 3 + 4 + 5 + 6;
System.out.println(...
@rrafols
On Android there is jack to the
rescue...
@rrafols
jack
public static void main(String args[]) {
int a = 10;
int b = a + 1 + 2 + 3 + 4 + 5 + 6;
System.out.println(b...
@rrafols
And on Java there is the JIT
compiler to the rescue
@rrafols
JIT assembly output
public static void main(String args[]) {
int a = 10;
int b = a + 1 + 2 + 3 + 4 + 5 + 6;
Syste...
@rrafols
Compilers
Produces optimised code for
target platform
@rrafols
Language additions
Thinks to consider
@rrafols
Autoboxing
Transparent to the developer but
compiler adds some 'extra' code
@rrafols
Autoboxing
long total = 0;
for(int i = 0; i < N; i++) {
total += i;
}
0: lconst_0
1: lstore_1
2: iconst_0
3: isto...
@rrafols
Autoboxing
long total = 0;
for(int i = 0; i < N; i++) {
total += i;
}
0: lconst_0
1: lstore_1
2: iconst_0
3: isto...
@rrafols
Autoboxing
long total = 0;
for(int i = 0; i < N; i++) {
total += i;
}
0: lconst_0
1: lstore_1
2: iconst_0
3: isto...
@rrafols
Autoboxing
long total = 0;
for(int i = 0; i < N; i++) {
total += i;
}
0: lconst_0
1: lstore_1
2: iconst_0
3: isto...
@rrafols
Autoboxing
long total = 0;
for(int i = 0; i < N; i++) {
total += i;
}
0: lconst_0
1: lstore_1
2: iconst_0
3: isto...
@rrafols
Autoboxing
long total = 0;
for(int i = 0; i < N; i++) {
total += i;
}
0: lconst_0
1: lstore_1
2: iconst_0
3: isto...
@rrafols
Autoboxing
long total = 0;
for(int i = 0; i < N; i++) {
total += i;
}
0: lconst_0
1: lstore_1
2: iconst_0
3: isto...
@rrafols
Autoboxing
Long total = 0;
for(Integer i = 0; i < N; i++) {
total += i;
}
@rrafols
Autoboxing
Long total = 0;
for(Integer i = 0; i < N; i++) {
total += i;
}
0: lconst_0
1: invokestatic #7 // Metho...
@rrafols
Autoboxing
Long total = 0;
for(Integer i = 0; i < N; i++) {
total += i;
}
0: lconst_0
1: invokestatic #7 // Metho...
@rrafols
Autoboxing
Long total = 0;
for(Integer i = 0; i < N; i++) {
total += i;
}
0: lconst_0
1: invokestatic #7 // Metho...
@rrafols
Autoboxing
Long total = 0;
for(Integer i = 0; i < N; i++) {
total += i;
}
0: lconst_0
1: invokestatic #7 // Metho...
@rrafols
Autoboxing
Long total = 0;
for(Integer i = 0; i < N; i++) {
total += i;
}
0: lconst_0
1: invokestatic #7 // Metho...
@rrafols
Autoboxing
Long total = 0;
for(Integer i = 0; i < N; i++) {
total += i;
}
0: lconst_0
1: invokestatic #7 // Metho...
@rrafols
Autoboxing
Long total = 0;
for(Integer i = 0; i < N; i++) {
total += i;
}
0: lconst_0
1: invokestatic #7 // Metho...
@rrafols
Autoboxing
Long total = 0;
for(Integer i = 0; i < N; i++) {
total += i;
}
// ?
0: lconst_0
1: invokestatic #7 // ...
@rrafols
Autoboxing
This is what that code is actually doing:
Long total = Long.valueOf(0);
for(Integer i = Integer.valueO...
@rrafols
Autoboxing
Object creation
Long total = Long.valueOf(0);
for(Integer i = Integer.valueOf(0);
i.intValue() < N;
i ...
@rrafols
Autoboxing
What about Jack?
@rrafols
Autoboxing
Jack does not help in this situation
@rrafols
Autoboxing
What about the JIT compiler?
@rrafols
Autoboxing
Let's run that loop
10.000.000.000 times
(on my desktop computer)
@rrafols
Autoboxing
@rrafols
Autoboxing
Let's run that loop 100.000.000
Times on two Nexus 5
KitKat (Dalvik VM) & Lollipop (ART)
@rrafols
Autoboxing
@rrafols
Language Additions
Use them wisely!
@rrafols
Sorting
No bytecode mumbo-jumbo here
@rrafols
Let's sort some numbers…
Arrays.sort(...)
@rrafols
Difference between sorting
primitive types & objects
@rrafols
Using int & Integer
@rrafols
Sorting objects is a stable sort
Default java algorithm: TimSort
adaptation
@rrafols
Sorting primitives does not require
to be stable sort
Default java algorithm:
Dual-Pivot quicksort
@rrafols
Sorting
Use primitive types as much as
possible
@rrafols
Loops
@rrafols
Loops
What is going on behind the
scenes
@rrafols
Loops - List
ArrayList<Integer> list = new …
static long loopStandardList() {
long result = 0;
for(int i = 0; i <...
@rrafols
Loops - List
ArrayList<Integer> list = new …
static long loopStandardList() {
long result = 0;
for(int i = 0; i <...
@rrafols
Loops - List
ArrayList<Integer> list = new …
static long loopStandardList() {
long result = 0;
for(int i = 0; i <...
@rrafols
Loops - List
ArrayList<Integer> list = new …
static long loopStandardList() {
long result = 0;
for(int i = 0; i <...
@rrafols
Loops - List
ArrayList<Integer> list = new …
static long loopStandardList() {
long result = 0;
for(int i = 0; i <...
@rrafols
Loops - foreach
ArrayList<Integer> list = new …
static long loopForeachList() {
long result = 0;
for(int v : list...
@rrafols
Loops - foreach
ArrayList<Integer> list = new …
static long loopForeachList() {
long result = 0;
for(int v : list...
@rrafols
Loops - foreach
ArrayList<Integer> list = new …
static long loopForeachList() {
long result = 0;
for(int v : list...
@rrafols
Loops - foreach
ArrayList<Integer> list = new …
static long loopForeachList() {
long result = 0;
for(int v : list...
@rrafols
Loops - Array
static int[] array = new ...
static long loopStandardArray() {
long result = 0;
for(int i = 0; i < ...
@rrafols
Loops - Array
static int[] array = new ...
static long loopStandardArray() {
long result = 0;
for(int i = 0; i < ...
@rrafols
Loops - Array
static int[] array = new ...
static long loopStandardArray() {
long result = 0;
for(int i = 0; i < ...
@rrafols
Loops - Array
static int[] array = new ...
static long loopStandardArray() {
long result = 0;
for(int i = 0; i < ...
@rrafols
Loops - size cached
static int[] array = new ...
static long loopStandardArray () {
long result = 0;
int length =...
@rrafols
Loops - size cached
static int[] array = new ...
static long loopStandardArray () {
long result = 0;
int length =...
@rrafols
Loops - size cached
static int[] array = new ...
static long loopStandardArray () {
long result = 0;
int length =...
@rrafols
Loops - backwards
static int[] array = new ...
static long loopStandardArray () {
long result = 0;
for(int i = ar...
@rrafols
Loops - backwards
static int[] array = new ...
static long loopStandardArray () {
long result = 0;
for(int i = ar...
@rrafols
Loops - backwards
static int[] array = new ...
static long loopStandardArray () {
long result = 0;
for(int i = ar...
@rrafols
@rrafols
@rrafols
Loops – foreach II
static long loopForeachArray(int[] array) {
long result = 0;
for(int v : array) {
result += v;...
@rrafols
Loops – foreach II
static long loopForeachArray(int[] array) {
long result = 0;
for(int v : array) {
result += v;...
@rrafols
Loops – foreach II
static long loopForeachArray(int[] array) {
long result = 0;
for(int v : array) {
result += v;...
@rrafols
Loops – foreach II
static long loopForeachArray(int[] array) {
long result = 0;
for(int v : array) {
result += v;...
@rrafols
Loops – foreach II
static long loopForeachArray(int[] array) {
long result = 0;
for(int v : array) {
result += v;...
@rrafols
Loops – foreach II
static long loopForeachArray(int[] array) {
long result = 0;
for(int v : array) {
result += v;...
@rrafols
Loops – foreach II
static long loopForeachArray(int[] array) {
long result = 0;
for(int v : array) {
result += v;...
@rrafols
Loops – foreach II
static long loopForeachArray(int[] array) {
long result = 0;
for(int v : array) {
result += v;...
@rrafols
Loops – foreach II
static long loopForeachArray(int[] array) {
long result = 0;
for(int v : array) {
result += v;...
@rrafols
Loops – foreach II
static long loopForeachArray(int[] array) {
long result = 0;
for(int v : array) {
result += v;...
@rrafols
@rrafols
@rrafols
Loops
Use arrays instead of lists
@rrafols
Loops
When using lists, avoid foreach or
iterator constructions if
performance is a requirement
@rrafols
Manual bytecode optimization
Worth it?
@rrafols
Loops – foreach II 0: lconst_0
1: lstore_1
2: aload_0
3: dup
4: astore 6
6: arraylength
7: istore 5
9: iconst_0
1...
@rrafols
Manual bytecode optimization
0: aload_0
1: arraylength
2: istore_3
3: iconst_0
4: istore_1
5: lconst_0
6: goto 17...
@rrafols
Manual bytecode optimization
0: aload_0
1: arraylength
2: istore_3
3: iconst_0
4: istore_1
5: lconst_0
6: goto 17...
@rrafols
Manual bytecode optimization
0: aload_0
1: arraylength
2: istore_3
3: iconst_0
4: istore_1
5: lconst_0
6: goto 17...
@rrafols
Manual bytecode optimization
0: aload_0
1: arraylength
2: istore_3
3: iconst_0
4: istore_1
5: lconst_0
6: goto 17...
@rrafols
Manual bytecode optimization
0: aload_0
1: arraylength
2: istore_3
3: iconst_0
4: istore_1
5: lconst_0
6: goto 17...
@rrafols
Manual bytecode optimization
0: aload_0
1: arraylength
2: istore_3
3: iconst_0
4: istore_1
5: lconst_0
6: goto 17...
@rrafols
Manual bytecode optimization
0: aload_0
1: arraylength
2: istore_3
3: iconst_0
4: istore_1
5: lconst_0
6: goto 17...
@rrafols *reduced data set
@rrafols
Worth it?
Only in very specific, unique,
peculiar cases.
Too much effort involved.
@rrafols
Calling a method
Is there an overhead?
@rrafols
Overhead of calling a method
for(int i = 0; i < N; i++) {
setVal(getVal() + 1);
}
for(int i = 0; i < N; i++) {
va...
@rrafols
@rrafols
String concatenation
The evil + sign
@rrafols
String concatenation
String str = "";
for(int i = 0; i < N; i++) {
str += OTHER_STR;
}
@rrafols
String concatenation
String str = "";
for(int i = 0; i < N; i++) {
str += OTHER_STR;
}
0: ldc #13 // String
2: as...
@rrafols
String concatenation
String str = "";
for(int i = 0; i < N; i++) {
str += OTHER_STR;
}
0: ldc #13 // String
2: as...
@rrafols
String concatenation
String str = "";
for(int i = 0; i < N; i++) {
str += OTHER_STR;
}
0: ldc #13 // String
2: as...
@rrafols
String concatenation
String str = "";
for(int i = 0; i < N; i++) {
str += OTHER_STR;
}
0: ldc #13 // String
2: as...
@rrafols
String concatenation
String str = "";
for(int i = 0; i < N; i++) {
str += OTHER_STR;
}
0: ldc #13 // String
2: as...
@rrafols
String concatenation
String str = "";
for(int i = 0; i < N; i++) {
str += OTHER_STR;
}
0: ldc #13 // String
2: as...
@rrafols
String concatenation - equivalent
String str = "";
for(int i = 0; i < N; i++) {
StringBuilder sb = new StringBuil...
@rrafols
String concatenation
Object creation:
String str = "";
for(int i = 0; i < N; i++) {
StringBuilder sb = new String...
@rrafols
String concatenation
alternatives
@rrafols
String.concat()
• Concat cost is O(N) + O(M)
• Concat returns a new String Object.
String str = "";
for(int i = 0...
@rrafols
String.concat()
Object creation:
String str = "";
for(int i = 0; i < N; i++) {
str = str.concat(OTHER_STR);
}
@rrafols
StringBuilder
• StringBuilder.append cost is O(M) [M being the length of
appended String]
StringBuilder sb = new ...
@rrafols
StringBuilder
StringBuilder sb = new StringBuilder()
for(int i = 0; i < N; i++) {
sb.append(OTHER_STR);
}
str = s...
@rrafols
StringBuilder
StringBuilder sb = new StringBuilder()
for(int i = 0; i < N; i++) {
sb.append(OTHER_STR);
}
str = s...
@rrafols
StringBuilder
Object creation:
StringBuilder sb = new StringBuilder();
for(int i = 0; i < N; i++) {
sb.append(OTH...
@rrafols
String concatenation
Use StringBuilder (properly) as
much as possible. StringBuffer is
the thread safe implementa...
@rrafols
Strings in case statements
@rrafols
public void taskStateMachine(String status) {
switch(status) {
case "PENDING":
System.out.println("Status pending...
@rrafols
@rrafols
@rrafols
Tooling
@rrafols
Tooling - Disassembler
Java
• javap -c <classfile>
Android:
•Dexdump -d <dexfile>
@rrafols
Tooling - Assembler
Krakatau
https://github.com/Storyyeller/Krakatau
@rrafols
Tooling – Disassembler - ART
adb pull /data/dalvik-
cache/arm/data@app@<package>-1@base
apk@classes.dex
gobjdump ...
@rrafols
Tooling – Disassembler - ART
adb shell oatdump --oat-file=/data/dalvik-
cache/arm/data@app@<package>-1@base.
apk@...
@rrafols
Tooling – PrintAssembly - JIT
-XX:+UnlockDiagnosticVMOptions
-XX:+PrintAssembly
-XX:CompileCommand=print,com.raim...
@rrafols
Always measure
example - yuv2rgb optimization
@rrafols
Source: Wikipedia
@rrafols
@rrafols
Slightly optimized version
precalc tables, fixed point
operations, 2 pixels per loop…
@rrafols
@rrafols
@rrafols
Lets compare:
Normal, minified, minified with
optimizations & jack
Minified = obfuscated using Proguard
@rrafols
Normal Minified Minified & optimized Jack
0
2000
4000
6000
8000
10000
12000
14000
16000
18000
20000
non-optimized...
@rrafols
Performance measurements
Avoid doing multiple tests in one run
JIT might be evil!
@rrafols
Thank you!
http://blog.rafols.org
@rrafols
https://es.linkedin.com/in/raimonrafols
Upcoming SlideShare
Loading in …5
×

The bytecode hocus pocus - JavaOne 2016

1,155 views

Published on

JavaOne 2016 presentation about java bytecode, java compiler and what is going on under the hood.

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

The bytecode hocus pocus - JavaOne 2016

  1. 1. @rrafols The bytecode hocus-pocus #perfmatters
  2. 2. @rrafols Disclaimer This presentation contains bytecode Content is my own experimentation and might differ on other environments
  3. 3. @rrafols Competitions Work 2015 Local winner 2016 Ambassador Runner up Demoscene Speaker Author Entrerpreneurship About me
  4. 4. @rrafols Our friend the java compiler
  5. 5. @rrafols *.java → [javac] → *.class
  6. 6. @rrafols Or, for example, on Android
  7. 7. @rrafols *.java → [javac] → *.class *.class → [dx] → dex file
  8. 8. @rrafols *.java → [javac] → *.class *.class → [dx] → dex dex → [dexopt] → optimized dex dex → [dex2oat] → optimized native
  9. 9. @rrafols but change is coming! Jack & Jill
  10. 10. @rrafols *.java → [jack] → dex file
  11. 11. @rrafols Let’s focus on javac Javac vs other compilers
  12. 12. @rrafols Compilers Produces optimized code for the target platform
  13. 13. @rrafols Javac Does not produce optimized code*
  14. 14. @rrafols Javac Does not know on which architecture the code will be executed
  15. 15. @rrafols Source: Oracle
  16. 16. @rrafols For this reason Java bytecode & operations are stack based
  17. 17. @rrafols Easy to interpret But not the most optimal solution (regarding performance)
  18. 18. @rrafols Quick example Stack based integer addition
  19. 19. @rrafols j = j + i
  20. 20. @rrafols Java bytecode
  21. 21. @rrafols iload_3 iload_2 iadd istore_2
  22. 22. @rrafols Register based approach
  23. 23. @rrafols add r01, r02, r01 or add eax, ebx
  24. 24. @rrafols Let’s make things interesting… j = j + i + k + w + h * 2 + p * p;
  25. 25. @rrafols Java bytecode
  26. 26. @rrafols 0: iload_2 1: iload_1 2: iadd 3: iload_3 4: iadd 5: iload 4 7: iadd 8: iload 5 10: iconst_2 11: imul 12: iadd 13: iload 6 15: iload 6 17: imul 18: iadd 19: istore_2 j = j + i + k + w + h * 2 + p * p; Local vars 1: i 2: j 3: k 4: w 5: h 6: p
  27. 27. @rrafols 0: iload_2 1: iload_1 2: iadd 3: iload_3 4: iadd 5: iload 4 7: iadd 8: iload 5 10: iconst_2 11: imul 12: iadd 13: iload 6 15: iload 6 17: imul 18: iadd 19: istore_2 j = j + i + k + w + h * 2 + p * p; Local vars 1: i 2: j 3: k 4: w 5: h 6: p
  28. 28. @rrafols 0: iload_2 1: iload_1 2: iadd 3: iload_3 4: iadd 5: iload 4 7: iadd 8: iload 5 10: iconst_2 11: imul 12: iadd 13: iload 6 15: iload 6 17: imul 18: iadd 19: istore_2 j = j + i + k + w + h * 2 + p * p; Local vars 1: i 2: j 3: k 4: w 5: h 6: p
  29. 29. @rrafols 0: iload_2 1: iload_1 2: iadd 3: iload_3 4: iadd 5: iload 4 7: iadd 8: iload 5 10: iconst_2 11: imul 12: iadd 13: iload 6 15: iload 6 17: imul 18: iadd 19: istore_2 j = j + i + k + w + h * 2 + p * p; Local vars 1: i 2: j 3: k 4: w 5: h 6: p
  30. 30. @rrafols 0: iload_2 1: iload_1 2: iadd 3: iload_3 4: iadd 5: iload 4 7: iadd 8: iload 5 10: iconst_2 11: imul 12: iadd 13: iload 6 15: iload 6 17: imul 18: iadd 19: istore_2 j = j + i + k + w + h * 2 + p * p; Local vars 1: i 2: j 3: k 4: w 5: h 6: p
  31. 31. @rrafols 0: iload_2 1: iload_1 2: iadd 3: iload_3 4: iadd 5: iload 4 7: iadd 8: iload 5 10: iconst_2 11: imul 12: iadd 13: iload 6 15: iload 6 17: imul 18: iadd 19: istore_2 j = j + i + k + w + h * 2 + p * p; Local vars 1: i 2: j 3: k 4: w 5: h 6: p
  32. 32. @rrafols 0: iload_2 1: iload_1 2: iadd 3: iload_3 4: iadd 5: iload 4 7: iadd 8: iload 5 10: iconst_2 11: imul 12: iadd 13: iload 6 15: iload 6 17: imul 18: iadd 19: istore_2 j = j + i + k + w + h * 2 + p * p; Local vars 1: i 2: j 3: k 4: w 5: h 6: p
  33. 33. @rrafols Register based approach
  34. 34. @rrafols add r01, r02, r01 add r01, r03, r01 add r01, r04, r01 mul r07, r05, #2 add r01, r07, r01 mul r08, r06, r06 add r02, r08, r01 j = j + i + k + w + h * 2 + p * p; r01: i r02: j r03: k r04: w r05: h r06: p
  35. 35. @rrafols Java VM (JVM) Only the JVM knows on which architecture is running. In this case, for example, we used up to 8 registers
  36. 36. @rrafols Java VM (JVM) All optimizations are left to be done by the JVM
  37. 37. @rrafols Maybe takes this concept a bit too far...
  38. 38. @rrafols Imagine this simple C code #include <stdio.h> int main() { int a = 10; int b = 1 + 2 + 3 + 4 + 5 + 6 + a; printf("%dn", b); }
  39. 39. @rrafols GCC compiler #include <stdio.h> int main() { int a = 10; int b = 1 + 2 + 3 + 4 + 5 + 6 + a; printf("%dn", b); } … movl $31, %esi call _printf … * Using gcc & -O2 compiler option
  40. 40. @rrafols javac public static void main(String args[]) { int a = 10; int b = 1 + 2 + 3 + 4 + 5 + 6 + a; System.out.println(b); } 0: bipush 10 2: istore_1 3: bipush 21 5: iload_1 6: iadd 7: istore_2 ...
  41. 41. @rrafols Let's do a small change #include <stdio.h> int main() { int a = 10; int b = 1 + 2 + 3 + 4 + 5 + a + 6; printf("%dn", b); }
  42. 42. @rrafols GCC compiler #include <stdio.h> int main() { int a = 10; int b = 1 + 2 + 3 + 4 + 5 + a + 6; printf("%dn", b); } … movl $31, %esi call _printf … * Using gcc & -O2 compiler option
  43. 43. @rrafols javac public static void main(String args[]) { int a = 10; int b = 1 + 2 + 3 + 4 + 5 + a + 6; System.out.println(b); } 0: bipush 10 2: istore_1 3: bipush 15 5: iload_1 6: iadd 7: bipush 6 9: iadd 10: istore_2 ...
  44. 44. @rrafols Let's do another quick change.. public static void main(String args[]) { int a = 10; int b = a + 1 + 2 + 3 + 4 + 5 + 6; System.out.println(b); }
  45. 45. @rrafols javac public static void main(String args[]) { int a = 10; int b = a + 1 + 2 + 3 + 4 + 5 + 6; System.out.println(b); } 0: bipush 10 2: istore_1 3: iload_1 4: iconst_1 5: iadd 6: iconst_2 7: iadd 8: iconst_3 9: iadd 10: iconst_4 11: iadd 12: iconst_5 13: iadd 14: bipush 6 16: iadd 17: istore_2
  46. 46. @rrafols On Android there is jack to the rescue...
  47. 47. @rrafols jack public static void main(String args[]) { int a = 10; int b = a + 1 + 2 + 3 + 4 + 5 + 6; System.out.println(b); } ... 0: const/16 v0, #int 31 2: sget-object v1, Ljava/lang/System; 4: invoke-virtual {v1, v0} 7: return-void ...
  48. 48. @rrafols And on Java there is the JIT compiler to the rescue
  49. 49. @rrafols JIT assembly output public static void main(String args[]) { int a = 10; int b = a + 1 + 2 + 3 + 4 + 5 + 6; System.out.println(b); } ... 0x00000001104b2bff: mov eax, 0x000000000000001f ... 0: bipush 10 2: istore_1 3: iload_1 4: iconst_1 5: iadd 6: iconst_2 7: iadd 8: iconst_3 9: iadd 10: iconst_4 11: iadd 12: iconst_5 13: iadd 14: bipush 6 16: iadd 17: istore_2
  50. 50. @rrafols Compilers Produces optimised code for target platform
  51. 51. @rrafols Language additions Thinks to consider
  52. 52. @rrafols Autoboxing Transparent to the developer but compiler adds some 'extra' code
  53. 53. @rrafols Autoboxing long total = 0; for(int i = 0; i < N; i++) { total += i; } 0: lconst_0 1: lstore_1 2: iconst_0 3: istore_3 4: iload_3 5: ldc #8 // N 7: if_icmpge 21 10: lload_1 11: iload_3 12: i2l 13: ladd 14: lstore_1 15: iinc 3, 1 18: goto 4
  54. 54. @rrafols Autoboxing long total = 0; for(int i = 0; i < N; i++) { total += i; } 0: lconst_0 1: lstore_1 2: iconst_0 3: istore_3 4: iload_3 5: ldc #8 // N 7: if_icmpge 21 10: lload_1 11: iload_3 12: i2l 13: ladd 14: lstore_1 15: iinc 3, 1 18: goto 4
  55. 55. @rrafols Autoboxing long total = 0; for(int i = 0; i < N; i++) { total += i; } 0: lconst_0 1: lstore_1 2: iconst_0 3: istore_3 4: iload_3 5: ldc #8 // N 7: if_icmpge 21 10: lload_1 11: iload_3 12: i2l 13: ladd 14: lstore_1 15: iinc 3, 1 18: goto 4
  56. 56. @rrafols Autoboxing long total = 0; for(int i = 0; i < N; i++) { total += i; } 0: lconst_0 1: lstore_1 2: iconst_0 3: istore_3 4: iload_3 5: ldc #8 // N 7: if_icmpge 21 10: lload_1 11: iload_3 12: i2l 13: ladd 14: lstore_1 15: iinc 3, 1 18: goto 4 21:
  57. 57. @rrafols Autoboxing long total = 0; for(int i = 0; i < N; i++) { total += i; } 0: lconst_0 1: lstore_1 2: iconst_0 3: istore_3 4: iload_3 5: ldc #8 // N 7: if_icmpge 21 10: lload_1 11: iload_3 12: i2l 13: ladd 14: lstore_1 15: iinc 3, 1 18: goto 4
  58. 58. @rrafols Autoboxing long total = 0; for(int i = 0; i < N; i++) { total += i; } 0: lconst_0 1: lstore_1 2: iconst_0 3: istore_3 4: iload_3 5: ldc #8 // N 7: if_icmpge 21 10: lload_1 11: iload_3 12: i2l 13: ladd 14: lstore_1 15: iinc 3, 1 18: goto 4
  59. 59. @rrafols Autoboxing long total = 0; for(int i = 0; i < N; i++) { total += i; } 0: lconst_0 1: lstore_1 2: iconst_0 3: istore_3 4: iload_3 5: ldc #8 // N 7: if_icmpge 21 10: lload_1 11: iload_3 12: i2l 13: ladd 14: lstore_1 15: iinc 3, 1 18: goto 4
  60. 60. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; }
  61. 61. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; } 0: lconst_0 1: invokestatic #7 // Method java/lang/Long.valueOf:(J)Ljava/ 4: astore_1 5: iconst_0 6: invokestatic #8 // Method java/lang/Integer.valueOf:(I)Ljav 9: astore_2 10: aload_2 11: invokevirtual #9 // Method java/lang/Integer.intValue:()I 14: sipush N 17: if_icmpge 54 20: aload_1 21: invokevirtual #10 // Method java/lang/Long.longValue:()J 24: aload_2 25: invokevirtual #9 // Method java/lang/Integer.intValue:()I 28: i2l 29: ladd 30: invokestatic #7 // Method java/lang/Long.valueOf:(J)Ljava 33: astore_1 34: aload_2 35: astore_3 36: aload_2 37: invokevirtual #9 // Method java/lang/Integer.intValue:()I 40: iconst_1 41: iadd 42: invokestatic #8 // Method java/lang/Integer.valueOf:(I)Lja 45: dup 46: astore_2 47: astore 4 49: aload_3 50: pop 51: goto 10
  62. 62. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; } 0: lconst_0 1: invokestatic #7 // Method java/lang/Long.valueOf:(J)Ljava/ 4: astore_1 5: iconst_0 6: invokestatic #8 // Method java/lang/Integer.valueOf:(I)Ljav 9: astore_2 10: aload_2 11: invokevirtual #9 // Method java/lang/Integer.intValue:()I 14: sipush N 17: if_icmpge 54 20: aload_1 21: invokevirtual #10 // Method java/lang/Long.longValue:()J 24: aload_2 25: invokevirtual #9 // Method java/lang/Integer.intValue:()I 28: i2l 29: ladd 30: invokestatic #7 // Method java/lang/Long.valueOf:(J)Ljava 33: astore_1 34: aload_2 35: astore_3 36: aload_2 37: invokevirtual #9 // Method java/lang/Integer.intValue:()I 40: iconst_1 41: iadd 42: invokestatic #8 // Method java/lang/Integer.valueOf:(I)Lja 45: dup 46: astore_2 47: astore 4 49: aload_3 50: pop 51: goto 10
  63. 63. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; } 0: lconst_0 1: invokestatic #7 // Method java/lang/Long.valueOf:(J)Ljava/ 4: astore_1 5: iconst_0 6: invokestatic #8 // Method java/lang/Integer.valueOf:(I)Ljav 9: astore_2 10: aload_2 11: invokevirtual #9 // Method java/lang/Integer.intValue:()I 14: sipush N 17: if_icmpge 54 20: aload_1 21: invokevirtual #10 // Method java/lang/Long.longValue:()J 24: aload_2 25: invokevirtual #9 // Method java/lang/Integer.intValue:()I 28: i2l 29: ladd 30: invokestatic #7 // Method java/lang/Long.valueOf:(J)Ljava 33: astore_1 34: aload_2 35: astore_3 36: aload_2 37: invokevirtual #9 // Method java/lang/Integer.intValue:()I 40: iconst_1 41: iadd 42: invokestatic #8 // Method java/lang/Integer.valueOf:(I)Lja 45: dup 46: astore_2 47: astore 4 49: aload_3 50: pop 51: goto 10
  64. 64. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; } 0: lconst_0 1: invokestatic #7 // Method java/lang/Long.valueOf:(J)Ljava/ 4: astore_1 5: iconst_0 6: invokestatic #8 // Method java/lang/Integer.valueOf:(I)Ljav 9: astore_2 10: aload_2 11: invokevirtual #9 // Method java/lang/Integer.intValue:()I 14: sipush N 17: if_icmpge 54 20: aload_1 21: invokevirtual #10 // Method java/lang/Long.longValue:()J 24: aload_2 25: invokevirtual #9 // Method java/lang/Integer.intValue:()I 28: i2l 29: ladd 30: invokestatic #7 // Method java/lang/Long.valueOf:(J)Ljava 33: astore_1 34: aload_2 35: astore_3 36: aload_2 37: invokevirtual #9 // Method java/lang/Integer.intValue:()I 40: iconst_1 41: iadd 42: invokestatic #8 // Method java/lang/Integer.valueOf:(I)Lja 45: dup 46: astore_2 47: astore 4 49: aload_3 50: pop 51: goto 10
  65. 65. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; } 0: lconst_0 1: invokestatic #7 // Method java/lang/Long.valueOf:(J)Ljava/ 4: astore_1 5: iconst_0 6: invokestatic #8 // Method java/lang/Integer.valueOf:(I)Ljav 9: astore_2 10: aload_2 11: invokevirtual #9 // Method java/lang/Integer.intValue:()I 14: sipush N 17: if_icmpge 54 20: aload_1 21: invokevirtual #10 // Method java/lang/Long.longValue:()J 24: aload_2 25: invokevirtual #9 // Method java/lang/Integer.intValue:()I 28: i2l 29: ladd 30: invokestatic #7 // Method java/lang/Long.valueOf:(J)Ljava 33: astore_1 34: aload_2 35: astore_3 36: aload_2 37: invokevirtual #9 // Method java/lang/Integer.intValue:()I 40: iconst_1 41: iadd 42: invokestatic #8 // Method java/lang/Integer.valueOf:(I)Lja 45: dup 46: astore_2 47: astore 4 49: aload_3 50: pop 51: goto 10
  66. 66. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; } 0: lconst_0 1: invokestatic #7 // Method java/lang/Long.valueOf:(J)Ljava/ 4: astore_1 5: iconst_0 6: invokestatic #8 // Method java/lang/Integer.valueOf:(I)Ljav 9: astore_2 10: aload_2 11: invokevirtual #9 // Method java/lang/Integer.intValue:()I 14: sipush N 17: if_icmpge 54 20: aload_1 21: invokevirtual #10 // Method java/lang/Long.longValue:()J 24: aload_2 25: invokevirtual #9 // Method java/lang/Integer.intValue:()I 28: i2l 29: ladd 30: invokestatic #7 // Method java/lang/Long.valueOf:(J)Ljava 33: astore_1 34: aload_2 35: astore_3 36: aload_2 37: invokevirtual #9 // Method java/lang/Integer.intValue:()I 40: iconst_1 41: iadd 42: invokestatic #8 // Method java/lang/Integer.valueOf:(I)Lja 45: dup 46: astore_2 47: astore 4 49: aload_3 50: pop 51: goto 10
  67. 67. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; } 0: lconst_0 1: invokestatic #7 // Method java/lang/Long.valueOf:(J)Ljava/ 4: astore_1 5: iconst_0 6: invokestatic #8 // Method java/lang/Integer.valueOf:(I)Ljav 9: astore_2 10: aload_2 11: invokevirtual #9 // Method java/lang/Integer.intValue:()I 14: sipush N 17: if_icmpge 54 20: aload_1 21: invokevirtual #10 // Method java/lang/Long.longValue:()J 24: aload_2 25: invokevirtual #9 // Method java/lang/Integer.intValue:()I 28: i2l 29: ladd 30: invokestatic #7 // Method java/lang/Long.valueOf:(J)Ljava 33: astore_1 34: aload_2 35: astore_3 36: aload_2 37: invokevirtual #9 // Method java/lang/Integer.intValue:()I 40: iconst_1 41: iadd 42: invokestatic #8 // Method java/lang/Integer.valueOf:(I)Lja 45: dup 46: astore_2 47: astore 4 49: aload_3 50: pop 51: goto 10
  68. 68. @rrafols Autoboxing Long total = 0; for(Integer i = 0; i < N; i++) { total += i; } // ? 0: lconst_0 1: invokestatic #7 // Method java/lang/Long.valueOf:(J)Ljava/ 4: astore_1 5: iconst_0 6: invokestatic #8 // Method java/lang/Integer.valueOf:(I)Ljav 9: astore_2 10: aload_2 11: invokevirtual #9 // Method java/lang/Integer.intValue:()I 14: sipush N 17: if_icmpge 54 20: aload_1 21: invokevirtual #10 // Method java/lang/Long.longValue:()J 24: aload_2 25: invokevirtual #9 // Method java/lang/Integer.intValue:()I 28: i2l 29: ladd 30: invokestatic #7 // Method java/lang/Long.valueOf:(J)Ljava 33: astore_1 34: aload_2 35: astore_3 36: aload_2 37: invokevirtual #9 // Method java/lang/Integer.intValue:()I 40: iconst_1 41: iadd 42: invokestatic #8 // Method java/lang/Integer.valueOf:(I)Lja 45: dup 46: astore_2 47: astore 4 49: aload_3 50: pop 51: goto 10
  69. 69. @rrafols Autoboxing This is what that code is actually doing: Long total = Long.valueOf(0); for(Integer i = Integer.valueOf(0); i.intValue() < N; i = Integer.valueOf(i.intValue() + 1)) { total = Long.valueOf(total.longValue() + (long)i.intValue()) }
  70. 70. @rrafols Autoboxing Object creation Long total = Long.valueOf(0); for(Integer i = Integer.valueOf(0); i.intValue() < N; i = Integer.valueOf(i.intValue() + 1)) { total = Long.valueOf(total.longValue() + (long)i.intValue()) }
  71. 71. @rrafols Autoboxing What about Jack?
  72. 72. @rrafols Autoboxing Jack does not help in this situation
  73. 73. @rrafols Autoboxing What about the JIT compiler?
  74. 74. @rrafols Autoboxing Let's run that loop 10.000.000.000 times (on my desktop computer)
  75. 75. @rrafols Autoboxing
  76. 76. @rrafols Autoboxing Let's run that loop 100.000.000 Times on two Nexus 5 KitKat (Dalvik VM) & Lollipop (ART)
  77. 77. @rrafols Autoboxing
  78. 78. @rrafols Language Additions Use them wisely!
  79. 79. @rrafols Sorting No bytecode mumbo-jumbo here
  80. 80. @rrafols Let's sort some numbers… Arrays.sort(...)
  81. 81. @rrafols Difference between sorting primitive types & objects
  82. 82. @rrafols Using int & Integer
  83. 83. @rrafols Sorting objects is a stable sort Default java algorithm: TimSort adaptation
  84. 84. @rrafols Sorting primitives does not require to be stable sort Default java algorithm: Dual-Pivot quicksort
  85. 85. @rrafols Sorting Use primitive types as much as possible
  86. 86. @rrafols Loops
  87. 87. @rrafols Loops What is going on behind the scenes
  88. 88. @rrafols Loops - List ArrayList<Integer> list = new … static long loopStandardList() { long result = 0; for(int i = 0; i < list.size(); i++) { result += list.get(i); } return result; }
  89. 89. @rrafols Loops - List ArrayList<Integer> list = new … static long loopStandardList() { long result = 0; for(int i = 0; i < list.size(); i++) { result += list.get(i); } return result; } 7: lload_0 8: getstatic #26 // Field list:Ljava/util/ArrayList; 11: iload_2 12: invokevirtual #54 // Method java/util/ArrayList.get:(I) 15: checkcast #38 // class java/lang/Integer 18: invokevirtual #58 // Method java/lang/Integer.intValue:()I 21: i2l 22: ladd 23: lstore_0 24: iinc 2, 1 27: iload_2 28: getstatic #26 // Field list:Ljava/util/ArrayList; 31: invokevirtual #61 // Method java/util/ArrayList.size:()I 34: if_icmplt 7
  90. 90. @rrafols Loops - List ArrayList<Integer> list = new … static long loopStandardList() { long result = 0; for(int i = 0; i < list.size(); i++) { result += list.get(i); } return result; } 7: lload_0 8: getstatic #26 // Field list:Ljava/util/ArrayList; 11: iload_2 12: invokevirtual #54 // Method java/util/ArrayList.get:(I) 15: checkcast #38 // class java/lang/Integer 18: invokevirtual #58 // Method java/lang/Integer.intValue:()I 21: i2l 22: ladd 23: lstore_0 24: iinc 2, 1 27: iload_2 28: getstatic #26 // Field list:Ljava/util/ArrayList; 31: invokevirtual #61 // Method java/util/ArrayList.size:()I 34: if_icmplt 7
  91. 91. @rrafols Loops - List ArrayList<Integer> list = new … static long loopStandardList() { long result = 0; for(int i = 0; i < list.size(); i++) { result += list.get(i); } return result; } 7: lload_0 8: getstatic #26 // Field list:Ljava/util/ArrayList; 11: iload_2 12: invokevirtual #54 // Method java/util/ArrayList.get:(I) 15: checkcast #38 // class java/lang/Integer 18: invokevirtual #58 // Method java/lang/Integer.intValue:()I 21: i2l 22: ladd 23: lstore_0 24: iinc 2, 1 27: iload_2 28: getstatic #26 // Field list:Ljava/util/ArrayList; 31: invokevirtual #61 // Method java/util/ArrayList.size:()I 34: if_icmplt 7
  92. 92. @rrafols Loops - List ArrayList<Integer> list = new … static long loopStandardList() { long result = 0; for(int i = 0; i < list.size(); i++) { result += list.get(i); } return result; } 7: lload_0 8: getstatic #26 // Field list:Ljava/util/ArrayList; 11: iload_2 12: invokevirtual #54 // Method java/util/ArrayList.get:(I) 15: checkcast #38 // class java/lang/Integer 18: invokevirtual #58 // Method java/lang/Integer.intValue:()I 21: i2l 22: ladd 23: lstore_0 24: iinc 2, 1 27: iload_2 28: getstatic #26 // Field list:Ljava/util/ArrayList; 31: invokevirtual #61 // Method java/util/ArrayList.size:()I 34: if_icmplt 7
  93. 93. @rrafols Loops - foreach ArrayList<Integer> list = new … static long loopForeachList() { long result = 0; for(int v : list) { result += v; } return result; }
  94. 94. @rrafols Loops - foreach ArrayList<Integer> list = new … static long loopForeachList() { long result = 0; for(int v : list) { result += v; } return result; } 12: aload_3 13: invokeinterface #70, 1 // java/util/Iterator.next:() 18: checkcast #38 // class java/lang/Integer 21: invokevirtual #58 // Method java/lang/Integer.intValue: 24: istore_2 25: lload_0 26: iload_2 27: i2l 28: ladd 29: lstore_0 30: aload_3 31: invokeinterface #76, 1 // java/util/Iterator.hasNext:()Z 36: ifne 12
  95. 95. @rrafols Loops - foreach ArrayList<Integer> list = new … static long loopForeachList() { long result = 0; for(int v : list) { result += v; } return result; } 12: aload_3 13: invokeinterface #70, 1 // java/util/Iterator.next:() 18: checkcast #38 // class java/lang/Integer 21: invokevirtual #58 // Method java/lang/Integer.intValue:()I 24: istore_2 25: lload_0 26: iload_2 27: i2l 28: ladd 29: lstore_0 30: aload_3 31: invokeinterface #76, 1 // java/util/Iterator.hasNext:()Z 36: ifne 12
  96. 96. @rrafols Loops - foreach ArrayList<Integer> list = new … static long loopForeachList() { long result = 0; for(int v : list) { result += v; } return result; } 12: aload_3 13: invokeinterface #70, 1 // java/util/Iterator.next:() 18: checkcast #38 // class java/lang/Integer 21: invokevirtual #58 // Method java/lang/Integer.intValue:()I 24: istore_2 25: lload_0 26: iload_2 27: i2l 28: ladd 29: lstore_0 30: aload_3 31: invokeinterface #76, 1 // java/util/Iterator.hasNext:()Z 36: ifne 12
  97. 97. @rrafols Loops - Array static int[] array = new ... static long loopStandardArray() { long result = 0; for(int i = 0; i < array.length; i++) { result += array[i]; } return result; }
  98. 98. @rrafols Loops - Array static int[] array = new ... static long loopStandardArray() { long result = 0; for(int i = 0; i < array.length; i++) { result += array[i]; } return result; } 7: lload_0 8: getstatic #28 // Field array:[I 11: iload_2 12: iaload 13: i2l 14: ladd 15: lstore_0 16: iinc 2, 1 19: iload_2 20: getstatic #28 // Field array:[I 23: arraylength 24: if_icmplt 7
  99. 99. @rrafols Loops - Array static int[] array = new ... static long loopStandardArray() { long result = 0; for(int i = 0; i < array.length; i++) { result += array[i]; } return result; } 7: lload_0 8: getstatic #28 // Field array:[I 11: iload_2 12: iaload 13: i2l 14: ladd 15: lstore_0 16: iinc 2, 1 19: iload_2 20: getstatic #28 // Field array:[I 23: arraylength 24: if_icmplt 7
  100. 100. @rrafols Loops - Array static int[] array = new ... static long loopStandardArray() { long result = 0; for(int i = 0; i < array.length; i++) { result += array[i]; } return result; } 7: lload_0 8: getstatic #28 // Field array:[I 11: iload_2 12: iaload 13: i2l 14: ladd 15: lstore_0 16: iinc 2, 1 19: iload_2 20: getstatic #28 // Field array:[I 23: arraylength 24: if_icmplt 7
  101. 101. @rrafols Loops - size cached static int[] array = new ... static long loopStandardArray () { long result = 0; int length = array.length; for(int i = 0; i < length; i++) { result += array[i]; } return result; }
  102. 102. @rrafols Loops - size cached static int[] array = new ... static long loopStandardArray () { long result = 0; int length = array.length; for(int i = 0; i < length; i++) { result += array[i]; } return result; } 12: lload_0 13: getstatic #28 // Field array:[I 16: iload_3 17: iaload 18: i2l 19: ladd 20: lstore_0 21: iinc 3, 1 24: iload_3 25: iload_2 26: if_icmplt 12
  103. 103. @rrafols Loops - size cached static int[] array = new ... static long loopStandardArray () { long result = 0; int length = array.length; for(int i = 0; i < length; i++) { result += array[i]; } return result; } 12: lload_0 13: getstatic #28 // Field array:[I 16: iload_3 17: iaload 18: i2l 19: ladd 20: lstore_0 21: iinc 3, 1 24: iload_3 25: iload_2 26: if_icmplt 12
  104. 104. @rrafols Loops - backwards static int[] array = new ... static long loopStandardArray () { long result = 0; for(int i = array.length - 1; i >= 0; i--) { result += array[i]; } return result; }
  105. 105. @rrafols Loops - backwards static int[] array = new ... static long loopStandardArray () { long result = 0; for(int i = array.length - 1; i >= 0; i--) { result += array[i]; } return result; } 12: lload_0 13: getstatic #28 // Field array:[I 16: iload_2 17: iaload 18: i2l 19: ladd 20: lstore_0 21: iinc 2, -1 24: iload_2 25: ifge 12
  106. 106. @rrafols Loops - backwards static int[] array = new ... static long loopStandardArray () { long result = 0; for(int i = array.length - 1; i >= 0; i--) { result += array[i]; } return result; } 12: lload_0 13: getstatic #28 // Field array:[I 16: iload_2 17: iaload 18: i2l 19: ladd 20: lstore_0 21: iinc 2, -1 24: iload_2 25: ifge 12
  107. 107. @rrafols
  108. 108. @rrafols
  109. 109. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; }
  110. 110. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 0: lconst_0 1: lstore_1 2: aload_0 3: dup 4: astore 6 6: arraylength 7: istore 5 9: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 - 1 - 5 - 2 - 6 - 3 - 7 -
  111. 111. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 0: lconst_0 1: lstore_1 2: aload_0 3: dup 4: astore 6 6: arraylength 7: istore 5 9: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 - 1 0 (result) 5 - 2 - 6 - 3 - 7 -
  112. 112. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 0: lconst_0 1: lstore_1 2: aload_0 3: dup 4: astore 6 6: arraylength 7: istore 5 9: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 - 1 0 (result) 5 - 2 - 6 array 3 - 7 -
  113. 113. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 0: lconst_0 1: lstore_1 2: aload_0 3: dup 4: astore 6 6: arraylength 7: istore 5 9: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 - 1 0 (result) 5 array.length 2 - 6 array 3 - 7 -
  114. 114. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 0: lconst_0 1: lstore_1 2: aload_0 3: dup 4: astore 6 6: arraylength 7: istore 5 9: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 0 (loop index) 1 0 (result) 5 array.length 2 - 6 array 3 - 7 -
  115. 115. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 0: lconst_0 1: lstore_1 2: aload_0 3: dup 4: astore 6 6: arraylength 7: istore 5 9: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 0 (loop index) 1 0 (result) 5 array.length 2 - 6 array 3 - 7 -
  116. 116. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 0: lconst_0 1: lstore_1 2: aload_0 3: dup 4: astore 6 6: arraylength 7: istore 5 9: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 0 (loop index) 1 0 (result) 5 array.length 2 - 6 array 3 array[index] 7 -
  117. 117. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 0: lconst_0 1: lstore_1 2: aload_0 3: dup 4: astore 6 6: arraylength 7: istore 5 9: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 0 (loop index) 1 0 + array[index] 5 array.length 2 - 6 array 3 array[index] 7 -
  118. 118. @rrafols Loops – foreach II static long loopForeachArray(int[] array) { long result = 0; for(int v : array) { result += v; } return result; } 0: lconst_0 1: lstore_1 2: aload_0 3: dup 4: astore 6 6: arraylength 7: istore 5 9: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0 array 4 0 (loop index) + 1 1 0 + array[index] 5 array.length 2 - 6 array 3 array[index] 7 -
  119. 119. @rrafols
  120. 120. @rrafols
  121. 121. @rrafols Loops Use arrays instead of lists
  122. 122. @rrafols Loops When using lists, avoid foreach or iterator constructions if performance is a requirement
  123. 123. @rrafols Manual bytecode optimization Worth it?
  124. 124. @rrafols Loops – foreach II 0: lconst_0 1: lstore_1 2: aload_0 3: dup 4: astore 6 6: arraylength 7: istore 5 9: iconst_0 10: istore 4 12: goto 29 15: aload 6 17: iload 4 19: iaload 20: istore_3 21: lload_1 22: iload_3 23: i2l 24: ladd 25: lstore_1 26: iinc 4, 1 29: iload 4 31: iload 5 33: if_icmplt 15 0: aload_0 1: arraylength 2: istore_3 3: iconst_0 4: istore_1 5: lconst_0 6: goto 17 9: aload_0 10: iload_1 11: iaload 12: i2l 13: ladd 14: iinc 1, 1 17: iload_1 18: iload_3 19: if_icmplt 9 Manual bytecode optimization
  125. 125. @rrafols Manual bytecode optimization 0: aload_0 1: arraylength 2: istore_3 3: iconst_0 4: istore_1 5: lconst_0 6: goto 17 9: aload_0 10: iload_1 11: iaload 12: i2l 13: ladd 14: iinc 1, 1 17: iload_1 18: iload_3 19: if_icmplt 9 Local variables 0 array 3 - 1 - 4 - 2 - 5 - Stack - - -
  126. 126. @rrafols Manual bytecode optimization 0: aload_0 1: arraylength 2: istore_3 3: iconst_0 4: istore_1 5: lconst_0 6: goto 17 9: aload_0 10: iload_1 11: iaload 12: i2l 13: ladd 14: iinc 1, 1 17: iload_1 18: iload_3 19: if_icmplt 9 Local variables 0 array 3 array.length 1 - 4 - 2 - 5 - Stack - - -
  127. 127. @rrafols Manual bytecode optimization 0: aload_0 1: arraylength 2: istore_3 3: iconst_0 4: istore_1 5: lconst_0 6: goto 17 9: aload_0 10: iload_1 11: iaload 12: i2l 13: ladd 14: iinc 1, 1 17: iload_1 18: iload_3 19: if_icmplt 9 Local variables 0 array 3 array.length 1 0 (index) 4 - 2 - 5 - Stack - - -
  128. 128. @rrafols Manual bytecode optimization 0: aload_0 1: arraylength 2: istore_3 3: iconst_0 4: istore_1 5: lconst_0 6: goto 17 9: aload_0 10: iload_1 11: iaload 12: i2l 13: ladd 14: iinc 1, 1 17: iload_1 18: iload_3 19: if_icmplt 9 Local variables 0 array 3 array.length 1 0 (index) 4 - 2 - 5 - Stack 0 (result) - -
  129. 129. @rrafols Manual bytecode optimization 0: aload_0 1: arraylength 2: istore_3 3: iconst_0 4: istore_1 5: lconst_0 6: goto 17 9: aload_0 10: iload_1 11: iaload 12: i2l 13: ladd 14: iinc 1, 1 17: iload_1 18: iload_3 19: if_icmplt 9 Local variables 0 array 3 array.length 1 0 (index) 4 - 2 - 5 - Stack 0 (result) array[index] -
  130. 130. @rrafols Manual bytecode optimization 0: aload_0 1: arraylength 2: istore_3 3: iconst_0 4: istore_1 5: lconst_0 6: goto 17 9: aload_0 10: iload_1 11: iaload 12: i2l 13: ladd 14: iinc 1, 1 17: iload_1 18: iload_3 19: if_icmplt 9 Local variables 0 array 3 array.length 1 0 (index) 4 - 2 - 5 - Stack 0 + array[index] - -
  131. 131. @rrafols Manual bytecode optimization 0: aload_0 1: arraylength 2: istore_3 3: iconst_0 4: istore_1 5: lconst_0 6: goto 17 9: aload_0 10: iload_1 11: iaload 12: i2l 13: ladd 14: iinc 1, 1 17: iload_1 18: iload_3 19: if_icmplt 9 Local variables 0 array 3 array.length 1 0 (index) + 1 4 - 2 - 5 - Stack 0 + array[index] - -
  132. 132. @rrafols *reduced data set
  133. 133. @rrafols Worth it? Only in very specific, unique, peculiar cases. Too much effort involved.
  134. 134. @rrafols Calling a method Is there an overhead?
  135. 135. @rrafols Overhead of calling a method for(int i = 0; i < N; i++) { setVal(getVal() + 1); } for(int i = 0; i < N; i++) { val = val + 1; } vs
  136. 136. @rrafols
  137. 137. @rrafols String concatenation The evil + sign
  138. 138. @rrafols String concatenation String str = ""; for(int i = 0; i < N; i++) { str += OTHER_STR; }
  139. 139. @rrafols String concatenation String str = ""; for(int i = 0; i < N; i++) { str += OTHER_STR; } 0: ldc #13 // String 2: astore_1 3: iconst_0 4: istore_2 5: iload_2 6: sipush N 9: if_icmpge 40 12: new #14 // class java/lang/StringBuilder 15: dup 16: invokespecial #15 // Method java/lang/StringBuilder."<init>":()V 19: aload_1 20: invokevirtual #16 // Method java/lang/StringBuilder.append:(Ljava/lang/S 23: aload_0 24: getfield #3 // Field OTHER_STR:Ljava/lang/String; 27: invokevirtual #16 // Method java/lang/StringBuilder.append:(Ljava/lang/S 30: invokevirtual #17 // Method java/lang/StringBuilder.toString:()Ljava/lan 33: astore_1 34: iinc 2, 1 37: goto 5
  140. 140. @rrafols String concatenation String str = ""; for(int i = 0; i < N; i++) { str += OTHER_STR; } 0: ldc #13 // String 2: astore_1 3: iconst_0 4: istore_2 5: iload_2 6: sipush N 9: if_icmpge 40 12: new #14 // class java/lang/StringBuilder 15: dup 16: invokespecial #15 // Method java/lang/StringBuilder."<init>":()V 19: aload_1 20: invokevirtual #16 // Method java/lang/StringBuilder.append:(Ljava/lang/S 23: aload_0 24: getfield #3 // Field OTHER_STR:Ljava/lang/String; 27: invokevirtual #16 // Method java/lang/StringBuilder.append:(Ljava/lang/S 30: invokevirtual #17 // Method java/lang/StringBuilder.toString:()Ljava/lan 33: astore_1 34: iinc 2, 1 37: goto 5
  141. 141. @rrafols String concatenation String str = ""; for(int i = 0; i < N; i++) { str += OTHER_STR; } 0: ldc #13 // String 2: astore_1 3: iconst_0 4: istore_2 5: iload_2 6: sipush N 9: if_icmpge 40 12: new #14 // class java/lang/StringBuilder 15: dup 16: invokespecial #15 // Method java/lang/StringBuilder."<init>":()V 19: aload_1 20: invokevirtual #16 // Method java/lang/StringBuilder.append:(Ljava/lang/S 23: aload_0 24: getfield #3 // Field OTHER_STR:Ljava/lang/String; 27: invokevirtual #16 // Method java/lang/StringBuilder.append:(Ljava/lang/S 30: invokevirtual #17 // Method java/lang/StringBuilder.toString:()Ljava/lan 33: astore_1 34: iinc 2, 1 37: goto 5
  142. 142. @rrafols String concatenation String str = ""; for(int i = 0; i < N; i++) { str += OTHER_STR; } 0: ldc #13 // String 2: astore_1 3: iconst_0 4: istore_2 5: iload_2 6: sipush N 9: if_icmpge 40 12: new #14 // class java/lang/StringBuilder 15: dup 16: invokespecial #15 // Method java/lang/StringBuilder."<init>":()V 19: aload_1 20: invokevirtual #16 // Method java/lang/StringBuilder.append:(Ljava/lang/S 23: aload_0 24: getfield #3 // Field OTHER_STR:Ljava/lang/String; 27: invokevirtual #16 // Method java/lang/StringBuilder.append:(Ljava/lang/S 30: invokevirtual #17 // Method java/lang/StringBuilder.toString:()Ljava/lan 33: astore_1 34: iinc 2, 1 37: goto 5
  143. 143. @rrafols String concatenation String str = ""; for(int i = 0; i < N; i++) { str += OTHER_STR; } 0: ldc #13 // String 2: astore_1 3: iconst_0 4: istore_2 5: iload_2 6: sipush N 9: if_icmpge 40 12: new #14 // class java/lang/StringBuilder 15: dup 16: invokespecial #15 // Method java/lang/StringBuilder."<init>":()V 19: aload_1 20: invokevirtual #16 // Method java/lang/StringBuilder.append:(Ljava/lang/S 23: aload_0 24: getfield #3 // Field OTHER_STR:Ljava/lang/String; 27: invokevirtual #16 // Method java/lang/StringBuilder.append:(Ljava/lang/S 30: invokevirtual #17 // Method java/lang/StringBuilder.toString:()Ljava/lan 33: astore_1 34: iinc 2, 1 37: goto 5
  144. 144. @rrafols String concatenation String str = ""; for(int i = 0; i < N; i++) { str += OTHER_STR; } 0: ldc #13 // String 2: astore_1 3: iconst_0 4: istore_2 5: iload_2 6: sipush N 9: if_icmpge 40 12: new #14 // class java/lang/StringBuilder 15: dup 16: invokespecial #15 // Method java/lang/StringBuilder."<init>":()V 19: aload_1 20: invokevirtual #16 // Method java/lang/StringBuilder.append:(Ljava/lang/S 23: aload_0 24: getfield #3 // Field OTHER_STR:Ljava/lang/String; 27: invokevirtual #16 // Method java/lang/StringBuilder.append:(Ljava/lang/S 30: invokevirtual #17 // Method java/lang/StringBuilder.toString:()Ljava/lan 33: astore_1 34: iinc 2, 1 37: goto 5
  145. 145. @rrafols String concatenation - equivalent String str = ""; for(int i = 0; i < N; i++) { StringBuilder sb = new StringBuilder(); sb.append(str); sb.append(OTHER_STR); str = sb.toString(); }
  146. 146. @rrafols String concatenation Object creation: String str = ""; for(int i = 0; i < N; i++) { StringBuilder sb = new StringBuilder(); sb.append(str); sb.append(OTHER_STR); str = sb.toString(); }
  147. 147. @rrafols String concatenation alternatives
  148. 148. @rrafols String.concat() • Concat cost is O(N) + O(M) • Concat returns a new String Object. String str = ""; for(int i = 0; i < N; i++) { str = str.concat(OTHER_STR); }
  149. 149. @rrafols String.concat() Object creation: String str = ""; for(int i = 0; i < N; i++) { str = str.concat(OTHER_STR); }
  150. 150. @rrafols StringBuilder • StringBuilder.append cost is O(M) [M being the length of appended String] StringBuilder sb = new StringBuilder() for(int i = 0; i < N; i++) { sb.append(OTHER_STR); } str = sb.toString();
  151. 151. @rrafols StringBuilder StringBuilder sb = new StringBuilder() for(int i = 0; i < N; i++) { sb.append(OTHER_STR); } str = sb.toString(); 0: ldc #13 // String 2: astore_1 3: new #14 // class java/lang/StringBuilder 6: dup 7: invokespecial #15 // Method java/lang/StringBuilder."<i 10: astore_2 11: iconst_0 12: istore_3 13: iload_3 14: sipush N 17: if_icmpge 35 20: aload_2 21: aload_0 22: getfield #3 // Field OTHER_STR:Ljava/lang/String; 25: invokevirtual #16 // Method java/lang/StringBuilder.ap 28: pop 29: iinc 3, 1 32: goto 13
  152. 152. @rrafols StringBuilder StringBuilder sb = new StringBuilder() for(int i = 0; i < N; i++) { sb.append(OTHER_STR); } str = sb.toString(); 0: ldc #13 // String 2: astore_1 3: new #14 // class java/lang/StringBuilder 6: dup 7: invokespecial #15 // Method java/lang/StringBuilder."<i 10: astore_2 11: iconst_0 12: istore_3 13: iload_3 14: sipush N 17: if_icmpge 35 20: aload_2 21: aload_0 22: getfield #3 // Field OTHER_STR:Ljava/lang/String; 25: invokevirtual #16 // Method java/lang/StringBuilder.ap 28: pop 29: iinc 3, 1 32: goto 13
  153. 153. @rrafols StringBuilder Object creation: StringBuilder sb = new StringBuilder(); for(int i = 0; i < N; i++) { sb.append(OTHER_STR); } str = sb.toString();
  154. 154. @rrafols String concatenation Use StringBuilder (properly) as much as possible. StringBuffer is the thread safe implementation.
  155. 155. @rrafols Strings in case statements
  156. 156. @rrafols public void taskStateMachine(String status) { switch(status) { case "PENDING": System.out.println("Status pending"); break; case "EXECUTING": System.out.println("Status executing"); break; } }
  157. 157. @rrafols
  158. 158. @rrafols
  159. 159. @rrafols Tooling
  160. 160. @rrafols Tooling - Disassembler Java • javap -c <classfile> Android: •Dexdump -d <dexfile>
  161. 161. @rrafols Tooling - Assembler Krakatau https://github.com/Storyyeller/Krakatau
  162. 162. @rrafols Tooling – Disassembler - ART adb pull /data/dalvik- cache/arm/data@app@<package>-1@base apk@classes.dex gobjdump -D <file>
  163. 163. @rrafols Tooling – Disassembler - ART adb shell oatdump --oat-file=/data/dalvik- cache/arm/data@app@<package>-1@base. apk@classes.dex
  164. 164. @rrafols Tooling – PrintAssembly - JIT -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly -XX:CompileCommand=print,com.raimon.test.Test::method Under the Hood of the JVM: From Bytecode Through the JIT to Assembly [CON3808] http://alblue.bandlem.com/2016/09/javaone-hotspot.html
  165. 165. @rrafols Always measure example - yuv2rgb optimization
  166. 166. @rrafols Source: Wikipedia
  167. 167. @rrafols
  168. 168. @rrafols Slightly optimized version precalc tables, fixed point operations, 2 pixels per loop…
  169. 169. @rrafols
  170. 170. @rrafols
  171. 171. @rrafols Lets compare: Normal, minified, minified with optimizations & jack Minified = obfuscated using Proguard
  172. 172. @rrafols Normal Minified Minified & optimized Jack 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000 non-optimized optimized
  173. 173. @rrafols Performance measurements Avoid doing multiple tests in one run JIT might be evil!
  174. 174. @rrafols Thank you! http://blog.rafols.org @rrafols https://es.linkedin.com/in/raimonrafols

×