Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Inside the JVM - Follow the white rabbit!

1,010 views

Published on

How do we go from your Java code to the CPU assembly that actually runs it? Using high level constructs has made us forget what happens behind the scenes, which is however key to write efficient code.

Starting from a few lines of Java, we explore the different layers that constribute to running your code: JRE, byte code, structure of the OpenJDK virtual machine, HotSpot, intrinsic methds, benchmarking.

An introductory presentation to these low-level concerns, based on the practical use case of optimizing 6 lines of code, so that hopefully you to want to explore further!

Presentation given at the Toulouse (FR) Java User Group.

Video (in french) at https://www.youtube.com/watch?v=rB0ElXf05nU
Slideshow with animations at https://docs.google.com/presentation/d/1eIcROfLpdTU2_Z_IKiMG-AwqZGZgbN1Bs2E0nGShpbk/pub?start=true&loop=false&delayms=60000

Published in: Software
  • Be the first to comment

Inside the JVM - Follow the white rabbit!

  1. 1. Inside the JVM Follow the white rabbit! Sylvain Wallez - @bluxte Toulouse JUG - 2017-04-26
  2. 2. Who’s this guy? Software engineer at Elastic (cloud team) Previously: ● IoT tech lead at OVH ● CEO at Actoboard ● Backend architect at Sigfox ● CTO at Goojet/Scoop it ● Lead architect at Joost ● Member of the Apache Software Foundation ● Cofounder & CTO at Anyware Technologies (now part of Sierra Wireless)
  3. 3. Agenda ● How it started: let’s optimize 6 lines of (hot) code! ● Profiling memory usage ● What’s in a class file? ● Micro-benchmarking with JMH ● Exploration of the OpenJDK source code
  4. 4. How it started Let’s optimize 6 lines of (hot) code
  5. 5. On the CouchBase blog... “JVM Profiling - Lessons from the trenches”: optimize the conversion of a protocol error code into a readable message. ... private final short code; private final String description; KeyValueStatus(short code, String description) { this.code = code; this.description = description; } public static KeyValueStatus valueOf(final short code) { for (KeyValueStatus value: values()) { if (value.code() == code) return value; } return UNKNOWN; } public enum KeyValueStatus { UNKNOWN((short) -1, "Unknown code"), SUCCESS((short) 0x00, "The operation completed successfully"), ERR_NOT_FOUND((short) 0x01, "The key does not exists"), ERR_EXISTS((short) 0x02, "The key exists in the cluster"), ERR_TOO_BIG((short) 0x03, "The document exceeds the maximum size"), ERR_INVALID((short) 0x04, "Invalid request"), ERR_NOT_STORED((short) 0x05, "The document was not stored"), ...
  6. 6. On the CouchBase blog... Finding: values() is allocating memory public static KeyValueStatus valueOf(final short code) { for (KeyValueStatus value: values()) { if (value.code() == code) return value; } return UNKNOWN; } public static KeyValueStatus valueOf(final short code) { if (code == SUCCESS.code) { return SUCCESS; } else if (code == ERR_NOT_FOUND.code) { return ERR_NOT_FOUND; } else if (code == ERR_EXISTS.code) { return ERR_EXISTS; } else if (code == ERR_NOT_MY_VBUCKET.code) { return ERR_NOT_MY_VBUCKET; } for (KeyValueStatus value : values()) { if (value.code() == code) { return value; } } return UNKNOWN; } Optimization: fast path on common values If something goes wrong, it’ll make it worse!
  7. 7. xkcd #386
  8. 8. Oh well, nobody cares... blog post
  9. 9. Profiling memory usage
  10. 10. Various kinds of memory optimization ● Memory usage / memory leaks ○ My application needs tons of heap ○ How many objects are held active? → Memory profiler / jmap ● Garbage collection pressure ○ My application spends a lot of time in the GC ○ How often are objects allocated? → Java Mission Control / jmap
  11. 11. jmap histograms jmap -histo num #instances #bytes class name ---------------------------------------------- 1: 4217124 674740720 [Lnet.bluxte.experiments.couchbase_keyvalue.KeyValueStatus; 2: 486 14947912 [I 3: 5855 493864 [C 4: 1461 166752 java.lang.Class 5: 5848 140352 java.lang.String 6: 503 136440 [B 7: 968 62480 [Ljava.lang.Object; 8: 1255 40160 java.util.HashMap$Node 9: 991 39640 java.util.LinkedHashMap$Entry 10: 258 30720 [Ljava.util.HashMap$Node; 11: 259 22792 java.lang.reflect.Method 12: 441 20952 [Ljava.lang.String; 13: 229 16488 java.lang.reflect.Field 14: 171 9576 java.util.LinkedHashMap 15: 291 9312 java.util.concurrent.ConcurrentHashMap$Node 16: 160 7680 java.util.HashMap 17: 178 7120 java.lang.ref.SoftReference 18: 89 7120 java.net.URI code available on GitHub
  12. 12. jmap histograms jmap -histo:live – perform a full GC first num #instances #bytes class name ---------------------------------------------- 1: 5855 493864 [C 2: 1461 166752 java.lang.Class 3: 5848 140352 java.lang.String 4: 503 136440 [B 5: 967 62456 [Ljava.lang.Object; 6: 1255 40160 java.util.HashMap$Node 7: 991 39640 java.util.LinkedHashMap$Entry 8: 258 30720 [Ljava.util.HashMap$Node; 9: 259 22792 java.lang.reflect.Method 10: 441 20952 [Ljava.lang.String; 11: 283 19272 [I 12: 229 16488 java.lang.reflect.Field .................... 51: 35 1400 javax.management.MBeanOperationInfo 52: 3 1360 [Lnet.bluxte.experiments.couchbase_keyvalue.KeyValueStatus; 53: 55 1320 java.io.ExpiringCache$Entry 54: 29 1304 [Ljava.lang.reflect.Field; 55: 36 1152 net.bluxte.experiments.couchbase_keyvalue.KeyValueStatus
  13. 13. Java Mission Control / Java Flight Recorder Lightweight monitoring agent ● Integrated into the (Oracle) JVM ● Very low overhead Continuously samples diagnostics data ● Thread activity ● GC activity ● Memory allocations
  14. 14. Java Mission Control / Java Flight Recorder Available only with Oracle JDK ● Free for development ● Commercial for use in production How to enable it? ● at launch time: java -XX:+UnlockCommercialFeatures ● after launch: jcmd <pid> VM.unlock_commercial_features
  15. 15. Original code Simple loop on the enum values public static KeyValueStatus valueOf(final short code) { for (KeyValueStatus value: values()) { if (value.code() == code) return value; } return UNKNOWN; }
  16. 16. Original code - Memory stats Looks good! No leak! Hmm… growing fast!
  17. 17. Original code - Allocations
  18. 18. Original code - GC activity
  19. 19. Iteration on constant array Still trivial, but reuse the values array private static final KeyValueStatus[] VALUES = values(); public static KeyValueStatus valueOf(final short code) { for (KeyValueStatus value: VALUES) { if (value.code() == code) return value; } return UNKNOWN; }
  20. 20. Constant array - Allocations
  21. 21. Constant array - GC activity
  22. 22. GC pressure collateral damages Full GC clears weak references → clears some caches → additional load to repopulate them!
  23. 23. Enum.values() ? Exploring the bytecode
  24. 24. Enum.values() – a generated method The compiler automatically adds some special methods when it creates an enum. For example, they have a static values method that returns an array containing all of the values of the enum in the order they are declared. – The Java Tutorial /** * Returns an array containing the constants of this enum * type, in the order they're declared. This method may be * used to iterate over the constants as follows: * * for(E c : E.values()) * System.out.println(c); * * @return an array containing the constants of this enum * type, in the order they're declared */ public static E[] values(); – The Java Language Specification
  25. 25. Show me the (byte)code! public class SimpleMain { public static void main(String[] args) { System.out.println("Hello world!"); } } public class net.bluxte.experiments.talk.SimpleMain { public net.bluxte.experiments.talk.SimpleMain(); Code: 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return public static void main(java.lang.String[]); Code: 0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #3 // String Hello world! 5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return } Default constructor javap -c SimpleMain.class or IntelliJ’s bytecode plugin
  26. 26. Show me the (byte)code! public class SimpleMain { static String hello = "Hello"; static String world = "world"; public static void main( String[] args ) { System.out.println( hello + " " + world ); } } public class net.bluxte.experiments.talk.SimpleMain { static java.lang.String hello; static java.lang.String world; public net.bluxte.experiments.talk.SimpleMain(); Code: 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return public static void main(java.lang.String[]); Code: 0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; 3: new #3 // class java/lang/StringBuilder 6: dup 7: invokespecial #4 // Method java/lang/StringBuilder."<init>":()V 10: getstatic #5 // Field hello:Ljava/lang/String; 13: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 16: ldc #7 // String “ “ 18: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 21: getstatic #8 // Field world:Ljava/lang/String; 24: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 27: invokevirtual #9 // Method java/lang/StringBuilder.toString:()Ljava/lang/String; 30: invokevirtual #10 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 33: return static {}; Code: 0: ldc #11 // String Hello 2: putstatic #5 // Field hello:Ljava/lang/String; 5: ldc #12 // String world 7: putstatic #8 // Field world:Ljava/lang/String; 10: return } String concat with StringBuilder Static initializer
  27. 27. Show me the (byte)code! public enum SimpleEnum { FIRST_ENUM, SECOND_ENUM } ... public static net.bluxte.experiments.talk.SimpleEnum[] values(); Code: 0: getstatic #1 // Field $VALUES:[Lnet/bluxte/experiments/talk/SimpleEnum; 3: invokevirtual #2 // Method "[Lnet/bluxte/experiments/talk/SimpleEnum;".clone:()Ljava/lang/Object; 6: checkcast #3 // class "[Lnet/bluxte/experiments/talk/SimpleEnum;" 9: areturn public static net.bluxte.experiments.talk.SimpleEnum valueOf(java.lang.String); Code: 0: ldc #4 // class net/bluxte/experiments/talk/SimpleEnum 2: aload_0 3: invokestatic #5 // Method java/lang/Enum.valueOf:(Ljava/lang/Class;Ljava/lang/String;)Ljava/lang/Enum; 6: checkcast #4 // class net/bluxte/experiments/talk/SimpleEnum 9: areturn ... Aha! We found the culprit!
  28. 28. But why the clone? Java arrays are mutable The caller can mess with it, which would break other users → Perform a defensive copy every time How to could it be prevented? Return an immutable List, but probably too high level here
  29. 29. More on the bytecode A class file is composed of: ● constant pool: strings, fields/methods name+type, class names, etc. ● fields and methods definitions and code ○ Access flags and attributes ○ Code ○ Line number table ○ Local variable table (type and name) ○ Exception table
  30. 30. But wait… ...why would I want to know about this? ● Better understand low level diagnostics ● Check generated code ○ Java: enum values (!), for loops, etc ○ Scala, Kotlin: implementation of higher level constructs ○ Hibernate & co: how do they mangle your code? ● Grasping low level stuff allows writing better high-level code
  31. 31. #1 = Methodref #6.#20 // java/lang/Object."<init>":()V #2 = Fieldref #21.#22 // java/lang/System.out:Ljava/io/PrintStream; #3 = String #23 // Hello world #4 = Methodref #24.#25 // java/io/PrintStream.println:(Ljava/lang/String;)V #5 = Class #26 // net/bluxte/experiments/talk/SimpleMain #6 = Class #27 // java/lang/Object #7 = Utf8 <init> #8 = Utf8 ()V #9 = Utf8 Code #10 = Utf8 LineNumberTable #11 = Utf8 LocalVariableTable #12 = Utf8 this #13 = Utf8 Lnet/bluxte/experiments/talk/SimpleMain; #14 = Utf8 main #15 = Utf8 ([Ljava/lang/String;)V #16 = Utf8 args #17 = Utf8 [Ljava/lang/String; #18 = Utf8 SourceFile #19 = Utf8 SimpleMain.java #20 = NameAndType #7:#8 // "<init>":()V #21 = Class #28 // java/lang/System #22 = NameAndType #29:#30 // out:Ljava/io/PrintStream; #23 = Utf8 Hello world #24 = Class #31 // java/io/PrintStream #25 = NameAndType #32:#33 // println:(Ljava/lang/String;)V #26 = Utf8 net/bluxte/experiments/talk/SimpleMain #27 = Utf8 java/lang/Object Constant pool for SimpleMain public class net.bluxte.experiments.talk.SimpleMain { public net.bluxte.experiments.talk.SimpleMain(); Code: 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return public static void main(java.lang.String[]); Code: 0: getstatic #2 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #3 // String Hello world! 5: invokevirtual #4 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return }
  32. 32. Type encoding What is (Ljava/lang/String;)V ??? L<class path>; → class name I, J, S, B, C → integer, long, short, byte, char F, D → float, double Z → boolean public String foo(int a, char[] b, List<Integer> c, boolean d) (I[CLjava/util/List;Z)Ljava/lang/String;
  33. 33. The bytecode “language” Stack-based machine ● Easier to target a large variety of CPUs (Android/Dalvik is register based) Object-oriented assembler ● Method calls (static / virtual / interface / special) Controlled memory access ● Local variables ● Object fields
  34. 34. The bytecode “language” Very simple 200 instructions set Instruction groups: ● Load and store ● Arithmetic and logic ● Type conversion ● Object creation and manipulation ● Operand stack management ● Control transfer ● Method invocation and return Only addition since 1996: invokedynamic in Java7
  35. 35. The bytecode “language” public static void main(String[] args) { long start = System.nanoTime(); while(System.nanoTime() - start < MAX_NANOS) { for (int i = 0; i < 1_000_000; i++) { resolved = resolve((short)rnd.nextInt(0x100)); } Thread.sleep(100); } } 0: invokestatic #2 // Method java/lang/System.nanoTime:()J 3: lstore_1 4: invokestatic #2 // Method java/lang/System.nanoTime:()J 7: lload_1 8: lsub 9: getstatic #3 // Field MAX_NANOS:J 12: lcmp 13: ifge 55 16: iconst_0 17: istore_3 18: iload_3 19: ldc #4 // int 1000000 21: if_icmpge 46 24: getstatic #5 // Field rnd:Ljava/util/Random; 27: sipush 256 30: invokevirtual #6 // Method java/util/Random.nextInt:(I)I 33: i2s 34: invokestatic #7 // Method resolve:(S)Lnet/bluxte/experiments/ couchbase_keyvalue/KeyValueStatus; 37: putstatic #8 // Field resolved:Lnet/bluxte/experiments/ couchbase_keyvalue/KeyValueStatus; 40: iinc 3, 1 43: goto 18 46: ldc2_w #9 // long 100l 49: invokestatic #11 // Method java/lang/Thread.sleep:(J)V 52: goto 4 55: return LocalVariableTable: Start Length Slot Name Signature 18 28 3 i I 0 56 0 args [Ljava/lang/String; 4 52 1 start J
  36. 36. Benchmarking with JMH (Back to good old Java)
  37. 37. Improving our solution We fixed the memory issue but it’s clearly non optimal Let’s benchmark it! private static final KeyValueStatus[] VALUES = values(); public static KeyValueStatus valueOf(final short code) { for (KeyValueStatus value: VALUES) { if (value.code() == code) return vue; }al return UNKNOWN; } O(n) on constant data!
  38. 38. JMH: an OpenJDK project ● Provides drivers and guidance for writing tests ● Takes care of pre-warming the JVM, collecting results and computing stats ● Provides a Maven artifact type for benchmarking projects “JMH is a Java harness for building, running, and analysing nano/micro/milli/macro benchmarks written in Java and other languages targetting the JVM.”
  39. 39. Benchmark code @State(Scope.Benchmark) public class ValueOfBenchmark { @Param({ "0", // 0x00, Success "1", // 0x01, Not Found "134", // 0x86 Temporary Failure "255", // undefined "1024" // undefined, out of bounds }) public short code; @Benchmark public KeyValueStatus loopNoFastPath() { return KeyValueStatus.valueOfLoop(code); } @Benchmark public KeyValueStatus loopFastPath() { return KeyValueStatus.valueOf(code); } ... } mvn clean install java -jar target/benchmarks.jar # VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_121.jdk/Contents /Home/jre/bin/java # VM options: <none> # Warmup: 20 iterations, 1 s each # Measurement: 20 iterations, 1 s each # Threads: 1 thread, will synchronize iterations # Benchmark mode: Throughput, ops/time # Benchmark: net.bluxte.experiments.couchbase_keyvalue. ValueOfBenchmark.loopNoFastPath # Parameters: (code = 0) # Run progress: 0,00% complete, ETA 04:53:20 # Fork: 1 of 10 # Warmup Iteration 1: 152063982,769 ops/s # Warmup Iteration 2: 149808416,787 ops/s # Warmup Iteration 3: 210436722,740 ops/s # Warmup Iteration 4: 202906403,960 ops/s # Warmup Iteration 5: 204518647,481 ops/s # Warmup Iteration 6: 209602101,373 ops/s # Warmup Iteration 7: 204717066,594 ops/s # Warmup Iteration 8: 209156212,425 ops/s # Warmup Iteration 9: 215544157,049 ops/s # Warmup Iteration 10: 213919676,979 ops/s # Warmup Iteration 11: 211316588,650 ops/s
  40. 40. Benchmark-driven optimization public static KeyValueStatus valueOfLoop(final short code) { for (KeyValueStatus value: values()) { if (value.code() == code) return value; } return UNKNOWN; } Benchmark (code) Mode Samples Score Score error Units loopNoFastPath 0 avgt 10 19.383 0.331 ns/op loopNoFastPath 1 avgt 10 19.243 0.376 ns/op loopNoFastPath 134 avgt 10 24.855 0.651 ns/op loopNoFastPath 255 avgt 10 30.587 0.833 ns/op loopNoFastPath 1024 avgt 10 30.619 1.209 ns/op Time grows linearly with value, even with out of bound values Initial implementation
  41. 41. Benchmark-driven optimization private static final KeyValueStatus[] VALUES = values(); public static KeyValueStatus valueOf(short code) { for (KeyValueStatus value: VALUES) { if (value.code() == code) return value; } return UNKNOWN; } Benchmark (code) Mode Samples Score Score error Units loopOnConstantArray 0 avgt 10 2.975 0.086 ns/op loopOnConstantArray 1 avgt 10 3.035 0.080 ns/op loopOnConstantArray 134 avgt 10 10.215 0.269 ns/op loopOnConstantArray 255 avgt 10 16.856 0.679 ns/op loopOnConstantArray 1024 avgt 10 17.015 0.577 ns/op Still linear, removed ~15 ns allocation overhead Reuse the constant array
  42. 42. Benchmark-driven optimization private static final Map<Short, KeyValueStatus> code2statusMap = new HashMap<>(); static { for (KeyValueStatus value: values()) { code2statusMap.put(value.code(), value); } } public static KeyValueStatus valueOf(final short code) { return code2statusMap.getOrDefault(code, UNKNOWN); } Benchmark (code) Mode Samples Score Score error Units lookupMap 0 avgt 10 4.954 0.134 ns/op lookupMap 1 avgt 10 4.036 0.125 ns/op lookupMap 134 avgt 10 5.597 0.157 ns/op lookupMap 255 avgt 10 4.006 0.144 ns/op lookupMap 1024 avgt 10 6.752 0.228 ns/op More or less constant Worse on small values Way better on larger values Prepare a hashmap, then simple lookup
  43. 43. Oh wait… autoboxing! public static net.bluxte.experiments.couchbase_keyvalue.KeyValueStatus valueOfLookupMap(short); Code: 0: getstatic #17 // Field code2statusMap:Ljava/util/HashMap; 3: iload_0 4: invokestatic #18 // Method java/lang/Short.valueOf:(S)Ljava/lang/Short; 7: getstatic #15 // Field UNKNOWN:Lnet/bluxte/experiments/couchbase_keyvalue/KeyValueStatus; 10: invokevirtual #19 // Method java/util/HashMap.getOrDefault: (Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object; 13: checkcast #4 // class net/bluxte/experiments/couchbase_keyvalue/KeyValueStatus 16: areturn public static KeyValueStatus valueOf(final short code) { return code2statusMap.getOrDefault(code, UNKNOWN); }
  44. 44. Oh wait… autoboxing! Using Carrot HPPC (high performance primitive collections) avoids this
  45. 45. Benchmarking variations private static final KeyValueStatus[] code2status = new KeyValueStatus[0x100]; static { Arrays.fill(code2status, UNKNOWN); for (KeyValueStatus keyValueStatus : values()) { if (keyValueStatus != UNKNOWN) { code2status[keyValueStatus.code()] = keyValueStatus; } } } public static KeyValueStatus valueOfLookupArray(short code) { if (code >= 0 && code < code2status.length) { return code2status[code]; } else { return UNKNOWN; } } Benchmark (code) Mode Samples Score Score error Units lookupArray 0 avgt 10 3.061 0.126 ns/op lookupArray 1 avgt 10 3.048 0.127 ns/op lookupArray 134 avgt 10 3.070 0.084 ns/op lookupArray 255 avgt 10 3.035 0.113 ns/op lookupArray 1024 avgt 10 3.034 0.113 ns/op Constant fast time No GC overhead w00t! Prepare a lookup array, then simple lookup
  46. 46. Dangers of JMH Benchmark-driven iterations ● Can drive you to partial incremental improvements ● Take a step back, think outside of the box Optimizing for the sake of optimizing ● Time consuming ● No real effect if not on “hot” code
  47. 47. Diving into OpenJDK (This gets scary!)
  48. 48. The VM does a lot of things C1 “client” compiler C2 “server” compiler Interpreter Garbage collector
  49. 49. Finding your way in OpenJDK Main website http://openjdk.java.net/ Get the code: hg clone http://hg.openjdk.java.net/jdk8/jdk8/hotspot/ hg clone http://hg.openjdk.java.net/jdk8/jdk8/jdk/ Mercurial still alive!
  50. 50. garbage collectors bytecode interpreter server compiler (c2 / opto) client compiler (c1) LLVM-based JIT OS and/or CPU specific code root of shared code
  51. 51. CPU-independent target (works with shark JIT) Additional support in JDK9: ● ARM 32 & 64 bits ● PowerPC ● S390 ● AIX
  52. 52. Intrinsic methods What you see is not what you get ● The JVM “intercepts” some methods calls ○ String / StringBuffer methods, Math, Unsafe, array manipulation, etc. ● Replaced inline with native (assembly) code ○ Extremely fast and optimized ○ Not even JNI overhead ● Find them in hotspot/src/share/vm/classfile/vmSymbols.hpp
  53. 53. Intrinsic methods // IndexOf for constant substrings with size >= 8 chars // which don't need to be loaded through stack. void MacroAssembler::string_indexofC8(Register str1, Register str2, Register cnt1, Register cnt2, int int_cnt2, Register result, XMMRegister vec, Register tmp) { ShortBranchVerifier sbv(this); assert(UseSSE42Intrinsics, "SSE4.2 is required"); // This method uses pcmpestri inxtruction with bound registers // inputs: // xmm - substring // rax - substring length (elements count) // mem - scanned string // rdx - string length (elements count) // 0xd - mode: 1100 (substring search) + 01 (unsigned shorts) // outputs: // rcx - matched index in string assert(cnt1 == rdx && cnt2 == rax && tmp == rcx, "pcmpestri"); Label RELOAD_SUBSTR, SCAN_TO_SUBSTR, SCAN_SUBSTR, RET_FOUND, RET_NOT_FOUND, EXIT, FOUND_SUBSTR, MATCH_SUBSTR_HEAD, RELOAD_STR, FOUND_CANDIDATE; Example: String.indexOf on x86
  54. 54. In JDK9 beta, String.indexOf(String) is faster than String.indexOf(char)! This is because one is intrinsic, and not yet the other Intrinsic methods Benchmark Mode Cnt Score Error Units # JDK 8u121 IndexOfBenchmark.StringIndexOfChar thrpt 5 141857.332 ± 5530.472 ops/s IndexOfBenchmark.StringIndexOfString thrpt 5 113091.517 ± 2241.533 ops/s # JDK 9b152 IndexOfBenchmark.StringIndexOfChar thrpt 5 154525.343 ± 3796.818 ops/s IndexOfBenchmark.StringIndexOfString thrpt 5 185917.059 ± 3391.230 ops/s (from the jdk9-dev mailing-list)
  55. 55. Intrinsic methods ● “I can do it better than JDK source” – think twice! → Have a look at vmSymbols.hpp first! ● Can sometimes be indirect (esp with strings and arrays) ● When in doubt, benchmark (with the same JVM)
  56. 56. Conclusion
  57. 57. Conclusion ● Know your tools ● Be curious, and follow the white rabbit from time to time, you’ll learn a lot ● However… don’t go overboard and waste (too much) time!
  58. 58. Thanks! Questions? Sylvain Wallez - @bluxte Toulouse JUG - 2017-04-26
  59. 59. Bonus links to dive deeper Java MissionControl & FlightRecorder docs What the JIT!? Anatomy of the OpenJDK HotSpot VM Intrinsic Methods in HotSpot VM Zero and Shark (LLVM JIT)

×