JVM Internals - NHJUG Jan 2012
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

JVM Internals - NHJUG Jan 2012

on

  • 1,762 views

Updated version of previous JVM Internals presentation - now covering features added in Java 7

Updated version of previous JVM Internals presentation - now covering features added in Java 7

Statistics

Views

Total Views
1,762
Views on SlideShare
1,760
Embed Views
2

Actions

Likes
12
Downloads
170
Comments
0

2 Embeds 2

http://www.linkedin.com 1
https://www.linkedin.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

JVM Internals - NHJUG Jan 2012 Document Transcript

  • 1. JVM Internals Douglas Q. Hawkins http://www.slideshare.net/dougqh http://www.dougqh.net dougqh@gmail.comMonday, January 23, 12
  • 2. Topics Java Byte Code File Format Byte Code Examples How Java 5 & 7 Features Are Implemented JVM OptimizationsMonday, January 23, 12
  • 3. Why?Monday, January 23, 12Besides techie edification, why is this useful?A better understanding of the internals can help in deciphering some of the harder problems, but better...You’ll know that the compiler and JVM are doing a lot for you letting you focus on writing readable code.
  • 4. File FormatMonday, January 23, 12
  • 5. Class File Format CA FE BA BE Minor Version Major Version Constant Pool Flags This Class Super Class Interfaces Fields Methods AttributesMonday, January 23, 12Every file starts the magic 2-bytes: CAFEBABEFollowed by major and minor version - major indicates Java 5, 6, 7, etc.Then a constant pool - which contains... constants: int, long, String, etc. references: method and field descriptors: method and fieldFollowed by flags: modifiers for this class/interfaceFollowed by reference to this class/interfaceFollowed by the super class - which is an index into the constant poolFollowed by a list interface references - which are indices into constant poolFollowed by fieldsFollowed by methodsAnd, finally, attributes which are extra meta-information about the class... - the name of the original file - annotation information - information on sub-classesClass File Spec: http://java.sun.com/docs/books/jvms/second_edition/ClassFileFormat-Java5.pdfHistory of CAFEBABE: http://en.wikipedia.org/wiki/Java_class_file
  • 6. Class File Format CA FE BA BE Minor Version Major Version Constant Pool n Flags This Class Super Class pu te d tio ce iva te er ct ta ab tfp pr ec int ra fa ic um pr ic no Interfaces st bl ric ot al at en an fin st st Fields Methods AttributesMonday, January 23, 12Every file starts the magic 2-bytes: CAFEBABEFollowed by major and minor version - major indicates Java 5, 6, 7, etc.Then a constant pool - which contains... constants: int, long, String, etc. references: method and field descriptors: method and fieldFollowed by flags: modifiers for this class/interfaceFollowed by reference to this class/interfaceFollowed by the super class - which is an index into the constant poolFollowed by a list interface references - which are indices into constant poolFollowed by fieldsFollowed by methodsAnd, finally, attributes which are extra meta-information about the class... - the name of the original file - annotation information - information on sub-classesClass File Spec: http://java.sun.com/docs/books/jvms/second_edition/ClassFileFormat-Java5.pdfHistory of CAFEBABE: http://en.wikipedia.org/wiki/Java_class_file
  • 7. Field Format Flags Name Descriptor pu te d lat nt iva te ile vo ie pr ec ic pr ic ns Attributes bl ot al at tra finMonday, January 23, 12 stFields consist of...flagsfollowed by name - actually index to a string literal into the constant poolfollowed by descriptor - e.g. field type - also index into the constant pool - type is raw typefollowed by attributes- constant value- specific type information - List< String >, etc.
  • 8. Field Format Flags Name Descriptor “name” AttributesMonday, January 23, 12Fields consist of...flagsfollowed by name - actually index to a string literal into the constant poolfollowed by descriptor - e.g. field type - also index into the constant pool - type is raw typefollowed by attributes- constant value- specific type information - List< String >, etc.
  • 9. Field Format Flags Name Descriptor “Ljava/lang/String;” AttributesMonday, January 23, 12Fields consist of...flagsfollowed by name - actually index to a string literal into the constant poolfollowed by descriptor - e.g. field type - also index into the constant pool - type is raw typefollowed by attributes- constant value- specific type information - List< String >, etc.
  • 10. Field Format Flags Name Descriptor Attributes ConstantValueMonday, January 23, 12Fields consist of...flagsfollowed by name - actually index to a string literal into the constant poolfollowed by descriptor - e.g. field type - also index into the constant pool - type is raw typefollowed by attributes- constant value- specific type information - List< String >, etc.
  • 11. Method Format d ize Flags Name Descriptor pu te d al on iva te s tfp fi n hr pr ec rg ic va e pr ic Attributes tiv nc bl ra ric ot at na sy stMonday, January 23, 12 stMethods consist of...flagsfollowed by name - actually index to a string literal into the constant poolfollowed by descriptor - e.g. raw parameter types and return typefollowed by attributes- exceptions & code- specific type information - List< String >, etc.- specific exception information- debugging information
  • 12. Method Format Flags Name Descriptor “main” AttributesMonday, January 23, 12Methods consist of...flagsfollowed by name - actually index to a string literal into the constant poolfollowed by descriptor - e.g. raw parameter types and return typefollowed by attributes- exceptions & code- specific type information - List< String >, etc.- specific exception information- debugging information
  • 13. Method Format Flags Name Descriptor “([Ljava/lang/String;)V” AttributesMonday, January 23, 12Methods consist of...flagsfollowed by name - actually index to a string literal into the constant poolfollowed by descriptor - e.g. raw parameter types and return typefollowed by attributes- exceptions & code- specific type information - List< String >, etc.- specific exception information- debugging information
  • 14. Method Format Flags Name Descriptor Attributes Exceptions CodeMonday, January 23, 12Methods consist of...flagsfollowed by name - actually index to a string literal into the constant poolfollowed by descriptor - e.g. raw parameter types and return typefollowed by attributes- exceptions & code- specific type information - List< String >, etc.- specific exception information- debugging information
  • 15. Constant Pool C 2 UTF 10 HelloWorld C 4 UTF 16 “java/lang/Object” UTF 6 “<init>” UTF 3 “()V” UTF 4 “Code” M 3 9 N&T 5 6 UTF 4 “main” UTF 22 “([Ljava/lang/String;)V” F 13 15 C 14 UTF 16 “java/lang/System”Monday, January 23, 12Dissect the “Hello World” example a little...Entry 1 is a class entry - a 2-byte index to a UTF entry that contains the nameEntry 2 is the name of the classSimilarly...Entry 3 is a class entry - referring to the parent class refers to Entry 4 which is the full name of the parent classSkip over the constructor “<init>” and focus on mainEntry 10 is the name “main” & Entry 11 is the raw type descriptor for “main”The [Ljava/lang/String indicates String[] - V indicates returns void
  • 16. Browsing Class File Format JClassLib Viewer http://www.ej-technologies.com/products/jclasslib/overview.htmlMonday, January 23, 12JClassLibViewer: http://www.ej-technologies.com/products/jclasslib/overview.html
  • 17. ConstantValue public final class HelloWorld { public static final String MESSAGE = "Hello, World!"; public static final void main( final String... args ) { System.out.println( MESSAGE ); } }Monday, January 23, 12Here, we can see that because the “MESSAGE” field is “static final”.The value is stored in a “ConstantValue” attribute on the “MESSAGE” field.
  • 18. Exceptions public interface InputStreamProvider { public abstract InputStream open() throws IOException; }Monday, January 23, 12Exception information is also stored in attribute.As it turns out the JVM, makes no distinction between checked and unchecked exceptions which has an interestingimplication...
  • 19. Exceptionspublic final class NewInstance { public static void main(String... args) { try { public class SomeClass { Class. public SomeClass() throws SomeException { forName("net.dougqh.runtime.SomeClass"). throw new SomeException(); newInstance(); } } catch ( } InstantiationException | IllegalAccessException | ClassNotFoundException e) { e.printStackTrace(); } }} Exception in thread "main" net.dougqh.runtime.SomeClass$SomeException ! at net.dougqh.runtime.SomeClass.<init> ! at sun.reflect.NativeConstructorAccessorImpl.newInstance0 ! at sun.reflect.NativeConstructorAccessorImpl.newInstance ! at sun.reflect.DelegatingConstructorAccessorImpl.newInstance ! at java.lang.reflect.Constructor.newInstance ! at java.lang.Class.newInstance0 ! at java.lang.Class.newInstance ! at net.dougqh.runtime.NewInstance.mainMonday, January 23, 12www.javapuzzlers.comBecause of an oversight in the original reflection API, Class.newInstance can throw a checked exception that isnot reported by the compiler
  • 20. Generics public final class Generics { public static final List<String> getStrings() { return Collections.singletonList("foo"); } }Monday, January 23, 12Here, we can getStrings() which returns List<String> has a descriptor of the raw-type ListHowever, the exact type information is stored in the “Signature” attribute
  • 21. Annotations @Inherited @Retention( RetentionPolicy.RUNTIME ) public @interface Annotation { public int foo() default 20; public String bar(); } @Annotation( bar="quux" ) class Annotated {}Monday, January 23, 12An annotation is just an intefaceThe default values for each method are stored in a ConstElement attributeThe annotation information on a class or method is also stored in an attributeIn this case, since the annotation has a RUNTIME RetentionPolicy, it is stored in the RuntimeVisibleAnnotationsattributeValues for the attribute are stored in the sub-attribute ElementValuePair
  • 22. Byte CodeMonday, January 23, 12
  • 23. Stack Based Virtual Machine 0 iconst_1 0 1 2 3 1 iconst_2 2 iadd 3 istore_0 4 iload_0Monday, January 23, 12The JVM byte code format is stack-based like many other VMs: CLR, PHP, and PythonIn this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stackLet’s look at how to add 1 + 2 together and store into a local variableFirst, we use an iconst_1 instruction to load onto the stackJava has special instructions for common numbers: -1 to 5.Next, an iconst_2 to place 2 on the stackNext, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stackNext, we use an istore_0 to store into the first local variable slotTo load value, back from the local variable slots, we use an iload_0Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
  • 24. Stack Based Virtual Machine 0 iconst_1 0 1 2 3 1 iconst_2 2 iadd 3 istore_0 4 iload_0 1Monday, January 23, 12The JVM byte code format is stack-based like many other VMs: CLR, PHP, and PythonIn this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stackLet’s look at how to add 1 + 2 together and store into a local variableFirst, we use an iconst_1 instruction to load onto the stackJava has special instructions for common numbers: -1 to 5.Next, an iconst_2 to place 2 on the stackNext, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stackNext, we use an istore_0 to store into the first local variable slotTo load value, back from the local variable slots, we use an iload_0Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
  • 25. Stack Based Virtual Machine 0 iconst_1 0 1 2 3 1 iconst_2 2 iadd 3 istore_0 4 iload_0 2 1Monday, January 23, 12The JVM byte code format is stack-based like many other VMs: CLR, PHP, and PythonIn this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stackLet’s look at how to add 1 + 2 together and store into a local variableFirst, we use an iconst_1 instruction to load onto the stackJava has special instructions for common numbers: -1 to 5.Next, an iconst_2 to place 2 on the stackNext, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stackNext, we use an istore_0 to store into the first local variable slotTo load value, back from the local variable slots, we use an iload_0Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
  • 26. Stack Based Virtual Machine 0 iconst_1 0 1 2 3 1 iconst_2 2 iadd 3 istore_0 4 iload_0 1+2Monday, January 23, 12The JVM byte code format is stack-based like many other VMs: CLR, PHP, and PythonIn this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stackLet’s look at how to add 1 + 2 together and store into a local variableFirst, we use an iconst_1 instruction to load onto the stackJava has special instructions for common numbers: -1 to 5.Next, an iconst_2 to place 2 on the stackNext, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stackNext, we use an istore_0 to store into the first local variable slotTo load value, back from the local variable slots, we use an iload_0Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
  • 27. Stack Based Virtual Machine 0 iconst_1 0 1 2 3 1 iconst_2 2 iadd 3 istore_0 4 iload_0 3Monday, January 23, 12The JVM byte code format is stack-based like many other VMs: CLR, PHP, and PythonIn this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stackLet’s look at how to add 1 + 2 together and store into a local variableFirst, we use an iconst_1 instruction to load onto the stackJava has special instructions for common numbers: -1 to 5.Next, an iconst_2 to place 2 on the stackNext, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stackNext, we use an istore_0 to store into the first local variable slotTo load value, back from the local variable slots, we use an iload_0Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
  • 28. Stack Based Virtual Machine 0 iconst_1 0 1 2 3 1 iconst_2 3 2 iadd 3 istore_0 4 iload_0Monday, January 23, 12The JVM byte code format is stack-based like many other VMs: CLR, PHP, and PythonIn this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stackLet’s look at how to add 1 + 2 together and store into a local variableFirst, we use an iconst_1 instruction to load onto the stackJava has special instructions for common numbers: -1 to 5.Next, an iconst_2 to place 2 on the stackNext, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stackNext, we use an istore_0 to store into the first local variable slotTo load value, back from the local variable slots, we use an iload_0Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
  • 29. Stack Based Virtual Machine 0 iconst_1 0 1 2 3 1 iconst_2 3 2 iadd 3 istore_0 4 iload_0 3Monday, January 23, 12The JVM byte code format is stack-based like many other VMs: CLR, PHP, and PythonIn this example, the green field is heap, the bar above is local variable slots, and the column to the left is the stackLet’s look at how to add 1 + 2 together and store into a local variableFirst, we use an iconst_1 instruction to load onto the stackJava has special instructions for common numbers: -1 to 5.Next, an iconst_2 to place 2 on the stackNext, we use iadd which pops the 1 & 2 on the stack adds them together and stores the result back on the stackNext, we use an istore_0 to store into the first local variable slotTo load value, back from the local variable slots, we use an iload_0Note: Similar to iconst, there are special istore/iload instructions for the most often used slots: 0-3
  • 30. Parameters and Local Variablesstatic int volume( 0 iload_0 int width, 1 iload_1 int depth, int height ) 2 imul e t h lum igh h pt a dt{ are 3 istore_3 de he wi vo 0 1 2 3 4 int area = width * depth; 4 iload_3 int volume = area * height; return volume; 5 iload_2} 6 imul 7 istore 4 9 iload 4 11ireturnMonday, January 23, 12Trace through a slightly more complicated example: calculating volume- arguments are passed into the low local variables slots - 0 - 3 in this case- first to calculate area, load width and depth from slots 0 & 1 respectively- multiply the values on the stack, then store result into slot 4 area- reload area & height - slots 4 & 3 respectively- multiply the values and store into slot 5: volume- reload volume and returnYes, the value is stored and then immediately reloaded in the byte code. Starting with Java 3, byte code is notoptimized by javac, all optimizations are left to the JVM to perform.
  • 31. Static vs Virtual Methods int volume( 0 iload_1 int width, 1 iload_2 int depth, 2 imul e int height ) are t he h lum h igh pt a dt s{ 3 istore 4 thi de wi vo 0 1 2 3 4 5 int area = width * depth; 5 iload 4 int volume = area * height; return volume; 7 iload_3} 8 imul 9 istore 5 11 iload 5 13 ireturnMonday, January 23, 12In the prior example, you may have noticed that method was static.If the method isn’t static, then “this” is invisibly passed to the first slot.So, our arguments start at 1 and the load and stores all change accordingly.
  • 32. Hello World System.out.println( “Hello World” ); 0 1 2 3 0 getstatic System.out 3 ldc “Hello World” 5 invokevirtual PrintStream.println “Hello World” 8 return System.outMonday, January 23, 12Now, we know enough to understand “Hello World”The first operation is a getstatic to load the value of System.out onto the stackWe need this reference to invoke printlnSecond, load the string “Hello World” onto the stack - the ldc indicates a load from the constant poolNow, since this is non-static method on a class, use invokevirtual to invoke PrintStream.printlnThis consumes the pointer to System.out (which is the this for PrintStream.println) and the reference to “HelloWorld”These values are then mapped to local slots for “this” and “msg” in the new stack frame
  • 33. Hello World System.out.println( “Hello World” ); 0 1 2 3 0 getstatic System.out 3 ldc “Hello World” 5 invokevirtual PrintStream.println “Hello World” 8 return System.outMonday, January 23, 12Now, we know enough to understand “Hello World”The first operation is a getstatic to load the value of System.out onto the stackWe need this reference to invoke printlnSecond, load the string “Hello World” onto the stack - the ldc indicates a load from the constant poolNow, since this is non-static method on a class, use invokevirtual to invoke PrintStream.printlnThis consumes the pointer to System.out (which is the this for PrintStream.println) and the reference to “HelloWorld”These values are then mapped to local slots for “this” and “msg” in the new stack frame
  • 34. Hello World System.out.println( “Hello World” ); 0 1 2 3 0 getstatic System.out 3 ldc “Hello World” 5 invokevirtual PrintStream.println “Hello World” 8 return System.outMonday, January 23, 12Now, we know enough to understand “Hello World”The first operation is a getstatic to load the value of System.out onto the stackWe need this reference to invoke printlnSecond, load the string “Hello World” onto the stack - the ldc indicates a load from the constant poolNow, since this is non-static method on a class, use invokevirtual to invoke PrintStream.printlnThis consumes the pointer to System.out (which is the this for PrintStream.println) and the reference to “HelloWorld”These values are then mapped to local slots for “this” and “msg” in the new stack frame
  • 35. Hello World g s ms thi System.out.println( “Hello World” ); 0 1 2 3 0 getstatic System.out 3 ldc “Hello World” 5 invokevirtual PrintStream.println “Hello World” 8 return System.outMonday, January 23, 12Now, we know enough to understand “Hello World”The first operation is a getstatic to load the value of System.out onto the stackWe need this reference to invoke printlnSecond, load the string “Hello World” onto the stack - the ldc indicates a load from the constant poolNow, since this is non-static method on a class, use invokevirtual to invoke PrintStream.printlnThis consumes the pointer to System.out (which is the this for PrintStream.println) and the reference to “HelloWorld”These values are then mapped to local slots for “this” and “msg” in the new stack frame
  • 36. Types of Method Invocations invokestatic - invoke static methods invokevirtual - invoke instance method from class invokeinterface - invoke instance method from interface invokespecial - invoke <init> / invoke super method invokedynamic - optimized dynamic look-up (in Java 7)Monday, January 23, 12We’ve seen a call to invokevirtual which is used class methods, but there are other invocation types, too.invokestatic - for static methodsinvokeinterface- for methods invoked through an interface reference (rather than a class reference)invokespecial - for direct targets - like constructors or invoking a super method where the call is not polymorphicinvokedynamic - used by script languages like JRuby in Java 7 for improved performance
  • 37. New Object BigDecimal num = m new BigDecimal(“2.0”); nu 0 1 2 3 0 new BigDecimal 3 dup 4 ldc “2.0” “2.0” 6 invokespecial BigDecimal.<init> 9 astore_0Monday, January 23, 12Now, let’s look an object allocationThe first step is to an object; however, this steps does not yet invoke the constructorIt just allocates space on the heap for the object and returns a pointer to uninitialized memoryUnfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, weneed to a copy (“dup”) so that we’ll have a reference left to store into “num”.Next, we push “2.0” onto the stackThen we invoke BigDecimal.<init> which is the BigDecimal constructor.It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”.As you can see construction is rather complicated, some of the past security wholes with byte code verifierinvolved object construction because the sequence is non-trivial.CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thusmaking byte code verification easier.From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a singlestep and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field.In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before theconstructor is done being invoked.
  • 38. New Object BigDecimal num = m new BigDecimal(“2.0”); nu 0 1 2 3 0 new BigDecimal 3 dup 4 ldc “2.0” “2.0” 6 invokespecial BigDecimal.<init> 9 astore_0 BigDecimalMonday, January 23, 12Now, let’s look an object allocationThe first step is to an object; however, this steps does not yet invoke the constructorIt just allocates space on the heap for the object and returns a pointer to uninitialized memoryUnfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, weneed to a copy (“dup”) so that we’ll have a reference left to store into “num”.Next, we push “2.0” onto the stackThen we invoke BigDecimal.<init> which is the BigDecimal constructor.It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”.As you can see construction is rather complicated, some of the past security wholes with byte code verifierinvolved object construction because the sequence is non-trivial.CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thusmaking byte code verification easier.From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a singlestep and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field.In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before theconstructor is done being invoked.
  • 39. New Object BigDecimal num = m new BigDecimal(“2.0”); nu 0 1 2 3 0 new BigDecimal 3 dup 4 ldc “2.0” “2.0” 6 invokespecial BigDecimal.<init> 9 astore_0 BigDecimalMonday, January 23, 12Now, let’s look an object allocationThe first step is to an object; however, this steps does not yet invoke the constructorIt just allocates space on the heap for the object and returns a pointer to uninitialized memoryUnfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, weneed to a copy (“dup”) so that we’ll have a reference left to store into “num”.Next, we push “2.0” onto the stackThen we invoke BigDecimal.<init> which is the BigDecimal constructor.It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”.As you can see construction is rather complicated, some of the past security wholes with byte code verifierinvolved object construction because the sequence is non-trivial.CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thusmaking byte code verification easier.From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a singlestep and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field.In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before theconstructor is done being invoked.
  • 40. New Object BigDecimal num = m new BigDecimal(“2.0”); nu 0 1 2 3 0 new BigDecimal 3 dup 4 ldc “2.0” “2.0” 6 invokespecial BigDecimal.<init> 9 astore_0 BigDecimalMonday, January 23, 12Now, let’s look an object allocationThe first step is to an object; however, this steps does not yet invoke the constructorIt just allocates space on the heap for the object and returns a pointer to uninitialized memoryUnfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, weneed to a copy (“dup”) so that we’ll have a reference left to store into “num”.Next, we push “2.0” onto the stackThen we invoke BigDecimal.<init> which is the BigDecimal constructor.It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”.As you can see construction is rather complicated, some of the past security wholes with byte code verifierinvolved object construction because the sequence is non-trivial.CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thusmaking byte code verification easier.From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a singlestep and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field.In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before theconstructor is done being invoked.
  • 41. New Object BigDecimal num = m new BigDecimal(“2.0”); nu 0 1 2 3 0 new BigDecimal 3 dup 4 ldc “2.0” “2.0” 6 invokespecial BigDecimal.<init> 9 astore_0 BigDecimalMonday, January 23, 12Now, let’s look an object allocationThe first step is to an object; however, this steps does not yet invoke the constructorIt just allocates space on the heap for the object and returns a pointer to uninitialized memoryUnfortunately, since invoking of the constructor will consume a reference to the newly allocated BigDecimal, weneed to a copy (“dup”) so that we’ll have a reference left to store into “num”.Next, we push “2.0” onto the stackThen we invoke BigDecimal.<init> which is the BigDecimal constructor.It consumes the pointer to “2.0” and the duplicate reference, leaving us with one reference to assign into “num”.As you can see construction is rather complicated, some of the past security wholes with byte code verifierinvolved object construction because the sequence is non-trivial.CLR learned from this and has a single “new” instruction that both allocates and invokes the construction, thusmaking byte code verification easier.From this example, you can also see why double-checked locking is broken in Java. Construction isn’t a singlestep and with reordering, so it is possible for a pointer to an uninitialized object to be assigned to field.In Java 5, the use of volatile guarantees a “happens-before”, so the field will never be assigned before theconstructor is done being invoked.
  • 42. Demo javap -cMonday, January 23, 12
  • 43. Conditionals Original Byte Code if ( x > 0 ) { 0: iload_0 return true; 1: ifle 6 } else { 4: iconst_1 return false; 5: ireturn } 6: iconst_0 7: ireturn 0: iload_0 return x > 0 ? true : false; 1: ifle 8 4: iconst_1 5: goto 9 8: iconst_0 9: ireturn 0: iload_0 return ( x > 0 ); 1: ifle 8 4: iconst_1 5: goto 9 8: iconst_0 9: ireturnMonday, January 23, 12Three ways to write a method that checks if a number is greater than 0.The byte code is almost the same in all 3 cases.
  • 44. Invoke Static Original Decompiled Math.max(10, 20); 0: bipush 10 2: bipush 20 4: invokestatic Math.max 7: pop 8: returnMonday, January 23, 12Here, we see an extra pop after the invokestatic call.That’s because the return value of max is left on the stack, since we don’t use it the compiler generates a pop todiscard it.If we store the value in a variable, the pop will be replaced with an istore
  • 45. Invocations Original Decompiled FileInputStream in = 0: new FileInputStream new FileInputStream("foo"); 3: dup in.close(); 4: ldc "foo" 6: invokespecial FileInputStream.<init> 9: astore_0 10: aload_0 11: invokevirtual FileInputStream.close 14: return Closeable in = new FileInputStream("foo"); 0: new FileInputStream in.close(); 3: dup 4: ldc "foo" 6: invokespecial FileInputStream.<init> 9: astore_0 10: aload_0 11: invokeinterface Closeable.close 16: returnMonday, January 23, 12In one example, close is called on a class-type FileInputStream in the other it is called on an interface-typeCloseableIn the first case, the compiler generates an invokevirtual callIn the second case, the compiler generates an invokeinterface call
  • 46. For Loop before 0 iconst_0 init & test loop 1 istore_2 2 iload_0 static int sum( int min, int max ){ 3 istore_3 int sum = 0; 4 goto +10 //14 for ( int i=min; i<max; ++i ){ 7 iload_2 loop body sum += i; 8 iload_3 } 9 iadd return sum; 10 istore_2 } inc 11 iinc 3 by 1 14 iload_3 test 15 iload_1 16 if_icmplt -9 //7 19 iload_2 after loop 20 ireturnMonday, January 23, 12Examine a for loop exampleThe first 2 ops are the initialization of “sum”, load 0 and store in “sum” (slot 2)The next 3 ops are the loop initialization and jump to the initial test...- load the value of “min” (slot 0) into “i” (slot 3)- then jump to the testThe test is placed at the end since it is generally performed after the body and step portions of the loopThe test...- loads “i” (slot 3) and “max” (slot 1)- if “i” is less than “max”, then it jumps back 9 bytes to the start of the loop bodyThe loop body...- loads and adds “sum” and “i” (slots 2 and 3) and stores the result back into “sum” (slot 2)Then the step / increment part of the loop happens...- which just increments “i”Then we flow straight into the test portionIf the test fails, we flow through to the after loop portionHere, we load “sum” (slot 2) and return the result
  • 47. 0 aload_0Exception Handling 1 invokevirtual InputStream.read try / finally static int read( InputStream in ) { 4 istore_1 try { 5 aload_0 return in.read(); 6 invokestatic IoUtils.closeQuietly } catch ( IOException e ) { 9 iload_1 return -1; 10 ireturn } finally { 11 pop IoUtils.closeQuietly( in ); catch / finally } 12 aload_0 } 13 invokestatic IoUtils.closeQuietly 16 iconst_m1 17 ireturn Exception Table 18 astore_2 start end handler Exception 19 aload_0 0 5 11 IOException finally 20 invokestatic IoUtils.closeQuietly 0 5 18 any 23 aload_2 11 12 18 any 24 athrowMonday, January 23, 12Now, Exception handling...Exceptions are handled through extra meta-information that says how to handle different types of exceptionsover a range of byte-code instructions.The finally portion is inlined in the try, catch, and finally portions of the generated byte code.(Prior to Java 6, the regular javac compiler generated “jsr” and “ret” to jump to single block of compiled “finally”code.)The “try / finally” section represents the normal flow.- invoke InputStream.read- store the result into an unnamed temporary variable (slot 1) b/c we need to run the finally code- run the finally code- reload the temporary variable and returnThe “catch / finally” is the catching of the IOException...The exception table says if an IOException is raised between instructions 0 and 5 (the try), jump to 11 this catchsection.First, step is to “pop”, pop what? In this case the IOException which was automatically placed on the stack. Sincewe don’t use it discard it. This implies that “e” is never assigned a stack slot by the compiler.Now, invoke IoUtils.closeQuietly (the finally block) then return -1.
  • 48. 0 aload_0Synchronization before try 1 dup int inc() { 2 astore_1 synchronized ( this ) { 3 monitorenter ++this.counter; 4 aload_0 } 5 dup } 6 getfield Counter.num try / finally 9 iconst_1 10 iadd 11 putfield Counter.num 14 aload_1 15 monitorexit 16 goto +6 //22 Exception Table 19 aload_1 start end handler Exception finally 20 monitorexit 4 16 22 any 21 athrow 19 21 22 any 22 returnMonday, January 23, 12Interestingly enough, synchronization works the same way.To understand synchronization, it is better to luck at synchronization as a lock and unlock within a try / finally.And, that’s exactly how the byte code works.And, just like a regular try / finally, the finally is inlined is both the try and the finally.
  • 49. 0 aload_0Synchronization before try 1 dup int inc() { 2 astore_1 lock( this ); 3 monitorenter try { 4 aload_0 ++this.counter; 5 dup } finally { 6 getfield Counter.num try / finally unlock( this ); } 9 iconst_1 } 10 iadd 11 putfield Counter.num 14 aload_1 15 monitorexit 16 goto +6 //22 Exception Table 19 aload_1 start end handler Exception finally 20 monitorexit 4 16 22 any 21 athrow 19 21 22 any 22 returnMonday, January 23, 12Interestingly enough, synchronization works the same way.To understand synchronization, it is better to luck at synchronization as a lock and unlock within a try / finally.And, that’s exactly how the byte code works.And, just like a regular try / finally, the finally is inlined is both the try and the finally.
  • 50. Demo Java 5 Java 7Monday, January 23, 12In these demos, I demonstrate new language features by showing Java 5 and Java 7 code and then showing what it lookswhen its decompiled back into Java 4 code.JAD - http://www.varaneckas.com/jad
  • 51. Java 5Monday, January 23, 12JAD - http://www.varaneckas.com/jad
  • 52. Auto-Boxing Original Decompiled as Java 4 public class AutoBoxing { public class AutoBoxing { public static void main(String[] args) { public static void main(String args[]) { Integer foo = 20; Integer foo = Integer.valueOf(20); Integer bar = 30; Integer bar = Integer.valueOf(30); int sum = foo + bar; int sum = foo.intValue() + bar.intValue(); System.out.println(sum); System.out.println(sum); } } } }Monday, January 23, 12Here, we see how auto-boxing works.The compiler injects the necessary calls to Integer.valueOf and Integer.intValue for us.NOTE: Even if you don’t like auto-boxing, please call Integer.valueOf rather than calling new Integer.Unlike new, Integer.valueOf returns cached instances of Integer for commonly used values.
  • 53. Enhanced For Original Decompiled as Java 4 public class EnhancedFor { public class EnhancedFor { static void array(String[] args) { static void array(String args[]) { for ( String arg : args ) { String arr$[] = args; System.out.println(arg); int len$ = arr$.length; } for (int i$ = 0; i$ < len$; i$++) { } String arg = arr$[i$]; System.out.println(arg); static void iterable( } Iterable<String> args) } { for ( String arg: args ) { static void iterable(Iterable args) { System.out.println(arg); String arg; } } for (Iterator i$ = args.iterator(); } i$.hasNext(); ) { arg = (String) i$.next(); System.out.println(arg) } } }Monday, January 23, 12In this slide, we see how the enhanced for gets handled by the compiler.The array for loop, converts to the canonical C-style loop. With one slight difference of performing invarianthoisting on the array length. (Although, this is rather pointless optimization because the JVM would do this atruntime anyway.)For an Iterable, a loop that uses an iterator is generated. In this example, we can also see that the compilerinjects a cast to exact type String, too.
  • 54. Var-Args Original Decompiled as Java 4 public final class VarArgs { public final class VarArgs { public static void main(String... args) { public static transient void main( System.out.printf( String[] args) "Hello %s %s", "Jon", "Doe"); { } System.out.printf( } "Hello %s %s", new Object[] {"Jon", "Doe"}); } }Monday, January 23, 12In this example, we var-args being used both in the signature and in the call to printf.NOTE: I’ve declared a main method with var-args, since on a byte-code level this is still just a String[]. Thisactually works just fine.The “transient” modifier in the decompiled Java 4 is a bit amusing. This happens because Java ran out of flag bitsto use in Java 5, so they overloaded the “transient” bit which only applies to fields to mean “var-args” whenapplied to methods.In the call to printf, we can see that the compiler injects a construction of a new Object[] and passes it as the lastarg to printf.
  • 55. Enum Original Decompiled as Java 4 public enum AnEnum { public static final class AnEnum FOO, extends Enum BAR, { QUUX public static final AnEnum FOO = } new AnEnum(“FOO”, 0); public static final AnEnum BAR = new AnEnum(“BAR”, 1); public static final AnEnum QUUX = new AnEnum(“QUUX”, 2); private static final AnEnum[] $VALUES = new AnEnum[]{FOO, BAR, QUUX}; public static AnEnum[] values() { return (AnEnum[]) $VALUES.clone(); } public static AnEnum valueOf(String name){ return (AnEnum)Enum.valueOf( AnEnum.class, name); } private Simple(String s, int i) { super(s, i); } }Monday, January 23, 12For Enum-s, the compiler does a great deal of work on your behalf -- even in the simplest case.The compiler generates a constructor that takes a label and ordinal for each entry.It then initializes a static final field for each constant from the original file.These constants are all placed in a value array.Finally, the compiler generates a values() method and valueOf() method for each enum class.
  • 56. Covariance Original Decompiled as Java 4 public interface Parent { public static interface Parent { Number calculate(); public abstract Number calculate(); } } public class CovariantChild public class CovariantChild implements Parent implements Parent { { public Integer calculate() { public Integer calculate() { return 10; return Integer.valueOf(10); } } } public volatile Number calculate() { return calculate(); } }Monday, January 23, 12A lesser known addition to Java 5 is the ability to have a covariant return type.Here, the child type returns a more specific type of Number -- namely Integer.The generated code is interesting. We end up with two “calculate” methods - one that returns Integer and anotherreturns Number. The one that returns Number satisfies the contact of the parent and simply calls the morespecific version that returns Integer.Here, again we see the curious modifier on a method: “volatile”. This another situation where Java 5 overloaded anexisting flag bit.For more information on why this is type-safe, look-up Liskov Substitution Principle.
  • 57. Java 7Monday, January 23, 12
  • 58. Multi-Catch Original Decompiled as Java 4 public final class EnhancedCatch { public final class EnhancedCatch { public static void main(String[] args){ public static void main(String args[]) { try { try { Class. Class. forName("some.package.SomeClass"). forName("some.package.SomeClass"). newInstance(); newInstance(); } catch ( } catch (ReflectiveOperationException e){ InstantiationException | throw new IllegalStateException(e); IllegalAccessException | } ClassNotFoundException e) } { } throw new IllegalStateException(e); } } }Monday, January 23, 12Java 7 adds the ability to handle multi-exception types in a single catch.Great for ugly reflection code.Here, the catch of all the reflection exceptions simplifies to a single catch of their common parentReflectiveOperationException (a new base class for reflection exceptions also introduced in Java 7).
  • 59. Try With Resources Original Decompiled public class EnhancedTry { public class EnhancedTry { public static void main( public static void main(String args[]) String[] args) throws IOException throws IOException { { Properties properties = new Properties(); Properties properties = InputStream in = new Properties(); new FileInputStream("my.properties"); Throwable throwable = null; try (InputStream in = try { new FileInputStream("my.properties")) properties.load(in); { } catch (Throwable throwable1) { properties.load(in); throwable = throwable1; } } finally { } if (in != null) { } try { in.close(); } catch (Throwable x2) { throwable.addSuppressed(x2); throw throwable; } } } } }Monday, January 23, 12Java 7 also enhances try by allowing it to automatically close resources.It generates a similar try / finally to what you’d write by hand.Although, it puts the resource acquisition outside the try (which is correct but uncommon among many Javaprogrammers).However, it does one more thing, it also adds code, so that if an exception happens when closing the originalexception from the body is still propagated. And, even better the exception raised by closed is added to thesuppressed list of the original exception using the new Java 7 method: Throwable.addSuppressed.
  • 60. String Switch Original Decompiled switch (args[0]) { byte byte0 = -1; case "Hello": switch(args[0].hashCode()) { System.out.println("Hello, World!"); case 69609650: ... break; break; case 67278: if(s.equals("9uFFE7")) { case "Bye": byte0 = 2; System.out.println("Good Bye, World!"); } else if(s.equals("Bye")) { break; byte0 = 1; } case "9uffe7": break; System.out.println("Collision"); } break; switch(byte0) { } case 0: System.out.println("Hello, World!"); break; case 1: System.out.println("Good Bye, World!"); break; case 2: System.out.println("Collision"); break; }Monday, January 23, 12One last example from Java 7 -- string switchString switch is implemented as a switch on the String’s hashCode.However, hashCode is not unique, so the generated code must also perform an equals check.To handle this, string switch actually generates two switch statements.The first on the hashCode, assigns a temporary variable, a case value from the original code.Then the second switches on the case code, each case containing code from the original Java 7 cases.Here, I’ve deliberately created a hash collision, so you can see how collisions are resolved.
  • 61. Compiler OptimizationsMonday, January 23, 12In the next few examples, I show code the original code and the code after it has been decompiled.By doing this, we can see some of the optimizations performed by the compiler.JAD - http://www.varaneckas.com/jad
  • 62. Constant Folding Original Decompiled public final class StaticInitializer { public final class StaticInitializer { private static final String LOG_FORMAT = private static final String LOG_FORMAT = "Started at %d ms"; "Started at %d ms"; private static final long START_TIME = private static final long START_TIME = System.currentTimeMillis(); System.currentTimeMillis(); private static final long START_TIME_2; private static final long START_TIME_2 = System.currentTimeMillis(); static { } START_TIME_2 = System.currentTimeMillis(); } }Monday, January 23, 12While modern Java compiler’s don’t do much optimization, they do some.One example is constant folding -- when possible, the compiler computes simply constant expressions at compiletime.This even includes string concatenation.
  • 63. Constant Inlining Original Decompiled public class Inlining { public class Inlining { public static final String public static final String INLINED_VERSION = "1.1.0"; INLINED_VERSION = "1.1.0"; public static final String public static final String NOT_INLINED_VERSION = identity("1.2.0"); NOT_INLINED_VERSION = identity("1.2.0"); private static String identity( private static String identity( String value) String value) { { return value; return value; } } public static void print() { public static void print() { System.out.println(INLINED_VERSION); System.out.println("1.1.0"); System.out.println(NOT_INLINED_VERSION); System.out.println(NOT_INLINED_VERSION); } } } }Monday, January 23, 12Constants can also be inlined by the compilerIn this example, the compiler inlines INLINED_VERSION in the print method; however,it does no inlined NOT_INLINED_VERSION.The reason is that NOT_INLINED_VERSION is complexed expression because a method was invoked.This has implications in the byte code, too.INLINED_VERSION will have its value set through a ConstantValue attribute.NOT_INLINED_VERSION will be initialized in a <clinit> method generated by the compiler andcalled automatically when the class is first loaded.
  • 64. Dead Code Elimination Original Decompiled public class DeadCodeElimination { public class DeadCodeElimination { public static final boolean public static final boolean DEBUG_OFF = false; DEBUG_OFF = false; public static final boolean public static final boolean DEBUG_ON = true; DEBUG_ON = true; public static void main(String[] args) { public static void main(String args[]) { if ( DEBUG_OFF ) { System.out.println("always"); System.out.println("never"); } } } if ( DEBUG_ON ) { System.out.println("always"); } } }Monday, January 23, 12Along with inlining, the compiler can perform dead code elimination.In this case, DEBUG_OFF is never true, so the “never” print out is not generated by thecompiler.Even in the DEBUG_ON case, the compiler realizes the if is always true and simply includes anunconditional print of “always”.
  • 65. Runtime OptimizationsMonday, January 23, 12
  • 66. HotSpot Lifecycle 1 2 Interpreted Profiling Dynamic Dynamic Decompilation Compilation 4 3Monday, January 23, 12Client compilation kicks-in at invocation 3000Server compilation kicks-in at invocation 10000Tiered compilation - C0, C1, C2Method Replacement vs On-Stack Replacementhttp://java.sun.com/products/hotspot/whitepaper.htmlhttp://openjdk.java.net/groups/hotspot/docs/HotSpotGlossary.htmlhttp://www.azulsystems.com/blog/cliff-click/2010-07-16-tiered-compilationhttp://www.slideshare.net/drorbr/so-you-want-to-write-your-own-benchmark-presentation
  • 67. Is This Optimized? double sumU = 0, sumV = 0; for ( int i = 0; i < 100; ++i ) { Vector2D vector = new Vector2D( i, i ); synchronized ( vector ) { sumU += vector.getU(); How many...? sumV += vector.getV(); Loop Iterations 100 } Heap Allocations 100 } Method Invocations 200 Lock Acquisitions 100Monday, January 23, 12Let’s start the runtime observation discussion with a simple question.Is this optimized?How many loop iterations does it do? 100How many heap allocations? 100How method invocations? 200How lock acquisitions? 100Surprisingly, enough the answer to all of these may actually be zero.
  • 68. Is This Optimized? double sumU = 0, sumV = 0; for ( int i = 0; i < 100; ++i ) { Vector2D vector = new Vector2D( i, i ); synchronized ( vector ) { sumU += vector.getU(); How many...? sumV += vector.getV(); Loop Iterations 0 } Heap Allocations 0 } Method Invocations 0 Lock Acquisitions 0Monday, January 23, 12Let’s start the runtime observation discussion with a simple question.Is this optimized?How many loop iterations does it do? 100How many heap allocations? 100How method invocations? 200How lock acquisitions? 100Surprisingly, enough the answer to all of these may actually be zero.
  • 69. Common Sub-Expression Elimination int x = a + b; int y = a + b; int tmp = a + b; int x = tmp; int y = tmp;Monday, January 23, 12Among the simplest optimizations is common sub-expression elimination.Here the VM optimizes the code by only performing the calculation of “a+b” once.http://www.slideshare.net/drorbr/so-you-want-to-write-your-own-benchmark-presentation
  • 70. Array Bounds Check Elimination int[] nums = ... for ( int i = 0; i < nums.length; ++i ) { System.out.println( “nums[“ + i + “]=” + nums[ i ] ); } int[] nums = ... for ( int i = 0; i < nums.length; ++i ) { if ( i < 0 || i >= nums.length ) { throw new ArrayIndexOutOfBoundsException(); } System.out.println( “nums[“ + i + “]=” + nums[ i ] ); }Monday, January 23, 12One of the nice things about the VM is that we do have to worry about buffer overruns because the VM checksarray bounds for us, but how much is that costing us.In short, nothing. The VM recognizes common patterns and realizes that it does not need to generate the boundchecking code.http://www.cs.umd.edu/~vibha/330/array-bounds.pdf
  • 71. Loop Invariant Hoisting for ( int i = 0; i < nums.length; ++i ) { ... } int length = nums.length; for ( int i = 0; i < length; ++i ) { ... }Monday, January 23, 12The VM can also also realize that the length of array does not change, so it can replace looking up the length of the array oneach test with a single storing of a temporary variable and comparing against that instead.http://java.sun.com/products/hotspot/docs/whitepaper/Java_Hotspot_v1.4.1/Java_HSpot_WP_v1.4.1_1002_4.html
  • 72. Loop Unrolling int sum = 0; for ( int i = 0; i < 10; ++i ) { sum += i; } int sum = 0; sum += 1; ... sum += 9;Monday, January 23, 12In some situations, the loop can even be unrolled into a simple linear code segment.
  • 73. Method Inlining Vector vector = ... double magnitude = vector.magnitude(); Vector vector = ... static always double magnitude = Math.sqrt( final always vector.u*vector.u + vector.v*vector.v ); private always Vector vector = ... virtual often double magnitude; reflective sometimes if ( vector instance of Vector2D ) { magnitude = Math.sqrt( dynamic often vector.u*vector.u + vector.v*vector.v ); } else { magnitude = vector.magnitude(); }Monday, January 23, 12http://www.ibm.com/developerworks/library/j-jtp12214/http://openjdk.java.net/groups/hotspot/docs/HotSpotGlossary.htmlhttp://blog.headius.com/2009/01/my-favorite-hotspot-jvm-flags.htmlhttp://java.sun.com/developer/technicalArticles/Networking/HotSpot/inlining.html
  • 74. Lock Coarsening StringBuffer buffer = ... buffer.append( “Hello” ); buffer.append( name ); buffer.append( “n” ); StringBuffer buffer = ... lock( buffer ); buffer.append( “Hello” ); unlock( buffer ); lock( buffer ); buffer.append( name ); unlock( buffer ); lock( buffer ); buffer.append( “n” ); unlock( buffer ); StringBuffer buffer = ... lock( buffer ); buffer.append( “Hello” ); buffer.append( name ); buffer.append( “n” ); unlock( buffer );Monday, January 23, 12Starting in Java 5, HotSpot optimizes locks by performing lock coarsening.The VM realizes that constantly acquiring and releasing the same lock is not performant, so may take a single larger lockinstead.http://java.sun.com/performance/reference/whitepapers/6_performance.html#2.1
  • 75. Other Lock Optimizations Biased Locking Adaptive Locking - Thread sleep vs. Spin lockMonday, January 23, 12And, even more lock optimizations are possible...- biased locking - makes it cheap for the last thread to acquire lock to acquire it again- adaptive locking - dynamic detects whether a lock is usually held for a short or long period - if it is long, the thread is put to sleep - if it is short, the thread will simply spinhttp://java.sun.com/performance/reference/whitepapers/6_performance.html#2.1
  • 76. Escape Analysis Point p1 = new Point( x1, y1 ), p2 = new Point( x2, y2 ); synchronized ( p1 ) { synchronized ( p2 ) { double dx = p1.getX() - p2.getX(); double dy = p1.getY() - p2.getY(); double distance = Math.sqrt( dx*dx + dy*dy ); } }Monday, January 23, 12Finally, in Java 7, escape analysis is finally on by default.With escape analysis, the VM can realize that an object never escapes a stack frame allowingit to...- elide heap allocation- elide locks
  • 77. Escape Analysis Point p1 = new Point( x1, y1 ), p2 = new Point( x2, y2 ); double dx = p1.getX() - p2.getX(); double dy = p1.getY() - p2.getY(); double distance = Math.sqrt( dx*dx + dy*dy );Monday, January 23, 12Finally, in Java 7, escape analysis is finally on by default.With escape analysis, the VM can realize that an object never escapes a stack frame allowingit to...- elide heap allocation- elide locks
  • 78. Escape Analysis Point p1 = new Point( x1, y1 ), p2 = new Point( x2, y2 ); double dx = p1.getX() - p2.getX(); double dy = p1.getY() - p2.getY(); double distance = Math.sqrt( dx*dx + dy*dy ); double dx = x1 - x2; double dx = y1 - y2; double distance = Math.sqrt( dx*dx + dy*dy );Monday, January 23, 12Finally, in Java 7, escape analysis is finally on by default.With escape analysis, the VM can realize that an object never escapes a stack frame allowingit to...- elide heap allocation- elide locks
  • 79. Runtime Demo http://code.google.com/p/caliper/Monday, January 23, 12To conclude the runtime optimization section, I’ll show some micro-benchmarks illustrating some of the optimizations.Writing microbenchmarks for a dynamically optimizing VM is devilishly hard, fortunately, Google created a tool called Caliperto make it easy. You can write JUnit 3 like Benchmark classes to compare various implementation options.http://www.slideshare.net/drorbr/so-you-want-to-write-your-own-benchmark-presentationhttp://code.google.com/p/caliper/
  • 80. Loop Variable Placement Inside for ( int i = 0; i < ints.length; ++i ) { int x = ints[i]; sum += x; vs. } Outside int x; for ( int i = 0; i < ints.length; ++i ) { x = ints[i]; sum += x; vs. } No Variable for ( int i = 0; i < ints.length; ++i ) { sum += ints[i]; }Monday, January 23, 12First, let’s look at loop variable placement -- declaring the loop variable inside the loop vs. outside vs. using novariable at all.All three take the same amount of time to run. In fact, declaring inside or outside produces the same byte code.My recommendation...For a one-line loop body, skip the variable.For a complicated loop body, declare the variable inside to keep the code easier to read and refactor.
  • 81. Loop Invariant Hoisting Regular For for ( int i = 0; i < ints.length; ++i ) { sum += ints[i]; } vs. Manual Hoisting for ( int i = 0, len = ints.length; i < len; ++i ) { sum += ints[i]; } vs. Enhanced For for ( int x : ints ) { sum += x; }Monday, January 23, 12Now, we’ll compare...- the canonical loop which checks i against array.length each time in the test- manually, hoisting the length into a len temporary variable- using Java 5’s enhanced forOnce again, they all take the same amount of time because the VM performs for hoisting for us.
  • 82. Field Access Direct point.x point.y vs. Virtual Accessor point.getX() point.getY() vs. Interface Accessor point.getX() point.getY()Monday, January 23, 12Next, we’ll look at direct field access vs. using a virtual accessor method vs. using an interface accessor methodOnce again, the VM can optimize all of these by performing method inlining, so all three take the same amount ofthe time.
  • 83. Loop Variable Placement StringBuilder - no locks StringBuilder builder = new StringBuilder(); builder.append( "foo" ); builder.append( "bar" ); builder.append( "baz" ); vs. StringBuffer - multiple locks StringBuffer buffer = new StringBuffer(); buffer.append( "foo" ); buffer.append( "bar" ); buffer.append( "baz" ); vs. StringBuffer - single lock StringBuffer buffer = new StringBuffer(); synchronized( buffer ) { buffer.append( "foo" ); buffer.append( "bar" ); buffer.append( "baz" ); }Monday, January 23, 12Now, revisiting locking - compare...Java 5’s StringBuilder which performs no locking vs.Plain StringBuffer code - multiple separate appends vs.StringBuffer - with a manually added bigger lockThe no lock version does come out slightly ahead, but it is close.And, the attempt to manually improve performance by taking a bigger single lock actually comes in last.
  • 84. Heap Elision Benchmark Primitive Array Arrays.sort(new int[]{...}); vs. Boxed Array - no Comparator Arrays.sort(new Integer[]{...}); vs. Boxed Array - singleton Compator Arrays.sort( new Integer[]{...}, IntCompator.INSTANCE); vs. Boxed Array - anonymous Compator Arrays.sort( new Integer[]{...}, new Comparator<Integer>() { ... });Monday, January 23, 12Lastly, lets look at heap elision by looking at sorting some lists.No surprise, the primitive array is the most performant.But the no Comparator case, the singleton Comparator case, and an anonymous Comparator all perform the same.Even creating an anonymous every time does not impact performance much -- in Java 7, no heap allocation maytake place at all.
  • 85. Is This Optimized? double sumU = 0, sumV = 0; for ( int i = 0; i < 100; ++i ) { Vector2D vector = new Vector2D( i, i ); synchronized ( vector ) { sumU += vector.getU(); How many...? sumV += vector.getV(); Loop Iterations 0 } Heap Allocations 0 } Method Invocations 0 Lock Acquisitions 0Monday, January 23, 12So now, hopefully, you can see how this could may truly be optimized already.Just write clean code and trust in the VM to make it fast.If you must optimize always profile first and use a micro-benchmarking tool like Caliper.
  • 86. Recommending Reading Java Puzzlers By Joshua Bloch and Neal Gafter http://www.javapuzzlers.com/ Java Specialist Newsletter http://www.javaspecialists.eu Brian Goetz’s Articles http://www.ibm.com/developerworks/views/java/libraryview.jsp?contentarea_by=Java+technology&search_by=brian+goetzMonday, January 23, 12
  • 87. Q&AMonday, January 23, 12