2. Compiling Java - 10,000 Foot View
Write and Debug Java code in an IDE (eclipse)
Compile Java source into bytecode (class files)
Run the bytecode on any JVM on any platform
At runtime JIT compile the bytecode into native code for performance
Innovation for a smarter planet
2
3. IBM and Java
IBM has over 3000 products based on Java.
IBM sells hardware (PowerPC and SystemZ) and these platforms must support
Java applications.
IBM Java is optimized to run IBM software, especially Websphere Application
Server.
Innovation for a smarter planet
3
4. IBM and Java
IBM Java supports 12 different platforms and many other embedded
space platforms
Re-use is the only way to scale
Challenging to do things right across all platforms
If there is a bug… it will be found
Java is developed across multiple development sites
Code straddles the boundary of research and production
IBM develops tools for Java developers (based on Eclipse)
RAD – Rational Application Developer
Innovation for a smarter planet
4
5. Worldwide Java Development Team
Toronto Ottawa Poughkeepsie Hursley
Dynamic/Static J9 JVM z/OS system test J2SE libraries and CORBA
compilation Eclipse IDE S/390 specialists J2SE integration and delivery
XML parsing J2ME libraries Customer service
Shanghai
Globalization
Rochester Specialized testing
iSeries development
Phoenix
J2ME development
J2ME delivery
Austin
Java and XML security
AIX system test
PowerPC specialists
Innovation for a smarter planet Bangalore
Integration testing
Customer service
Field release development
5
6. First Step - Write Java code in an IDE
Innovation for a smarter planet
6
7. What is an IDE?
IDE - Integrated Development Environment
Powerful editor for writing your programs
Makes writing software faster and easier
Increased developer productivity
Understands your code
Not just a text editor
Parses and analyzes the code
Provides an integrated environment for all your tools
Version Control (SVN, CVS, Jazz, etc..)
Debuggers
Performance Engineering
Documentation Tools
DatabasesInnovation for a smarter planet
Etc…
7
8. Writing Code using an IDE
Modern Java IDEs have many advanced code editing features
Instant Feedback
Detect syntax errors as you type.
Code Navigation
Instantly jump from a method call to the method definition
Refactoring
Rename a method and the IDE will find everywhere the method is called and
rename all the calls.
Code Completion
Start typing and the IDE finishes it for you.
Visualizations
View a type hierarchy
View the structure outline of a class.
Quick Assist
Automatically fix coding errors for you.
Innovation for a smarter planet
And many more....
8
9. ECJ – Eclipse Compiler for Java
At the core of eclipse there is a Java compiler.
Designed with the needs of an IDE in mind.
The compiler has three outputs:
Generate ASTs
Generate bytecode (class files)
Generate an on-disk index file
ASTs can be used directly by some features
eg) Refactoring
Index is used for fast lookup of program elements.
eg) Code navigation, Search, Generate Type Hierarchy
Innovation for a smarter planet
Compiler is designed to support recompilation while debugging
Incremental compilation
9
10. Incremental Compilation
An incremental compiler will only recompile the parts of the code that have
changed.
Avoid wasteful recompilation of unchanged parts.
Reduces the granularity of a language's translation units.
ECJ will only recompile files that have changed.
A standard C compiler will compile all the header files included by a source file.
The standard javac compiler is not an incremental compiler.
Very important for productivity.
Long compilation pauses are unacceptable.
The developer needs to be able to recompile code changes very quickly.
Innovation for a smarter planet
10
11. Parsing
Parse the code in the editor.
Supports different versions of Java.
Parser runs whenever the user stops typing for a few seconds.
Instantly reports syntax errors and warnings.
Parser generated from an LALR parser generator.
Grammar file contains grammar rules in BNF form.
Most rules have actions associated with them.
Actions build the AST in a bottom up fashion
Leaf nodes created first.
Last node to be created is the root.
Unique challenges
Syntax error recovery needs to be really good.
Parse unsaved code in the editor.
Innovation for a smarter planet
Content assist.
Can't desugar.
11
12. Content Assist
The IDE will complete the code for you.
Problem: user hasn't finished typing a full statement yet, therefore there is a syntax
error at the insertion point.
Must recover from the error and compute a list of possible completions.
Innovation for a smarter planet
12
13. Refactoring
Transforming code into a new form
that behaves the same as before
but is structured better.
Rename
Extract local variable
Inline expression
Inline method
Extract superclass
Extract interface
Change method signature
Etc...
Refactorings are performed on the
AST with the help of the index.
Rewrite rules
Innovation for a smarter planet
13
14. Desugaring
Syntactic Sugar
Syntax that is equivalent to some other syntax
in the language but is more convenient or
compact.
i++;
i += 1;
i = i + 1;
Desugaring
The parser produces the same AST fragment
for different syntax.
Convenient for code generation.
AST produced by IDE cannot be desugared.
The AST needs to represent exactly what is in
the user's source. for a smarter planet
Innovation
All source offsets must be preserved.
Comments must be preserved.
14
15. AST
Eclipse actually has two separate ASTs for Java.
“Internal” AST
May be desugared and extended by the parser.
Used to resolve compilation problems, perform type checking and generate
bytecode.
Example:
– In Java if you do not provide a constructor the compiler will provide a default
constructor for you.
– This is implemented by adding a constructor node under a class node.
“DOM” AST
Exactly represents the user's source code, no desugaring.
Generated from the internal AST.
– “Cleaned up”
Used for code completion, refactoring, and generating the index.
Example:
Innovation for a smarter planet
– The default constructor node is filtered out because it does not actually exist
in the source.
15
16. Bytecode Generation
Each AST node has a
generateCode() method.
Code generation is done
by a depth-first traversal
of the AST.
Each generateCode() method first
calls generateCode() on its children
then generates code for itself.
This works because the JVM
is a stack machine.
Innovation for a smarter planet
16
18. Dynamic Class Loading
Static languages have a linking step after compilation.
Java uses Dynamic Class Loading
All classes are resolved at runtime.
The first time a class name is encountered it is loaded by the JVM.
Searches the “classpath” for the class file to load.
Advantages:
Reflection
load and use classes at runtime that were not known to the compiler.
Hotswap :)
Make code changes as you are debugging.
Incremental compiler recompiles the class file, unloads the old version of the class
and loads the new one.
Change the behaviour of the program while it is running without needing to restart.
Creates many challenges for the JIT compiler
Innovation for a smarter planet
Some optimizations are performed based on assumptions.
A class may be loaded at any time that invalidates these assumptions and requires the
optimization to be backed out.
18
19. JIT Compilation
Also known as Dynamic Compilation
Java bytecode is compiled into native machine code while the application is
running.
Results in ~10x speed improvement over pure interpretation.
Compilation overhead is a runtime cost
There must be a payoff
The resulting speedup must outweigh the cost of compiling the method.
Only compile the “hottest” methods.
Granularity:
Method based JIT – compilation unit is a method
Tracing JIT – compilation unit is a basic block
IBM Java JIT compilerais method based
Innovation for smarter planet
19
20. JIT Compilation Control
A sampling thread wakes up every X milliseconds and records all the methods that
are currently executing.
When a method reaches some threshold it is queued for native code compilation.
The method is initially compiled at a low optimization level.
More optimizations increases compilation overhead.
The jitted version of the method is used on subsequent calls
Note, the interpreted version of the method may still be executing somewhere.
If the jitted method is still hot it may get queued up again for compilation at higher
optimization levels.
JIT compilation happens in separate threads.
Good when you have underutilized cores available.
Innovation for a smarter planet
20
21. JIT Characteristics
The JIT compiler can optimize for the target CPU and OS where the application is
running.
The JIT can detect if certain instruction sets are supported.
Knows the size of the data and instruction caches.
Knows how many registers are available.
In contrast a static compiler must generate code for the lowest common
denominator, or generate code separately for each possible target.
JIT compiler has access to profiling data which it can use when performing
optimizations.
Can perform aggressive optimizations based on runtime assumptions.
Can back out optimizations if an assumption is invalidated.
Innovation for a smarter planet
21
22. JIT Limitations
Compilation overhead is a runtime cost
Certain analyses are impractical to do because they are too slow
Escape analysis is only done at the highest optimization levels
Whole program analysis is not done at all.
Jitted code must often branch back into the interpreter.
Throwing an exception.
Garbage collection points.
Resolving references (i.e. triggering class loading).
Calling an interpreted method from a jitted method.
Innovation for a smarter planet
22
23. JIT Characteristics
Compilation overhead is a runtime cost
Certain analyses are impractical to do because they are too slow
Escape analysis is only done at the highest optimization levels
Whole program analysis is not done at all.
Jitted code must often branch back into the interpreter.
Throwing an exception.
Garbage collection points.
Resolving references (ie triggering class loading).
Calling an interpreted method from a jitted method.
Innovation for a smarter planet
23
24. Optimization: Devirtualization
Java programs contain many virtual methods.
If a virtual method has no overrides then it may be devirtualized.
Observation based on the current state of loaded classes.
Removes the overhead of looking up the method implementation.
Enables inlining.
Problem: dynamic class loading
Its possible that at any time a class may be loaded that contains a method that
overrides a method that was devirtualized.
A table of assumptions is maintained.
Each assumption has a list of instructions that must be patched if the assumption is
invalidated.
Patched method may get queued up for recompilation.
Innovation for a smarter planet
24
25. Optimization: Patching when assumption invalidated
0: no-op
1: fast path- call method directly
2: more code
3: return
4: slow path- call virtual method
5: branch 2
Patch
0: branch 4
1: fast path- call method directly
2: more code
3: return
4: slow path- call virtual method
5: branch 2
Recompile
Innovation for a smarter planet method
0: slow path- call virtual
1: more code
2: return
25
Editor's Notes
Author Notes: This is the standard Lotus Symphony template for Rational Learn more about Lotus Symphony here: http://w3.ibm.com/connections/wikis/home?lang=en#/wiki/Rational%27s%20Phased%20Approach%20in%20Migrating%20to%20Lotus%20Symphony/page/Lotus%20Symphony%20resources%20%E2%80%93%20Get%20started%20and%20find%20out%20more! Additional IBM Rational presentation resource links can be found on Rational’s Managing the Brand W3 Intranet site https://w3-03.ibm.com/software/marketing/marksite.nsf/AllMarketingPages/Brand-Rational-rt_rtb?opendocument?opendocument
Author Notes: This is the standard Lotus Symphony template for Rational Learn more about Lotus Symphony here: http://w3.ibm.com/connections/wikis/home?lang=en#/wiki/Rational%27s%20Phased%20Approach%20in%20Migrating%20to%20Lotus%20Symphony/page/Lotus%20Symphony%20resources%20%E2%80%93%20Get%20started%20and%20find%20out%20more! Additional IBM Rational presentation resource links can be found on Rational’s Managing the Brand W3 Intranet site: https://w3-03.ibm.com/software/marketing/marksite.nsf/AllMarketingPages/Brand-Rational-rt_rtb?opendocument?opendocument