SlideShare a Scribd company logo

JavaOne 2012 - JVM JIT for Dummies

1 of 123
Download to read offline
JVM JIT for Dummies
    And the rest of you, too.
Intro
•   Charles Oliver Nutter
    •   “JRuby Guy”
    •   Sun Microsystems 2006-2009
    •   Engine Yard 2009-2012
    •   Red Hat 2012-
•   Primarily responsible for compiler, perf
    •   Looking inside JVM
What We Will Learn

• How the JVM’s JIT works
• Monitoring the JIT
• Finding problems
• Dumping assembly (don’t be scared!)
What We Won’t

• GC tuning
• GC monitoring with VisualVM
 • Google ‘visualgc’, it’s awesome
• OpenJDK internals
• JNI
Caveat

• Focusing on OpenJDK (Hotspot)
• Other JVMs will do things differently
 • But base principals usually apply
• Flags are specific to Hotspot
 • Internal, subject to change, etc
JIT

• Just-In-Time compilation
• Compiled when needed
 • Maybe immediately before execution
 • ...or when we decide it’s important
 • ...or never?

Recommended

JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013JVM JIT-compiler overview @ JavaOne Moscow 2013
JVM JIT-compiler overview @ JavaOne Moscow 2013Vladimir Ivanov
 
JVM JIT compilation overview by Vladimir Ivanov
JVM JIT compilation overview by Vladimir IvanovJVM JIT compilation overview by Vladimir Ivanov
JVM JIT compilation overview by Vladimir IvanovZeroTurnaround
 
Intrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VMIntrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VMKris Mok
 
为啥别读HotSpot VM的源码(2012-03-03)
为啥别读HotSpot VM的源码(2012-03-03)为啥别读HotSpot VM的源码(2012-03-03)
为啥别读HotSpot VM的源码(2012-03-03)Kris Mok
 
UseNUMA做了什么?(2012-03-14)
UseNUMA做了什么?(2012-03-14)UseNUMA做了什么?(2012-03-14)
UseNUMA做了什么?(2012-03-14)Kris Mok
 
JVM @ Taobao - QCon Hangzhou 2011
JVM @ Taobao - QCon Hangzhou 2011JVM @ Taobao - QCon Hangzhou 2011
JVM @ Taobao - QCon Hangzhou 2011Kris Mok
 
Graal and Truffle: One VM to Rule Them All
Graal and Truffle: One VM to Rule Them AllGraal and Truffle: One VM to Rule Them All
Graal and Truffle: One VM to Rule Them AllThomas Wuerthinger
 
JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?Doug Hawkins
 

More Related Content

What's hot

Memory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux KernelMemory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux KernelAdrian Huang
 
Linux Instrumentation
Linux InstrumentationLinux Instrumentation
Linux InstrumentationDarkStarSword
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernelAdrian Huang
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File SystemAdrian Huang
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)Brendan Gregg
 
Graal in GraalVM - A New JIT Compiler
Graal in GraalVM - A New JIT CompilerGraal in GraalVM - A New JIT Compiler
Graal in GraalVM - A New JIT CompilerKoichi Sakata
 
WebLogic's ClassLoaders, Filtering ClassLoader and ClassLoader Analysis Tool
WebLogic's ClassLoaders, Filtering ClassLoader and ClassLoader Analysis ToolWebLogic's ClassLoaders, Filtering ClassLoader and ClassLoader Analysis Tool
WebLogic's ClassLoaders, Filtering ClassLoader and ClassLoader Analysis ToolJeffrey West
 
Enabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkEnabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkKazuaki Ishizaki
 
Understanding the Android System Server
Understanding the Android System ServerUnderstanding the Android System Server
Understanding the Android System ServerOpersys inc.
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionGene Chang
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageKernel TLV
 
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxConAnatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxConJérôme Petazzoni
 
twlkh-linux-vsyscall-and-vdso
twlkh-linux-vsyscall-and-vdsotwlkh-linux-vsyscall-and-vdso
twlkh-linux-vsyscall-and-vdsoViller Hsiao
 
Startup Snapshot in Node.js
Startup Snapshot in Node.jsStartup Snapshot in Node.js
Startup Snapshot in Node.jsIgalia
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsDatabricks
 
Process Scheduler and Balancer in Linux Kernel
Process Scheduler and Balancer in Linux KernelProcess Scheduler and Balancer in Linux Kernel
Process Scheduler and Balancer in Linux KernelHaifeng Li
 
Memory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdfMemory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdfAdrian Huang
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBrendan Gregg
 

What's hot (20)

Memory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux KernelMemory Mapping Implementation (mmap) in Linux Kernel
Memory Mapping Implementation (mmap) in Linux Kernel
 
Linux Instrumentation
Linux InstrumentationLinux Instrumentation
Linux Instrumentation
 
Page cache in Linux kernel
Page cache in Linux kernelPage cache in Linux kernel
Page cache in Linux kernel
 
Android IPC Mechanism
Android IPC MechanismAndroid IPC Mechanism
Android IPC Mechanism
 
Linux Kernel - Virtual File System
Linux Kernel - Virtual File SystemLinux Kernel - Virtual File System
Linux Kernel - Virtual File System
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
Graal in GraalVM - A New JIT Compiler
Graal in GraalVM - A New JIT CompilerGraal in GraalVM - A New JIT Compiler
Graal in GraalVM - A New JIT Compiler
 
WebLogic's ClassLoaders, Filtering ClassLoader and ClassLoader Analysis Tool
WebLogic's ClassLoaders, Filtering ClassLoader and ClassLoader Analysis ToolWebLogic's ClassLoaders, Filtering ClassLoader and ClassLoader Analysis Tool
WebLogic's ClassLoaders, Filtering ClassLoader and ClassLoader Analysis Tool
 
Enabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkEnabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache Spark
 
Understanding the Android System Server
Understanding the Android System ServerUnderstanding the Android System Server
Understanding the Android System Server
 
Linux MMAP & Ioremap introduction
Linux MMAP & Ioremap introductionLinux MMAP & Ioremap introduction
Linux MMAP & Ioremap introduction
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxConAnatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
Anatomy of a Container: Namespaces, cgroups & Some Filesystem Magic - LinuxCon
 
twlkh-linux-vsyscall-and-vdso
twlkh-linux-vsyscall-and-vdsotwlkh-linux-vsyscall-and-vdso
twlkh-linux-vsyscall-and-vdso
 
Virtual Machine Constructions for Dummies
Virtual Machine Constructions for DummiesVirtual Machine Constructions for Dummies
Virtual Machine Constructions for Dummies
 
Startup Snapshot in Node.js
Startup Snapshot in Node.jsStartup Snapshot in Node.js
Startup Snapshot in Node.js
 
Fine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark JobsFine Tuning and Enhancing Performance of Apache Spark Jobs
Fine Tuning and Enhancing Performance of Apache Spark Jobs
 
Process Scheduler and Balancer in Linux Kernel
Process Scheduler and Balancer in Linux KernelProcess Scheduler and Balancer in Linux Kernel
Process Scheduler and Balancer in Linux Kernel
 
Memory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdfMemory Compaction in Linux Kernel.pdf
Memory Compaction in Linux Kernel.pdf
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
 

Viewers also liked

JavaOne 2011 - JVM Bytecode for Dummies
JavaOne 2011 - JVM Bytecode for DummiesJavaOne 2011 - JVM Bytecode for Dummies
JavaOne 2011 - JVM Bytecode for DummiesCharles Nutter
 
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsBrendan Gregg
 
Fast as C: How to Write Really Terrible Java
Fast as C: How to Write Really Terrible JavaFast as C: How to Write Really Terrible Java
Fast as C: How to Write Really Terrible JavaCharles Nutter
 
Game of Performance: A Song of JIT and GC
Game of Performance: A Song of JIT and GCGame of Performance: A Song of JIT and GC
Game of Performance: A Song of JIT and GCMonica Beckwith
 
JVM: A Platform for Multiple Languages
JVM: A Platform for Multiple LanguagesJVM: A Platform for Multiple Languages
JVM: A Platform for Multiple LanguagesKris Mok
 
Down the Rabbit Hole: An Adventure in JVM Wonderland
Down the Rabbit Hole: An Adventure in JVM WonderlandDown the Rabbit Hole: An Adventure in JVM Wonderland
Down the Rabbit Hole: An Adventure in JVM WonderlandCharles Nutter
 
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...Monica Beckwith
 
JFokus Java 9 contended locking performance
JFokus Java 9 contended locking performanceJFokus Java 9 contended locking performance
JFokus Java 9 contended locking performanceMonica Beckwith
 
Java Performance Engineer's Survival Guide
Java Performance Engineer's Survival GuideJava Performance Engineer's Survival Guide
Java Performance Engineer's Survival GuideMonica Beckwith
 
LISA2010 visualizations
LISA2010 visualizationsLISA2010 visualizations
LISA2010 visualizationsBrendan Gregg
 
The New Systems Performance
The New Systems PerformanceThe New Systems Performance
The New Systems PerformanceBrendan Gregg
 
Linux Performance Tools 2014
Linux Performance Tools 2014Linux Performance Tools 2014
Linux Performance Tools 2014Brendan Gregg
 
DTrace Topics: Introduction
DTrace Topics: IntroductionDTrace Topics: Introduction
DTrace Topics: IntroductionBrendan Gregg
 
FreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame GraphsFreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame GraphsBrendan Gregg
 
From DTrace to Linux
From DTrace to LinuxFrom DTrace to Linux
From DTrace to LinuxBrendan Gregg
 
Lisa12 methodologies
Lisa12 methodologiesLisa12 methodologies
Lisa12 methodologiesBrendan Gregg
 
Performance Analysis: new tools and concepts from the cloud
Performance Analysis: new tools and concepts from the cloudPerformance Analysis: new tools and concepts from the cloud
Performance Analysis: new tools and concepts from the cloudBrendan Gregg
 
Open Source Systems Performance
Open Source Systems PerformanceOpen Source Systems Performance
Open Source Systems PerformanceBrendan Gregg
 

Viewers also liked (20)

JavaOne 2011 - JVM Bytecode for Dummies
JavaOne 2011 - JVM Bytecode for DummiesJavaOne 2011 - JVM Bytecode for Dummies
JavaOne 2011 - JVM Bytecode for Dummies
 
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame Graphs
 
Presto@Uber
Presto@UberPresto@Uber
Presto@Uber
 
Fast as C: How to Write Really Terrible Java
Fast as C: How to Write Really Terrible JavaFast as C: How to Write Really Terrible Java
Fast as C: How to Write Really Terrible Java
 
Game of Performance: A Song of JIT and GC
Game of Performance: A Song of JIT and GCGame of Performance: A Song of JIT and GC
Game of Performance: A Song of JIT and GC
 
JVM: A Platform for Multiple Languages
JVM: A Platform for Multiple LanguagesJVM: A Platform for Multiple Languages
JVM: A Platform for Multiple Languages
 
Down the Rabbit Hole: An Adventure in JVM Wonderland
Down the Rabbit Hole: An Adventure in JVM WonderlandDown the Rabbit Hole: An Adventure in JVM Wonderland
Down the Rabbit Hole: An Adventure in JVM Wonderland
 
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
 
JFokus Java 9 contended locking performance
JFokus Java 9 contended locking performanceJFokus Java 9 contended locking performance
JFokus Java 9 contended locking performance
 
Java Performance Engineer's Survival Guide
Java Performance Engineer's Survival GuideJava Performance Engineer's Survival Guide
Java Performance Engineer's Survival Guide
 
LISA2010 visualizations
LISA2010 visualizationsLISA2010 visualizations
LISA2010 visualizations
 
DTraceCloud2012
DTraceCloud2012DTraceCloud2012
DTraceCloud2012
 
The New Systems Performance
The New Systems PerformanceThe New Systems Performance
The New Systems Performance
 
Linux Performance Tools 2014
Linux Performance Tools 2014Linux Performance Tools 2014
Linux Performance Tools 2014
 
DTrace Topics: Introduction
DTrace Topics: IntroductionDTrace Topics: Introduction
DTrace Topics: Introduction
 
FreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame GraphsFreeBSD 2014 Flame Graphs
FreeBSD 2014 Flame Graphs
 
From DTrace to Linux
From DTrace to LinuxFrom DTrace to Linux
From DTrace to Linux
 
Lisa12 methodologies
Lisa12 methodologiesLisa12 methodologies
Lisa12 methodologies
 
Performance Analysis: new tools and concepts from the cloud
Performance Analysis: new tools and concepts from the cloudPerformance Analysis: new tools and concepts from the cloud
Performance Analysis: new tools and concepts from the cloud
 
Open Source Systems Performance
Open Source Systems PerformanceOpen Source Systems Performance
Open Source Systems Performance
 

Similar to JavaOne 2012 - JVM JIT for Dummies

Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...
Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...
Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...Charles Nutter
 
Paradigma FP y OOP usando técnicas avanzadas de Programación | Programacion A...
Paradigma FP y OOP usando técnicas avanzadas de Programación | Programacion A...Paradigma FP y OOP usando técnicas avanzadas de Programación | Programacion A...
Paradigma FP y OOP usando técnicas avanzadas de Programación | Programacion A...Víctor Bolinches
 
Java 7 Whats New(), Whats Next() from Oredev
Java 7 Whats New(), Whats Next() from OredevJava 7 Whats New(), Whats Next() from Oredev
Java 7 Whats New(), Whats Next() from OredevMattias Karlsson
 
JRuby and Invokedynamic - Japan JUG 2015
JRuby and Invokedynamic - Japan JUG 2015JRuby and Invokedynamic - Japan JUG 2015
JRuby and Invokedynamic - Japan JUG 2015Charles Nutter
 
Blocks & GCD
Blocks & GCDBlocks & GCD
Blocks & GCDrsebbe
 
Ahead-Of-Time Compilation of Java Applications
Ahead-Of-Time Compilation of Java ApplicationsAhead-Of-Time Compilation of Java Applications
Ahead-Of-Time Compilation of Java ApplicationsNikita Lipsky
 
Silicon Valley JUG: JVM Mechanics
Silicon Valley JUG: JVM MechanicsSilicon Valley JUG: JVM Mechanics
Silicon Valley JUG: JVM MechanicsAzul Systems, Inc.
 
JIT vs. AOT: Unity And Conflict of Dynamic and Static Compilers
JIT vs. AOT: Unity And Conflict of Dynamic and Static Compilers JIT vs. AOT: Unity And Conflict of Dynamic and Static Compilers
JIT vs. AOT: Unity And Conflict of Dynamic and Static Compilers Nikita Lipsky
 
Tips and tricks for building high performance android apps using native code
Tips and tricks for building high performance android apps using native codeTips and tricks for building high performance android apps using native code
Tips and tricks for building high performance android apps using native codeKenneth Geisshirt
 
Javascript Everywhere
Javascript EverywhereJavascript Everywhere
Javascript EverywherePascal Rettig
 
.NET Multithreading and File I/O
.NET Multithreading and File I/O.NET Multithreading and File I/O
.NET Multithreading and File I/OJussi Pohjolainen
 
GOTO Night with Charles Nutter Slides
GOTO Night with Charles Nutter SlidesGOTO Night with Charles Nutter Slides
GOTO Night with Charles Nutter SlidesAlexandra Masterson
 
A topology of memory leaks on the JVM
A topology of memory leaks on the JVMA topology of memory leaks on the JVM
A topology of memory leaks on the JVMRafael Winterhalter
 
JRuby 9000 - Optimizing Above the JVM
JRuby 9000 - Optimizing Above the JVMJRuby 9000 - Optimizing Above the JVM
JRuby 9000 - Optimizing Above the JVMCharles Nutter
 
Atmosphere 2016 - Krzysztof Kaczmarek - Don't fear the brackets - Clojure in ...
Atmosphere 2016 - Krzysztof Kaczmarek - Don't fear the brackets - Clojure in ...Atmosphere 2016 - Krzysztof Kaczmarek - Don't fear the brackets - Clojure in ...
Atmosphere 2016 - Krzysztof Kaczmarek - Don't fear the brackets - Clojure in ...PROIDEA
 

Similar to JavaOne 2012 - JVM JIT for Dummies (20)

Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...
Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...
Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...
 
Paradigma FP y OOP usando técnicas avanzadas de Programación | Programacion A...
Paradigma FP y OOP usando técnicas avanzadas de Programación | Programacion A...Paradigma FP y OOP usando técnicas avanzadas de Programación | Programacion A...
Paradigma FP y OOP usando técnicas avanzadas de Programación | Programacion A...
 
Java 7 Whats New(), Whats Next() from Oredev
Java 7 Whats New(), Whats Next() from OredevJava 7 Whats New(), Whats Next() from Oredev
Java 7 Whats New(), Whats Next() from Oredev
 
JRuby and Invokedynamic - Japan JUG 2015
JRuby and Invokedynamic - Japan JUG 2015JRuby and Invokedynamic - Japan JUG 2015
JRuby and Invokedynamic - Japan JUG 2015
 
Blocks & GCD
Blocks & GCDBlocks & GCD
Blocks & GCD
 
Ahead-Of-Time Compilation of Java Applications
Ahead-Of-Time Compilation of Java ApplicationsAhead-Of-Time Compilation of Java Applications
Ahead-Of-Time Compilation of Java Applications
 
Nodejs - A quick tour (v6)
Nodejs - A quick tour (v6)Nodejs - A quick tour (v6)
Nodejs - A quick tour (v6)
 
Silicon Valley JUG: JVM Mechanics
Silicon Valley JUG: JVM MechanicsSilicon Valley JUG: JVM Mechanics
Silicon Valley JUG: JVM Mechanics
 
JIT vs. AOT: Unity And Conflict of Dynamic and Static Compilers
JIT vs. AOT: Unity And Conflict of Dynamic and Static Compilers JIT vs. AOT: Unity And Conflict of Dynamic and Static Compilers
JIT vs. AOT: Unity And Conflict of Dynamic and Static Compilers
 
Tips and tricks for building high performance android apps using native code
Tips and tricks for building high performance android apps using native codeTips and tricks for building high performance android apps using native code
Tips and tricks for building high performance android apps using native code
 
Jvm memory model
Jvm memory modelJvm memory model
Jvm memory model
 
Javascript Everywhere
Javascript EverywhereJavascript Everywhere
Javascript Everywhere
 
Why learn Internals?
Why learn Internals?Why learn Internals?
Why learn Internals?
 
.NET Multithreading and File I/O
.NET Multithreading and File I/O.NET Multithreading and File I/O
.NET Multithreading and File I/O
 
GOTO Night with Charles Nutter Slides
GOTO Night with Charles Nutter SlidesGOTO Night with Charles Nutter Slides
GOTO Night with Charles Nutter Slides
 
A topology of memory leaks on the JVM
A topology of memory leaks on the JVMA topology of memory leaks on the JVM
A topology of memory leaks on the JVM
 
JRuby 9000 - Optimizing Above the JVM
JRuby 9000 - Optimizing Above the JVMJRuby 9000 - Optimizing Above the JVM
JRuby 9000 - Optimizing Above the JVM
 
Atmosphere 2016 - Krzysztof Kaczmarek - Don't fear the brackets - Clojure in ...
Atmosphere 2016 - Krzysztof Kaczmarek - Don't fear the brackets - Clojure in ...Atmosphere 2016 - Krzysztof Kaczmarek - Don't fear the brackets - Clojure in ...
Atmosphere 2016 - Krzysztof Kaczmarek - Don't fear the brackets - Clojure in ...
 
Java
JavaJava
Java
 
Java Language fundamental
Java Language fundamentalJava Language fundamental
Java Language fundamental
 

More from Charles Nutter

The Year of JRuby - RubyC 2018
The Year of JRuby - RubyC 2018The Year of JRuby - RubyC 2018
The Year of JRuby - RubyC 2018Charles Nutter
 
Ruby Performance - The Last Mile - RubyConf India 2016
Ruby Performance - The Last Mile - RubyConf India 2016Ruby Performance - The Last Mile - RubyConf India 2016
Ruby Performance - The Last Mile - RubyConf India 2016Charles Nutter
 
JRuby 9000 - Taipei Ruby User's Group 2015
JRuby 9000 - Taipei Ruby User's Group 2015JRuby 9000 - Taipei Ruby User's Group 2015
JRuby 9000 - Taipei Ruby User's Group 2015Charles Nutter
 
Open Source Software Needs You!
Open Source Software Needs You!Open Source Software Needs You!
Open Source Software Needs You!Charles Nutter
 
InvokeBinder: Fluent Programming for Method Handles
InvokeBinder: Fluent Programming for Method HandlesInvokeBinder: Fluent Programming for Method Handles
InvokeBinder: Fluent Programming for Method HandlesCharles Nutter
 
Over 9000: JRuby in 2015
Over 9000: JRuby in 2015Over 9000: JRuby in 2015
Over 9000: JRuby in 2015Charles Nutter
 
Doing Open Source the Right Way
Doing Open Source the Right WayDoing Open Source the Right Way
Doing Open Source the Right WayCharles Nutter
 
Bringing Concurrency to Ruby - RubyConf India 2014
Bringing Concurrency to Ruby - RubyConf India 2014Bringing Concurrency to Ruby - RubyConf India 2014
Bringing Concurrency to Ruby - RubyConf India 2014Charles Nutter
 
Beyond JVM - YOW! Sydney 2013
Beyond JVM - YOW! Sydney 2013Beyond JVM - YOW! Sydney 2013
Beyond JVM - YOW! Sydney 2013Charles Nutter
 
Beyond JVM - YOW! Brisbane 2013
Beyond JVM - YOW! Brisbane 2013Beyond JVM - YOW! Brisbane 2013
Beyond JVM - YOW! Brisbane 2013Charles Nutter
 
Beyond JVM - YOW Melbourne 2013
Beyond JVM - YOW Melbourne 2013Beyond JVM - YOW Melbourne 2013
Beyond JVM - YOW Melbourne 2013Charles Nutter
 
The Future of JRuby - Baruco 2013
The Future of JRuby - Baruco 2013The Future of JRuby - Baruco 2013
The Future of JRuby - Baruco 2013Charles Nutter
 
High Performance Ruby - E4E Conference 2013
High Performance Ruby - E4E Conference 2013High Performance Ruby - E4E Conference 2013
High Performance Ruby - E4E Conference 2013Charles Nutter
 
Invokedynamic in 45 Minutes
Invokedynamic in 45 MinutesInvokedynamic in 45 Minutes
Invokedynamic in 45 MinutesCharles Nutter
 
Invokedynamic: Tales from the Trenches
Invokedynamic: Tales from the TrenchesInvokedynamic: Tales from the Trenches
Invokedynamic: Tales from the TrenchesCharles Nutter
 
Why JRuby? - RubyConf 2012
Why JRuby? - RubyConf 2012Why JRuby? - RubyConf 2012
Why JRuby? - RubyConf 2012Charles Nutter
 
Aloha RubyConf 2012 - JRuby
Aloha RubyConf 2012 - JRubyAloha RubyConf 2012 - JRuby
Aloha RubyConf 2012 - JRubyCharles Nutter
 
High Performance Ruby - Golden Gate RubyConf 2012
High Performance Ruby - Golden Gate RubyConf 2012High Performance Ruby - Golden Gate RubyConf 2012
High Performance Ruby - Golden Gate RubyConf 2012Charles Nutter
 

More from Charles Nutter (20)

The Year of JRuby - RubyC 2018
The Year of JRuby - RubyC 2018The Year of JRuby - RubyC 2018
The Year of JRuby - RubyC 2018
 
Ruby Performance - The Last Mile - RubyConf India 2016
Ruby Performance - The Last Mile - RubyConf India 2016Ruby Performance - The Last Mile - RubyConf India 2016
Ruby Performance - The Last Mile - RubyConf India 2016
 
JRuby 9000 - Taipei Ruby User's Group 2015
JRuby 9000 - Taipei Ruby User's Group 2015JRuby 9000 - Taipei Ruby User's Group 2015
JRuby 9000 - Taipei Ruby User's Group 2015
 
Open Source Software Needs You!
Open Source Software Needs You!Open Source Software Needs You!
Open Source Software Needs You!
 
InvokeBinder: Fluent Programming for Method Handles
InvokeBinder: Fluent Programming for Method HandlesInvokeBinder: Fluent Programming for Method Handles
InvokeBinder: Fluent Programming for Method Handles
 
Over 9000: JRuby in 2015
Over 9000: JRuby in 2015Over 9000: JRuby in 2015
Over 9000: JRuby in 2015
 
Doing Open Source the Right Way
Doing Open Source the Right WayDoing Open Source the Right Way
Doing Open Source the Right Way
 
JRuby: The Hard Parts
JRuby: The Hard PartsJRuby: The Hard Parts
JRuby: The Hard Parts
 
Bringing Concurrency to Ruby - RubyConf India 2014
Bringing Concurrency to Ruby - RubyConf India 2014Bringing Concurrency to Ruby - RubyConf India 2014
Bringing Concurrency to Ruby - RubyConf India 2014
 
Beyond JVM - YOW! Sydney 2013
Beyond JVM - YOW! Sydney 2013Beyond JVM - YOW! Sydney 2013
Beyond JVM - YOW! Sydney 2013
 
Beyond JVM - YOW! Brisbane 2013
Beyond JVM - YOW! Brisbane 2013Beyond JVM - YOW! Brisbane 2013
Beyond JVM - YOW! Brisbane 2013
 
Beyond JVM - YOW Melbourne 2013
Beyond JVM - YOW Melbourne 2013Beyond JVM - YOW Melbourne 2013
Beyond JVM - YOW Melbourne 2013
 
Down the Rabbit Hole
Down the Rabbit HoleDown the Rabbit Hole
Down the Rabbit Hole
 
The Future of JRuby - Baruco 2013
The Future of JRuby - Baruco 2013The Future of JRuby - Baruco 2013
The Future of JRuby - Baruco 2013
 
High Performance Ruby - E4E Conference 2013
High Performance Ruby - E4E Conference 2013High Performance Ruby - E4E Conference 2013
High Performance Ruby - E4E Conference 2013
 
Invokedynamic in 45 Minutes
Invokedynamic in 45 MinutesInvokedynamic in 45 Minutes
Invokedynamic in 45 Minutes
 
Invokedynamic: Tales from the Trenches
Invokedynamic: Tales from the TrenchesInvokedynamic: Tales from the Trenches
Invokedynamic: Tales from the Trenches
 
Why JRuby? - RubyConf 2012
Why JRuby? - RubyConf 2012Why JRuby? - RubyConf 2012
Why JRuby? - RubyConf 2012
 
Aloha RubyConf 2012 - JRuby
Aloha RubyConf 2012 - JRubyAloha RubyConf 2012 - JRuby
Aloha RubyConf 2012 - JRuby
 
High Performance Ruby - Golden Gate RubyConf 2012
High Performance Ruby - Golden Gate RubyConf 2012High Performance Ruby - Golden Gate RubyConf 2012
High Performance Ruby - Golden Gate RubyConf 2012
 

JavaOne 2012 - JVM JIT for Dummies

  • 1. JVM JIT for Dummies And the rest of you, too.
  • 2. Intro • Charles Oliver Nutter • “JRuby Guy” • Sun Microsystems 2006-2009 • Engine Yard 2009-2012 • Red Hat 2012- • Primarily responsible for compiler, perf • Looking inside JVM
  • 3. What We Will Learn • How the JVM’s JIT works • Monitoring the JIT • Finding problems • Dumping assembly (don’t be scared!)
  • 4. What We Won’t • GC tuning • GC monitoring with VisualVM • Google ‘visualgc’, it’s awesome • OpenJDK internals • JNI
  • 5. Caveat • Focusing on OpenJDK (Hotspot) • Other JVMs will do things differently • But base principals usually apply • Flags are specific to Hotspot • Internal, subject to change, etc
  • 6. JIT • Just-In-Time compilation • Compiled when needed • Maybe immediately before execution • ...or when we decide it’s important • ...or never?
  • 7. Mixed-Mode • Interpreted • Bytecode-walking • Artificial stack machine • Compiled • Direct native operations • Native register machine
  • 8. Profiling • Gather data about code while interpreting • Invariants (types, constants, nulls) • Statistics (branches, calls) • Use that information to optimize • Educated guess • Guess can be wrong...
  • 9. The Golden Rule of Optimization Don’t do unnecessary work.
  • 10. Optimization • Method inlining • Loop unrolling • Lock coarsening/eliding • Dead code elimination • Duplicate code elimination • Escape analysis
  • 11. Inlining? • Combine caller and callee into one unit • e.g. based on profile • Perhaps with a guard/test • Optimize as a whole • More code means better visibility
  • 12. Inlining int addAll(int max) { int accum = 0; for (int i = 0; i < max; i++) { accum = add(accum, i); } return accum; } int add(int a, int b) { return a + b; }
  • 13. Inlining int addAll(int max) { int accum = 0; for (int i = 0; i < max; i++) { accum = add(accum, i); } return accum; Only one target is ever seen } int add(int a, int b) { return a + b; }
  • 14. Inlining int addAll(int max) { int accum = 0; for (int i = 0; i < max; i++) { accum = accum + i; } return accum; Don’t bother making the call }
  • 15. Loop Unrolling • Works for small, constant loops • Avoid tests, branching • Allow inlining a single call as many
  • 16. Loop Unrolling private static final String[] options = { "yes", "no", "maybe"}; public void looper() { for (String option : options) { process(option); } } Small loop, constant stride, constant size
  • 17. Loop Unrolling private static final String[] options = { "yes", "no", "maybe"}; public void looper() { process(options[0]); process(options[1]); Unrolled! process(options[2]); }
  • 18. Lock Coarsening public void needsLocks() { for (option : options) { process(option); } Repeatedly locking } private synchronized String process(String option) { // some wacky thread-unsafe code }
  • 19. Lock Coarsening public void needsLocks() { Lock once synchronized (this) { for (option : options) { // some wacky thread-unsafe code } } }
  • 20. Lock Eliding public void overCautious() { Synchronize on List l = new ArrayList(); synchronized (l) { new Object for (option : options) { l.add(process(option)); } } } But we know it never escapes this thread...
  • 21. Lock Eliding public void overCautious() { List l = new ArrayList(); for (option : options) { l.add( /* process()’s code */); } } No need to lock
  • 22. Escape Analysis private static class Foo { public final String a; public final String b; Foo(String a, String b) { this.a = a; this.b = b; } }
  • 23. Escape Analysis public void bar() { Foo f = new Foo("Hello", "JVM"); baz(f); } public void baz(Foo f) { Same object all System.out.print(f.a); System.out.print(", "); the way through quux(f); } Never “escapes” public void quux(Foo f) { these methods System.out.print(f.b); System.out.println('!'); }
  • 24. Escape Analysis public secret awesome inlinedBarBazQuux() { System.out.print("Hello"); System.out.print(", "); System.out.print("JavaOne"); System.out.println('!'); } Don’t bother allocating Foo object
  • 25. Escape Analysis • A bit tweaky on Hotspot • All paths must inline • No external view of object • JRockit was better here? • Now they can fix Hotspot!
  • 26. Perf Sinks • Memory accesses • By far the biggest expense • Calls • Memory ref + branch kills pipeline • Call stack, register juggling costs • Locks
  • 27. Volatile? • Each CPU maintains a memory cache • Caches may be out of sync • If it doesn’t matter, no problem • If it does matter, threads disagree! • Volatile forces synchronization of cache • Across cores and to main memory
  • 28. Call Site • The place where you make a call • Monomorphic (“one shape”) • Single target class • Bimorphic (“two shapes”) • Polymorphic (“many shapes”) • Megamorphic (“you’re screwed”)
  • 29. Blah.java System.currentTimeMillis(); // static, monomorphic List list1 = new ArrayList(); // constructor, monomorphic List list2 = new LinkedList(); for (List list : new List[]{ list1, list2 }) { list.add("hello"); // bimorphic } for (Object obj : new Object[]{ 'foo', list1, new Object() }) { obj.toString(); // polymorphic }
  • 30. Hotspot • -client mode (C1) inlines, less aggressive • Fewer opportunities to optimize • -server mode (C2) inlines aggressively • Based on richer runtime profiling
  • 31. Tiered • Increasing tiers of interp, C1, and C2 • Level 0 = Interpreter • Level 1-3 = C1 • Level 4 = C2 • Kinda sorta works...
  • 32. system ~/projects/javaone2012-jit $ (pickjdk 4 ; time jruby -e 1) New JDK: jdk1.7.0_07.jdk real 0m1.251s user 0m2.128s sys m0.093s 0 system ~/projects/javaone2012-jit $ (pickjdk 5 ; time jruby -e 1) New JDK: jdk1.8.0.jdk real 0m1.167s user 0m2.767s sys m0.143s 0 system ~/projects/javaone2012-jit $ (pickjdk 5 ; time jruby -J-XX:TieredStopAtLevel=1 -e 1) New JDK: jdk1.8.0.jdk real 0m0.850s user 0m1.344s sys m0.114s 0
  • 33. C2 Compiler • Profile to find “hot spots” • Call sites • Branch statistics • Profile until 10k calls • Inline mono/bimorphic calls • Other mechanisms for polymorphic calls
  • 34. Now it gets fun!
  • 35. Monitoring the JIT • Dozens of flags • Reams of output • Always evolving • How can you understand it?
  • 36. public class Accumulator { public static void main(String[] args) { int max = Integer.parseInt(args[0]); System.out.println(addAll(max)); } static int addAll(int max) { int accum = 0; for (int i = 0; i < max; i++) { accum = add(accum, i); } return accum; } static int add(int a, int b) { return a + b; } }
  • 37. $ java -version openjdk version "1.7.0-b147" OpenJDK Runtime Environment (build 1.7.0- b147-20110927) OpenJDK 64-Bit Server VM (build 21.0-b17, mixed mode) $ javac Accumulator.java $ java Accumulator 1000 499500
  • 38. Print Compilation • -XX:+PrintCompilation • Print methods as they JIT • Class + name + size
  • 39. $ java -XX:+PrintCompilation Accumulator 1000 53 1 java.lang.String::hashCode (67 bytes) 499500
  • 40. $ java -XX:+PrintCompilation Accumulator 1000 53 1 java.lang.String::hashCode (67 bytes) 499500 Where’s our code?
  • 41. $ java -XX:+PrintCompilation Accumulator 1000 53 1 java.lang.String::hashCode (67 bytes) 499500 Where’s our code? Remember...10k calls before JIT
  • 42. 10k loop, 10k calls to add $ java -XX:+PrintCompilation Accumulator 10000 53 1 java.lang.String::hashCode (67 bytes) 64 2 Accumulator::add (4 bytes) 49995000 Hooray!
  • 43. But what’s this? $ java -XX:+PrintCompilation Accumulator 10000 53 1 java.lang.String::hashCode (67 bytes) 64 2 Accumulator::add (4 bytes) 49995000 Class loading, security logic, other stuff...
  • 44. Hotspot is making zombies? 1401 70 java.util.concurrent.ConcurrentHashMap::hash (49 bytes) 1412 71 java.lang.String::indexOf (7 bytes) 1420 72 ! java.io.BufferedReader::readLine (304 bytes) 1420 73 sun.nio.cs.UTF_8$Decoder::decodeArrayLoop (543 bytes) 1422 42 java.util.zip.ZipCoder::getBytes (192 bytes) made not entrant 1435 74 n java.lang.Object::hashCode (0 bytes) 1443 29 ! sun.misc.URLClassPath$JarLoader::getResource (91 bytes) made zombie 1443 25 sun.misc.URLClassPath::getResource (74 bytes) made zombie 1443 36 sun.misc.URLClassPath::getResource (74 bytes) made not entrant 1443 43 java.util.zip.ZipCoder::encoder (35 bytes) made not entrant 1449 75 java.lang.String::endsWith (15 bytes) 1631 1 % sun.misc.URLClassPath::getResource @ 39 (74 bytes) 1665 76 java.lang.ClassLoader::checkName (43 bytes)
  • 45. Hotspot is making zombies? 1401 70 java.util.concurrent.ConcurrentHashMap::hash (49 bytes) 1412 71 java.lang.String::indexOf (7 bytes) 1420 72 ! java.io.BufferedReader::readLine (304 bytes) 1420 73 sun.nio.cs.UTF_8$Decoder::decodeArrayLoop (543 bytes) 1422 42 java.util.zip.ZipCoder::getBytes (192 bytes) made not entrant 1435 74 n java.lang.Object::hashCode (0 bytes) 1443 29 ! sun.misc.URLClassPath$JarLoader::getResource (91 bytes) made zombie 1443 25 sun.misc.URLClassPath::getResource (74 bytes) made zombie 1443 36 sun.misc.URLClassPath::getResource (74 bytes) made not entrant 1443 43 java.util.zip.ZipCoder::encoder (35 bytes) made not entrant 1449 75 java.lang.String::endsWith (15 bytes) 1631 1 % sun.misc.URLClassPath::getResource @ 39 (74 bytes) 1665 76 java.lang.ClassLoader::checkName (43 bytes) Not entrant? What the heck?
  • 46. Optimistic Compilers • Assume profile is accurate • Aggressively optimize based on profile • Bail out if we’re wrong • ...and hope that we’re usually right
  • 47. Deoptimization • Bail out of running code • Monitoring flags describe process • “uncommon trap” - something’s changed • “not entrant” - don’t let new calls enter • “zombie” - on its way to deadness
  • 49. JRuby red_black perf 4s Most code not JITed yet 3s 2s 1s 0s
  • 50. JRuby red_black perf 4s Most code not JITed yet 3s Back off 2s 1s 0s
  • 51. JRuby red_black perf 4s Most code not JITed yet 3s Back off Back off 2s 1s 0s
  • 52. No JIT At All? • Code is too big • Code isn’t called enough
  • 53. That looks exciting! 1401 70 java.util.concurrent.ConcurrentHashMap::hash (49 bytes) 1412 71 java.lang.String::indexOf (7 bytes) 1420 72 ! java.io.BufferedReader::readLine (304 bytes) 1420 73 sun.nio.cs.UTF_8$Decoder::decodeArrayLoop (543 bytes) 1422 42 java.util.zip.ZipCoder::getBytes (192 bytes) made not entrant 1435 74 n java.lang.Object::hashCode (0 bytes) 1443 29 ! sun.misc.URLClassPath$JarLoader::getResource (91 bytes) made zombie 1443 25 sun.misc.URLClassPath::getResource (74 bytes) made zombie 1443 36 sun.misc.URLClassPath::getResource (74 bytes) made not entrant 1443 43 java.util.zip.ZipCoder::encoder (35 bytes) made not entrant 1449 75 java.lang.String::endsWith (15 bytes) 1631 1 % sun.misc.URLClassPath::getResource @ 39 (74 bytes) 1665 76 java.lang.ClassLoader::checkName (43 bytes)
  • 54. Exception handling in here (boring!) 1401 70 java.util.concurrent.ConcurrentHashMap::hash (49 bytes) 1412 71 java.lang.String::indexOf (7 bytes) 1420 72 ! java.io.BufferedReader::readLine (304 bytes) 1420 73 sun.nio.cs.UTF_8$Decoder::decodeArrayLoop (543 bytes) 1422 42 java.util.zip.ZipCoder::getBytes (192 bytes) made not entrant 1435 74 n java.lang.Object::hashCode (0 bytes) 1443 29 ! sun.misc.URLClassPath$JarLoader::getResource (91 bytes) made zombie 1443 25 sun.misc.URLClassPath::getResource (74 bytes) made zombie 1443 36 sun.misc.URLClassPath::getResource (74 bytes) made not entrant 1443 43 java.util.zip.ZipCoder::encoder (35 bytes) made not entrant 1449 75 java.lang.String::endsWith (15 bytes) 1631 1 % sun.misc.URLClassPath::getResource @ 39 (74 bytes) 1665 76 java.lang.ClassLoader::checkName (43 bytes)
  • 55. Exception Handling • Unroll stack until someone stops us • Handler gets registered in JVM • Different treatment by JIT • Inlined throw + catch = jump • If no stack trace, essentially free
  • 56. What’s this “n” all about? 1401 70 java.util.concurrent.ConcurrentHashMap::hash (49 bytes) 1412 71 java.lang.String::indexOf (7 bytes) 1420 72 ! java.io.BufferedReader::readLine (304 bytes) 1420 73 sun.nio.cs.UTF_8$Decoder::decodeArrayLoop (543 bytes) 1422 42 java.util.zip.ZipCoder::getBytes (192 bytes) made not entrant 1435 74 n java.lang.Object::hashCode (0 bytes) 1443 29 ! sun.misc.URLClassPath$JarLoader::getResource (91 bytes) made zombie 1443 25 sun.misc.URLClassPath::getResource (74 bytes) made zombie 1443 36 sun.misc.URLClassPath::getResource (74 bytes) made not entrant 1443 43 java.util.zip.ZipCoder::encoder (35 bytes) made not entrant 1449 75 java.lang.String::endsWith (15 bytes) 1631 1 % sun.misc.URLClassPath::getResource @ 39 (74 bytes) 1665 76 java.lang.ClassLoader::checkName (43 bytes)
  • 57. This method is native 1401 70 java.util.concurrent.ConcurrentHashMap::hash (49 bytes) 1412 71 java.lang.String::indexOf (7 bytes) 1420 72 ! java.io.BufferedReader::readLine (304 bytes) 1420 73 sun.nio.cs.UTF_8$Decoder::decodeArrayLoop (543 bytes) 1422 42 java.util.zip.ZipCoder::getBytes (192 bytes) made not entrant 1435 74 n java.lang.Object::hashCode (0 bytes) 1443 29 ! sun.misc.URLClassPath$JarLoader::getResource (91 bytes) made zombie 1443 25 sun.misc.URLClassPath::getResource (74 bytes) made zombie 1443 36 sun.misc.URLClassPath::getResource (74 bytes) made not entrant 1443 43 java.util.zip.ZipCoder::encoder (35 bytes) made not entrant 1449 75 java.lang.String::endsWith (15 bytes) 1631 1 % sun.misc.URLClassPath::getResource @ 39 (74 bytes) 1665 76 java.lang.ClassLoader::checkName (43 bytes)
  • 58. And this one? 1401 70 java.util.concurrent.ConcurrentHashMap::hash (49 bytes) 1412 71 java.lang.String::indexOf (7 bytes) 1420 72 ! java.io.BufferedReader::readLine (304 bytes) 1420 73 sun.nio.cs.UTF_8$Decoder::decodeArrayLoop (543 bytes) 1422 42 java.util.zip.ZipCoder::getBytes (192 bytes) made not entrant 1435 74 n java.lang.Object::hashCode (0 bytes) 1443 29 ! sun.misc.URLClassPath$JarLoader::getResource (91 bytes) made zombie 1443 25 sun.misc.URLClassPath::getResource (74 bytes) made zombie 1443 36 sun.misc.URLClassPath::getResource (74 bytes) made not entrant 1443 43 java.util.zip.ZipCoder::encoder (35 bytes) made not entrant 1449 75 java.lang.String::endsWith (15 bytes) 1631 1 % sun.misc.URLClassPath::getResource @ 39 (74 bytes) 1665 76 java.lang.ClassLoader::checkName (43 bytes) Method has been replaced while running (OSR)
  • 59. On-Stack Replacement • Running method never exits? • But it’s getting really hot? • Generally means loops, back-branching • Compile and replace while running • Not typically useful in large systems • Looks great on benchmarks!
  • 60. public class Accumulator { public static void main(String[] args) { int max = Integer.parseInt(args[0]); System.out.println(addAll(max)); } addAll never exits... static int addAll(int max) { int accum = 0; loops until end for (int i = 0; i < max; i++) { accum = add(accum, i); } return accum; } static int add(int a, int b) { return a + b; } }
  • 61. system ~/projects/javaone2012-jit $ java -XX:+PrintCompilation Accumulator1 1000 63 1 java.lang.String::hashCode (55 bytes) 499500 system ~/projects/javaone2012-jit $ java -XX:+PrintCompilation Accumulator1 10000 63 1 java.lang.String::hashCode (55 bytes) 74 2 Accumulator1::add (4 bytes) 49995000 system ~/projects/javaone2012-jit $ java -XX:+PrintCompilation Accumulator1 100000 62 1 java.lang.String::hashCode (55 bytes) 73 2 Accumulator1::add (4 bytes) 74 1 % Accumulator1::addAll @ 4 (23 bytes) 704982704
  • 62. Millis from JVM start 1401 70 java.util.concurrent.ConcurrentHashMap::hash (49 bytes) 1412 71 java.lang.String::indexOf (7 bytes) 1420 72 ! java.io.BufferedReader::readLine (304 bytes) 1420 73 sun.nio.cs.UTF_8$Decoder::decodeArrayLoop (543 bytes) 1422 42 java.util.zip.ZipCoder::getBytes (192 bytes) made not entrant 1435 74 n java.lang.Object::hashCode (0 bytes) 1443 29 ! sun.misc.URLClassPath$JarLoader::getResource (91 bytes) made zombie 1443 25 sun.misc.URLClassPath::getResource (74 bytes) made zombie 1443 36 sun.misc.URLClassPath::getResource (74 bytes) made not entrant 1443 43 java.util.zip.ZipCoder::encoder (35 bytes) made not entrant 1449 75 java.lang.String::endsWith (15 bytes) 1631 1 % sun.misc.URLClassPath::getResource @ 39 (74 bytes) 1665 76 java.lang.ClassLoader::checkName (43 bytes) Sequence number of compilation
  • 63. Compiling 1401 70 java.util.concurrent.ConcurrentHashMap::hash (49 bytes) 1412 71 java.lang.String::indexOf (7 bytes) 1420 72 ! java.io.BufferedReader::readLine (304 bytes) 1420 73 sun.nio.cs.UTF_8$Decoder::decodeArrayLoop (543 bytes) 1422 42 java.util.zip.ZipCoder::getBytes (192 bytes) made not entrant 1435 74 n java.lang.Object::hashCode (0 bytes) 1443 29 ! sun.misc.URLClassPath$JarLoader::getResource (91 bytes) made zombie 1443 25 sun.misc.URLClassPath::getResource (74 bytes) made zombie 1443 36 sun.misc.URLClassPath::getResource (74 bytes) made not entrant 1443 43 java.util.zip.ZipCoder::encoder (35 bytes) made not entrant 1449 75 java.lang.String::endsWith (15 bytes) 1631 1 % sun.misc.URLClassPath::getResource @ 39 (74 bytes) 1665 76 java.lang.ClassLoader::checkName (43 bytes)
  • 64. Backing Off 1401 70 java.util.concurrent.ConcurrentHashMap::hash (49 bytes) 1412 71 java.lang.String::indexOf (7 bytes) 1420 72 ! java.io.BufferedReader::readLine (304 bytes) 1420 73 sun.nio.cs.UTF_8$Decoder::decodeArrayLoop (543 bytes) 1422 42 java.util.zip.ZipCoder::getBytes (192 bytes) made not entrant 1435 74 n java.lang.Object::hashCode (0 bytes) 1443 29 ! sun.misc.URLClassPath$JarLoader::getResource (91 bytes) made zombie 1443 25 sun.misc.URLClassPath::getResource (74 bytes) made zombie 1443 36 sun.misc.URLClassPath::getResource (74 bytes) made not entrant 1443 43 java.util.zip.ZipCoder::encoder (35 bytes) made not entrant 1449 75 java.lang.String::endsWith (15 bytes) 1631 1 % sun.misc.URLClassPath::getResource @ 39 (74 bytes) 1665 76 java.lang.ClassLoader::checkName (43 bytes)
  • 65. OSR 1401 70 java.util.concurrent.ConcurrentHashMap::hash (49 bytes) 1412 71 java.lang.String::indexOf (7 bytes) 1420 72 ! java.io.BufferedReader::readLine (304 bytes) 1420 73 sun.nio.cs.UTF_8$Decoder::decodeArrayLoop (543 bytes) 1422 42 java.util.zip.ZipCoder::getBytes (192 bytes) made not entrant 1435 74 n java.lang.Object::hashCode (0 bytes) 1443 29 ! sun.misc.URLClassPath$JarLoader::getResource (91 bytes) made zombie 1443 25 sun.misc.URLClassPath::getResource (74 bytes) made zombie 1443 36 sun.misc.URLClassPath::getResource (74 bytes) made not entrant 1443 43 java.util.zip.ZipCoder::encoder (35 bytes) made not entrant 1449 75 java.lang.String::endsWith (15 bytes) 1631 1 % sun.misc.URLClassPath::getResource @ 39 (74 bytes) 1665 76 java.lang.ClassLoader::checkName (43 bytes)
  • 66. system ~/projects/javaone2012-jit $ java -XX:+PrintCompilation -XX:+TieredCompilation Accumulator1 1000 55 1 3 java.lang.String::charAt (29 bytes) 57 2 3 java.lang.String::hashCode (55 bytes) 57 3 3 java.lang.Object::<init> (1 bytes) 57 4 n 0 java.lang.System::arraycopy (0 bytes) (static) 57 5 3 java.lang.String::indexOf (70 bytes) 57 6 3 java.lang.String::length (6 bytes) 58 7 3 java.lang.AbstractStringBuilder::ensureCapacityInternal (16 bytes) 59 8 3 java.lang.String::equals (81 bytes) ... 69 26 3 java.lang.Character::toLowerCase (6 bytes) 69 27 3 java.lang.AbstractStringBuilder::append (48 bytes) 70 28 3 java.lang.String::indexOf (7 bytes) 72 29 4 java.lang.String::charAt (29 bytes) 72 30 3 java.lang.StringBuilder::append (8 bytes) 73 31 1 java.net.URL::getProtocol (5 bytes) 73 32 3 java.lang.String::lastIndexOf (52 bytes) 74 33 3 java.io.UnixFileSystem::normalize (75 bytes) 75 1 3 java.lang.String::charAt (29 bytes) made not entrant 77 36 n 0 java.lang.Thread::currentThread (0 bytes) (static) 77 35 3 Accumulator1::add (4 bytes) 49950 Tier we’re at Only called 1k times
  • 67. Print Inlining • -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining • Display hierarchy of inlined methods • Include reasons for not inlining • More, better output on OpenJDK 7
  • 68. $ java -XX:+UnlockDiagnosticVMOptions > -XX:+PrintInlining > Accumulator 10000 49995000
  • 69. $ java -XX:+UnlockDiagnosticVMOptions > -XX:+PrintInlining > Accumulator 10000 49995000 Um...I don’t see anything inlining
  • 70. public class Accumulator { public static void main(String[] args) { int max = Integer.parseInt(args[0]); System.out.println(addAll(max)); } static int addAll(int max) { Called only once int accum = 0; for (int i = 0; i < max; i++) { accum = add(accum, i); } return accum; } static int add(int a, int b) { return a + b; } }
  • 71. public class Accumulator { public static void main(String[] args) { int max = Integer.parseInt(args[0]); System.out.println(addAll(max)); } static int addAll(int max) { Called only once int accum = 0; for (int i = 0; i < max; i++) { accum = add(accum, i); } return accum; Called 10k times } static int add(int a, int b) { return a + b; } }
  • 72. public class Accumulator { public static void main(String[] args) { int max = Integer.parseInt(args[0]); System.out.println(addAll(max)); } static int addAll(int max) { Called only once int accum = 0; for (int i = 0; i < max; i++) { accum = add(accum, i); } return accum; Called 10k times } static int add(int a, int b) { JITs as expected return a + b; } }
  • 73. public class Accumulator { public static void main(String[] args) { int max = Integer.parseInt(args[0]); System.out.println(addAll(max)); } static int addAll(int max) { Called only once int accum = 0; for (int i = 0; i < max; i++) { accum = add(accum, i); } return accum; Called 10k times } static int add(int a, int b) { JITs as expected return a + b; } } But makes no calls!
  • 74. static double addAllSqrts(int max) { double accum = 0; for (int i = 0; i < max; i++) { accum = addSqrt(accum, i); } return accum; } static int addSqrt(double a, int b) { return a + sqrt(b); } static double sqrt(int a) { return Math.sqrt(b); }
  • 75. $ java -XX:+UnlockDiagnosticVMOptions > -XX:+PrintInlining > -XX:+PrintCompilation > Accumulator 10000 53 1 java.lang.String::hashCode (67 bytes) 65 2 Accumulator::addSqrt (7 bytes) @ 3 Accumulator::sqrt (6 bytes) inline (hot) @ 2 java.lang.Math::sqrt (5 bytes) (intrinsic) 65 3 Accumulator::sqrt (6 bytes) @ 2 java.lang.Math::sqrt (5 bytes) (intrinsic) 666616.4591971082
  • 76. $ java -XX:+UnlockDiagnosticVMOptions > -XX:+PrintInlining HOT HOT HOT! > -XX:+PrintCompilation > Accumulator 10000 53 1 java.lang.String::hashCode (67 bytes) 65 2 Accumulator::addSqrt (7 bytes) @ 3 Accumulator::sqrt (6 bytes) inline (hot) @ 2 java.lang.Math::sqrt (5 bytes) (intrinsic) 65 3 Accumulator::sqrt (6 bytes) @ 2 java.lang.Math::sqrt (5 bytes) (intrinsic) 666616.4591971082
  • 77. $ java -XX:+UnlockDiagnosticVMOptions > -XX:+PrintInlining > -XX:+PrintCompilation > Accumulator 10000 53 1 java.lang.String::hashCode (67 bytes) 65 2 Accumulator::addSqrt (7 bytes) @ 3 Accumulator::sqrt (6 bytes) inline (hot) @ 2 java.lang.Math::sqrt (5 bytes) (intrinsic) 65 3 Accumulator::sqrt (6 bytes) @ 2 java.lang.Math::sqrt (5 bytes) (intrinsic) 666616.4591971082 Calls treated specially by JIT
  • 78. Intrinsic? • Known to the JIT • Don’t inline bytecode • Do insert “best” native code • e.g. kernel-level memory operation • e.g. optimized sqrt in machine code
  • 79. Common Intrinsics • String#equals • Most (all?) Math methods • System.arraycopy • Object#hashCode • Object#getClass • sun.misc.Unsafe methods
  • 80. LogCompilation • -XX:+LogCompilation • Dumps compiler events to hotspot.log • Tons and tons of output
  • 81. scopes_pcs_offset='1384' dependencies_offset='1576' handler_table_offset='1592' nul_chk_table_offset='1736' oops_offset='992' method='org/jruby/lexer/yacc/ByteArrayLexerSource$ByteArrayCursor read ()I' bytes='49' count='5296' backedge_count='1' iicount='10296' stamp='0.412'/> <writer thread='4425007104'/> <nmethod compile_id='21' compiler='C2' entry='4345862528' size='1152' address='4345862160' relocation_offset='288' insts_offset='368' stub_offset='688' scopes_data_offset='840' scopes_pcs_offset='904' dependencies_offset='1016' handler_table_offset='1032' oops_offset='784' method='org/jruby/lexer/yacc/ ByteArrayLexerSource forward (I)I' bytes='111' count='5296' backedge_count='1' iicount='10296' stamp='0.412'/> <writer thread='4300214272'/> <task_queued compile_id='22' method='org/jruby/lexer/yacc/ByteArrayLexerSource read ()I' bytes='10' count='5000' backedge_count='1' iicount='10000' stamp='0.433' comment='count' hot_count='10000'/> <writer thread='4426067968'/> <nmethod compile_id='22' compiler='C2' entry='4345885984' size='1888' address='4345885584' relocation_offset='288' insts_offset='400' stub_offset='912' scopes_data_offset='1104' scopes_pcs_offset='1496' dependencies_offset='1704' handler_table_offset='1720' nul_chk_table_offset='1864' oops_offset='1024' method='org/jruby/lexer/yacc/ByteArrayLexerSource read ()I' bytes='10' count='5044' backedge_count='1' iicount='10044' stamp='0.435'/> <writer thread='4300214272'/> <task_queued compile_id='23' method='java/util/HashMap hash (I)I' bytes='23' count='5000' backedge_count='1' iicount='10000' stamp='0.442' comment='count' hot_count='10000'/> <writer thread='4425007104'/> <nmethod compile_id='23' compiler='C2' entry='4345887808' size='440' address='4345887504' relocation_offset='288' insts_offset='304' stub_offset='368' scopes_data_offset='392' scopes_pcs_offset='400' dependencies_offset='432' method='java/util/HashMap hash (I)I' bytes='23' count='5039' backedge_count='1' iicount='10039' stamp='0.442'/> <writer thread='4300214272'/> <dependency_failed type='abstract_with_unique_concrete_subtype' ctxk='org/jruby/lexer/yacc/LexerSource' x='org/jruby/lexer/yacc/ByteArrayLexerSource' witness='org/jruby/lexer/yacc/InputStreamLexerSource' stamp='0.456'/> <dependency_failed type='abstract_with_unique_concrete_subtype' ctxk='org/jruby/lexer/yacc/LexerSource' x='org/jruby/lexer/yacc/ByteArrayLexerSource' witness='org/jruby/lexer/yacc/InputStreamLexerSource' stamp='0.456'/> <dependency_failed type='abstract_with_unique_concrete_subtype' ctxk='org/jruby/lexer/yacc/LexerSource' x='org/jruby/lexer/yacc/ByteArrayLexerSource' witness='org/jruby/lexer/yacc/InputStreamLexerSource' stamp='0.456'/> <dependency_failed type='abstract_with_unique_concrete_subtype' ctxk='org/jruby/lexer/yacc/LexerSource' x='org/jruby/lexer/yacc/ByteArrayLexerSource' witness='org/jruby/lexer/yacc/InputStreamLexerSource' stamp='0.456'/>
  • 82. scopes_pcs_offset='1384' dependencies_offset='1576' handler_table_offset='1592' nul_chk_table_offset='1736' oops_offset='992' method='org/jruby/lexer/yacc/ByteArrayLexerSource$ByteArrayCursor read ()I' bytes='49' count='5296' backedge_count='1' iicount='10296' stamp='0.412'/> <writer thread='4425007104'/> <nmethod compile_id='21' compiler='C2' entry='4345862528' size='1152' address='4345862160' relocation_offset='288' insts_offset='368' stub_offset='688' scopes_data_offset='840' scopes_pcs_offset='904' dependencies_offset='1016' handler_table_offset='1032' oops_offset='784' method='org/jruby/lexer/yacc/ ByteArrayLexerSource forward (I)I' bytes='111' count='5296' backedge_count='1' iicount='10296' stamp='0.412'/> <writer thread='4300214272'/> <task_queued compile_id='22' method='org/jruby/lexer/yacc/ByteArrayLexerSource read ()I' bytes='10' count='5000' backedge_count='1' iicount='10000' stamp='0.433' comment='count' hot_count='10000'/> <writer thread='4426067968'/> <nmethod compile_id='22' compiler='C2' entry='4345885984' size='1888' address='4345885584' relocation_offset='288' insts_offset='400' stub_offset='912' scopes_data_offset='1104' scopes_pcs_offset='1496' dependencies_offset='1704' handler_table_offset='1720' nul_chk_table_offset='1864' oops_offset='1024' method='org/jruby/lexer/yacc/ByteArrayLexerSource read ()I' bytes='10' count='5044' backedge_count='1' iicount='10044' stamp='0.435'/> <writer thread='4300214272'/> <task_queued compile_id='23' method='java/util/HashMap hash (I)I' bytes='23' count='5000' backedge_count='1' iicount='10000' stamp='0.442' comment='count' hot_count='10000'/> <writer thread='4425007104'/> <nmethod compile_id='23' compiler='C2' entry='4345887808' size='440' address='4345887504' relocation_offset='288' insts_offset='304' stub_offset='368' scopes_data_offset='392' scopes_pcs_offset='400' dependencies_offset='432' method='java/util/HashMap hash (I)I' bytes='23' count='5039' backedge_count='1' iicount='10039' stamp='0.442'/> <writer thread='4300214272'/> <dependency_failed type='abstract_with_unique_concrete_subtype' ctxk='org/jruby/lexer/yacc/LexerSource' x='org/jruby/lexer/yacc/ByteArrayLexerSource' witness='org/jruby/lexer/yacc/InputStreamLexerSource' stamp='0.456'/> <dependency_failed type='abstract_with_unique_concrete_subtype' ctxk='org/jruby/lexer/yacc/LexerSource' x='org/jruby/lexer/yacc/ByteArrayLexerSource' witness='org/jruby/lexer/yacc/InputStreamLexerSource' stamp='0.456'/> <dependency_failed type='abstract_with_unique_concrete_subtype' ctxk='org/jruby/lexer/yacc/LexerSource' x='org/jruby/lexer/yacc/ByteArrayLexerSource' witness='org/jruby/lexer/yacc/InputStreamLexerSource' stamp='0.456'/> <dependency_failed type='abstract_with_unique_concrete_subtype' ctxk='org/jruby/lexer/yacc/LexerSource' x='org/jruby/lexer/yacc/ByteArrayLexerSource' witness='org/jruby/lexer/yacc/InputStreamLexerSource' stamp='0.456'/>
  • 83. Worst XML Evar • Relational structure in hierarchical form • Hotspot guys can read it...I cannot • <JDK>/hotspot/src/share/tools/LogCompilation • or http://github.com/headius/logc
  • 84. No flags, like PrintCompilation $ java -jar logc.jar hotspot.log 1 java.lang.String::hashCode (67 bytes) 2 Accumulator::addSqrt (7 bytes) 3 Accumulator::sqrt (6 bytes)
  • 85. -i flag, PrintCompilation and PrintInlining $ java -jar logc.jar -i hotspot.log 1 java.lang.String::hashCode (67 bytes) 2 Accumulator::addSqrt (7 bytes) @ 2 Accumulator::sqrt (6 bytes) (end time: 0.0660 nodes: 36) @ 2 java.lang.Math::sqrt (5 bytes) 3 Accumulator::sqrt (6 bytes) @ 2 java.lang.Math::sqrt (5 bytes)
  • 86. -i flag, PrintCompilation and PrintInlining $ java -jar logc.jar -i hotspot.log 1 java.lang.String::hashCode (67 bytes) 2 Accumulator::addSqrt (7 bytes) @ 2 Accumulator::sqrt (6 bytes) (end time: 0.0660 nodes: 36) @ 2 java.lang.Math::sqrt (5 bytes) 3 Accumulator::sqrt (6 bytes) @ 2 java.lang.Math::sqrt (5 bytes)
  • 87. 8 sun.nio.cs.UTF_8$Encoder::encode (361 bytes) 6 uncommon trap null_check make_not_entrant @8 java/lang/String equals (Ljava/lang/Object;)Z 6 make_not_entrant 9 java.lang.String::equals (88 bytes) 10 java.util.LinkedList::indexOf (73 bytes)
  • 88. Hotspot sees it’s 100% String 10 java.util.LinkedList::indexOf (73 bytes) @ 52 java.lang.Object::equals (11 bytes) type profile java/lang/Object -> java/lang/String (100%) @ 52 java.lang.String::equals (88 bytes) 11 java.lang.String::indexOf (87 bytes) @ 83 java.lang.String::indexOfSupplementary too big Too big to inline! Could be bad?
  • 89. Tuning Inlining • -XX:+MaxInlineSize=35 • Largest inlinable method (bytecode) • -XX:+InlineSmallCode=# • Largest inlinable compiled method • -XX:+FreqInlineSize=# • Largest frequently-called method...
  • 90. Tuning Inlining • -XX:+MaxInlineLevel=9 • How deep does the rabbit hole go? • -XX:+MaxRecursiveInlineLevel=# • Recursive inlining
  • 93. The Red Pill • Knowing code compiles is good • Knowing code inlines is better • Seeing the actual assembly is best!
  • 94. Caveat • I don’t really know assembly. • But I fake it really well.
  • 95. Print Assembly • -XX:+PrintAssembly • Google “hotspot printassembly” • https://wikis.oracle.com/display/ HotSpotInternals/PrintAssembly • Assembly-dumping plugin for Hotspot
  • 96. Alternative • -XX:+PrintOptoAssembly • Only in debug/fastdebug builds • Not as pretty
  • 97. Wednesday, July 27, 2011 ~/oscon ! java -XX:+UnlockDiagnosticVMOptions > -XX:+PrintAssembly > Accumulator 10000 OpenJDK 64-Bit Server VM warning: PrintAssembly is enabled; turning on DebugNonSafepoints to gain additional output Loaded disassembler from hsdis-amd64.dylib ...
  • 98. Decoding compiled method 11343cbd0: Code: [Disassembling for mach='i386:x86-64'] [Entry Point] [Verified Entry Point] [Constants] # {method} 'add' '(II)I' in 'Accumulator' # parm0: rsi = int # parm1: rdx = int # [sp+0x20] (sp of caller) 11343cd00: push %rbp 11343cd01: sub $0x10,%rsp 11343cd05: nop ;*synchronization entry ; - Accumulator::add@-1 (line 16) 11343cd06: mov %esi,%eax 11343cd08: add %edx,%eax ;*iadd ; - Accumulator::add@2 (line 16) 11343cd0a: add $0x10,%rsp 11343cd0e: pop %rbp 11343cd0f: test %eax,-0x1303fd15(%rip) # 1003fd000 ; {poll_return} 11343cd15: retq
  • 100. x86_64 Assembly 101 add Two’s complement add sub ...subtract mov* Move data from a to b jmp goto je, jne, jl, jge, ... Jump if ==, !=, <, >=, ... push, pop Call stack operations call*, ret* Call, return from subroutine eax, ebx, esi, ... 32-bit registers rax, rbx, rsi, ... 64-bit registers
  • 101. Register Machine • Instead of stack moves, we have “slots” • Move data into slots • Trigger operations that manipulate data • Get new data out of slots • JVM stack, locals end up as register ops
  • 102. Native Stack? • Native code has a stack too • Preserves registers from call to call • Various calling conventions • Caller preserves registers? • Callee preserves registers?
  • 103. Decoding compiled method 11343cbd0: <= address of new compiled code Code: [Disassembling for mach='i386:x86-64'] <= architecture [Entry Point] [Verified Entry Point] [Constants] # {method} 'add' '(II)I' in 'Accumulator' <= method, signature, class # parm0: rsi = int <= first parm to method goes in rsi # parm1: rdx = int <= second parm goes in rdx # [sp+0x20] (sp of caller) <= caller’s pointer into native stack
  • 104. 11343cd00: push %rbp 11343cd01: sub $0x10,%rsp 11343cd05: nop ;*synchronization entry ; - Accumulator::add@-1 (line 16) 11343cd06: mov %esi,%eax 11343cd08: add %edx,%eax ;*iadd ; - Accumulator::add@2 (line 16) 11343cd0a: add $0x10,%rsp 11343cd0e: pop %rbp 11343cd0f: test %eax,-0x1303fd15(%rip) # 1003fd000 ; {poll_return} 11343cd15: retq rbp points at current stack frame, so we save it off.
  • 105. 11343cd00: push %rbp 11343cd01: sub $0x10,%rsp 11343cd05: nop ;*synchronization entry ; - Accumulator::add@-1 (line 16) 11343cd06: mov %esi,%eax 11343cd08: add %edx,%eax ;*iadd ; - Accumulator::add@2 (line 16) 11343cd0a: add $0x10,%rsp 11343cd0e: pop %rbp 11343cd0f: test %eax,-0x1303fd15(%rip) # 1003fd000 ; {poll_return} 11343cd15: retq Two args, so we bump stack pointer by 0x10.
  • 106. 11343cd00: push %rbp 11343cd01: sub $0x10,%rsp 11343cd05: nop ;*synchronization entry ; - Accumulator::add@-1 (line 16) 11343cd06: mov %esi,%eax 11343cd08: add %edx,%eax ;*iadd ; - Accumulator::add@2 (line 16) 11343cd0a: add $0x10,%rsp 11343cd0e: pop %rbp 11343cd0f: test %eax,-0x1303fd15(%rip) # 1003fd000 ; {poll_return} 11343cd15: retq Do nothing, e.g. to memory-align code.
  • 107. 11343cd00: push %rbp 11343cd01: sub $0x10,%rsp 11343cd05: nop ;*synchronization entry ; - Accumulator::add@-1 (line 16) 11343cd06: mov %esi,%eax 11343cd08: add %edx,%eax ;*iadd ; - Accumulator::add@2 (line 16) 11343cd0a: add $0x10,%rsp 11343cd0e: pop %rbp 11343cd0f: test %eax,-0x1303fd15(%rip) # 1003fd000 ; {poll_return} 11343cd15: retq At the “-1” instruction of our add() method... i.e. here we go!
  • 108. 11343cd00: push %rbp 11343cd01: sub $0x10,%rsp 11343cd05: nop ;*synchronization entry ; - Accumulator::add@-1 (line 16) 11343cd06: mov %esi,%eax 11343cd08: add %edx,%eax ;*iadd ; - Accumulator::add@2 (line 16) 11343cd0a: add $0x10,%rsp 11343cd0e: pop %rbp 11343cd0f: test %eax,-0x1303fd15(%rip) # 1003fd000 ; {poll_return} 11343cd15: retq Move parm1 into eax.
  • 109. 11343cd00: push %rbp 11343cd01: sub $0x10,%rsp 11343cd05: nop ;*synchronization entry ; - Accumulator::add@-1 (line 16) 11343cd06: mov %esi,%eax 11343cd08: add %edx,%eax ;*iadd ; - Accumulator::add@2 (line 16) 11343cd0a: add $0x10,%rsp 11343cd0e: pop %rbp 11343cd0f: test %eax,-0x1303fd15(%rip) # 1003fd000 ; {poll_return} 11343cd15: retq Add parm0 and parm1, store result in eax.
  • 110. 11343cd00: push %rbp 11343cd01: sub $0x10,%rsp 11343cd05: nop ;*synchronization entry ; - Accumulator::add@-1 (line 16) 11343cd06: mov %esi,%eax 11343cd08: add %edx,%eax ;*iadd ; - Accumulator::add@2 (line 16) 11343cd0a: add $0x10,%rsp 11343cd0e: pop %rbp 11343cd0f: test %eax,-0x1303fd15(%rip) # 1003fd000 ; {poll_return} 11343cd15: retq How nice, Hotspot shows us this is our “iadd” op!
  • 111. 11343cd00: push %rbp 11343cd01: sub $0x10,%rsp 11343cd05: nop ;*synchronization entry ; - Accumulator::add@-1 (line 16) 11343cd06: mov %esi,%eax 11343cd08: add %edx,%eax ;*iadd ; - Accumulator::add@2 (line 16) 11343cd0a: add $0x10,%rsp 11343cd0e: pop %rbp 11343cd0f: test %eax,-0x1303fd15(%rip) # 1003fd000 ; {poll_return} 11343cd15: retq Put stack pointer back where it was.
  • 112. 11343cd00: push %rbp 11343cd01: sub $0x10,%rsp 11343cd05: nop ;*synchronization entry ; - Accumulator::add@-1 (line 16) 11343cd06: mov %esi,%eax 11343cd08: add %edx,%eax ;*iadd ; - Accumulator::add@2 (line 16) 11343cd0a: add $0x10,%rsp 11343cd0e: pop %rbp 11343cd0f: test %eax,-0x1303fd15(%rip) # 1003fd000 ; {poll_return} 11343cd15: retq Restore rbp from stack.
  • 113. 11343cd00: push %rbp 11343cd01: sub $0x10,%rsp 11343cd05: nop ;*synchronization entry ; - Accumulator::add@-1 (line 16) 11343cd06: mov %esi,%eax 11343cd08: add %edx,%eax ;*iadd ; - Accumulator::add@2 (line 16) 11343cd0a: add $0x10,%rsp 11343cd0e: pop %rbp 11343cd0f: test %eax,-0x1303fd15(%rip) # 1003fd000 ; {poll_return} 11343cd15: retq Poll a “safepoint”...give JVM a chance to GC, etc.
  • 114. 11343cd00: push %rbp 11343cd01: sub $0x10,%rsp 11343cd05: nop ;*synchronization entry ; - Accumulator::add@-1 (line 16) 11343cd06: mov %esi,%eax 11343cd08: add %edx,%eax ;*iadd ; - Accumulator::add@2 (line 16) 11343cd0a: add $0x10,%rsp 11343cd0e: pop %rbp 11343cd0f: test %eax,-0x1303fd15(%rip) # 1003fd000 ; {poll_return} 11343cd15: retq All done!
  • 115. Things to Watch For • CALL operations • Indicates something failed to inline • LOCK operations • Cache-busting, e.g. volatility
  • 116. CALL 1134858f5: xchg %ax,%ax 1134858f7: callq 113414aa0 ; OopMap{off=316} ;*invokespecial addAsBignum ; - org.jruby.RubyFixnum::addFixnum@29 (line 348) ; {optimized virtual_call} 1134858fc: jmpq 11348586d Ruby integer adds might overflow into Bignum, leading to addAsBignum call. In this case, it’s never called, so Hotspot emits callq assuming we won’t hit it.
  • 117. LOCK Code from a RubyBasicObject’s default constructor. 11345d823: mov 0x70(%r8),%r9d ;*getstatic NULL_OBJECT_ARRAY ; - org.jruby.RubyBasicObject::<init>@5 (line 76) ; - org.jruby.RubyObject::<init>@2 (line 118) ; - org.jruby.RubyNumeric::<init>@2 (line 111) ; - org.jruby.RubyInteger::<init>@2 (line 95) ; - org.jruby.RubyFixnum::<init>@5 (line 112) ; - org.jruby.RubyFixnum::newFixnum@25 (line 173) 11345d827: mov %r9d,0x14(%rax) 11345d82b: lock addl $0x0,(%rsp) ;*putfield varTable ; - org.jruby.RubyBasicObject::<init>@8 (line 76) ; - org.jruby.RubyObject::<init>@2 (line 118) ; - org.jruby.RubyNumeric::<init>@2 (line 111) ; - org.jruby.RubyInteger::<init>@2 (line 95) ; - org.jruby.RubyFixnum::<init>@5 (line 112) ; - org.jruby.RubyFixnum::newFixnum@25 (line 173) Why are we doing a volatile write in the constructor?
  • 118. LOCK public class RubyBasicObject ... { private static final boolean DEBUG = false; private static final Object[] NULL_OBJECT_ARRAY = new Object[0]; // The class of this object protected transient RubyClass metaClass; // zeroed by jvm protected int flags; // variable table, lazily allocated as needed (if needed) private volatile Object[] varTable = NULL_OBJECT_ARRAY; Maybe it’s not such a good idea to pre-init a volatile?
  • 119. LOCK ~/projects/jruby ! git log 2f935de1e40bfd8b29b3a74eaed699e519571046 -1 | cat commit 2f935de1e40bfd8b29b3a74eaed699e519571046 Author: Charles Oliver Nutter <headius@headius.com> Date: Tue Jun 14 02:59:41 2011 -0500 Do not eagerly initialize volatile varTable field in RubyBasicObject; speeds object creation significantly. LEVEL UP!
  • 120. What Have We Learned? • How Hotspot’s JIT works • How to monitor the JIT • How to find problems • How to fix problems we find
  • 121. What We Missed • Tuning GC settings in JVM • Monitoring GC with VisualVM • Google ‘visualgc’...it’s awesome
  • 122. You’re no dummy now! ;-)
  • 123. Thank you! • headius@headius.com, @headius • http://blog.headius.com • “java virtual machine specification” • “jvm opcodes”

Editor's Notes

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. \n
  20. \n
  21. \n
  22. \n
  23. \n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n
  56. \n
  57. \n
  58. \n
  59. \n
  60. \n
  61. \n
  62. \n
  63. \n
  64. \n
  65. \n
  66. \n
  67. \n
  68. \n
  69. \n
  70. \n
  71. \n
  72. \n
  73. \n
  74. \n
  75. \n
  76. \n
  77. \n
  78. \n
  79. \n
  80. \n
  81. \n
  82. \n
  83. \n
  84. \n
  85. \n
  86. \n
  87. \n
  88. \n
  89. \n
  90. \n
  91. \n
  92. \n
  93. \n
  94. \n
  95. \n
  96. \n
  97. \n
  98. \n
  99. \n
  100. \n
  101. \n
  102. \n
  103. \n
  104. \n
  105. \n
  106. \n
  107. \n
  108. \n
  109. \n
  110. \n
  111. \n
  112. \n
  113. \n
  114. \n
  115. \n
  116. \n
  117. \n
  118. \n
  119. \n
  120. \n
  121. \n
  122. \n
  123. \n