SlideShare a Scribd company logo
1 of 11
Download to read offline
<Insert Picture Here>
Tiered Compilation
Igor Veresov
Goals
• Server performance
• With client startup
• Adaptive policy with minimal or no machine-
dependent tuning: feedback on compilation queue
length as a measure of machine's speed/compilation
speed
Compilation levels
• level 0 - interpreter
• level 1 - C1 with full optimization (no profiling)
• level 2 - C1 with invocation and backedge counters
• level 3 - C1 with full profiling (level 2 + MDO)
• level 4 - C2
• level 3 doesn't do CEE or BE
• level 3 is 35% slower than level 2
Client is hard to beat
• Client compiles when i + b >= 1500 (level 1), needs
no profiling
• Tiered needs to compile in best case:
– level 3 version and profile (code 35% slower)
– compile level 4 version (compilation is slow)
• Only beneficial for workloads when it's possible to
compensate by producing level 4 version fast
• Fortunately most real word apps (and benchmarks)
are like that (run more than a second with high code
reuse factor).
Profiling
• Profiles stored in MethodDataOop (MDO)
• Existing counters used when not profiling(MethodOop)
• Separate counters in MDO (multiple versions can run
concurrently)
• Can profile at levels 0 and 3
• What: branches, call receiver types, typechecks
(checkcast, instanceof, aastore)
• Periodic runtime notifications.
– Invocations: 0:128, 2:2048, 3:1024,
– Backedges: 0:1024, 2:8192, 3:4096
State transitions I
• Starts at level 0
• Ideally needs to transition to level 3
• Scales threshold based on C1 queue length / number
of C1 threads
• May transition to level 2 based on C2 queue length
(spending too much time in level 3 hurts, 30% slower)
• Can start profiling at level 0
State Transitions II
• Profiling: level 3
• Monitors C2 queue length / number of C2 threads and
scales level 4 threshold
• If the method is trivial (small, nothing to profile)
compiles level 1 version immediately
In-queue optimizations
• Prioritization by (rate + 1) * (i + 1) * (b + 1),
rate = d(i + b) / dt
• And methods after deoptimization have advantage
• Stale methods removal
• In-queue level change (3->2).
Common transition patterns
• 0 -> 3 -> 4 (common).
• 0 -> 2 -> 3 -> 4 (C2 queue too long).
• 0 -> (3->2) -> 4 (in queue change).
• 0 -> 3 -> 1 or 0 -> 2 -> 1 (trivial, or can't compile in
C2).
• 0 -> 4 (can't compile in C1, fully profile in interpreter).
• (1,2,3,4) → 0 (deoptimization)
Level switching predicates
• Regular compile:
i > k1 * scale || i > k2 * scale && i + b > k3 * scale
• OSR: b > k4 * scale
• Profile maturity measured the same way
(scale=ProfileMaturityPercentage/100.0 = 0.2)
• Reprofiling support (additional start counters in MDO)
Performance (8 core x86, 1 C1, 2 C2)
Client vs tiered
benchmark: _227_mtrt
startup: 1.857 1.692 diff: 8.88%
settled: 0.5961 0.4107 diff: 31.10%
benchmark: _202_jess
startup: 4.066 3.615 diff: 11.09%
settled: 1.4405 1.2545 diff: 12.91%
benchmark: _201_compress
startup: 4.781 4.467 diff: 6.56%
settled: 3.9605 3.6429 diff: 8.01%
benchmark: _209_db
startup: 9.719 8.863 diff: 8.80%
settled: 6.2362 6.3588 diff: -1.96%
benchmark: _222_mpegaudio
startup: 2.656 2.424 diff: 8.73%
settled: 2.4167 2.2408 diff: 7.27%
benchmark: _228_jack
startup: 3.737 3.339 diff: 10.65%
settled: 1.7389 1.2857 diff: 26.06%
benchmark: _213_javac
startup: 3.714 3.649 diff: 1.75%
settled: 2.0299 1.6375 diff: 19.33%
Server vs tiered
benchmark: _227_mtrt
startup: 2.014 1.692 diff: 15.98%
settled: 0.4234 0.4107 diff: 2.99%
benchmark: _202_jess
startup: 4.633 3.615 diff: 21.97%
settled: 1.3518 1.2545 diff: 7.19%
benchmark: _201_compress
startup: 4.942 4.467 diff: 9.61%
settled: 4.1837 3.6429 diff: 12.92%
benchmark: _209_db
startup: 9.003 8.863 diff: 1.55%
settled: 6.4088 6.3588 diff: 0.78%
benchmark: _222_mpegaudio
startup: 3.035 2.424 diff: 20.13%
settled: 2.2615 2.2408 diff: 0.91%
benchmark: _228_jack
startup: 4.911 3.339 diff: 32.00%
settled: 1.5041 1.2857 diff: 14.52%
benchmark: _213_javac
startup: 5.681 3.649 diff: 35.76%
settled: 1.7092 1.6375 diff: 4.19%
SPECjvm98, each sub-benchmark ran in a fresh JVM for 20 iterations.
startup – first iteration, settled – average of last 10.

More Related Content

What's hot

Discover Quarkus and GraalVM
Discover Quarkus and GraalVMDiscover Quarkus and GraalVM
Discover Quarkus and GraalVMRomain Schlick
 
From Java 11 to 17 and beyond.pdf
From Java 11 to 17 and beyond.pdfFrom Java 11 to 17 and beyond.pdf
From Java 11 to 17 and beyond.pdfJosé Paumard
 
Garbage First Garbage Collector (G1 GC): Current and Future Adaptability and ...
Garbage First Garbage Collector (G1 GC): Current and Future Adaptability and ...Garbage First Garbage Collector (G1 GC): Current and Future Adaptability and ...
Garbage First Garbage Collector (G1 GC): Current and Future Adaptability and ...Monica Beckwith
 
Whitebox testing of Spring Boot applications
Whitebox testing of Spring Boot applicationsWhitebox testing of Spring Boot applications
Whitebox testing of Spring Boot applicationsYura Nosenko
 
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon
 
Awesome Traefik - Ingress Controller for Kubernetes - Swapnasagar Pradhan
Awesome Traefik - Ingress Controller for Kubernetes - Swapnasagar PradhanAwesome Traefik - Ingress Controller for Kubernetes - Swapnasagar Pradhan
Awesome Traefik - Ingress Controller for Kubernetes - Swapnasagar PradhanAjeet Singh Raina
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeperSaurav Haloi
 
Graal and Truffle: One VM to Rule Them All
Graal and Truffle: One VM to Rule Them AllGraal and Truffle: One VM to Rule Them All
Graal and Truffle: One VM to Rule Them AllThomas Wuerthinger
 
Jvm & Garbage collection tuning for low latencies application
Jvm & Garbage collection tuning for low latencies applicationJvm & Garbage collection tuning for low latencies application
Jvm & Garbage collection tuning for low latencies applicationQuentin Ambard
 
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsBrendan Gregg
 
Github - Git Training Slides: Foundations
Github - Git Training Slides: FoundationsGithub - Git Training Slides: Foundations
Github - Git Training Slides: FoundationsLee Hanxue
 
Grafana optimization for Prometheus
Grafana optimization for PrometheusGrafana optimization for Prometheus
Grafana optimization for PrometheusMitsuhiro Tanda
 
Stop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production SystemsStop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production SystemsBrendan Gregg
 
Jvm tuning for low latency application & Cassandra
Jvm tuning for low latency application & CassandraJvm tuning for low latency application & Cassandra
Jvm tuning for low latency application & CassandraQuentin Ambard
 
Performance Tuning - Memory leaks, Thread deadlocks, JDK tools
Performance Tuning -  Memory leaks, Thread deadlocks, JDK toolsPerformance Tuning -  Memory leaks, Thread deadlocks, JDK tools
Performance Tuning - Memory leaks, Thread deadlocks, JDK toolsHaribabu Nandyal Padmanaban
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to PrometheusJulien Pivotto
 

What's hot (20)

Discover Quarkus and GraalVM
Discover Quarkus and GraalVMDiscover Quarkus and GraalVM
Discover Quarkus and GraalVM
 
From Java 11 to 17 and beyond.pdf
From Java 11 to 17 and beyond.pdfFrom Java 11 to 17 and beyond.pdf
From Java 11 to 17 and beyond.pdf
 
Garbage First Garbage Collector (G1 GC): Current and Future Adaptability and ...
Garbage First Garbage Collector (G1 GC): Current and Future Adaptability and ...Garbage First Garbage Collector (G1 GC): Current and Future Adaptability and ...
Garbage First Garbage Collector (G1 GC): Current and Future Adaptability and ...
 
Whitebox testing of Spring Boot applications
Whitebox testing of Spring Boot applicationsWhitebox testing of Spring Boot applications
Whitebox testing of Spring Boot applications
 
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
 
Awesome Traefik - Ingress Controller for Kubernetes - Swapnasagar Pradhan
Awesome Traefik - Ingress Controller for Kubernetes - Swapnasagar PradhanAwesome Traefik - Ingress Controller for Kubernetes - Swapnasagar Pradhan
Awesome Traefik - Ingress Controller for Kubernetes - Swapnasagar Pradhan
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Java 8 Workshop
Java 8 WorkshopJava 8 Workshop
Java 8 Workshop
 
Graal and Truffle: One VM to Rule Them All
Graal and Truffle: One VM to Rule Them AllGraal and Truffle: One VM to Rule Them All
Graal and Truffle: One VM to Rule Them All
 
Jvm & Garbage collection tuning for low latencies application
Jvm & Garbage collection tuning for low latencies applicationJvm & Garbage collection tuning for low latencies application
Jvm & Garbage collection tuning for low latencies application
 
Java Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame Graphs
 
Java On CRaC
Java On CRaCJava On CRaC
Java On CRaC
 
Github - Git Training Slides: Foundations
Github - Git Training Slides: FoundationsGithub - Git Training Slides: Foundations
Github - Git Training Slides: Foundations
 
Grafana optimization for Prometheus
Grafana optimization for PrometheusGrafana optimization for Prometheus
Grafana optimization for Prometheus
 
Stop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production SystemsStop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production Systems
 
Jvm tuning for low latency application & Cassandra
Jvm tuning for low latency application & CassandraJvm tuning for low latency application & Cassandra
Jvm tuning for low latency application & Cassandra
 
Performance Tuning - Memory leaks, Thread deadlocks, JDK tools
Performance Tuning -  Memory leaks, Thread deadlocks, JDK toolsPerformance Tuning -  Memory leaks, Thread deadlocks, JDK tools
Performance Tuning - Memory leaks, Thread deadlocks, JDK tools
 
Summary of C++17 features
Summary of C++17 featuresSummary of C++17 features
Summary of C++17 features
 
Introduction to Redis
Introduction to RedisIntroduction to Redis
Introduction to Redis
 
Introduction to Prometheus
Introduction to PrometheusIntroduction to Prometheus
Introduction to Prometheus
 

Viewers also liked

JVM JIT compilation overview by Vladimir Ivanov
JVM JIT compilation overview by Vladimir IvanovJVM JIT compilation overview by Vladimir Ivanov
JVM JIT compilation overview by Vladimir IvanovZeroTurnaround
 
JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?Doug Hawkins
 
OpenJDK HotSpot C1Compiler Overview
OpenJDK HotSpot C1Compiler OverviewOpenJDK HotSpot C1Compiler Overview
OpenJDK HotSpot C1Compiler Overviewnothingcosmos
 
Java Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoJava Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoValeriia Maliarenko
 
Understanding Garbage Collection
Understanding Garbage CollectionUnderstanding Garbage Collection
Understanding Garbage CollectionDoug Hawkins
 
Field theory pierre_bourdieu
Field theory pierre_bourdieuField theory pierre_bourdieu
Field theory pierre_bourdieuDavid Engelby
 

Viewers also liked (6)

JVM JIT compilation overview by Vladimir Ivanov
JVM JIT compilation overview by Vladimir IvanovJVM JIT compilation overview by Vladimir Ivanov
JVM JIT compilation overview by Vladimir Ivanov
 
JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?JVM Mechanics: When Does the JVM JIT & Deoptimize?
JVM Mechanics: When Does the JVM JIT & Deoptimize?
 
OpenJDK HotSpot C1Compiler Overview
OpenJDK HotSpot C1Compiler OverviewOpenJDK HotSpot C1Compiler Overview
OpenJDK HotSpot C1Compiler Overview
 
Java Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoJava Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey Kovalenko
 
Understanding Garbage Collection
Understanding Garbage CollectionUnderstanding Garbage Collection
Understanding Garbage Collection
 
Field theory pierre_bourdieu
Field theory pierre_bourdieuField theory pierre_bourdieu
Field theory pierre_bourdieu
 

Similar to Tiered Compilation in Hotspot JVM

IncQuery-D: Incremental Queries in the Cloud
IncQuery-D: Incremental Queries in the CloudIncQuery-D: Incremental Queries in the Cloud
IncQuery-D: Incremental Queries in the CloudGábor Szárnyas
 
running stable diffusion on android
running stable diffusion on androidrunning stable diffusion on android
running stable diffusion on androidKoan-Sin Tan
 
Application Assessment - Executive Summary Report
Application Assessment - Executive Summary ReportApplication Assessment - Executive Summary Report
Application Assessment - Executive Summary ReportCAST
 
Getting started with RISC-V verification what's next after compliance testing
Getting started with RISC-V verification what's next after compliance testingGetting started with RISC-V verification what's next after compliance testing
Getting started with RISC-V verification what's next after compliance testingRISC-V International
 
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIXCassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIXVinay Kumar Chella
 
Siemens s7 300 programming
Siemens s7 300 programming Siemens s7 300 programming
Siemens s7 300 programming satyajit patra
 
Sista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performanceSista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performanceESUG
 
Compiler design
Compiler designCompiler design
Compiler designsowfi
 
SOC Application Studies: Image Compression
SOC Application Studies: Image CompressionSOC Application Studies: Image Compression
SOC Application Studies: Image CompressionA B Shinde
 
Incremental Model Queries for Model-Dirven Software Engineering
Incremental Model Queries for Model-Dirven Software EngineeringIncremental Model Queries for Model-Dirven Software Engineering
Incremental Model Queries for Model-Dirven Software EngineeringÁkos Horváth
 
Vieworks - Hybrid TDI Cameras Technology (Wojciech Majewski)
Vieworks - Hybrid TDI Cameras Technology (Wojciech Majewski)Vieworks - Hybrid TDI Cameras Technology (Wojciech Majewski)
Vieworks - Hybrid TDI Cameras Technology (Wojciech Majewski)Charl Fontini
 
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudJourney to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudCeph Community
 
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudJourney to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudPatrick McGarry
 
Cloud computing_processing frameworks
Cloud computing_processing frameworksCloud computing_processing frameworks
Cloud computing_processing frameworksReem Abdel-Rahman
 
Principal Sources of Optimization in compiler design
Principal Sources of Optimization in compiler design Principal Sources of Optimization in compiler design
Principal Sources of Optimization in compiler design LogsAk
 
Single instruction multiple data
Single instruction multiple dataSingle instruction multiple data
Single instruction multiple dataSyed Zaid Irshad
 
Nokia kpi and_core_optimization
Nokia kpi and_core_optimizationNokia kpi and_core_optimization
Nokia kpi and_core_optimizationdebasish goswami
 

Similar to Tiered Compilation in Hotspot JVM (20)

IncQuery-D: Incremental Queries in the Cloud
IncQuery-D: Incremental Queries in the CloudIncQuery-D: Incremental Queries in the Cloud
IncQuery-D: Incremental Queries in the Cloud
 
running stable diffusion on android
running stable diffusion on androidrunning stable diffusion on android
running stable diffusion on android
 
Application Assessment - Executive Summary Report
Application Assessment - Executive Summary ReportApplication Assessment - Executive Summary Report
Application Assessment - Executive Summary Report
 
Getting started with RISC-V verification what's next after compliance testing
Getting started with RISC-V verification what's next after compliance testingGetting started with RISC-V verification what's next after compliance testing
Getting started with RISC-V verification what's next after compliance testing
 
ASIC design verification
ASIC design verificationASIC design verification
ASIC design verification
 
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIXCassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
CassandraSummit2015_Cassandra upgrades at scale @ NETFLIX
 
Siemens s7 300 programming
Siemens s7 300 programming Siemens s7 300 programming
Siemens s7 300 programming
 
Sista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performanceSista: Improving Cog’s JIT performance
Sista: Improving Cog’s JIT performance
 
Compiler design
Compiler designCompiler design
Compiler design
 
SOC Application Studies: Image Compression
SOC Application Studies: Image CompressionSOC Application Studies: Image Compression
SOC Application Studies: Image Compression
 
Incremental Model Queries for Model-Dirven Software Engineering
Incremental Model Queries for Model-Dirven Software EngineeringIncremental Model Queries for Model-Dirven Software Engineering
Incremental Model Queries for Model-Dirven Software Engineering
 
Vieworks - Hybrid TDI Cameras Technology (Wojciech Majewski)
Vieworks - Hybrid TDI Cameras Technology (Wojciech Majewski)Vieworks - Hybrid TDI Cameras Technology (Wojciech Majewski)
Vieworks - Hybrid TDI Cameras Technology (Wojciech Majewski)
 
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudJourney to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
 
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack CloudJourney to Stability: Petabyte Ceph Cluster in OpenStack Cloud
Journey to Stability: Petabyte Ceph Cluster in OpenStack Cloud
 
Cloud computing_processing frameworks
Cloud computing_processing frameworksCloud computing_processing frameworks
Cloud computing_processing frameworks
 
Principal Sources of Optimization in compiler design
Principal Sources of Optimization in compiler design Principal Sources of Optimization in compiler design
Principal Sources of Optimization in compiler design
 
embedded C.pptx
embedded C.pptxembedded C.pptx
embedded C.pptx
 
Java Decompiler
Java DecompilerJava Decompiler
Java Decompiler
 
Single instruction multiple data
Single instruction multiple dataSingle instruction multiple data
Single instruction multiple data
 
Nokia kpi and_core_optimization
Nokia kpi and_core_optimizationNokia kpi and_core_optimization
Nokia kpi and_core_optimization
 

Recently uploaded

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 

Recently uploaded (20)

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 

Tiered Compilation in Hotspot JVM

  • 1. <Insert Picture Here> Tiered Compilation Igor Veresov
  • 2. Goals • Server performance • With client startup • Adaptive policy with minimal or no machine- dependent tuning: feedback on compilation queue length as a measure of machine's speed/compilation speed
  • 3. Compilation levels • level 0 - interpreter • level 1 - C1 with full optimization (no profiling) • level 2 - C1 with invocation and backedge counters • level 3 - C1 with full profiling (level 2 + MDO) • level 4 - C2 • level 3 doesn't do CEE or BE • level 3 is 35% slower than level 2
  • 4. Client is hard to beat • Client compiles when i + b >= 1500 (level 1), needs no profiling • Tiered needs to compile in best case: – level 3 version and profile (code 35% slower) – compile level 4 version (compilation is slow) • Only beneficial for workloads when it's possible to compensate by producing level 4 version fast • Fortunately most real word apps (and benchmarks) are like that (run more than a second with high code reuse factor).
  • 5. Profiling • Profiles stored in MethodDataOop (MDO) • Existing counters used when not profiling(MethodOop) • Separate counters in MDO (multiple versions can run concurrently) • Can profile at levels 0 and 3 • What: branches, call receiver types, typechecks (checkcast, instanceof, aastore) • Periodic runtime notifications. – Invocations: 0:128, 2:2048, 3:1024, – Backedges: 0:1024, 2:8192, 3:4096
  • 6. State transitions I • Starts at level 0 • Ideally needs to transition to level 3 • Scales threshold based on C1 queue length / number of C1 threads • May transition to level 2 based on C2 queue length (spending too much time in level 3 hurts, 30% slower) • Can start profiling at level 0
  • 7. State Transitions II • Profiling: level 3 • Monitors C2 queue length / number of C2 threads and scales level 4 threshold • If the method is trivial (small, nothing to profile) compiles level 1 version immediately
  • 8. In-queue optimizations • Prioritization by (rate + 1) * (i + 1) * (b + 1), rate = d(i + b) / dt • And methods after deoptimization have advantage • Stale methods removal • In-queue level change (3->2).
  • 9. Common transition patterns • 0 -> 3 -> 4 (common). • 0 -> 2 -> 3 -> 4 (C2 queue too long). • 0 -> (3->2) -> 4 (in queue change). • 0 -> 3 -> 1 or 0 -> 2 -> 1 (trivial, or can't compile in C2). • 0 -> 4 (can't compile in C1, fully profile in interpreter). • (1,2,3,4) → 0 (deoptimization)
  • 10. Level switching predicates • Regular compile: i > k1 * scale || i > k2 * scale && i + b > k3 * scale • OSR: b > k4 * scale • Profile maturity measured the same way (scale=ProfileMaturityPercentage/100.0 = 0.2) • Reprofiling support (additional start counters in MDO)
  • 11. Performance (8 core x86, 1 C1, 2 C2) Client vs tiered benchmark: _227_mtrt startup: 1.857 1.692 diff: 8.88% settled: 0.5961 0.4107 diff: 31.10% benchmark: _202_jess startup: 4.066 3.615 diff: 11.09% settled: 1.4405 1.2545 diff: 12.91% benchmark: _201_compress startup: 4.781 4.467 diff: 6.56% settled: 3.9605 3.6429 diff: 8.01% benchmark: _209_db startup: 9.719 8.863 diff: 8.80% settled: 6.2362 6.3588 diff: -1.96% benchmark: _222_mpegaudio startup: 2.656 2.424 diff: 8.73% settled: 2.4167 2.2408 diff: 7.27% benchmark: _228_jack startup: 3.737 3.339 diff: 10.65% settled: 1.7389 1.2857 diff: 26.06% benchmark: _213_javac startup: 3.714 3.649 diff: 1.75% settled: 2.0299 1.6375 diff: 19.33% Server vs tiered benchmark: _227_mtrt startup: 2.014 1.692 diff: 15.98% settled: 0.4234 0.4107 diff: 2.99% benchmark: _202_jess startup: 4.633 3.615 diff: 21.97% settled: 1.3518 1.2545 diff: 7.19% benchmark: _201_compress startup: 4.942 4.467 diff: 9.61% settled: 4.1837 3.6429 diff: 12.92% benchmark: _209_db startup: 9.003 8.863 diff: 1.55% settled: 6.4088 6.3588 diff: 0.78% benchmark: _222_mpegaudio startup: 3.035 2.424 diff: 20.13% settled: 2.2615 2.2408 diff: 0.91% benchmark: _228_jack startup: 4.911 3.339 diff: 32.00% settled: 1.5041 1.2857 diff: 14.52% benchmark: _213_javac startup: 5.681 3.649 diff: 35.76% settled: 1.7092 1.6375 diff: 4.19% SPECjvm98, each sub-benchmark ran in a fresh JVM for 20 iterations. startup – first iteration, settled – average of last 10.