SlideShare a Scribd company logo
1 of 98
Download to read offline
Java
Under the hood
Javac and JVM optimizations
Agenda
● Javac and JVM optimizations
○ JIT (Just In Time Compilation)
■ Profiling, Method Binding, Safepoints
○ Method Inlining,
○ Loop Unrolling,
○ Lock Coarsening
○ Lock Eliding,
○ Branch Prediction,
○ Escape Analysis
○ OSR (On Stack Replacement)
○ TLAB (Thread Local Allocation Buffers)
Java programm lifetime
JIT Compilation
Method Inlining
Loop Unrolling
Loop Unrolling
Lock Coarsening
Lock Eliding
Branch Prediction
Branch Prediction
Branch Prediction
● Performance of an if-statement depends on whether its
condition has a predictable pattern.
● A “bad” true-false pattern can make an if-statement up
to six times slower than a “good” pattern!
Doing string concatenation in one scope will
be picked by javac and replaced with
StringBuilder equivalent.
String concatenation example
String concatenation example
Intrinsics
Intrinsics are methods KNOWN to JIT. Bytecodes of those
are ignored and native most performant versions for target
platform is used...
● System::arraycopy
● String::equals
● Math::*
● Object::hashcode
● Object::getClass
● Unsafe::*
Escape Analysis
Any object that is not escaping its creation
scope MAY be optimized to stack allocation.
Mostly Lambdas, Anonymous classes,
DateTime, String Builders, Optionals etc...
Escape analysis
TLAB (Thread Local Allocation Buffers)
How to “see” JIT activity? - JitWatch
Conclusion
Before attempting to “optimize” something in low level, make
sure you understand what the environment is already
optimizing for you…
Dont try to predict the performance (especially low-level
behavior) of your program by looking at the bytecode. When
the JIT Compiler is done with it, there will not be much
similarities left.
Questions?
Concurrency : Level 0
Agenda
● Concurrency : Hardware level
○ CPU architecture evolution
○ Cache Coherency Protocols
○ Memory Barriers
○ Store Buffers
○ Cachelines
○ volatiles, monitors (locks, synchronization), atomics
CPU structure
Cache access latencies
CPUs are getting faster not by frequency but by lower latency between L
caches, better cache coherency protocols and smart optimizations.
Why Concurrency is HARD?
Problem 1 : VISIBILITY!
● Any processor can temporarily store some values to L
caches instead of Main memory, thus other processor
might not see changes made by first processor…
● Also if processor works for some time with L caches it
might not see changes made by other processor right
away...
Why Concurrency is HARD?
Problem 2 : Reordering
Example : Non thread safe
JMM (Java Memory Model)
Java Memory model is set of rules and
guidelines which allows Java programs to
behave deterministically across multiple
memory architecture, CPU, and operating
systems.
Thread safe version (visibility + reordering both solved)
Thread safe version
cpu/x86/vm/c1_LIRGenerator_x86.cpp
Example : Thread safe
Happens Before
Understanding volatile
Conclusions on Volatile
● Volatile guarantees that changes made by one thread is visible
to other thread.
● Guarantees that read/write to volatile field is never reordered
(instructions before and after can be reordered).
● Volatile without additional synchronization is enough if you
have only one writer to the volatile field, if there are more
than one you need to synchronize...
Volatile Write/Read performance
Lazy Singleton (not thread safe)
Lazy Singleton (dumb thread safety)
Lazy Singleton (not thread safe)
Lazy Singleton (still not thread safe)
Lazy Singleton (thread safe yay!)
Happens Before
Lazy Singleton (CL trick)
False sharing (hidden contention)
False Sharing
False Sharing
Monitors
Monitor Operations :
● monitorenter
● monitorexit
● wait
● notify/notifyAll
Monitor States :
● init
● biased
● thin
● fat (inflated)
Cost of Contention
Conclusion
● Volatile reads are not that bad
● Avoid sharing state
● Avoid writing to shared state
● Avoid Contention
Tools
● JMH OpenJDK tool to write correct benchmarks
● JMH Samples
● Jcstress tool to test critical sections of concurrent code
● JOL (Java Object Layout) helps to measure sizes of objects
JMH example
JMH example
Jcstress example
Jcstress sample output
IMPORTANT!
Sometimes horizontal scaling is cheaper. Developing hardware friendly code is hard, it breaks easy if
new developer does not understand existing code base or new version of JVM does some optimizations
you never expect (happens a lot), it's hard to test, If your product needs higher throughput, you either
make it more efficient or scale. When cost of scaling is too high then it makes perfect sense to make the
system more efficient (assuming you don't have fundamentally inefficient system).
If you’re scaling your product and a single node on highest load utilizes low percentage of its resources
(CPU, Memory etc…) then you have a not efficient system.
Developing hardware friendly code is all about efficiency, on most systems you might NEVER
need to go low level, but knowledge of low level semantics of your environment will enable you to
write more efficient code by default.
And most important NEVER EVER optimize without
BENCHMARKING!!!
Disruptor by LMAX
Example of Disrupter useage : Log4j2
In the test with 64 threads, asynchronous loggers are 12 times faster than
asynchronous appenders, and 68 times faster than synchronous loggers.
Why?
● Generally any traditional queue is in one of two states : either its filling
up, or it’s draining.
● Most queues are unbounded : and any unbounded queue is a
potential OOM source.
● Queues are writing to the memory : put and pull… and writes are
expensive. During a write queue is locked (or partially locked).
● Queues are best way to create CONTENTION! thats what often is the
bottleneck of the system.
Queue typical state
What is it all about Disruptor?
● Non blocking. A write does not lock consumers, and consumers work in
parallel, with controlled access to data in the queue, and without
CONTENTION!
● GC Free : Disruptor does not create any objects at all, instead it pre
allocates all the memory programmatically predefined for it.
● Disruptor is bounded.
● Cache friendly. (Mechanical sympathy)
● Its hardware friendly. Disruptor uses all the low level semantics of JMM
to achieve maximum performance/latency.
● One thread per consumer.
Theory : understanding disruptor
Writing to Ring Buffer
Reading from Ring Buffer
Disruptor can coordinate consumers
Lmax architecture
Disruptor (Pros)
● Performance of course
● Holly BATCHING!!!
● Mechanical Sympathy
● Optionally GC Free
● Prevents False Sharing
● Easy to compose dependant consumers (concurrency)
● Synchronization free code in consumers
● Data Structure (not a frickin framework!!!)
● Fits werry well with CQRS and ES
Disruptor (Pros)
● Thread affinity (for more performance/throughput)
● Different strategies for Consumers (busy spin, sleep)
● Single/Multiple producer strategy
Avoid useless processing (disrupter can batch)
Disruptor (Cons)
● Not as trivial as ABQ (or other queues)
● Reasonable limit for busy threads (consumers)
● Not a drop in replacement, it different approach to queues
Disruptor Implementation (simplified : single writer)
No locks at all ( Atomic.lazySet )
Why power of 2?
Ring Buffer customizations
● Producer strategies
○ Single producer
○ Multiple producer
● Wait Strategies
○ Sleeping Wait
○ Yielding Wait
○ Busy Spin
Resources
JitWatch
Peter Lawrey blog
Aleksey Shipilyov stuff
About TLAB
About Monitors
About Memory Barriers
And some stuff about high performance Java code
● https://www.youtube.com/watch?v=NEG8tMn36VQ
● https://www.youtube.com/watch?v=t49bfPLp0B0
● http://www.slideshare.net/PeterLawrey/writing-and-testing-high-frequency-trading-engines-in-java
● https://www.youtube.com/watch?v=ih-IZHpxFkY
Links for LMAX Disruptor
● https://www.youtube.com/watch?v=DCdGlxBbKU4
● https://www.youtube.com/watch?v=KrWxle6U10M
● https://www.youtube.com/watch?v=IsGBA9KEtTM
● https://www.youtube.com/watch?v=o_nXgoTxBsQ
● http://martinfowler.com/articles/lmax.html
● https://www.youtube.com/watch?v=eTeWxZvlCZ8
Coming next
Concurrency : Level 1
Concurrency primitives provided by language SDK. Everything that
provides manual control over concurrency.
- package java.util.concurrent.*
- Future
- CompletableFuture
- Phaser
- ForkJoinPool (in Java 8), ForkJoinTask, CountedCompleters
Concurrency : Level 2
High level approach to concurrency, when library or framework handles
concurrent execution of the code... (will cover only RxJava although
there is a bunch of other good stuff)
- Functional Programming approach (high order functions)
- Optional
- Streams
- Reactive Programming (RxJava)

More Related Content

What's hot

Hybrid STM/HTM for Nested Transactions on OpenJDK
Hybrid STM/HTM for Nested Transactions on OpenJDKHybrid STM/HTM for Nested Transactions on OpenJDK
Hybrid STM/HTM for Nested Transactions on OpenJDKAntony Hosking
 
Using Flame Graphs
Using Flame GraphsUsing Flame Graphs
Using Flame GraphsIsuru Perera
 
The JVM - Internal ( 스터디 자료 )
The JVM - Internal ( 스터디 자료 )The JVM - Internal ( 스터디 자료 )
The JVM - Internal ( 스터디 자료 )GunHee Lee
 
Intrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VMIntrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VMKris Mok
 
Concurrency patterns in Ruby
Concurrency patterns in RubyConcurrency patterns in Ruby
Concurrency patterns in RubyThoughtWorks
 
"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine
"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine
"JIT compiler overview" @ JEEConf 2013, Kiev, UkraineVladimir Ivanov
 
Mitigating overflows using defense in-depth. What can your compiler do for you?
Mitigating overflows using defense in-depth. What can your compiler do for you?Mitigating overflows using defense in-depth. What can your compiler do for you?
Mitigating overflows using defense in-depth. What can your compiler do for you?Javier Tallón
 
Transactional Memory
Transactional MemoryTransactional Memory
Transactional MemoryYuuki Takano
 
Notes about concurrent and distributed systems & x86 virtualization
Notes about concurrent and distributed systems & x86 virtualizationNotes about concurrent and distributed systems & x86 virtualization
Notes about concurrent and distributed systems & x86 virtualizationAlessio Villardita
 
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and KibanaBuilding a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and KibanaMushfekur Rahman
 
Concurrency - Why it's hard ?
Concurrency - Why it's hard ?Concurrency - Why it's hard ?
Concurrency - Why it's hard ?Ramith Jayasinghe
 
Distributed Transaction Management in Spring & JEE
Distributed Transaction Management in Spring & JEEDistributed Transaction Management in Spring & JEE
Distributed Transaction Management in Spring & JEEMushfekur Rahman
 
Diagnosing HotSpot JVM Memory Leaks with JFR and JMC
Diagnosing HotSpot JVM Memory Leaks with JFR and JMCDiagnosing HotSpot JVM Memory Leaks with JFR and JMC
Diagnosing HotSpot JVM Memory Leaks with JFR and JMCMushfekur Rahman
 
Make Your Own Developement Board @ 2014.4.21 JuluOSDev
Make Your Own Developement Board @ 2014.4.21 JuluOSDevMake Your Own Developement Board @ 2014.4.21 JuluOSDev
Make Your Own Developement Board @ 2014.4.21 JuluOSDevJian-Hong Pan
 
Building a QT based solution on a i.MX7 processor running Linux and FreeRTOS
Building a QT based solution on a i.MX7 processor running Linux and FreeRTOSBuilding a QT based solution on a i.MX7 processor running Linux and FreeRTOS
Building a QT based solution on a i.MX7 processor running Linux and FreeRTOSFernando Luiz Cola
 
JVM Performance Tuning
JVM Performance TuningJVM Performance Tuning
JVM Performance TuningJeremy Leisy
 

What's hot (20)

Hybrid STM/HTM for Nested Transactions on OpenJDK
Hybrid STM/HTM for Nested Transactions on OpenJDKHybrid STM/HTM for Nested Transactions on OpenJDK
Hybrid STM/HTM for Nested Transactions on OpenJDK
 
Using Flame Graphs
Using Flame GraphsUsing Flame Graphs
Using Flame Graphs
 
The JVM - Internal ( 스터디 자료 )
The JVM - Internal ( 스터디 자료 )The JVM - Internal ( 스터디 자료 )
The JVM - Internal ( 스터디 자료 )
 
Intrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VMIntrinsic Methods in HotSpot VM
Intrinsic Methods in HotSpot VM
 
Concurrency patterns in Ruby
Concurrency patterns in RubyConcurrency patterns in Ruby
Concurrency patterns in Ruby
 
"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine
"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine
"JIT compiler overview" @ JEEConf 2013, Kiev, Ukraine
 
What your jvm can do for you
What your jvm can do for youWhat your jvm can do for you
What your jvm can do for you
 
Free FreeRTOS Course-Task Management
Free FreeRTOS Course-Task ManagementFree FreeRTOS Course-Task Management
Free FreeRTOS Course-Task Management
 
Mitigating overflows using defense in-depth. What can your compiler do for you?
Mitigating overflows using defense in-depth. What can your compiler do for you?Mitigating overflows using defense in-depth. What can your compiler do for you?
Mitigating overflows using defense in-depth. What can your compiler do for you?
 
Pgq
PgqPgq
Pgq
 
Transactional Memory
Transactional MemoryTransactional Memory
Transactional Memory
 
Notes about concurrent and distributed systems & x86 virtualization
Notes about concurrent and distributed systems & x86 virtualizationNotes about concurrent and distributed systems & x86 virtualization
Notes about concurrent and distributed systems & x86 virtualization
 
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and KibanaBuilding a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
Building a Unified Logging Layer with Fluentd, Elasticsearch and Kibana
 
Concurrency - Why it's hard ?
Concurrency - Why it's hard ?Concurrency - Why it's hard ?
Concurrency - Why it's hard ?
 
Why Concurrency is hard ?
Why Concurrency is hard ?Why Concurrency is hard ?
Why Concurrency is hard ?
 
Distributed Transaction Management in Spring & JEE
Distributed Transaction Management in Spring & JEEDistributed Transaction Management in Spring & JEE
Distributed Transaction Management in Spring & JEE
 
Diagnosing HotSpot JVM Memory Leaks with JFR and JMC
Diagnosing HotSpot JVM Memory Leaks with JFR and JMCDiagnosing HotSpot JVM Memory Leaks with JFR and JMC
Diagnosing HotSpot JVM Memory Leaks with JFR and JMC
 
Make Your Own Developement Board @ 2014.4.21 JuluOSDev
Make Your Own Developement Board @ 2014.4.21 JuluOSDevMake Your Own Developement Board @ 2014.4.21 JuluOSDev
Make Your Own Developement Board @ 2014.4.21 JuluOSDev
 
Building a QT based solution on a i.MX7 processor running Linux and FreeRTOS
Building a QT based solution on a i.MX7 processor running Linux and FreeRTOSBuilding a QT based solution on a i.MX7 processor running Linux and FreeRTOS
Building a QT based solution on a i.MX7 processor running Linux and FreeRTOS
 
JVM Performance Tuning
JVM Performance TuningJVM Performance Tuning
JVM Performance Tuning
 

Similar to Java JVM and JIT Optimizations

Efficient Buffer Management
Efficient Buffer ManagementEfficient Buffer Management
Efficient Buffer Managementbasisspace
 
Linux Locking Mechanisms
Linux Locking MechanismsLinux Locking Mechanisms
Linux Locking MechanismsKernel TLV
 
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...PingCAP
 
Performance optimization techniques for Java code
Performance optimization techniques for Java codePerformance optimization techniques for Java code
Performance optimization techniques for Java codeAttila Balazs
 
Software Profiling: Java Performance, Profiling and Flamegraphs
Software Profiling: Java Performance, Profiling and FlamegraphsSoftware Profiling: Java Performance, Profiling and Flamegraphs
Software Profiling: Java Performance, Profiling and FlamegraphsIsuru Perera
 
Arm developement
Arm developementArm developement
Arm developementhirokiht
 
cachegrand: A Take on High Performance Caching
cachegrand: A Take on High Performance Cachingcachegrand: A Take on High Performance Caching
cachegrand: A Take on High Performance CachingScyllaDB
 
Writing and testing high frequency trading engines in java
Writing and testing high frequency trading engines in javaWriting and testing high frequency trading engines in java
Writing and testing high frequency trading engines in javaPeter Lawrey
 
LMAX Disruptor - High Performance Inter-Thread Messaging Library
LMAX Disruptor - High Performance Inter-Thread Messaging LibraryLMAX Disruptor - High Performance Inter-Thread Messaging Library
LMAX Disruptor - High Performance Inter-Thread Messaging LibrarySebastian Andrasoni
 
strangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsstrangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsMatthew Dennis
 
An End to Order (many cores with java, session two)
An End to Order (many cores with java, session two)An End to Order (many cores with java, session two)
An End to Order (many cores with java, session two)Robert Burrell Donkin
 

Similar to Java JVM and JIT Optimizations (20)

Efficient Buffer Management
Efficient Buffer ManagementEfficient Buffer Management
Efficient Buffer Management
 
Java Memory Model
Java Memory ModelJava Memory Model
Java Memory Model
 
Concept of thread
Concept of threadConcept of thread
Concept of thread
 
Java concurrency
Java concurrencyJava concurrency
Java concurrency
 
Linux Locking Mechanisms
Linux Locking MechanismsLinux Locking Mechanisms
Linux Locking Mechanisms
 
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
 
Performance optimization techniques for Java code
Performance optimization techniques for Java codePerformance optimization techniques for Java code
Performance optimization techniques for Java code
 
Software Profiling: Java Performance, Profiling and Flamegraphs
Software Profiling: Java Performance, Profiling and FlamegraphsSoftware Profiling: Java Performance, Profiling and Flamegraphs
Software Profiling: Java Performance, Profiling and Flamegraphs
 
Presentation
PresentationPresentation
Presentation
 
Realtime
RealtimeRealtime
Realtime
 
Arm developement
Arm developementArm developement
Arm developement
 
cachegrand: A Take on High Performance Caching
cachegrand: A Take on High Performance Cachingcachegrand: A Take on High Performance Caching
cachegrand: A Take on High Performance Caching
 
Writing and testing high frequency trading engines in java
Writing and testing high frequency trading engines in javaWriting and testing high frequency trading engines in java
Writing and testing high frequency trading engines in java
 
Volatile
VolatileVolatile
Volatile
 
Java memory model
Java memory modelJava memory model
Java memory model
 
Microreboot
MicrorebootMicroreboot
Microreboot
 
LMAX Disruptor - High Performance Inter-Thread Messaging Library
LMAX Disruptor - High Performance Inter-Thread Messaging LibraryLMAX Disruptor - High Performance Inter-Thread Messaging Library
LMAX Disruptor - High Performance Inter-Thread Messaging Library
 
strangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsstrangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patterns
 
ForkJoinPools and parallel streams
ForkJoinPools and parallel streamsForkJoinPools and parallel streams
ForkJoinPools and parallel streams
 
An End to Order (many cores with java, session two)
An End to Order (many cores with java, session two)An End to Order (many cores with java, session two)
An End to Order (many cores with java, session two)
 

Recently uploaded

Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidNikhilNagaraju
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis
 

Recently uploaded (20)

Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
main PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfidmain PPT.pptx of girls hostel security using rfid
main PPT.pptx of girls hostel security using rfid
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...
 

Java JVM and JIT Optimizations

  • 2. Javac and JVM optimizations
  • 3.
  • 4. Agenda ● Javac and JVM optimizations ○ JIT (Just In Time Compilation) ■ Profiling, Method Binding, Safepoints ○ Method Inlining, ○ Loop Unrolling, ○ Lock Coarsening ○ Lock Eliding, ○ Branch Prediction, ○ Escape Analysis ○ OSR (On Stack Replacement) ○ TLAB (Thread Local Allocation Buffers)
  • 14. Branch Prediction ● Performance of an if-statement depends on whether its condition has a predictable pattern. ● A “bad” true-false pattern can make an if-statement up to six times slower than a “good” pattern!
  • 15. Doing string concatenation in one scope will be picked by javac and replaced with StringBuilder equivalent. String concatenation example
  • 17. Intrinsics Intrinsics are methods KNOWN to JIT. Bytecodes of those are ignored and native most performant versions for target platform is used... ● System::arraycopy ● String::equals ● Math::* ● Object::hashcode ● Object::getClass ● Unsafe::*
  • 18. Escape Analysis Any object that is not escaping its creation scope MAY be optimized to stack allocation. Mostly Lambdas, Anonymous classes, DateTime, String Builders, Optionals etc...
  • 20. TLAB (Thread Local Allocation Buffers)
  • 21. How to “see” JIT activity? - JitWatch
  • 22. Conclusion Before attempting to “optimize” something in low level, make sure you understand what the environment is already optimizing for you… Dont try to predict the performance (especially low-level behavior) of your program by looking at the bytecode. When the JIT Compiler is done with it, there will not be much similarities left.
  • 25. Agenda ● Concurrency : Hardware level ○ CPU architecture evolution ○ Cache Coherency Protocols ○ Memory Barriers ○ Store Buffers ○ Cachelines ○ volatiles, monitors (locks, synchronization), atomics
  • 27. Cache access latencies CPUs are getting faster not by frequency but by lower latency between L caches, better cache coherency protocols and smart optimizations.
  • 28. Why Concurrency is HARD? Problem 1 : VISIBILITY! ● Any processor can temporarily store some values to L caches instead of Main memory, thus other processor might not see changes made by first processor… ● Also if processor works for some time with L caches it might not see changes made by other processor right away...
  • 29. Why Concurrency is HARD? Problem 2 : Reordering
  • 30. Example : Non thread safe
  • 31. JMM (Java Memory Model) Java Memory model is set of rules and guidelines which allows Java programs to behave deterministically across multiple memory architecture, CPU, and operating systems.
  • 32. Thread safe version (visibility + reordering both solved)
  • 38. Conclusions on Volatile ● Volatile guarantees that changes made by one thread is visible to other thread. ● Guarantees that read/write to volatile field is never reordered (instructions before and after can be reordered). ● Volatile without additional synchronization is enough if you have only one writer to the volatile field, if there are more than one you need to synchronize...
  • 40. Lazy Singleton (not thread safe)
  • 41. Lazy Singleton (dumb thread safety)
  • 42. Lazy Singleton (not thread safe)
  • 43. Lazy Singleton (still not thread safe)
  • 47. False sharing (hidden contention)
  • 50. Monitors Monitor Operations : ● monitorenter ● monitorexit ● wait ● notify/notifyAll Monitor States : ● init ● biased ● thin ● fat (inflated)
  • 52. Conclusion ● Volatile reads are not that bad ● Avoid sharing state ● Avoid writing to shared state ● Avoid Contention
  • 53. Tools ● JMH OpenJDK tool to write correct benchmarks ● JMH Samples ● Jcstress tool to test critical sections of concurrent code ● JOL (Java Object Layout) helps to measure sizes of objects
  • 58. IMPORTANT! Sometimes horizontal scaling is cheaper. Developing hardware friendly code is hard, it breaks easy if new developer does not understand existing code base or new version of JVM does some optimizations you never expect (happens a lot), it's hard to test, If your product needs higher throughput, you either make it more efficient or scale. When cost of scaling is too high then it makes perfect sense to make the system more efficient (assuming you don't have fundamentally inefficient system). If you’re scaling your product and a single node on highest load utilizes low percentage of its resources (CPU, Memory etc…) then you have a not efficient system. Developing hardware friendly code is all about efficiency, on most systems you might NEVER need to go low level, but knowledge of low level semantics of your environment will enable you to write more efficient code by default. And most important NEVER EVER optimize without BENCHMARKING!!!
  • 60. Example of Disrupter useage : Log4j2 In the test with 64 threads, asynchronous loggers are 12 times faster than asynchronous appenders, and 68 times faster than synchronous loggers.
  • 61. Why? ● Generally any traditional queue is in one of two states : either its filling up, or it’s draining. ● Most queues are unbounded : and any unbounded queue is a potential OOM source. ● Queues are writing to the memory : put and pull… and writes are expensive. During a write queue is locked (or partially locked). ● Queues are best way to create CONTENTION! thats what often is the bottleneck of the system.
  • 63. What is it all about Disruptor? ● Non blocking. A write does not lock consumers, and consumers work in parallel, with controlled access to data in the queue, and without CONTENTION! ● GC Free : Disruptor does not create any objects at all, instead it pre allocates all the memory programmatically predefined for it. ● Disruptor is bounded. ● Cache friendly. (Mechanical sympathy) ● Its hardware friendly. Disruptor uses all the low level semantics of JMM to achieve maximum performance/latency. ● One thread per consumer.
  • 65.
  • 66.
  • 67.
  • 68.
  • 69.
  • 70.
  • 71. Writing to Ring Buffer
  • 72.
  • 73.
  • 74.
  • 75.
  • 76.
  • 77.
  • 78.
  • 79.
  • 80.
  • 82.
  • 85. Disruptor (Pros) ● Performance of course ● Holly BATCHING!!! ● Mechanical Sympathy ● Optionally GC Free ● Prevents False Sharing ● Easy to compose dependant consumers (concurrency) ● Synchronization free code in consumers ● Data Structure (not a frickin framework!!!) ● Fits werry well with CQRS and ES
  • 86. Disruptor (Pros) ● Thread affinity (for more performance/throughput) ● Different strategies for Consumers (busy spin, sleep) ● Single/Multiple producer strategy
  • 87. Avoid useless processing (disrupter can batch)
  • 88. Disruptor (Cons) ● Not as trivial as ABQ (or other queues) ● Reasonable limit for busy threads (consumers) ● Not a drop in replacement, it different approach to queues
  • 90. No locks at all ( Atomic.lazySet )
  • 92. Ring Buffer customizations ● Producer strategies ○ Single producer ○ Multiple producer ● Wait Strategies ○ Sleeping Wait ○ Yielding Wait ○ Busy Spin
  • 93.
  • 94.
  • 95. Resources JitWatch Peter Lawrey blog Aleksey Shipilyov stuff About TLAB About Monitors About Memory Barriers
  • 96. And some stuff about high performance Java code ● https://www.youtube.com/watch?v=NEG8tMn36VQ ● https://www.youtube.com/watch?v=t49bfPLp0B0 ● http://www.slideshare.net/PeterLawrey/writing-and-testing-high-frequency-trading-engines-in-java ● https://www.youtube.com/watch?v=ih-IZHpxFkY
  • 97. Links for LMAX Disruptor ● https://www.youtube.com/watch?v=DCdGlxBbKU4 ● https://www.youtube.com/watch?v=KrWxle6U10M ● https://www.youtube.com/watch?v=IsGBA9KEtTM ● https://www.youtube.com/watch?v=o_nXgoTxBsQ ● http://martinfowler.com/articles/lmax.html ● https://www.youtube.com/watch?v=eTeWxZvlCZ8
  • 98. Coming next Concurrency : Level 1 Concurrency primitives provided by language SDK. Everything that provides manual control over concurrency. - package java.util.concurrent.* - Future - CompletableFuture - Phaser - ForkJoinPool (in Java 8), ForkJoinTask, CountedCompleters Concurrency : Level 2 High level approach to concurrency, when library or framework handles concurrent execution of the code... (will cover only RxJava although there is a bunch of other good stuff) - Functional Programming approach (high order functions) - Optional - Streams - Reactive Programming (RxJava)