SlideShare a Scribd company logo
1 of 42
Download to read offline
Enabling Java
in
Latency Sensitive
Applications
Gil Tene, CTO & co-Founder, Azul Systems
©2013 Azul Systems, Inc.
About me: Gil Tene
co-founder, CTO
@Azul Systems
Have been working on
a “think different” GC
approaches since 2002
Created Pauseless & C4
core GC algorithms
(Tene, Wolf)

A Long history building
Virtual & Physical
Machines, Operating
Systems, Enterprise
apps, etc...
©2013 Azul Systems, Inc.	

	

	

	

	

* working on real-world trash compaction issues, circa 2004
About Azul
Vega

We make scalable Virtual
Machines
Have built “whatever it takes
to get job done” since 2002
3 generations of custom SMP
Multi-core HW (Vega)
Zing: Pure software for
commodity x86

C4

Known for Low Latency,
Consistent execution, and
Large data set excellence
©2013 Azul Systems, Inc.
High level agenda
Java in a low latency application world
The (historical) fundamental problems
What people have done to try to get around them
What if the fundamental problems were eliminated?
What 2013 looks like for Low latency Java developers
What’s next?
©2013 Azul Systems, Inc.
Java in the low latency world?

©2013 Azul Systems, Inc.
Java in the low latency world?
Why do people use Java for low latency apps?
Are they crazy?
No. There are good, easy to articulate reasons
Projected lifetime cost
Developer productivity
Time-to-product, Time-to-market, ...
Leverage, ecosystem, ability to hire
©2013 Azul Systems, Inc.
E.g. Customer answer to:
“Why do you use Java in Algo Trading?”
Strategies have a shelf life
We have to keep developing and deploying new ones
Only one out of N is actually productive
Profitability therefore depends on ability to
successfully deploy new strategies, and on the cost
of doing so
Our developers seem to be able to produce 2x-3x as
much when using a Java environment as they would
with C/C++ ...
©2013 Azul Systems, Inc.
So what is the problem?
Is Java Slow?
No...
A good programmer will get roughly the same speed
from both Java and C++
A bad programmer won’t get you fast code on either
The 50%‘ile and 90%‘ile are typically excellent...
It’s those pesky occasional stutters and stammers
and stalls that are the problem...
Ever hear of Garbage Collection?
©2013 Azul Systems, Inc.
Is “jitter” even the right word for this?
Hiccups&by&Time&Interval&
Max"per"Interval"

99%"

99.90%"

99.99%"

Max"

Hiccup&Dura*on&(msec)&

25"
20"
15"
10"
5"
0"
0"

100"

200"

300"

400"

500"

&Elapsed&Time&(sec)&

99%‘ile is
©2012 Azul Systems, Inc.	

25"

	

	

Max is ~30,000%
Hiccups&by&Percen*le&Distribu*on&
~60 usec
	

	

	

higher than “typical”

600"
Stop-The-World Garbage Collection:
Java’s Achilles heel
Let’s ignore the bad multi-second pauses for now...
Low latency applications regularly experience “small”,
“minor” GC events that range in the 10s of msec
Frequency directly related to allocation rate
So we have great 50%, 90%. Maybe even 99%
But 99.9%, 99.99%, Max, all “suck”
So bad that it affects risk, profitability, service
expectations, etc.
©2013 Azul Systems, Inc.
One way to deal with Stop-The-World GC
A common way to “deal” with STW-GC
Averages and Standard Deviation
Reality: Latency is usually
strongly “multi-modal”
Usually does’t look anything like a normal distribution
In software systems, usually sees periodic freezes

Complete shifts from one mode/behavior to another
Mode A: “good”.
Mode B: “Somewhat bad”
Mode C: “terrible”, ...
....
©2012 Azul Systems, Inc.
Another way to deal with STW-GC
Another way to cope: “Creative Language”
“Guarantee a worst case of 5 msec, 99% of the time”
“Mostly” Concurrent, “Mostly” Incremental
Translation: “Will at times exhibit long monolithic stopthe-world pauses”
“Fairly Consistent”
Translation: “Will sometimes show results well outside
this range”
“Typical pauses in the tens of milliseconds”
Translation: “Some pauses are much longer than tens of
milliseconds”
©2012 Azul Systems, Inc.
What do actual low latency developers
do about it?
They use “Java” instead of Java
They write “in the Java syntax”
They avoid allocation as much as possible
E.g. They build their own object pools for everything
They write all the code they use (no 3rd party libs)
They train developers for their local discipline
In short: They revert to many of the practices that
hurt productivity. They loose out on much of Java.
©2013 Azul Systems, Inc.
What do low latency (Java) developers
with all this effort?
They still see pauses (usually ranging to tens of msec)
They do get fewer (as in less frequent) pauses
And they see fewer people able to do the job
And they have to write EVERYTHING themselves
And they get to debug malloc/free patterns again
And they can only use memory in certain ways
...
Some call it “fun”... Others “duct tape engineering”...
©2013 Azul Systems, Inc.
was
It is an industry-wide problem
Stop-The-World GC mechanisms
contradict the fundamental
requirements of
low latency & low jitter apps
It’s 2013... We now have Zing.

©2013 Azul Systems, Inc.
The common GC behavior across ALL
currently shipping (non-Zing) JVMs
ALL use a Monolithic Stop-the-world NewGen
“small” periodic pauses (small as in 10s of msec)
pauses more frequent with higher throughput or allocation rates

Development focus for ALL is on Oldgen collectors
Focus is on trying to address the many-second pause problem
Usually by sweeping it farther and farther the rug
“Mostly X” (e.g. “mostly concurrent”) hides the fact that they refer
only to the OldGen part of the collector
E.g. CMS, G1, Balanced.... all are OldGen-only efforts

ALL use a Fallback to Full Stop-the-world Collection
Used to recover when other mechanisms (inevitably) fail
Also hidden under the term “Mostly”...
©2013 Azul Systems, Inc.
A Recipe: address STW-GC head-on
At Azul, we decided to focus on the core problems
Scale & productivity limited by responsiveness/latency
And it’s not the “typical” latency, it’s the outliers...
Even “short” GC pauses must be considered a problem
Responsiveness must be unlinked from key metrics:
Transaction Rate, Concurrent users, Data set size, etc.
Heap size, Live Set size, Allocation rate, Mutation rate
Responsiveness must be continually sustainable
Can’t ignore “rare but periodic” events

Eliminate ALL Stop-The-World Fallbacks
©2013 Azul Systems, Inc.
The Zing “C4” Collector
Continuously Concurrent Compacting Collector

Concurrent, compacting old generation
Concurrent, compacting new generation
No stop-the-world fallback
Always compacts, and always does so concurrently

©2013 Azul Systems, Inc.
Benefits

©2013 Azul Systems, Inc.
An example of “First day’s run” behavior
E-Commerce application

©2013 Azul Systems, Inc.
An example of behavior after 4 days of system tuning
Low latency application

©2013 Azul Systems, Inc.
Measuring Theory in Practice

jHiccup:
A tool that measures and reports
(as your application is running)
if your JVM is actually running
all the time
©2013 Azul Systems, Inc.
Discontinuities in Java platform execution - Easy To Measure
Incontinuities in Java platform execution
Hiccups"by"Time"Interval"

Max"per"Interval"

99%"

99.90%"

99.99%"

Max"

Hiccup&Dura*on&(msec)&

1800"

I call
these
“hiccups”

1600"
1400"
1200"
1000"
800"
600"
400"
200"
0"
0"

200"

400"

600"

800"

1000"

1200"

1400"

1600"

1800"

&Elapsed&Time&(sec)&

Hiccups"by"Percen@le"Distribu@on"

1800"

Hiccup&Dura*on&(msec)&

1600"

Max=1665.024&

1400"
1200"
1000"
800"
600"
400"
200"
0"

©2012 Azul Systems, Inc.	

	

	

0%"

	

90%"

	

99%"

	

&
99.9%"

&
Percen*le&

99.99%"

99.999%"

A telco
App with
a bit of a
“problem”
Fun with jHiccup

©2012 Azul Systems, Inc.
Oracle HotSpot (pure newgen)

Zing

Hiccups&by&Time&Interval&
Max"per"Interval"

99%"

99.90%"

Hiccups&by&Time&Interval&

99.99%"

Max"

Max"per"Interval"

20"
15"
10"
5"
0"

99.99%"

Max"

1.6"
1.4"
1.2"
1"
0.8"
0.6"
0.4"
0.2"
0"

0"

100"

200"

300"

400"

500"

600"

0"

100"

200"

&Elapsed&Time&(sec)&

300"

400"

500"

600"

&Elapsed&Time&(sec)&

Hiccups&by&Percen*le&Distribu*on&

Hiccups&by&Percen*le&Distribu*on&

25"

1.8"

Max=22.656&

20"

Hiccup&Dura*on&(msec)&

Hiccup&Dura*on&(msec)&

99.90%"

1.8"

Hiccup&Dura*on&(msec)&

Hiccup&Dura*on&(msec)&

25"

99%"

15"
10"
5"

1.6"

Max=1.568&

1.4"
1.2"
1"
0.8"
0.6"
0.4"
0.2"

0"

0%"

90%"

&

99%"

99.9%"

&
99.99%"

99.999%"

Percen*le&

0"

0%"

90%"

&

99%"

Low latency trading application
©2012 Azul Systems, Inc.	

	

	

	

	

	

99.9%"

&
99.99%"
Percen*le&

99.999%"
Oracle HotSpot (pure newgen)

Zing

Hiccups&by&Time&Interval&
Max"per"Interval"

99%"

99.90%"

Hiccups&by&Time&Interval&

99.99%"

Max"

Max"per"Interval"

20"
15"
10"
5"
0"

99.99%"

Max"

1.6"
1.4"
1.2"
1"
0.8"
0.6"
0.4"
0.2"
0"

0"

100"

200"

300"

400"

500"

600"

0"

100"

200"

&Elapsed&Time&(sec)&

300"

400"

500"

600"

&Elapsed&Time&(sec)&

Hiccups&by&Percen*le&Distribu*on&

Hiccups&by&Percen*le&Distribu*on&

25"

1.8"

Max=22.656&

20"

Hiccup&Dura*on&(msec)&

Hiccup&Dura*on&(msec)&

99.90%"

1.8"

Hiccup&Dura*on&(msec)&

Hiccup&Dura*on&(msec)&

25"

99%"

15"
10"
5"

1.6"

Max=1.568&

1.4"
1.2"
1"
0.8"
0.6"
0.4"
0.2"

0"

0%"

90%"

&

99%"

99.9%"

&
99.99%"

99.999%"

Percen*le&

0"

0%"

90%"

&

99%"

Low latency trading application
©2012 Azul Systems, Inc.	

	

	

	

	

	

99.9%"

&
99.99%"
Percen*le&

99.999%"
Oracle HotSpot (pure newgen)

Zing

Hiccups&by&Time&Interval&
Max"per"Interval"

99%"

99.90%"

Hiccups&by&Time&Interval&

99.99%"

Max"

Max"per"Interval"

20"
15"
10"
5"
0"

Max"

20"
15"
10"
5"

100"

200"

300"

400"

500"

600"

0"

100"

200"

&Elapsed&Time&(sec)&

300"

400"

500"

600"

&Elapsed&Time&(sec)&

Hiccups&by&Percen*le&Distribu*on&

Hiccups&by&Percen*le&Distribu*on&

25"

25"

Max=22.656&

20"

Hiccup&Dura*on&(msec)&

Hiccup&Dura*on&(msec)&

99.99%"

0"
0"

15"
10"
5"
0"

99.90%"

25"

Hiccup&Dura*on&(msec)&

Hiccup&Dura*on&(msec)&

25"

99%"

0%"

90%"

&

99%"

99.9%"

&
99.99%"

99.999%"

Percen*le&

20"
15"
10"
5"
0"

Max=1.568&
0%"

90%"

&

99%"

Low latency - Drawn to scale
©2012 Azul Systems, Inc.	

	

	

	

	

	

99.9%"

&
99.99%"
Percen*le&

99.999%"
Lets not forget about GC tuning

©2013 Azul Systems, Inc.
Java GC tuning is “hard”…
Examples of actual command line GC tuning parameters:
Java -Xmx12g -XX:MaxPermSize=64M -XX:PermSize=32M -XX:MaxNewSize=2g
-XX:NewSize=1g -XX:SurvivorRatio=128 -XX:+UseParNewGC
-XX:+UseConcMarkSweepGC -XX:MaxTenuringThreshold=0
-XX:CMSInitiatingOccupancyFraction=60 -XX:+CMSParallelRemarkEnabled
-XX:+UseCMSInitiatingOccupancyOnly -XX:ParallelGCThreads=12
-XX:LargePageSizeInBytes=256m …
Java –Xms8g –Xmx8g –Xmn2g -XX:PermSize=64M -XX:MaxPermSize=256M
-XX:-OmitStackTraceInFastThrow -XX:SurvivorRatio=2 -XX:-UseAdaptiveSizePolicy
-XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled
-XX:+CMSParallelRemarkEnabled -XX:+CMSParallelSurvivorRemarkEnabled
-XX:CMSMaxAbortablePrecleanTime=10000 -XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=63 -XX:+UseParNewGC –Xnoclassgc …
©2013 Azul Systems, Inc.
A	
  few	
  GC	
  tuning	
  flags

Source:	
  Word	
  Cloud	
  created	
  by	
  Frank	
  Pavageau	
  in	
  his	
  Devoxx	
  FR	
  2012	
  presentaFon	
  Ftled	
  “Death	
  by	
  Pauses”
The complete guide to
Zing GC tuning

java -Xmx40g

©2013 Azul Systems, Inc.
So what’s next?
GC is only the biggest problem...

©2013 Azul Systems, Inc.
JVMs make many tradeoffs
often trading speed vs. outliers
Some speed techniques come at extreme outlier costs
E.g. (“regular”) biased locking
E.g. counted loops optimizations
Deoptimization
Lock deflation
Weak References, Soft References, Finalizers
Time To Safe Point (TTSP)
©2012 Azul Systems, Inc.
Time To Safepoint (TTSP)
Your new #1 enemy
(Once GC itself was taken care of)
Many things in a JVM (still) use a global safepoint
All threads brought to a halt, at a “safe to analuze”
point in code, and then released after work is done.
E.g. GC phase shifts, Deoptimization, Class unloading,
Thread Dumps, Lock Deflation, etc. etc.

A single thread with a long time-to-safepoint path can
cause an effective pause for all other threads
Many code paths in the JVM are long...
©2012 Azul Systems, Inc.
Time To Safepoint (TTSP)
the most common examples
Array copies and object clone()
Counted loops
Many other other variants in the runtime...

Measure, Measure, Measure...
Zing has a built-in TTSP profiler
At Azul, I walk around with a 0.5msec stick...
©2012 Azul Systems, Inc.
OS related stuff
(once GC and TTSP are taken care of)
OS related hiccups tend to dominate once GC and TTSP
are removed as issues.
Take scheduling pressure seriously (Duh?)
Hyper-threading (good? bad?)
Swapping (Duh!)
Power management
Transparent Huge Pages (THP).
...
©2012 Azul Systems, Inc.
Takeaway: In 2013, “Real” Java is finally
viable for low latency applications
GC is no longer a dominant issue, even for outliers
2-3msec worst observed case with “easy” tuning
< 1 msec worst observed case is very doable
No need to code in special ways any more
You can finally use “real” Java for everything
You can finally 3rd party libraries without worries
You can finally use as much memory as you want
You can finally use regular (good) programmers
©2012 Azul Systems, Inc.
One-liner Takeaway:

Zing: A cure for the Java hiccups

©2013 Azul Systems, Inc.
Q&A

One-liner Takeaway:
Zing: A cure for the Java hiccups
jHiccup:
http:/
/www.azulsystems.com/dev_resources/jhiccup
©2013 Azul Systems, Inc.

More Related Content

Similar to Enabling Java in Latency-Sensitive Applications

How NOT to Measure Latency, Gil Tene, London, Oct. 2013
How NOT to Measure Latency, Gil Tene, London, Oct. 2013How NOT to Measure Latency, Gil Tene, London, Oct. 2013
How NOT to Measure Latency, Gil Tene, London, Oct. 2013Azul Systems Inc.
 
Understanding Latency Response Time and Behavior
Understanding Latency Response Time and BehaviorUnderstanding Latency Response Time and Behavior
Understanding Latency Response Time and BehaviorzuluJDK
 
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and Tools
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and ToolsIntelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and Tools
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and ToolsAzul Systems Inc.
 
Enterprise Search Summit - Speeding Up Search
Enterprise Search Summit - Speeding Up SearchEnterprise Search Summit - Speeding Up Search
Enterprise Search Summit - Speeding Up SearchAzul Systems Inc.
 
The Java Evolution Mismatch - Why You Need a Better JVM
The Java Evolution Mismatch - Why You Need a Better JVMThe Java Evolution Mismatch - Why You Need a Better JVM
The Java Evolution Mismatch - Why You Need a Better JVMAzul Systems Inc.
 
The Java Evolution Mismatch by Gil Tene, CTO at Azul Systems
The Java Evolution Mismatch by Gil Tene, CTO at Azul SystemsThe Java Evolution Mismatch by Gil Tene, CTO at Azul Systems
The Java Evolution Mismatch by Gil Tene, CTO at Azul SystemszuluJDK
 
JVM Mechanics: A Peek Under the Hood
JVM Mechanics: A Peek Under the HoodJVM Mechanics: A Peek Under the Hood
JVM Mechanics: A Peek Under the HoodAzul Systems Inc.
 
Understanding Hardware Transactional Memory
Understanding Hardware Transactional MemoryUnderstanding Hardware Transactional Memory
Understanding Hardware Transactional MemoryC4Media
 
Winning With Java at Market Open
Winning With Java at Market OpenWinning With Java at Market Open
Winning With Java at Market OpenAzul Systems, Inc.
 
Understanding Zulu Garbage Collection by Matt Schuetze, Director of Product M...
Understanding Zulu Garbage Collection by Matt Schuetze, Director of Product M...Understanding Zulu Garbage Collection by Matt Schuetze, Director of Product M...
Understanding Zulu Garbage Collection by Matt Schuetze, Director of Product M...zuluJDK
 
Understanding Java Garbage Collection
Understanding Java Garbage CollectionUnderstanding Java Garbage Collection
Understanding Java Garbage CollectionAzul Systems Inc.
 
Understanding Java Garbage Collection - And What You Can Do About It
Understanding Java Garbage Collection - And What You Can Do About ItUnderstanding Java Garbage Collection - And What You Can Do About It
Understanding Java Garbage Collection - And What You Can Do About ItAzul Systems Inc.
 
Understanding GC, JavaOne 2017
Understanding GC, JavaOne 2017Understanding GC, JavaOne 2017
Understanding GC, JavaOne 2017Azul Systems Inc.
 
DC JUG: Understanding Java Garbage Collection
DC JUG: Understanding Java Garbage CollectionDC JUG: Understanding Java Garbage Collection
DC JUG: Understanding Java Garbage CollectionAzul Systems, Inc.
 
QCon London: Low latency Java in the real world - LMAX Exchange and the Zing JVM
QCon London: Low latency Java in the real world - LMAX Exchange and the Zing JVMQCon London: Low latency Java in the real world - LMAX Exchange and the Zing JVM
QCon London: Low latency Java in the real world - LMAX Exchange and the Zing JVMAzul Systems, Inc.
 
Advancements ingc andc4overview_linkedin_oct2017
Advancements ingc andc4overview_linkedin_oct2017Advancements ingc andc4overview_linkedin_oct2017
Advancements ingc andc4overview_linkedin_oct2017Azul Systems Inc.
 
GC @ jmaghreb2014
GC @ jmaghreb2014GC @ jmaghreb2014
GC @ jmaghreb2014Ivan Krylov
 
Beyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareBeyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareQuantum Leaps, LLC
 
Javaland 2017: "You´ll do microservices now". Now what?
Javaland 2017: "You´ll do microservices now". Now what?Javaland 2017: "You´ll do microservices now". Now what?
Javaland 2017: "You´ll do microservices now". Now what?André Goliath
 

Similar to Enabling Java in Latency-Sensitive Applications (20)

How NOT to Measure Latency, Gil Tene, London, Oct. 2013
How NOT to Measure Latency, Gil Tene, London, Oct. 2013How NOT to Measure Latency, Gil Tene, London, Oct. 2013
How NOT to Measure Latency, Gil Tene, London, Oct. 2013
 
Understanding Latency Response Time and Behavior
Understanding Latency Response Time and BehaviorUnderstanding Latency Response Time and Behavior
Understanding Latency Response Time and Behavior
 
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and Tools
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and ToolsIntelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and Tools
Intelligent Trading Summit NY 2014: Understanding Latency: Key Lessons and Tools
 
Enterprise Search Summit - Speeding Up Search
Enterprise Search Summit - Speeding Up SearchEnterprise Search Summit - Speeding Up Search
Enterprise Search Summit - Speeding Up Search
 
The Java Evolution Mismatch - Why You Need a Better JVM
The Java Evolution Mismatch - Why You Need a Better JVMThe Java Evolution Mismatch - Why You Need a Better JVM
The Java Evolution Mismatch - Why You Need a Better JVM
 
The Java Evolution Mismatch by Gil Tene, CTO at Azul Systems
The Java Evolution Mismatch by Gil Tene, CTO at Azul SystemsThe Java Evolution Mismatch by Gil Tene, CTO at Azul Systems
The Java Evolution Mismatch by Gil Tene, CTO at Azul Systems
 
JVM Mechanics: A Peek Under the Hood
JVM Mechanics: A Peek Under the HoodJVM Mechanics: A Peek Under the Hood
JVM Mechanics: A Peek Under the Hood
 
Understanding Hardware Transactional Memory
Understanding Hardware Transactional MemoryUnderstanding Hardware Transactional Memory
Understanding Hardware Transactional Memory
 
Winning With Java at Market Open
Winning With Java at Market OpenWinning With Java at Market Open
Winning With Java at Market Open
 
Understanding Zulu Garbage Collection by Matt Schuetze, Director of Product M...
Understanding Zulu Garbage Collection by Matt Schuetze, Director of Product M...Understanding Zulu Garbage Collection by Matt Schuetze, Director of Product M...
Understanding Zulu Garbage Collection by Matt Schuetze, Director of Product M...
 
Understanding Java Garbage Collection
Understanding Java Garbage CollectionUnderstanding Java Garbage Collection
Understanding Java Garbage Collection
 
Understanding Java Garbage Collection - And What You Can Do About It
Understanding Java Garbage Collection - And What You Can Do About ItUnderstanding Java Garbage Collection - And What You Can Do About It
Understanding Java Garbage Collection - And What You Can Do About It
 
Understanding GC, JavaOne 2017
Understanding GC, JavaOne 2017Understanding GC, JavaOne 2017
Understanding GC, JavaOne 2017
 
DC JUG: Understanding Java Garbage Collection
DC JUG: Understanding Java Garbage CollectionDC JUG: Understanding Java Garbage Collection
DC JUG: Understanding Java Garbage Collection
 
QCon London: Low latency Java in the real world - LMAX Exchange and the Zing JVM
QCon London: Low latency Java in the real world - LMAX Exchange and the Zing JVMQCon London: Low latency Java in the real world - LMAX Exchange and the Zing JVM
QCon London: Low latency Java in the real world - LMAX Exchange and the Zing JVM
 
How NOT to Measure Latency
How NOT to Measure LatencyHow NOT to Measure Latency
How NOT to Measure Latency
 
Advancements ingc andc4overview_linkedin_oct2017
Advancements ingc andc4overview_linkedin_oct2017Advancements ingc andc4overview_linkedin_oct2017
Advancements ingc andc4overview_linkedin_oct2017
 
GC @ jmaghreb2014
GC @ jmaghreb2014GC @ jmaghreb2014
GC @ jmaghreb2014
 
Beyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded SoftwareBeyond the RTOS: A Better Way to Design Real-Time Embedded Software
Beyond the RTOS: A Better Way to Design Real-Time Embedded Software
 
Javaland 2017: "You´ll do microservices now". Now what?
Javaland 2017: "You´ll do microservices now". Now what?Javaland 2017: "You´ll do microservices now". Now what?
Javaland 2017: "You´ll do microservices now". Now what?
 

More from Azul Systems Inc.

Zulu Embedded Java Introduction
Zulu Embedded Java IntroductionZulu Embedded Java Introduction
Zulu Embedded Java IntroductionAzul Systems Inc.
 
What's New in the JVM in Java 8?
What's New in the JVM in Java 8?What's New in the JVM in Java 8?
What's New in the JVM in Java 8?Azul Systems Inc.
 
ObjectLayout: Closing the (last?) inherent C vs. Java speed gap
ObjectLayout: Closing the (last?) inherent C vs. Java speed gapObjectLayout: Closing the (last?) inherent C vs. Java speed gap
ObjectLayout: Closing the (last?) inherent C vs. Java speed gapAzul Systems Inc.
 
Priming Java for Speed at Market Open
Priming Java for Speed at Market OpenPriming Java for Speed at Market Open
Priming Java for Speed at Market OpenAzul Systems Inc.
 
Azul Systems open source guide
Azul Systems open source guideAzul Systems open source guide
Azul Systems open source guideAzul Systems Inc.
 
Start Fast and Stay Fast - Priming Java for Market Open with ReadyNow!
Start Fast and Stay Fast - Priming Java for Market Open with ReadyNow!Start Fast and Stay Fast - Priming Java for Market Open with ReadyNow!
Start Fast and Stay Fast - Priming Java for Market Open with ReadyNow!Azul Systems Inc.
 
The evolution of OpenJDK: From Java's beginnings to 2014
The evolution of OpenJDK: From Java's beginnings to 2014The evolution of OpenJDK: From Java's beginnings to 2014
The evolution of OpenJDK: From Java's beginnings to 2014Azul Systems Inc.
 
Push Technology's latest data distribution benchmark with Solarflare and Zing
Push Technology's latest data distribution benchmark with Solarflare and ZingPush Technology's latest data distribution benchmark with Solarflare and Zing
Push Technology's latest data distribution benchmark with Solarflare and ZingAzul Systems Inc.
 
Webinar: Zing Vision: Answering your toughest production Java performance que...
Webinar: Zing Vision: Answering your toughest production Java performance que...Webinar: Zing Vision: Answering your toughest production Java performance que...
Webinar: Zing Vision: Answering your toughest production Java performance que...Azul Systems Inc.
 
Speculative Locking: Breaking the Scale Barrier (JAOO 2005)
Speculative Locking: Breaking the Scale Barrier (JAOO 2005)Speculative Locking: Breaking the Scale Barrier (JAOO 2005)
Speculative Locking: Breaking the Scale Barrier (JAOO 2005)Azul Systems Inc.
 
Towards a Scalable Non-Blocking Coding Style
Towards a Scalable Non-Blocking Coding StyleTowards a Scalable Non-Blocking Coding Style
Towards a Scalable Non-Blocking Coding StyleAzul Systems Inc.
 
Experiences with Debugging Data Races
Experiences with Debugging Data RacesExperiences with Debugging Data Races
Experiences with Debugging Data RacesAzul Systems Inc.
 
Lock-Free, Wait-Free Hash Table
Lock-Free, Wait-Free Hash TableLock-Free, Wait-Free Hash Table
Lock-Free, Wait-Free Hash TableAzul Systems Inc.
 
How NOT to Write a Microbenchmark
How NOT to Write a MicrobenchmarkHow NOT to Write a Microbenchmark
How NOT to Write a MicrobenchmarkAzul Systems Inc.
 
The Art of Java Benchmarking
The Art of Java BenchmarkingThe Art of Java Benchmarking
The Art of Java BenchmarkingAzul Systems Inc.
 
Azul Zulu on Azure Overview -- OpenTech CEE Workshop, Warsaw, Poland
Azul Zulu on Azure Overview -- OpenTech CEE Workshop, Warsaw, PolandAzul Zulu on Azure Overview -- OpenTech CEE Workshop, Warsaw, Poland
Azul Zulu on Azure Overview -- OpenTech CEE Workshop, Warsaw, PolandAzul Systems Inc.
 
JVM Memory Management Details
JVM Memory Management DetailsJVM Memory Management Details
JVM Memory Management DetailsAzul Systems Inc.
 
Java at Scale, Dallas JUG, October 2013
Java at Scale, Dallas JUG, October 2013Java at Scale, Dallas JUG, October 2013
Java at Scale, Dallas JUG, October 2013Azul Systems Inc.
 

More from Azul Systems Inc. (20)

Zulu Embedded Java Introduction
Zulu Embedded Java IntroductionZulu Embedded Java Introduction
Zulu Embedded Java Introduction
 
What's New in the JVM in Java 8?
What's New in the JVM in Java 8?What's New in the JVM in Java 8?
What's New in the JVM in Java 8?
 
ObjectLayout: Closing the (last?) inherent C vs. Java speed gap
ObjectLayout: Closing the (last?) inherent C vs. Java speed gapObjectLayout: Closing the (last?) inherent C vs. Java speed gap
ObjectLayout: Closing the (last?) inherent C vs. Java speed gap
 
Priming Java for Speed at Market Open
Priming Java for Speed at Market OpenPriming Java for Speed at Market Open
Priming Java for Speed at Market Open
 
Azul Systems open source guide
Azul Systems open source guideAzul Systems open source guide
Azul Systems open source guide
 
Start Fast and Stay Fast - Priming Java for Market Open with ReadyNow!
Start Fast and Stay Fast - Priming Java for Market Open with ReadyNow!Start Fast and Stay Fast - Priming Java for Market Open with ReadyNow!
Start Fast and Stay Fast - Priming Java for Market Open with ReadyNow!
 
The evolution of OpenJDK: From Java's beginnings to 2014
The evolution of OpenJDK: From Java's beginnings to 2014The evolution of OpenJDK: From Java's beginnings to 2014
The evolution of OpenJDK: From Java's beginnings to 2014
 
Push Technology's latest data distribution benchmark with Solarflare and Zing
Push Technology's latest data distribution benchmark with Solarflare and ZingPush Technology's latest data distribution benchmark with Solarflare and Zing
Push Technology's latest data distribution benchmark with Solarflare and Zing
 
Webinar: Zing Vision: Answering your toughest production Java performance que...
Webinar: Zing Vision: Answering your toughest production Java performance que...Webinar: Zing Vision: Answering your toughest production Java performance que...
Webinar: Zing Vision: Answering your toughest production Java performance que...
 
Speculative Locking: Breaking the Scale Barrier (JAOO 2005)
Speculative Locking: Breaking the Scale Barrier (JAOO 2005)Speculative Locking: Breaking the Scale Barrier (JAOO 2005)
Speculative Locking: Breaking the Scale Barrier (JAOO 2005)
 
Java vs. C/C++
Java vs. C/C++Java vs. C/C++
Java vs. C/C++
 
What's Inside a JVM?
What's Inside a JVM?What's Inside a JVM?
What's Inside a JVM?
 
Towards a Scalable Non-Blocking Coding Style
Towards a Scalable Non-Blocking Coding StyleTowards a Scalable Non-Blocking Coding Style
Towards a Scalable Non-Blocking Coding Style
 
Experiences with Debugging Data Races
Experiences with Debugging Data RacesExperiences with Debugging Data Races
Experiences with Debugging Data Races
 
Lock-Free, Wait-Free Hash Table
Lock-Free, Wait-Free Hash TableLock-Free, Wait-Free Hash Table
Lock-Free, Wait-Free Hash Table
 
How NOT to Write a Microbenchmark
How NOT to Write a MicrobenchmarkHow NOT to Write a Microbenchmark
How NOT to Write a Microbenchmark
 
The Art of Java Benchmarking
The Art of Java BenchmarkingThe Art of Java Benchmarking
The Art of Java Benchmarking
 
Azul Zulu on Azure Overview -- OpenTech CEE Workshop, Warsaw, Poland
Azul Zulu on Azure Overview -- OpenTech CEE Workshop, Warsaw, PolandAzul Zulu on Azure Overview -- OpenTech CEE Workshop, Warsaw, Poland
Azul Zulu on Azure Overview -- OpenTech CEE Workshop, Warsaw, Poland
 
JVM Memory Management Details
JVM Memory Management DetailsJVM Memory Management Details
JVM Memory Management Details
 
Java at Scale, Dallas JUG, October 2013
Java at Scale, Dallas JUG, October 2013Java at Scale, Dallas JUG, October 2013
Java at Scale, Dallas JUG, October 2013
 

Recently uploaded

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 

Recently uploaded (20)

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 

Enabling Java in Latency-Sensitive Applications

  • 1. Enabling Java in Latency Sensitive Applications Gil Tene, CTO & co-Founder, Azul Systems ©2013 Azul Systems, Inc.
  • 2. About me: Gil Tene co-founder, CTO @Azul Systems Have been working on a “think different” GC approaches since 2002 Created Pauseless & C4 core GC algorithms (Tene, Wolf) A Long history building Virtual & Physical Machines, Operating Systems, Enterprise apps, etc... ©2013 Azul Systems, Inc. * working on real-world trash compaction issues, circa 2004
  • 3. About Azul Vega We make scalable Virtual Machines Have built “whatever it takes to get job done” since 2002 3 generations of custom SMP Multi-core HW (Vega) Zing: Pure software for commodity x86 C4 Known for Low Latency, Consistent execution, and Large data set excellence ©2013 Azul Systems, Inc.
  • 4. High level agenda Java in a low latency application world The (historical) fundamental problems What people have done to try to get around them What if the fundamental problems were eliminated? What 2013 looks like for Low latency Java developers What’s next? ©2013 Azul Systems, Inc.
  • 5. Java in the low latency world? ©2013 Azul Systems, Inc.
  • 6. Java in the low latency world? Why do people use Java for low latency apps? Are they crazy? No. There are good, easy to articulate reasons Projected lifetime cost Developer productivity Time-to-product, Time-to-market, ... Leverage, ecosystem, ability to hire ©2013 Azul Systems, Inc.
  • 7. E.g. Customer answer to: “Why do you use Java in Algo Trading?” Strategies have a shelf life We have to keep developing and deploying new ones Only one out of N is actually productive Profitability therefore depends on ability to successfully deploy new strategies, and on the cost of doing so Our developers seem to be able to produce 2x-3x as much when using a Java environment as they would with C/C++ ... ©2013 Azul Systems, Inc.
  • 8. So what is the problem? Is Java Slow? No... A good programmer will get roughly the same speed from both Java and C++ A bad programmer won’t get you fast code on either The 50%‘ile and 90%‘ile are typically excellent... It’s those pesky occasional stutters and stammers and stalls that are the problem... Ever hear of Garbage Collection? ©2013 Azul Systems, Inc.
  • 9. Is “jitter” even the right word for this? Hiccups&by&Time&Interval& Max"per"Interval" 99%" 99.90%" 99.99%" Max" Hiccup&Dura*on&(msec)& 25" 20" 15" 10" 5" 0" 0" 100" 200" 300" 400" 500" &Elapsed&Time&(sec)& 99%‘ile is ©2012 Azul Systems, Inc. 25" Max is ~30,000% Hiccups&by&Percen*le&Distribu*on& ~60 usec higher than “typical” 600"
  • 10. Stop-The-World Garbage Collection: Java’s Achilles heel Let’s ignore the bad multi-second pauses for now... Low latency applications regularly experience “small”, “minor” GC events that range in the 10s of msec Frequency directly related to allocation rate So we have great 50%, 90%. Maybe even 99% But 99.9%, 99.99%, Max, all “suck” So bad that it affects risk, profitability, service expectations, etc. ©2013 Azul Systems, Inc.
  • 11. One way to deal with Stop-The-World GC
  • 12. A common way to “deal” with STW-GC Averages and Standard Deviation
  • 13. Reality: Latency is usually strongly “multi-modal” Usually does’t look anything like a normal distribution In software systems, usually sees periodic freezes Complete shifts from one mode/behavior to another Mode A: “good”. Mode B: “Somewhat bad” Mode C: “terrible”, ... .... ©2012 Azul Systems, Inc.
  • 14. Another way to deal with STW-GC
  • 15. Another way to cope: “Creative Language” “Guarantee a worst case of 5 msec, 99% of the time” “Mostly” Concurrent, “Mostly” Incremental Translation: “Will at times exhibit long monolithic stopthe-world pauses” “Fairly Consistent” Translation: “Will sometimes show results well outside this range” “Typical pauses in the tens of milliseconds” Translation: “Some pauses are much longer than tens of milliseconds” ©2012 Azul Systems, Inc.
  • 16. What do actual low latency developers do about it? They use “Java” instead of Java They write “in the Java syntax” They avoid allocation as much as possible E.g. They build their own object pools for everything They write all the code they use (no 3rd party libs) They train developers for their local discipline In short: They revert to many of the practices that hurt productivity. They loose out on much of Java. ©2013 Azul Systems, Inc.
  • 17. What do low latency (Java) developers with all this effort? They still see pauses (usually ranging to tens of msec) They do get fewer (as in less frequent) pauses And they see fewer people able to do the job And they have to write EVERYTHING themselves And they get to debug malloc/free patterns again And they can only use memory in certain ways ... Some call it “fun”... Others “duct tape engineering”... ©2013 Azul Systems, Inc.
  • 18. was It is an industry-wide problem Stop-The-World GC mechanisms contradict the fundamental requirements of low latency & low jitter apps It’s 2013... We now have Zing. ©2013 Azul Systems, Inc.
  • 19. The common GC behavior across ALL currently shipping (non-Zing) JVMs ALL use a Monolithic Stop-the-world NewGen “small” periodic pauses (small as in 10s of msec) pauses more frequent with higher throughput or allocation rates Development focus for ALL is on Oldgen collectors Focus is on trying to address the many-second pause problem Usually by sweeping it farther and farther the rug “Mostly X” (e.g. “mostly concurrent”) hides the fact that they refer only to the OldGen part of the collector E.g. CMS, G1, Balanced.... all are OldGen-only efforts ALL use a Fallback to Full Stop-the-world Collection Used to recover when other mechanisms (inevitably) fail Also hidden under the term “Mostly”... ©2013 Azul Systems, Inc.
  • 20. A Recipe: address STW-GC head-on At Azul, we decided to focus on the core problems Scale & productivity limited by responsiveness/latency And it’s not the “typical” latency, it’s the outliers... Even “short” GC pauses must be considered a problem Responsiveness must be unlinked from key metrics: Transaction Rate, Concurrent users, Data set size, etc. Heap size, Live Set size, Allocation rate, Mutation rate Responsiveness must be continually sustainable Can’t ignore “rare but periodic” events Eliminate ALL Stop-The-World Fallbacks ©2013 Azul Systems, Inc.
  • 21. The Zing “C4” Collector Continuously Concurrent Compacting Collector Concurrent, compacting old generation Concurrent, compacting new generation No stop-the-world fallback Always compacts, and always does so concurrently ©2013 Azul Systems, Inc.
  • 23. An example of “First day’s run” behavior E-Commerce application ©2013 Azul Systems, Inc.
  • 24. An example of behavior after 4 days of system tuning Low latency application ©2013 Azul Systems, Inc.
  • 25. Measuring Theory in Practice jHiccup: A tool that measures and reports (as your application is running) if your JVM is actually running all the time ©2013 Azul Systems, Inc.
  • 26. Discontinuities in Java platform execution - Easy To Measure Incontinuities in Java platform execution Hiccups"by"Time"Interval" Max"per"Interval" 99%" 99.90%" 99.99%" Max" Hiccup&Dura*on&(msec)& 1800" I call these “hiccups” 1600" 1400" 1200" 1000" 800" 600" 400" 200" 0" 0" 200" 400" 600" 800" 1000" 1200" 1400" 1600" 1800" &Elapsed&Time&(sec)& Hiccups"by"Percen@le"Distribu@on" 1800" Hiccup&Dura*on&(msec)& 1600" Max=1665.024& 1400" 1200" 1000" 800" 600" 400" 200" 0" ©2012 Azul Systems, Inc. 0%" 90%" 99%" & 99.9%" & Percen*le& 99.99%" 99.999%" A telco App with a bit of a “problem”
  • 27. Fun with jHiccup ©2012 Azul Systems, Inc.
  • 28. Oracle HotSpot (pure newgen) Zing Hiccups&by&Time&Interval& Max"per"Interval" 99%" 99.90%" Hiccups&by&Time&Interval& 99.99%" Max" Max"per"Interval" 20" 15" 10" 5" 0" 99.99%" Max" 1.6" 1.4" 1.2" 1" 0.8" 0.6" 0.4" 0.2" 0" 0" 100" 200" 300" 400" 500" 600" 0" 100" 200" &Elapsed&Time&(sec)& 300" 400" 500" 600" &Elapsed&Time&(sec)& Hiccups&by&Percen*le&Distribu*on& Hiccups&by&Percen*le&Distribu*on& 25" 1.8" Max=22.656& 20" Hiccup&Dura*on&(msec)& Hiccup&Dura*on&(msec)& 99.90%" 1.8" Hiccup&Dura*on&(msec)& Hiccup&Dura*on&(msec)& 25" 99%" 15" 10" 5" 1.6" Max=1.568& 1.4" 1.2" 1" 0.8" 0.6" 0.4" 0.2" 0" 0%" 90%" & 99%" 99.9%" & 99.99%" 99.999%" Percen*le& 0" 0%" 90%" & 99%" Low latency trading application ©2012 Azul Systems, Inc. 99.9%" & 99.99%" Percen*le& 99.999%"
  • 29. Oracle HotSpot (pure newgen) Zing Hiccups&by&Time&Interval& Max"per"Interval" 99%" 99.90%" Hiccups&by&Time&Interval& 99.99%" Max" Max"per"Interval" 20" 15" 10" 5" 0" 99.99%" Max" 1.6" 1.4" 1.2" 1" 0.8" 0.6" 0.4" 0.2" 0" 0" 100" 200" 300" 400" 500" 600" 0" 100" 200" &Elapsed&Time&(sec)& 300" 400" 500" 600" &Elapsed&Time&(sec)& Hiccups&by&Percen*le&Distribu*on& Hiccups&by&Percen*le&Distribu*on& 25" 1.8" Max=22.656& 20" Hiccup&Dura*on&(msec)& Hiccup&Dura*on&(msec)& 99.90%" 1.8" Hiccup&Dura*on&(msec)& Hiccup&Dura*on&(msec)& 25" 99%" 15" 10" 5" 1.6" Max=1.568& 1.4" 1.2" 1" 0.8" 0.6" 0.4" 0.2" 0" 0%" 90%" & 99%" 99.9%" & 99.99%" 99.999%" Percen*le& 0" 0%" 90%" & 99%" Low latency trading application ©2012 Azul Systems, Inc. 99.9%" & 99.99%" Percen*le& 99.999%"
  • 30. Oracle HotSpot (pure newgen) Zing Hiccups&by&Time&Interval& Max"per"Interval" 99%" 99.90%" Hiccups&by&Time&Interval& 99.99%" Max" Max"per"Interval" 20" 15" 10" 5" 0" Max" 20" 15" 10" 5" 100" 200" 300" 400" 500" 600" 0" 100" 200" &Elapsed&Time&(sec)& 300" 400" 500" 600" &Elapsed&Time&(sec)& Hiccups&by&Percen*le&Distribu*on& Hiccups&by&Percen*le&Distribu*on& 25" 25" Max=22.656& 20" Hiccup&Dura*on&(msec)& Hiccup&Dura*on&(msec)& 99.99%" 0" 0" 15" 10" 5" 0" 99.90%" 25" Hiccup&Dura*on&(msec)& Hiccup&Dura*on&(msec)& 25" 99%" 0%" 90%" & 99%" 99.9%" & 99.99%" 99.999%" Percen*le& 20" 15" 10" 5" 0" Max=1.568& 0%" 90%" & 99%" Low latency - Drawn to scale ©2012 Azul Systems, Inc. 99.9%" & 99.99%" Percen*le& 99.999%"
  • 31. Lets not forget about GC tuning ©2013 Azul Systems, Inc.
  • 32. Java GC tuning is “hard”… Examples of actual command line GC tuning parameters: Java -Xmx12g -XX:MaxPermSize=64M -XX:PermSize=32M -XX:MaxNewSize=2g -XX:NewSize=1g -XX:SurvivorRatio=128 -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:MaxTenuringThreshold=0 -XX:CMSInitiatingOccupancyFraction=60 -XX:+CMSParallelRemarkEnabled -XX:+UseCMSInitiatingOccupancyOnly -XX:ParallelGCThreads=12 -XX:LargePageSizeInBytes=256m … Java –Xms8g –Xmx8g –Xmn2g -XX:PermSize=64M -XX:MaxPermSize=256M -XX:-OmitStackTraceInFastThrow -XX:SurvivorRatio=2 -XX:-UseAdaptiveSizePolicy -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSParallelRemarkEnabled -XX:+CMSParallelSurvivorRemarkEnabled -XX:CMSMaxAbortablePrecleanTime=10000 -XX:+UseCMSInitiatingOccupancyOnly -XX:CMSInitiatingOccupancyFraction=63 -XX:+UseParNewGC –Xnoclassgc … ©2013 Azul Systems, Inc.
  • 33. A  few  GC  tuning  flags Source:  Word  Cloud  created  by  Frank  Pavageau  in  his  Devoxx  FR  2012  presentaFon  Ftled  “Death  by  Pauses”
  • 34. The complete guide to Zing GC tuning java -Xmx40g ©2013 Azul Systems, Inc.
  • 35. So what’s next? GC is only the biggest problem... ©2013 Azul Systems, Inc.
  • 36. JVMs make many tradeoffs often trading speed vs. outliers Some speed techniques come at extreme outlier costs E.g. (“regular”) biased locking E.g. counted loops optimizations Deoptimization Lock deflation Weak References, Soft References, Finalizers Time To Safe Point (TTSP) ©2012 Azul Systems, Inc.
  • 37. Time To Safepoint (TTSP) Your new #1 enemy (Once GC itself was taken care of) Many things in a JVM (still) use a global safepoint All threads brought to a halt, at a “safe to analuze” point in code, and then released after work is done. E.g. GC phase shifts, Deoptimization, Class unloading, Thread Dumps, Lock Deflation, etc. etc. A single thread with a long time-to-safepoint path can cause an effective pause for all other threads Many code paths in the JVM are long... ©2012 Azul Systems, Inc.
  • 38. Time To Safepoint (TTSP) the most common examples Array copies and object clone() Counted loops Many other other variants in the runtime... Measure, Measure, Measure... Zing has a built-in TTSP profiler At Azul, I walk around with a 0.5msec stick... ©2012 Azul Systems, Inc.
  • 39. OS related stuff (once GC and TTSP are taken care of) OS related hiccups tend to dominate once GC and TTSP are removed as issues. Take scheduling pressure seriously (Duh?) Hyper-threading (good? bad?) Swapping (Duh!) Power management Transparent Huge Pages (THP). ... ©2012 Azul Systems, Inc.
  • 40. Takeaway: In 2013, “Real” Java is finally viable for low latency applications GC is no longer a dominant issue, even for outliers 2-3msec worst observed case with “easy” tuning < 1 msec worst observed case is very doable No need to code in special ways any more You can finally use “real” Java for everything You can finally 3rd party libraries without worries You can finally use as much memory as you want You can finally use regular (good) programmers ©2012 Azul Systems, Inc.
  • 41. One-liner Takeaway: Zing: A cure for the Java hiccups ©2013 Azul Systems, Inc.
  • 42. Q&A One-liner Takeaway: Zing: A cure for the Java hiccups jHiccup: http:/ /www.azulsystems.com/dev_resources/jhiccup ©2013 Azul Systems, Inc.