Optimizing Java Performance
When App is on Fire

by Konstantin Pavlov
blog.konstantinpavlov.net
kpavlov
What’s the High Load?
•

There is no exact definition of “High Load”

•

Let's say, over 10000+ incoming events per second, over 1000
events per one seconds on peak

•

Example: Market quotes processing
“20% efforts gives 80% of success”

– Pareto Principle
How to Optimize
•

Measure, not guess

•

Consider Not Doing Things

•

Minimize I/O

•

Reduce database operations

•

Reduce network operations

•

Reduce memory allocations and garbage collection

•

Use logging judiciously

•

Optimizing is iterative process
Measure, Not Guess
•

Record and compare throughput before and after changes

•

Samplers and profilers are your best friends: jvisualVM, JProfiler, etc.

•

Different environments have different results

•

Try to test on production-like environment

•

If production-like environment is not available, compare relative
performance change on single environment
Consider Not Doing
•

Use caching proxies for web applications

•

Review your algorithms

•

In some cases exit earlier. e.g. No need to prepare a data which will
not be used later
Minimize I/O
•

If CPU is idle then I/O is likely a bottleneck

•

File system operations

•

Network operations, data serialization/deserialization.

•

Logging

•

Database operations
Distributing Objects
•

Main rule of distributing objects: "Do not distribute objects. If
possible” (Martin Fowler, “Enterprise Integration Patterns”)

•

CAP Theorem: "It's impossible in distributed system to achieve
Consistency, Availability and Partition Tolerance at the same time"

•

Pros: Less I/O and serialization, simpler interfaces, no coping data
between POJOs

•

Cons: Higher resource consumption on single node, may not scale
well, temptation to make architectural mess
Reduce Database I/O
•

Make sure your DBMS has enough resources

•

Minimize DB access. Cache objects in-memory, if possible

•

Consider using stored procedures to reduce communication
between server and DB. DBMS is designed to process data

•

Find the slowest query, optimize it, then take next and so on…

•

Read/write only necessary columns
Use Logging Judiciously
•

Consider what to log. Review toString() methods

•

Consider when to log. Decision to log or not takes time itself.
Logger.isDebugEnabled() takes time

•

Choose appropriate log levels

•

When using {}-placeholders don’t evaluate expressions
Reduce GC
•

Do not create unnecessary objects

•

Use builders carefully. HashCodeBuilder and EqualsBuilder may
produce GC overhead in collections

•

Review comparators used in TreeSets

•

Reduce boxing/unboxing
Reduce CPU Load

•

HashCode and Equals when called frequently may be a bottleneck
Iterative Optimization
Result p = measurePerformance();
while (!satisfied(p)) {
optimize();
p = measurePerformance();
}
Thank you!

Optimizing Java Performance

  • 1.
    Optimizing Java Performance WhenApp is on Fire by Konstantin Pavlov blog.konstantinpavlov.net kpavlov
  • 2.
    What’s the HighLoad? • There is no exact definition of “High Load” • Let's say, over 10000+ incoming events per second, over 1000 events per one seconds on peak • Example: Market quotes processing
  • 3.
    “20% efforts gives80% of success” – Pareto Principle
  • 4.
    How to Optimize • Measure,not guess • Consider Not Doing Things • Minimize I/O • Reduce database operations • Reduce network operations • Reduce memory allocations and garbage collection • Use logging judiciously • Optimizing is iterative process
  • 5.
    Measure, Not Guess • Recordand compare throughput before and after changes • Samplers and profilers are your best friends: jvisualVM, JProfiler, etc. • Different environments have different results • Try to test on production-like environment • If production-like environment is not available, compare relative performance change on single environment
  • 6.
    Consider Not Doing • Usecaching proxies for web applications • Review your algorithms • In some cases exit earlier. e.g. No need to prepare a data which will not be used later
  • 7.
    Minimize I/O • If CPUis idle then I/O is likely a bottleneck • File system operations • Network operations, data serialization/deserialization. • Logging • Database operations
  • 8.
    Distributing Objects • Main ruleof distributing objects: "Do not distribute objects. If possible” (Martin Fowler, “Enterprise Integration Patterns”) • CAP Theorem: "It's impossible in distributed system to achieve Consistency, Availability and Partition Tolerance at the same time" • Pros: Less I/O and serialization, simpler interfaces, no coping data between POJOs • Cons: Higher resource consumption on single node, may not scale well, temptation to make architectural mess
  • 9.
    Reduce Database I/O • Makesure your DBMS has enough resources • Minimize DB access. Cache objects in-memory, if possible • Consider using stored procedures to reduce communication between server and DB. DBMS is designed to process data • Find the slowest query, optimize it, then take next and so on… • Read/write only necessary columns
  • 10.
    Use Logging Judiciously • Considerwhat to log. Review toString() methods • Consider when to log. Decision to log or not takes time itself. Logger.isDebugEnabled() takes time • Choose appropriate log levels • When using {}-placeholders don’t evaluate expressions
  • 11.
    Reduce GC • Do notcreate unnecessary objects • Use builders carefully. HashCodeBuilder and EqualsBuilder may produce GC overhead in collections • Review comparators used in TreeSets • Reduce boxing/unboxing
  • 12.
    Reduce CPU Load • HashCodeand Equals when called frequently may be a bottleneck
  • 13.
    Iterative Optimization Result p= measurePerformance(); while (!satisfied(p)) { optimize(); p = measurePerformance(); }
  • 14.