Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

PAC 2020 Santorin - Giovanni Paolo Gibilisco

205 views

Published on

Continuous Performance Optimization of Java Applications

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

PAC 2020 Santorin - Giovanni Paolo Gibilisco

  1. 1. PERFORMANCE IS NOT A MYTH P E R F O R M A N C E A D V I S O R Y C O U N C I L SANTORINI GREECE FEBRUARY 26 - 27 2020 Continuous Performance Optimization of Java Applications Giovanni Paolo Gibilisco
  2. 2. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Introduction • Automation and CI have done a great job simplifying deployment and environment provisioning • Performance testing tools are evolving to ease integration • We should take the baton and go further evolving the way we do performance testing and optimization
  3. 3. P E R F O R M A N C E A D V I S O R Y C O U N C I L Continuous Performance Testing Continuous Performance Optimization Use Case
  4. 4. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Continuous Performance Testing • This is a hot topic for Performance Engineers • Stephen Townshend – “Preparing for the pipeline” • Alexander Podelko – “Continuous Performance Testing: Myths and Realities” • Bruno Da Silva – “Continuous Performance Innovative Approach” • Amir Rozenberg – “Shifting left performance testing” • Tingting Zong – “Integrate Performance test with DevOps”
  5. 5. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L A common toolset
  6. 6. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L A common pipeline • Performance tests usually focus on a subset of components or functionalities • Focus on finding performance regressions Build Unit Test Integration Test End 2 End TestDeploy Performance Test Release Prepare Test Data Run Test Analyze ResultsProvision Load Infrastructure
  7. 7. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Why stop here? • Each release might change the way application uses resources • Performance Regression make sure you don’t do worst than previous iteration • What about doing better than before? Performance Improvements are left on the table “I’ve changed my data structure, should I switch to another GC type?” “Has my heap usage pattern changed?”
  8. 8. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Configuration is key • Treated as application code • Has huge impact on performance • Most of the time vendor default is used • Growing Complexity 2006 2009 2013 2016 2019 NumberofConfigurationOptions 847 520 270 50
  9. 9. P E R F O R M A N C E A D V I S O R Y C O U N C I L Continuous Performance Testing Continuous Performance Optimization Use Case
  10. 10. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Continuous Performance Optimization • A new kid in the pipeline block • See performance improvements of application changes immediately. Not just regressions • Adapt configuration to new application features and releases Build Unit Test Integration Test End 2 End TestDeploy Performance Test Prepare Test Data Run Test Analyze ResultsProvision Load Infrastructure Optimize Release
  11. 11. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Why it is so hard? • Configuration Selection • Manual (slow, expertise, trial and error) • One parameter change at a time • Huge configuration space (is random exploration effective?) • Machine Learning comes to aid • Analyze performance test and assign a performance score • Usually manual (slow, need expertise) • Some time a Boolean score based on thresholds (e.g. alarms, gateways) • Challenges: startup, noise, stability, … • Multiple KPIs (Throughput, Latency, Utilization..) Run Test Analyze ResultsPrepare Test Data Optimize
  12. 12. P E R F O R M A N C E A D V I S O R Y C O U N C I L Continuous Performance Testing Continuous Performance Optimization Use Case
  13. 13. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L The App • Core business application in charge of searching for flight combinations • “Searcho” is a microservice composed by Tomcat and a JVM application • JVM configuration is written in a file and committed to a Git repo • Gitlab Pipeline takes care of deploying the configuration changes • Kubernetes deployment takes care of restarting the services Searcho Dev Namespace
  14. 14. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L The Goal Maximize Searcho Transactions Per Second … complying with the following constraints (5 mins evaluation window): • stability: TPS std. dev. within the window must be < 3 • response time: avg. response times within the window must be < 4 sec • errors • http error rate within the window must be < 1% • log errors within the windows must be < 10
  15. 15. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Optimization Scope Which is the best Garbage Collection (GC) algo for this workload? • Baseline: reference value of Searcho with a manually tuned JVM configuration • G1: study on 12 JVM parameters using G1 GC algorithm • Parallel: optimized 11 JVM parameters using Parallel GC algorithm 11.03 13.6 11.69 Baseline Parallel G1 Baseline Parallel G1 +23% +5%
  16. 16. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Peak throughput Best window that satisfies all the constraints in baseline experiment: 11.03 TPS Best window that satisfies all the constraints in best experiment: 12.61 TPS
  17. 17. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Lower response time Baseline: 7.5+ sec @63 VU Best: 5.5 sec @63 VU
  18. 18. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Lower GC time Baseline: 200 ms @63 VU Best: 40 ms @63 VU
  19. 19. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Lower throttling Baseline: 1.7+ sec @63 VU Best: 0.5 sec @63 VU
  20. 20. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Higher memory efficiency • Overall heap utilization has been reduced • Compilation parameters optimized to reduce CPU consumption • Optimized heap inner space • Startup memory allocation to avoid noisy neighbors • Increased GC parallelism to reduce pauses Configuration Peak TPS Heap Size (MB) TPS/Heap Serchio default 11.03 6656 0.0017 High TPS 13.61 5714 0.0023 (+35%) Efficiency 12.31 4152 0.0030 (+76%)
  21. 21. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L A Stability study Can we serve traffic without degrading quality when workload is increasing? • … complying with the following constraints: • response time: avg. response times within the window has to be < 7.5 sec • throughput: TPS within the window must be > 10 • Baseline: reference value of Searchio with tuned configuration • Tomcat: optimized 3 tomcat parameters related to threading and queuing
  22. 22. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Keep serving traffic Baseline: 100% error from 90 VU Best: 13.61 TPS @ 100 VU Same behavior up to the breaking point
  23. 23. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Reduce response time degradation Baseline: 11+ sec @ 90 VU Best: 7.5 sec @ 100 VU Same behavior up to the breaking point
  24. 24. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Lower GC time Baseline: 150 +ms @ 90 VU Best: 20 ms @ 100 VU Same behavior up to the breaking point
  25. 25. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Lower throttling Baseline: 3 ms @ 90 VU Best: 0.2 ms @ 100 VU Same behavior up to the breaking point
  26. 26. P E R F O R M A N C E A D V I S O R Y C O U N C I L byP E R F O R M A N C E A D V I S O R Y C O U N C I L Takeaways • Automation in CI is great but is just a starting point • Continuous Performance Testing allows to avoid regressions and let us sleep well • Manual Optimization is hard, tedious and ages as faster as the software evolves • Continuous Performance Optimization let us find performance improvements due to application changes

×