Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Performance in Geode: How Fast Is It, How Is It Measured, and How Can It Be Improved?

90 views

Published on

SpringOne Platform 2019
Session Title: Performance in Geode: How Fast Is It, How Is It Measured, and How Can It Be Improved?
Speakers: Helena Bales, Software Engineer, Pivotal
Youtube: https://youtu.be/awQ4byzC2LM

Published in: Software
  • Be the first to comment

  • Be the first to like this

Performance in Geode: How Fast Is It, How Is It Measured, and How Can It Be Improved?

  1. 1. Performance In Geode: How Fast Is It, How Is It Measured, and How Can It Be Improved? Helena Bales, Senior Software Engineer at Pivotal
  2. 2. What Is The Performance Of Geode? 2
  3. 3. Performance of Geode 1.9.0 3 203,855 244,463 181,655 207,697
  4. 4. What do those number mean? 4 ● 200,000 operations per second means nothing to a person. ○ Is that good? ○ Is the performance consistent and accurate? ○ Has it improved or regressed since the last version? ○ Can it be better?
  5. 5. What do those number mean? 5 ● 200,000 operations per second means nothing to a person. ○ Is that good? Pretty good, yes. ○ Is the performance consistent and accurate? Not yet. ○ Has it improved since the last version? Yes, slightly. ○ Can it be better? YES.
  6. 6. What do those number mean? 6 ● 200,000 operations per second means nothing to a person. ○ Is that good? Pretty good, yes. ○ Is the performance consistent and accurate? Not yet. ○ Has it improved since the last version? Yes, slightly. ○ Can it be better? YES. How do you know???
  7. 7. How Is Performance Measured? 7
  8. 8. Creating the Geode Benchmark - Features 8 ● On demand ● Against any revision of Geode ● On AWS cluster deployment of Geode ● On any dev machine in the office ● From Concourse CI pipeline ● With a profiler attached ● Compare two runs of benchmarks for performance changes
  9. 9. Creating the Geode Benchmark - Goals 9 ● Run by anyone interested in Geode ● Have others create benchmarks ● Visualize benchmark results over time ● Increase benchmark coverage of Geode
  10. 10. Tests Currently in the Benchmarks 10 ○ ReplicatedGetBenchmark ○ ReplicatedGetLongBenchmark ○ ReplicatedPutBenchmark ○ ReplicatedPutLongBenchmark ○ ReplicatedPutAllBenchmark ○ ReplicatedPutAllLongBenchmark ○ ReplicatedFunctionExecutionBenchmark ○ ReplicatedFunctionExecutionWithArgum entsBenchmark ○ ReplicatedFunctionExecutionWithFilters Benchmark ○ PartitionedGetBenchmark ○ PartitionedGetLongBenchmark ○ PartitionedPutBenchmark ○ PartitionedPutLongBenchmark ○ PartitionedPutAllBenchmark ○ PartitionedPutAllLongBenchmark ○ PartitionedIndexedQueryBenchmark ○ PartitionedFunctionExecutionWithArgum entsBenchmark ○ PartitionedFunctionExecutionWithFilters Benchmark
  11. 11. Other Tested Configurations 11 ● With SSL ● With JDKs: 8, 11, 12, 13 ● With Security Manager ● With Garbage Collectors: ○ CMS ○ G1 ○ Z ○ Shenandoah ● Adjustable max heap size
  12. 12. How Can Performance Be Improved? 12
  13. 13. Finding Performance Bottlenecks 13 ● Monitor locks ● Thread Park/Unpark Reentrant Locks ● Allocations/GC ● Overuse of synchronization ● Getting a system property in a hot path ● Lazy initialization of objects in a hot path ● Synchronization on a container (ex. hash map)
  14. 14. Case Study – The Connection Pool 14 ● Why were we even looking for anything? ○ Couldn’t saturate network, CPU, memory; no matter the available resources ○ Profiler gave us no suspect hot spots ● How did we find the issue? ○ Found the secret profiler option to measure zero-time reentrant locks ○ Thread.park() became a hot spot, with reentrant lock and connection pool as callers ○ The connection pool was holding a reentrant lock in a hot path while using a deque.
  15. 15. Case Study – Finding the Problem 15
  16. 16. 16 Case Study – Finding the Problem
  17. 17. Case Study - Finding the Problem 17
  18. 18. Case Study- Finding the Problem 18
  19. 19. 19
  20. 20. Case Study – Solving the Problem 20 no lock!
  21. 21. Case Study – Solving the Problem 21 lock free structure
  22. 22. Case Study – Solving the Problem 22 no locks!
  23. 23. Case Study – Solving the Problem 23
  24. 24. Case Study - Profiling 24
  25. 25. Case Study – Testing 25 ● Unit testing ● Integration Testing ● Distributed Testing ● Concurrency Testing ● Performance Testing
  26. 26. Case Study - Performance Testing 26 197,686 before 659,980 after
  27. 27. Case Study - Performance Testing 27
  28. 28. Other Bottlenecks – Over Eager Allocations 28 2 potentially unused objects per call – new HashSet() => 1 HashSet & 1 HashMap
  29. 29. Other Bottlenecks – Over Eager Allocations (fixed) ● Do not allocate eagerly ● Allocate near first use ● Allocate after early returns that don’t use the allocated object 29
  30. 30. Other Bottlenecks – Know Your Structures 30 Methods called for every operation and results in 1 add and 1 remove per op
  31. 31. Other Bottlenecks – Know Your Structures (fixed) 31 Methods still called for every operation but does not allocate/gc
  32. 32. How much has performance improved? 32
  33. 33. Comparing Performance of 1.9.0 & 1.10.0 33 203,855 1.9.0 244,463 1.9.0 181,655 1.9.0 207,697 1.9.0 692,725 1.10.0 736,022 1.10.0 357,507 1.10.0 372,430 1.10.0
  34. 34. Comparing Performance of 1.9.0 & 1.10.0 34 1,764,765 1.9.0 518,534 1.10.0 488,051 1.10.0 1,005,730 1.10.0 965,404 1.10.0 1,980,391 1.9.0 1,471,434 1.9.0 1,731,946 1.9.0
  35. 35. Why Upgrade to Geode 1.10.0? 35
  36. 36. Comparing Performance of 1.9.0 & 1.10.0 36 v. 1.10.0 v. 1.9.0 PartitionedGetBenchmark
  37. 37. Relevant Links 37 ● Geode repo: https://github.com/apache/geode ● Benchmark repo: https://github.com/apache/geode-benchmarks ● JIRA query for Performance Issues: https://issues.apache.org/jira/browse/GEODE- 7134?jql=project%20%3D%20GEODE%20AND%20labels%20%3D %20performance
  38. 38. Thank You 38

×