E2E Performance Testing,
Profiling, and Analysis at Redis
Open Source Experience Conference, Nov 2022
Filipe Oliveira
Senior Performance Engineer
@Redis
IF YOU DON’T MEASURE IT
YOU CAN’T IMPROVE IT
2
THE SOONER
THE BETTER, THE CHEAPER, THE EASIER
3
PERFORMANCE @REDIS
4
OSS REDIS + Redis Ltd Projects
1. foster benchmark and observability standards
2. support the contributions to the OSS projects
3. optimize an industry-leading solution
Ordinarily, on our Companies Core Products
5
We have...
● automatic extensive tests to catch functional failures
...but when
● we accidentally commit a performance regression, nothing intercepts it*!
A Real Case From 2019
6
A Real Case From 2019
7
Simple request
1. RediSearch minor version bump
2. Required multiple patch
a. Feedback cycle took us at-least 1 day
b. prioritized over other projects
c. Siloed
d. Jul. 30, Nov. 27, 2019
You can relate to...
● your team run performance tests before releasing
Ordinarily, on our Companies Core Products
8
You can state...
● your team run performance tests before releasing
...but solving slowdowns just before releasing is...
● dangerous
● time-consuming
● one of the most difficult tasks to estimate time to
...is just buffering potential issues!
Goal: Reduce Feedback Cycle. Avoid Silos
9
Requirements for valid tests
- Stable testing environment
- Deterministic testing tools
- Deterministic outcomes
- Reduced testing/probing overhead
- Reduce tested changes to the minimal
Requirements for acceptance in
products
- Acceptable duration
- No manual work
- Actionable items
- Well defined key performance indicators
CODE REVIEW
PREVIEW /
UNSTABLE
RELEASE
MANUAL
PERF
CHECK
CODE REVIEW
PREVIEW /
UNSTABLE
RELEASE
ZERO TOUCH
PERF CHECK
ZERO TOUCH
PERF CHECK
ZERO TOUCH
PERF CHECK
10
This is Not New/Disruptive
Elastic
https://elasticsearch-benchmarks.elastic.co/#
Lucene
https://home.apache.org/~mikemccand/lucenebench/
11
This is Not New/Disruptive
mongoDB
Our Approach
12
Redis Developers Group
+ Partners ( )
by branch
scalability analysis
Our Approach
13
by version
1. Initial focus on OSS deployments
2. local and remote triggers
3. Used for testing, profiling
a. Regression analysis
i. and fix
b. Approval of features
c. Proactive optimization
14
Our Approach
1. Initial focus on OSS deployments
2. local and remote triggers
3. Used for testing, profiling
a. Regression analysis
i. and fix
b. Approval of features
c. Proactive optimization
Our Approach
15
Approval of features detail:
Our Approach
16
1. Full process Flame Graph + main thread Flame Graph
2. perf report per dso
3. perf report per dso,sym (w/wout callgraph)
4. perf report per dso,sym,srcline (w/wout callgraph)
5. identical stacks collapsed
6. hotpath callgraph
1
3
2
4
1. Initial focus on OSS deployments
2. local and remote triggers
3. Used for testing, profiling
a. Regression analysis
i. and fix
b. Approval of features
c. Proactive optimization
Our Approach
17
1. Full process Flame Graph + main thread Flame Graph
2. perf report per dso
3. perf report per dso,sym (w/wout callgraph)
4. perf report per dso,sym,srcline (w/wout callgraph)
5. identical stacks collapsed
6. hotpath callgraph
4
6
1. Initial focus on OSS deployments
2. local and remote triggers
3. Used for testing, profiling
a. Regression analysis
i. and fix
b. Approval of features
c. Proactive optimization
18
What We’ve Gained
● up to 68% performance boost on the covered commands
● Deeply reduced the feedback cycle ( days -> 1hour )
● Dev’s can easily add tests (243 full suites)
● Scaled team + more challenging!
19
What We’ve Gained
● Finding performance improvements is now everyone’s
power/responsibility
● A/B test new tech/state-of-the-art HW/SW components
● Continuous up-to-date numbers for use-cases that matter
● Foster openness
20
What’s Next
● aggregate performance data across a group of benchmarks
● better statistical analysis methods
● more visibility across API
● Increase OSS / Company adoption
○ expose data on docs
@fcosta_oliveira
questions?
filipe@redis.com or performance@redis.com
try redis cloud for free!
https://redis.com/try-free/
21

Open Source Experience Conference 2022

  • 1.
    E2E Performance Testing, Profiling,and Analysis at Redis Open Source Experience Conference, Nov 2022 Filipe Oliveira Senior Performance Engineer @Redis
  • 2.
    IF YOU DON’TMEASURE IT YOU CAN’T IMPROVE IT 2
  • 3.
    THE SOONER THE BETTER,THE CHEAPER, THE EASIER 3
  • 4.
    PERFORMANCE @REDIS 4 OSS REDIS+ Redis Ltd Projects 1. foster benchmark and observability standards 2. support the contributions to the OSS projects 3. optimize an industry-leading solution
  • 5.
    Ordinarily, on ourCompanies Core Products 5 We have... ● automatic extensive tests to catch functional failures ...but when ● we accidentally commit a performance regression, nothing intercepts it*!
  • 6.
    A Real CaseFrom 2019 6
  • 7.
    A Real CaseFrom 2019 7 Simple request 1. RediSearch minor version bump 2. Required multiple patch a. Feedback cycle took us at-least 1 day b. prioritized over other projects c. Siloed d. Jul. 30, Nov. 27, 2019 You can relate to... ● your team run performance tests before releasing
  • 8.
    Ordinarily, on ourCompanies Core Products 8 You can state... ● your team run performance tests before releasing ...but solving slowdowns just before releasing is... ● dangerous ● time-consuming ● one of the most difficult tasks to estimate time to ...is just buffering potential issues!
  • 9.
    Goal: Reduce FeedbackCycle. Avoid Silos 9 Requirements for valid tests - Stable testing environment - Deterministic testing tools - Deterministic outcomes - Reduced testing/probing overhead - Reduce tested changes to the minimal Requirements for acceptance in products - Acceptable duration - No manual work - Actionable items - Well defined key performance indicators CODE REVIEW PREVIEW / UNSTABLE RELEASE MANUAL PERF CHECK CODE REVIEW PREVIEW / UNSTABLE RELEASE ZERO TOUCH PERF CHECK ZERO TOUCH PERF CHECK ZERO TOUCH PERF CHECK
  • 10.
    10 This is NotNew/Disruptive Elastic https://elasticsearch-benchmarks.elastic.co/# Lucene https://home.apache.org/~mikemccand/lucenebench/
  • 11.
    11 This is NotNew/Disruptive mongoDB
  • 12.
  • 13.
    by branch scalability analysis OurApproach 13 by version 1. Initial focus on OSS deployments 2. local and remote triggers 3. Used for testing, profiling a. Regression analysis i. and fix b. Approval of features c. Proactive optimization
  • 14.
    14 Our Approach 1. Initialfocus on OSS deployments 2. local and remote triggers 3. Used for testing, profiling a. Regression analysis i. and fix b. Approval of features c. Proactive optimization
  • 15.
  • 16.
    Our Approach 16 1. Fullprocess Flame Graph + main thread Flame Graph 2. perf report per dso 3. perf report per dso,sym (w/wout callgraph) 4. perf report per dso,sym,srcline (w/wout callgraph) 5. identical stacks collapsed 6. hotpath callgraph 1 3 2 4 1. Initial focus on OSS deployments 2. local and remote triggers 3. Used for testing, profiling a. Regression analysis i. and fix b. Approval of features c. Proactive optimization
  • 17.
    Our Approach 17 1. Fullprocess Flame Graph + main thread Flame Graph 2. perf report per dso 3. perf report per dso,sym (w/wout callgraph) 4. perf report per dso,sym,srcline (w/wout callgraph) 5. identical stacks collapsed 6. hotpath callgraph 4 6 1. Initial focus on OSS deployments 2. local and remote triggers 3. Used for testing, profiling a. Regression analysis i. and fix b. Approval of features c. Proactive optimization
  • 18.
    18 What We’ve Gained ●up to 68% performance boost on the covered commands ● Deeply reduced the feedback cycle ( days -> 1hour ) ● Dev’s can easily add tests (243 full suites) ● Scaled team + more challenging!
  • 19.
    19 What We’ve Gained ●Finding performance improvements is now everyone’s power/responsibility ● A/B test new tech/state-of-the-art HW/SW components ● Continuous up-to-date numbers for use-cases that matter ● Foster openness
  • 20.
    20 What’s Next ● aggregateperformance data across a group of benchmarks ● better statistical analysis methods ● more visibility across API ● Increase OSS / Company adoption ○ expose data on docs
  • 21.
    @fcosta_oliveira questions? filipe@redis.com or performance@redis.com tryredis cloud for free! https://redis.com/try-free/ 21