How Fast Is My App?
Performance Testing 101
Gene Gotimer
Principal DevOps Engineer at Praeses, LLC
@OtherDevOpsGene
June 27, 2024 #KCDC2024
What is performance testing?
“I Love Lucy | Lucy And Ethel At The Chocolate Factory (S2, E1)”
Paramount Plus, © 1952
2
@OtherDevOpsGene #KCDC2024
• Time
• Resources
• Capacity
What are we measuring?
• Time- How long does it take?
• Resources- How big a system do we need?
• Capacity- How much can we handle at once?
3
@OtherDevOpsGene #KCDC2024
Common performance problems
• Long load times
• Poor response time
• Poor scalability
• Bottlenecks
• CPU
• memory
• network
• storage
4
@OtherDevOpsGene #KCDC2024
Types of performance testing
• Load testing
• Stress testing
• Endurance testing
• Spike testing
• Scalability testing
5
@OtherDevOpsGene #KCDC2024
Load testing
• Can we handle the realistic,
expected load?
• with some margin, possibly
• Could be very quick as a
smoke test
• Or longer as a
non-functional quality gate
6
load
time
expected load
@OtherDevOpsGene #KCDC2024
Stress testing
• How much traffic can we
handle until we break?
• aka breakpoint testing
7
load
time
expected load
@OtherDevOpsGene #KCDC2024
Endurance testing
• How long can we handle typical traffic and usage for?
• aka soak testing
• This should be a long test- overnight or over a weekend
8
load
time
expected load
@OtherDevOpsGene #KCDC2024
Spike testing
• Can we support a sudden
peak load?
• sudden transient traffic
• not seasonable,
predictable growth
9
load
time
expected load
@OtherDevOpsGene #KCDC2024
Scalability testing
• Can we provide acceptable
service as the traffic, number
of users, amount of data
grows?
• time can be built-in to allow
additional resources to start
• especially for horizontal scaling
• Could be combined with
stress testing
10
load
time
expected load
@OtherDevOpsGene #KCDC2024
What metrics should we collect?
• Response time
• Requests or transactions per second
• Error rate
• System behavior
• CPU, memory, network, storage
11
@OtherDevOpsGene #KCDC2024
How do we design perf tests?
• Identify the environment
• isolated?
• production-sized?
• Pick the scenarios
• representative actions?
• full-throttle load or more realistic?
• Select the metrics
• Decide on the goal
• what is passing?
• measure the metrics or pass a threshold?
• or just an experiment to understand the behavior?
12
@OtherDevOpsGene #KCDC2024
Collecting data, not finding flaws
• Goal should be learning the app’s behavior under different
scenarios
• So that you can plan how to deal if/when it happens
• Not focused on trying to flag shortcomings
• Not all performance “failures” will be real problems
• We won’t be able to do much about some limits
• Some will result in business discussions
@OtherDevOpsGene #KCDC2024
13
Taylor Swift and Ticketmaster
When the sale went online on November 15, 2022,
the website crashed in an hour,
with users logged out or in a frozen queue;
however, 2.4 million tickets were sold,
breaking the record for the highest single-day ticket
sales ever by an artist.
@OtherDevOpsGene #KCDC2024
14
“
”
“Taylor Swift–Ticketmaster controversy”,
https://en.wikipedia.org/wiki/Taylor_Swift%E2%80%93Ticketmaster_controversy
Demo app
https://github.com/OtherDevOpsGene/sorting
15
@OtherDevOpsGene #KCDC2024
Gatling demo
16
@OtherDevOpsGene #KCDC2024
JMeter demo
17
@OtherDevOpsGene #KCDC2024
Unit test demo
18
@OtherDevOpsGene #KCDC2024
Balance
19
Early rapid
feedback
No late
surprises
@OtherDevOpsGene #KCDC2024
Testing pyramid
Full transaction metrics
via the UI, endurance, and scalability
User performance testing, stress, and spike testing
Load testing via the UI
Load testing via the API
Baseline and trends of a subset of functional tests
Unit test performance
20
early and fast
late
and
slower
@OtherDevOpsGene #KCDC2024
Not just web performance
• We were adding a lot of audit logging to the database
• Used pgBadger to analyze PostgreSQL query logs
• 4 queries represented 85% of the traffic
• Wrote 4 JDBC Sampler tests in JMeter to match those queries
• Looping in roughly the same proportion
• Created baseline, then watched trends
• Database changes were tested for performance impacts
21
@OtherDevOpsGene #KCDC2024
Actionable tests
• Make the early tests actionable.
• Metrics aren’t actionable unless you have a cutoff.
• Fail builds if the performance slows.
• Gathering performance metrics is important for planning.
• Testing is a means to an end, not a goal.
22
@OtherDevOpsGene #KCDC2024
Wrap up
23
@OtherDevOpsGene #KCDC2024
Key takeaways
• Understand what you are trying to test for.
• Performance testing is a field, not a particular kind of test.
• Use a balance of types and trends to add confidence that
you won’t find performance issues late in the process.
• Fail builds for performance issues.
• Testing is a means to an end, not a goal.
24
@OtherDevOpsGene #KCDC2024
Questions?
Gene Gotimer
Principal DevOps Engineer at Praeses, LLC
@OtherDevOpsGene
25
@OtherDevOpsGene #KCDC2024

How Fast Is My App? Performance Testing 101

  • 1.
    How Fast IsMy App? Performance Testing 101 Gene Gotimer Principal DevOps Engineer at Praeses, LLC @OtherDevOpsGene June 27, 2024 #KCDC2024
  • 2.
    What is performancetesting? “I Love Lucy | Lucy And Ethel At The Chocolate Factory (S2, E1)” Paramount Plus, © 1952 2 @OtherDevOpsGene #KCDC2024
  • 3.
    • Time • Resources •Capacity What are we measuring? • Time- How long does it take? • Resources- How big a system do we need? • Capacity- How much can we handle at once? 3 @OtherDevOpsGene #KCDC2024
  • 4.
    Common performance problems •Long load times • Poor response time • Poor scalability • Bottlenecks • CPU • memory • network • storage 4 @OtherDevOpsGene #KCDC2024
  • 5.
    Types of performancetesting • Load testing • Stress testing • Endurance testing • Spike testing • Scalability testing 5 @OtherDevOpsGene #KCDC2024
  • 6.
    Load testing • Canwe handle the realistic, expected load? • with some margin, possibly • Could be very quick as a smoke test • Or longer as a non-functional quality gate 6 load time expected load @OtherDevOpsGene #KCDC2024
  • 7.
    Stress testing • Howmuch traffic can we handle until we break? • aka breakpoint testing 7 load time expected load @OtherDevOpsGene #KCDC2024
  • 8.
    Endurance testing • Howlong can we handle typical traffic and usage for? • aka soak testing • This should be a long test- overnight or over a weekend 8 load time expected load @OtherDevOpsGene #KCDC2024
  • 9.
    Spike testing • Canwe support a sudden peak load? • sudden transient traffic • not seasonable, predictable growth 9 load time expected load @OtherDevOpsGene #KCDC2024
  • 10.
    Scalability testing • Canwe provide acceptable service as the traffic, number of users, amount of data grows? • time can be built-in to allow additional resources to start • especially for horizontal scaling • Could be combined with stress testing 10 load time expected load @OtherDevOpsGene #KCDC2024
  • 11.
    What metrics shouldwe collect? • Response time • Requests or transactions per second • Error rate • System behavior • CPU, memory, network, storage 11 @OtherDevOpsGene #KCDC2024
  • 12.
    How do wedesign perf tests? • Identify the environment • isolated? • production-sized? • Pick the scenarios • representative actions? • full-throttle load or more realistic? • Select the metrics • Decide on the goal • what is passing? • measure the metrics or pass a threshold? • or just an experiment to understand the behavior? 12 @OtherDevOpsGene #KCDC2024
  • 13.
    Collecting data, notfinding flaws • Goal should be learning the app’s behavior under different scenarios • So that you can plan how to deal if/when it happens • Not focused on trying to flag shortcomings • Not all performance “failures” will be real problems • We won’t be able to do much about some limits • Some will result in business discussions @OtherDevOpsGene #KCDC2024 13
  • 14.
    Taylor Swift andTicketmaster When the sale went online on November 15, 2022, the website crashed in an hour, with users logged out or in a frozen queue; however, 2.4 million tickets were sold, breaking the record for the highest single-day ticket sales ever by an artist. @OtherDevOpsGene #KCDC2024 14 “ ” “Taylor Swift–Ticketmaster controversy”, https://en.wikipedia.org/wiki/Taylor_Swift%E2%80%93Ticketmaster_controversy
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
    Testing pyramid Full transactionmetrics via the UI, endurance, and scalability User performance testing, stress, and spike testing Load testing via the UI Load testing via the API Baseline and trends of a subset of functional tests Unit test performance 20 early and fast late and slower @OtherDevOpsGene #KCDC2024
  • 21.
    Not just webperformance • We were adding a lot of audit logging to the database • Used pgBadger to analyze PostgreSQL query logs • 4 queries represented 85% of the traffic • Wrote 4 JDBC Sampler tests in JMeter to match those queries • Looping in roughly the same proportion • Created baseline, then watched trends • Database changes were tested for performance impacts 21 @OtherDevOpsGene #KCDC2024
  • 22.
    Actionable tests • Makethe early tests actionable. • Metrics aren’t actionable unless you have a cutoff. • Fail builds if the performance slows. • Gathering performance metrics is important for planning. • Testing is a means to an end, not a goal. 22 @OtherDevOpsGene #KCDC2024
  • 23.
  • 24.
    Key takeaways • Understandwhat you are trying to test for. • Performance testing is a field, not a particular kind of test. • Use a balance of types and trends to add confidence that you won’t find performance issues late in the process. • Fail builds for performance issues. • Testing is a means to an end, not a goal. 24 @OtherDevOpsGene #KCDC2024
  • 25.
    Questions? Gene Gotimer Principal DevOpsEngineer at Praeses, LLC @OtherDevOpsGene 25 @OtherDevOpsGene #KCDC2024