Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
An Automated Approach for
Recommending When to Stop
Performance Tests
Hammam
AlGhamdi
Weiyi
Shang 
Mark D.
Syer 
Ahmed E.
...
Failures in ultra
large-scale systems
are often due to
performance issues
rather than
functional issues
2
A 25-minutes service outage in 2013
cost Amazon approximately $1.7M
3
4
Performance testing is essential to
prevent these failures
System under
test
requests
requests
requests
Performance
coun...
5
Determining the length of a
performance test is challenging
Time
Repetitive data is generated
from the test
Optimal stop...
6
Determining the length of a
performance test is challenging
Time
Stopping too early,
misses performance
issues
Stopping ...
7
Our approach for recommending
when to stop a performance test
Collected data
Likelihood of
repetitiveness
First
derivati...
8
Our approach for recommending
when to stop a performance test
Collected data
Likelihood of
repetitiveness
First
derivati...
9
Our approach for recommending
when to stop a performance test
Collected data
Likelihood of
repetitiveness
First
derivati...
10
Our approach for recommending
when to stop a performance test
Collected data
Likelihood of
repetitiveness
First
derivat...
11
Time
Current time
Collected
data
Likelihood of
repetitiveness
First
derivative
Whether to
stop the test
1) Collect the
...
12
Time
Step 2: Measure
the likelihood of
repetitiveness
Select a random time period (e.g. 30 min) 
Current time
Collected...
13
Time
Current time
Search for another non-overlapping time period that is
NOT statistically significantly different. 
…
S...
14
Time
Wilcoxon test
between the
distributions of
every performance
counter across both
periods
…
Current time
Step 2: Me...
15
Step 2: Measure
the likelihood of
repetitiveness
Response
time
CPU Memory IO
p-values 0.0258 0.313 0.687 0.645
Statisti...
16
Step 2: Measure
the likelihood of
repetitiveness
Response
time
CPU Memory IO
p-values 0.67 0.313 0.687 0.645
Find a tim...
17
Find a period that
is NOT statistically
significantly
different?
Yes. Repetitive!

No. Not
repetitive!
Step 2: Measure
t...
18
Step 2: Measure
the likelihood of
repetitiveness
Collected
data
Likelihood of
repetitiveness
First
derivative
Whether t...
19
Step 2: Measure
the likelihood of
repetitiveness
30 min
40 min
Time
…
1h 10 min
Collected
data
Likelihood of
repetitive...
20
Step 2: Measure
the likelihood of
repetitiveness
Time
likelihood of
repetitiveness
00:00
 24:00
1%
100%
Stabilization
(...
21
Step 3: Extrapolate
the likelihood of
repetitiveness
Time
likelihood of
repetitiveness
00:00
 24:00
1%
100%
Collected
d...
22
Step 4: Determine
whether to stop
the test
Time
likelihood of
repetitivenes
s
00:00
 24:00
1%
100%
Stop the test if the...
23
Our approach for recommending
when to stop a performance test
Collected data
Likelihood of
repetitiveness
First
derivat...
PetClinic
 Dell DVD Store
CloudStore
24
We conduct 24-hour performance
tests on three systems
25
We evaluate whether our
approach: 
Stops the test too
early?
Stops the test too late?
Optimal
stopping time
1 2
26
Pre-stopping data
 Post-stopping data
Time
STOP
Does our approach stop the test
too early?
00:00
 24:00
1
1) Select a r...
27
We apply our evaluation approach in RQ1
at the end of every hour during the test to
find the most cost effective stoppin...
The most cost-effective
stopping time has:

1.  A big difference to
the previous hour
2.  A small difference to
the next h...
29
There is a short delay
between the
recommended stopping
times and the most cost
effective stopping times
(The majority ...
30
Determining the length of a
performance test is challenging
Time
Stopping too early,
misses performance
issues
Stopping...
31
30
Determining the length of a
performance test is challenging
Time
Stopping too early,
misses performance
issues
Stopp...
32
Our approach for recommending
when to stop a performance test
Collected data
Likelihood of
repetitiveness
First
derivat...
33
30
Determining the length of a
performance test is challenging
Time
Stopping too early,
misses performance
issues
Stopp...
34
Pre-stopping data
 Post-stopping data
Time
STOP
Does our approach stop the test
too early?
00:00
 24:00
1
1) Select a r...
35
30
Determining the length of a
performance test is challenging
Time
Stopping too early,
misses performance
issues
Stopp...
36
There is a short delay
between the
recommended stopping
times and the most cost
effective stopping times
(The majority ...
37
30
Determining the length of a
performance test is challenging
Time
Stopping too early,
misses performance
issues
Stopp...
Upcoming SlideShare
Loading in …5
×

An Automated Approach for Recommending When to Stop Performance Tests

—Performance issues are often the cause of failures in
today’s large-scale software systems. These issues make performance
testing essential during software maintenance. However,
performance testing is faced with many challenges. One challenge
is determining how long a performance test must run. Although
performance tests often run for hours or days to uncover
performance issues (e.g., memory leaks), much of the data that is
generated during a performance test is repetitive. Performance
analysts can stop their performance tests (to reduce the time
to market and the costs of performance testing) if they know
that continuing the test will not provide any new information
about the system’s performance. To assist performance analysts
in deciding when to stop a performance test, we propose an
automated approach that measures how much of the data that is
generated during a performance test is repetitive. Our approach
then provides a recommendation to stop the test when the data
becomes highly repetitive and the repetitiveness has stabilized
(i.e., little new information about the systems’ performance is
generated).

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Login to see the comments

  • Be the first to like this

An Automated Approach for Recommending When to Stop Performance Tests

  1. 1. An Automated Approach for Recommending When to Stop Performance Tests Hammam AlGhamdi Weiyi Shang Mark D. Syer Ahmed E. Hassan 1
  2. 2. Failures in ultra large-scale systems are often due to performance issues rather than functional issues 2
  3. 3. A 25-minutes service outage in 2013 cost Amazon approximately $1.7M 3
  4. 4. 4 Performance testing is essential to prevent these failures System under test requests requests requests Performance counters, e.g., CPU, memory, I/O and response time Pre-defined workload Performance testing environment
  5. 5. 5 Determining the length of a performance test is challenging Time Repetitive data is generated from the test Optimal stopping time
  6. 6. 6 Determining the length of a performance test is challenging Time Stopping too early, misses performance issues Stopping too late, delays the release and wastes testing resources Optimal stopping time
  7. 7. 7 Our approach for recommending when to stop a performance test Collected data Likelihood of repetitiveness First derivatives Whether to stop the test 1) Collect the already-generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test STOP No Yes
  8. 8. 8 Our approach for recommending when to stop a performance test Collected data Likelihood of repetitiveness First derivatives Whether to stop the test 1) Collect the already-generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test STOP No Yes
  9. 9. 9 Our approach for recommending when to stop a performance test Collected data Likelihood of repetitiveness First derivatives Whether to stop the test 1) Collect the already-generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test STOP No Yes
  10. 10. 10 Our approach for recommending when to stop a performance test Collected data Likelihood of repetitiveness First derivatives Whether to stop the test 1) Collect the already-generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test STOP No Yes
  11. 11. 11 Time Current time Collected data Likelihood of repetitiveness First derivative Whether to stop the test 1) Collect the already- generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test Step 1: Collect the data that the test generates Performance counters, e.g., CPU, memory, I/O and response time
  12. 12. 12 Time Step 2: Measure the likelihood of repetitiveness Select a random time period (e.g. 30 min) Current time Collected data Likelihood of repetitiveness First derivative Whether to stop the test 1) Collect the already- generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test A
  13. 13. 13 Time Current time Search for another non-overlapping time period that is NOT statistically significantly different. … Step 2: Measure the likelihood of repetitiveness … B…A Collected data Likelihood of repetitiveness First derivative Whether to stop the test 1) Collect the already- generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test
  14. 14. 14 Time Wilcoxon test between the distributions of every performance counter across both periods … Current time Step 2: Measure the likelihood of repetitiveness B…A Collected data Likelihood of repetitiveness First derivative Whether to stop the test 1) Collect the already- generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test
  15. 15. 15 Step 2: Measure the likelihood of repetitiveness Response time CPU Memory IO p-values 0.0258 0.313 0.687 0.645 Statistically significantly different in response time! Collected data Likelihood of repetitiveness First derivative Whether to stop the test 1) Collect the already- generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test Time Wilcoxon test between every performance counter from both periods … Current time B…A
  16. 16. 16 Step 2: Measure the likelihood of repetitiveness Response time CPU Memory IO p-values 0.67 0.313 0.687 0.645 Find a time period that is NOT statistically significantly different in ALL performance metrics! Collected data Likelihood of repetitiveness First derivative Whether to stop the test 1) Collect the already- generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test Time Wilcoxon test between every performance counter from both periods … Current time B…A
  17. 17. 17 Find a period that is NOT statistically significantly different? Yes. Repetitive! No. Not repetitive! Step 2: Measure the likelihood of repetitiveness Collected data Likelihood of repetitiveness First derivative Whether to stop the test 1) Collect the already- generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test Time Wilcoxon test between every performance counter from both periods … Current time B…A
  18. 18. 18 Step 2: Measure the likelihood of repetitiveness Collected data Likelihood of repetitiveness First derivative Whether to stop the test 1) Collect the already- generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test Repeat this process a large number (e.g., 1,000) times to calculate the: likelihood of repetitiveness
  19. 19. 19 Step 2: Measure the likelihood of repetitiveness 30 min 40 min Time … 1h 10 min Collected data Likelihood of repetitiveness First derivative Whether to stop the test 1) Collect the already- generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test A new likelihood of repetitiveness is measured periodically, e.g., every 10 min, in order to get more frequent feedback on the repetitiveness
  20. 20. 20 Step 2: Measure the likelihood of repetitiveness Time likelihood of repetitiveness 00:00 24:00 1% 100% Stabilization (little new information) Collected data Likelihood of repetitiveness First derivative Whether to stop the test 1) Collect the already- generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test The likelihood of repetitiveness eventually starts stabilizing.
  21. 21. 21 Step 3: Extrapolate the likelihood of repetitiveness Time likelihood of repetitiveness 00:00 24:00 1% 100% Collected data Likelihood of repetitiveness First derivative Whether to stop the test 1) Collect the already- generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test To know when the repetitiveness stabilizes, we calculate the first derivative.
  22. 22. 22 Step 4: Determine whether to stop the test Time likelihood of repetitivenes s 00:00 24:00 1% 100% Stop the test if the fist derivative is close to 0. Collected data Likelihood of repetitiveness First derivative Whether to stop the test 1) Collect the already- generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test To know when the repetitiveness stabilizes, we calculate the first derivative.
  23. 23. 23 Our approach for recommending when to stop a performance test Collected data Likelihood of repetitiveness First derivatives Whether to stop the test 1) Collect the already-generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test STOP No Yes
  24. 24. PetClinic Dell DVD Store CloudStore 24 We conduct 24-hour performance tests on three systems
  25. 25. 25 We evaluate whether our approach: Stops the test too early? Stops the test too late? Optimal stopping time 1 2
  26. 26. 26 Pre-stopping data Post-stopping data Time STOP Does our approach stop the test too early? 00:00 24:00 1 1) Select a random time period from the post- stopping data 2) Check if the random time period has a repetitive one from the pre-stopping data The test is likely to generate little new data, after the stopping times (preserving more than 91.9% of the information). Repeat 1,000 times
  27. 27. 27 We apply our evaluation approach in RQ1 at the end of every hour during the test to find the most cost effective stopping time. Does our approach stop the test too late? 2 1h 2h Time … 10h 20h 24h
  28. 28. The most cost-effective stopping time has: 1.  A big difference to the previous hour 2.  A small difference to the next hour 28 1% 100% 00:00 04:00 05:00 06:00 Does our approach stop the test too late? 2 likelihood of repetitiveness
  29. 29. 29 There is a short delay between the recommended stopping times and the most cost effective stopping times (The majority are under 4-hour delay). Short delay Does our approach stop the test too late? 2
  30. 30. 30 Determining the length of a performance test is challenging Time Stopping too early, misses performance issues Stopping too late, delays the release and wastes testing resources Optimal stopping time
  31. 31. 31 30 Determining the length of a performance test is challenging Time Stopping too early, misses performance issues Stopping too late, delays the release and wastes testing resources Optimal stopping time
  32. 32. 32 Our approach for recommending when to stop a performance test Collected data Likelihood of repetitiveness First derivatives Whether to stop the test 1) Collect the already-generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test STOP No Yes
  33. 33. 33 30 Determining the length of a performance test is challenging Time Stopping too early, misses performance issues Stopping too late, delays the release and wastes testing resources Optimal stopping time 32 Our approach for recommending when to stop a performance test Collected data Likelihood of repetitiveness First derivatives Whether to stop the test 1) Collect the already-generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test STOP No Yes
  34. 34. 34 Pre-stopping data Post-stopping data Time STOP Does our approach stop the test too early? 00:00 24:00 1 1) Select a random time period from the post- stopping data 2) Check if the random time period has a repetitive one from the pre-stopping data The test is likely to generate little new data, after the stopping times (preserving more than 91.9% of the information). Repeat 1,000 times
  35. 35. 35 30 Determining the length of a performance test is challenging Time Stopping too early, misses performance issues Stopping too late, delays the release and wastes testing resources Optimal stopping time 32 Pre-stopping data Post-stopping data Time STOP Does our approach stop the test too early? 00:00 24:00 1 1) Select a random time period from the post- stopping data 2) Check if the random time period has a repetitive one from the pre-stopping data The test is likely to generate little new data, after the stopping times (preserving more than 91.9% of the information). Repeat 1,000 times 32 Our approach for recommending when to stop a performance test Collected data Likelihood of repetitiveness First derivatives Whether to stop the test 1) Collect the already-generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test STOP No Yes
  36. 36. 36 There is a short delay between the recommended stopping times and the most cost effective stopping times (The majority are under 4-hour delay). Short delay Does our approach stop the test too late? 2
  37. 37. 37 30 Determining the length of a performance test is challenging Time Stopping too early, misses performance issues Stopping too late, delays the release and wastes testing resources Optimal stopping time 32 Pre-stopping data Post-stopping data Time STOP Does our approach stop the test too early? 00:00 24:00 1 1) Select a random time period from the post- stopping data 2) Check if the random time period has a repetitive one from the pre-stopping data The test is likely to generate little new data, after the stopping times (preserving more than 91.9% of the information). Repeat 1,000 times 33 There is a short delay between the recommended stopping times and the most cost effective stopping times (The majority are under 4-hour delay). Short delay Does our approach stop the test too late? 2 32 Our approach for recommending when to stop a performance test Collected data Likelihood of repetitiveness First derivatives Whether to stop the test 1) Collect the already-generated data 2) Measure the likelihood of repetitiveness 3) Extrapolate the likelihood of repetitiveness 4) Determine whether to stop the test STOP No Yes

×