Continuous delivery while minimizing performance risks

CONTINUOUS
DELIVERY WHILE
MINIMISING
PERFORMANCE
RISKS

OBJECTIVE

Put working software into production as quickly as possible,
whilst minimising risk of load-related problems:

› Bad response times
› Too small capacity
› Availability too low
› Excessive system resource use

PREVENTING RISK IS A BIG SUBJECT,
WHAT FOLLOWS IS TAKEN FROM OUR EXPERIENCE
RISK PREVENTION IS A BIG SUBJECT

Photo by chillihead: www.flickr.com/photos/chillihead/1778980935

CONTINUOUS DELIVERY LITERATURE PROVIDES METHODS
THAT HELP REDUCE RISK

› Blue-green deployments
› Dark launching
› Feature toggles
› Canary releasing
› Production immune systems

Jez Humble, http://continuousdelivery.com

BUT LEGACY SYSTEMS OFTEN LACK THE REQUIRED
RESILIENCE

WHILE WE WORK ON OUR RESILIENCE, WE USE LOAD TESTS
TO HELP IDENTIFY THE BIGGEST RISKS

PRE-PROD LOAD TESTING IS NOT FREE

› Extra code to maintain

› Usually test runs last several hours

› A production-like environment is expensive

› Realistic testing is hard

› Not all developers like writing (performance) tests

USE IT WISELY, WHERE PRODUCTION TESTING IS STILL
INAPPROPRIATE

› It provides no guarantee

› Use it to find any showstoppers you can

› Essentially, an optional service that teams can use

USE IT AS A PLAYGROUND TO TRY RISKY CHANGES

Photo by vastateparksstaff: www.flickr.com/photos/vastateparksstaff/5330257235

Load tests

Functional
Build, unit
integration
test, etc.
tests

Very often Less often About once a day
(at night)

Load tests

Functional
Build, unit Load test
integration
test, etc. script check
tests

Very often Less often About once a day
(at night)

THE AIM IS NOT PERFECTION, GO FOR “AS REALISTIC AS
NEEDED”

SET UP TEST DATA IN THE WEEKEND, TO MINIMIZE
DISRUPTION

WHEN IS A PROBLEM REALLY A PROBLEM?

FIND AN OBJECTIVE WAY TO JUDGE YOUR FINDINGS

ESTABLISH REQUIREMENTS TO MAKE CLEAR WHAT IS
ACCEPTABLE

› Seen from the main stakeholders’ perspective
– Response time: users
– System resources: ops
– Capacity: business
› Specific
› Measurable
› Achievable
› Relevant

Concurrent users

Fail: Now: Target:
< 100k 150k 200k

Intention: The website should at least be Stakeholder: Business
able to manage our typical daily load, but
we would like some margin for growth
and marketing campaigns.

Scale: Maximum load in a day, while Meter: Session table row count.
response times are still according to
spec.

FOR RESPONSE TIMES TOO, FOCUS ON THE
MAIN STAKEHOLDER!

FOR RESPONSE TIMES TOO, FOCUS ON THE MAIN
STAKEHOLDER!

SO USE A REAL BROWSER TO TEST
A REAL USER’S EXPERIENCE

Response time Fail [Today] Target
Homepage.FV > 6 sec 3.9 sec 2 sec
Homepage.RV > 5 sec 2.8 sec 1 sec
Checkout.FV > 8 sec 6.5 sec 2 sec
Details.FV > 6 sec 1.9 sec 2 sec
Details.RV > 5 sec 1.7 sec 1 sec
Search.FV > 6 sec 4.8 sec 2 sec
Search.RV > 5 sec 3.7 sec 1 sec
Cart.FV > 6 sec 4.4 sec 2 sec
Cart.RV > 5 sec 3.4 sec 1 sec
LoginForm.FV > 6 sec 3.5 sec 2 sec
LoginForm.RV > 5 sec 2.5 sec 1 sec

TO MAKE COMPARING SENSIBLE, MAKE YOUR TESTS
DETERMINISTIC

Stub systems that you have no control over

LOAD TESTING SHOULD BE OPTIONAL, THE ONLY THING
THAT COUNTS IS PRODUCTION!

› Your definition of done should reflect that

› The aim is to get early feedback from a safe environment

ANYTHING YOU FIND IS AN OPPORTUNITY TO FIX MORE
THAN ONE PROBLEM

SO WHAT MONITORING IS TYPICALLY NEEDED?

› Be able to localise where latency is coming from!
– For every system, all incoming and outgoing calls (count and
time spent stats)
› Finite resources (pools, CPU, I/O, etc.)
› Number of active users
› Response size, where possible
› Add whatever you need

It should be identical on all environments!

SET CLEAR TARGETS, SO YOU KNOW YOUR SITUATION

› How many errors would be OK in production?
› What kind of errors do we care about?

Number of stale server session requests /
min

50

0
100
150
250
300

200
00:00
01:00
02:00
03:00
04:00
05:00
06:00
07:00
08:00
09:00
10:00
11:00
12:00
13:00
Other servers taken out of LB

14:00
15:00
16:00
17:00
18:00
19:00
20:00
21:00
back in LB

22:00
Other servers

23:00
00:00

CONCLUSION

Support your continuous delivery process with optional load
tests and strong specs.

Use the load tests to identify some pain points, so you can
modify the code and add monitoring, making it safer to do
dark releases and canary testing in production.

Constantly ask yourself: what would it take to only do this in
production? Is it adding value?

QUESTIONS?

athomas@xebia.com
@a32an
www.xebia.com
blog.xebia.com

(we’re hiring)

Continuous delivery while minimizing performance risks

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (10)

Similar to Continuous delivery while minimizing performance risks

Similar to Continuous delivery while minimizing performance risks (20)

Recently uploaded

Recently uploaded (20)

Continuous delivery while minimizing performance risks