CONTINUOUS
DELIVERY WHILE
MINIMISING
PERFORMANCE
RISKS
INTRODUCTION
OBJECTIVE

Put working software into production as quickly as possible,
whilst minimising risk of load-related problems:

›   Bad response times
›   Too small capacity
›   Availability too low
›   Excessive system resource use
PREVENTING RISK IS A BIG SUBJECT,
WHAT FOLLOWS IS TAKEN FROM OUR EXPERIENCE
RISK PREVENTION IS A BIG SUBJECT




     Photo by chillihead: www.flickr.com/photos/chillihead/1778980935
CONTINUOUS DELIVERY LITERATURE PROVIDES METHODS
THAT HELP REDUCE RISK

›   Blue-green deployments
›   Dark launching
›   Feature toggles
›   Canary releasing
›   Production immune systems




                       Jez Humble, http://continuousdelivery.com
BLUE-GREEN DEPLOYMENTS




                    Elastic     Instances
                     Load
                   Balancer   Version n



        Amazon
        Route 53




                    Elastic    Instances
                     Load
                   Balancer   Version n+1
DARK LAUNCHING
                 Web page   DB
DARK LAUNCHING
                 Web page   DB   Weather SP
DARK LAUNCHING
                 Web page   DB   Weather SP
FEATURE TOGGLES
CANARY RELEASING




                   0%   100%
PRODUCTION IMMUNE SYSTEMS
USE CONTROLLED LOAD TESTING TO HELP CAPACITY
PLANNING


                       Instance           RDS DB
                                          Instance




 Amazon      Elastic    Instance
 Route 53     Load
            Balancer




                        Instance       RDS DB Instance
                                        Read Replica
WORK WITH FAILURE

›   Optimise for MTTD and MTTR, not MTBF
›   Game day exercises
›   Chaos monkey
›   Go / NoGo meetings
›   Retrospectives
BUT LEGACY SYSTEMS OFTEN LACK THE REQUIRED
RESILIENCE
WHILE WE WORK ON OUR RESILIENCE, WE USE LOAD TESTS
TO HELP IDENTIFY THE BIGGEST RISKS
PRE-PROD LOAD TESTING IS NOT FREE


›   Extra code to maintain

›   Usually test runs last several hours

›   A production-like environment is expensive

›   Realistic testing is hard

›   Not all developers like writing (performance) tests
USE IT WISELY, WHERE PRODUCTION TESTING IS STILL
INAPPROPRIATE

›   It provides no guarantee

›   Use it to find any showstoppers you can

›   Essentially, an optional service that teams can use
USE IT AS A PLAYGROUND TO TRY RISKY CHANGES




        Photo by vastateparksstaff: www.flickr.com/photos/vastateparksstaff/5330257235
Load tests


              Functional
Build, unit
              integration
 test, etc.
                 tests


Very often    Less often    At least once a day
                                  (at night)
Load tests


              Functional
Build, unit                      Load test
              integration
 test, etc.                     script check
                 tests


Very often              Less often             At least once a day
                                                     (at night)
THE AIM IS NOT PERFECTION, GO FOR “AS REALISTIC AS
NEEDED”
SET UP TEST DATA IN THE WEEKEND, TO MINIMIZE
DISRUPTION
WHEN IS A PROBLEM REALLY A PROBLEM?
FIND AN OBJECTIVE WAY TO JUDGE YOUR FINDINGS
ESTABLISH REQUIREMENTS TO MAKE CLEAR WHAT IS
ACCEPTABLE

›   Seen from the main stakeholders’ perspective
    – Response time: users
    – System resources: ops
    – Capacity: business
›   Specific
›   Measurable
›   Achievable
›   Relevant
Concurrent users



                             Fail:              Now:            Target:
                             < 100k             150k            200k




Intention: The website should at least be    Stakeholder: Business
able to manage our typical daily load, but
we would like some margin for growth
and marketing campaigns.

Scale: Maximum load in a day, while          Meter: Session table row count.
response times are still according to
spec.
SO USE A REAL BROWSER TO TEST
A REAL USER’S EXPERIENCE
Response time   Fail      [Today]    Target
Homepage.FV     > 6 sec    3.9 sec    2 sec
Homepage.RV     > 5 sec    2.8 sec    1 sec
Checkout.FV     > 8 sec    6.5 sec    2 sec
Details.FV      > 6 sec    1.9 sec    2 sec
Details.RV      > 5 sec    1.7 sec    1 sec
Search.FV       > 6 sec    4.8 sec    2 sec
Search.RV       > 5 sec    3.7 sec    1 sec
Cart.FV         > 6 sec    4.4 sec    2 sec
Cart.RV         > 5 sec    3.4 sec    1 sec
LoginForm.FV    > 6 sec    3.5 sec    2 sec
LoginForm.RV    > 5 sec    2.5 sec    1 sec
TO MAKE COMPARING SENSIBLE, MAKE YOUR TESTS
DETERMINISTIC

Stub systems that you have no control over
LOAD TESTING SHOULD BE OPTIONAL, THE ONLY THING
THAT COUNTS IS PRODUCTION!

›   Your definition of done should reflect that

›   The aim is to get early feedback from a safe environment
ANYTHING YOU FIND IS AN OPPORTUNITY TO FIX MORE
THAN ONE PROBLEM
SO WHAT MONITORING IS TYPICALLY NEEDED?

›   Be able to localise where latency is coming from!
    – For every system, all incoming and outgoing calls (count and
       time spent stats)
›   Finite resources (pools, CPU, I/O, etc.)
›   Number of active users
›   Response size, where possible
›   Add whatever you need


It should be identical on all environments!
CONCLUSION

In order to put code live without pre-prod load testing, at least
the following need to be in place:
› Culture
› State-of-the-art monitoring
› Resilience
Without these, support your continuous delivery process with
optional load tests and strong specs.

Use the load tests to identify some pain points, so you can
modify the code and add monitoring, making it safer to do
(incremental) dark releases and canary testing in production.
QUESTIONS?



             athomas@xebia.com
             @a32an
             www.xebia.com
             blog.xebia.com

             (we’re hiring)

Continuous delivery while minimizing performance risks (dutch web ops meetup)

  • 1.
  • 2.
  • 3.
    OBJECTIVE Put working softwareinto production as quickly as possible, whilst minimising risk of load-related problems: › Bad response times › Too small capacity › Availability too low › Excessive system resource use
  • 5.
    PREVENTING RISK ISA BIG SUBJECT, WHAT FOLLOWS IS TAKEN FROM OUR EXPERIENCE RISK PREVENTION IS A BIG SUBJECT Photo by chillihead: www.flickr.com/photos/chillihead/1778980935
  • 6.
    CONTINUOUS DELIVERY LITERATUREPROVIDES METHODS THAT HELP REDUCE RISK › Blue-green deployments › Dark launching › Feature toggles › Canary releasing › Production immune systems Jez Humble, http://continuousdelivery.com
  • 7.
    BLUE-GREEN DEPLOYMENTS Elastic Instances Load Balancer Version n Amazon Route 53 Elastic Instances Load Balancer Version n+1
  • 8.
    DARK LAUNCHING Web page DB
  • 9.
    DARK LAUNCHING Web page DB Weather SP
  • 10.
    DARK LAUNCHING Web page DB Weather SP
  • 11.
  • 12.
  • 13.
  • 14.
    USE CONTROLLED LOADTESTING TO HELP CAPACITY PLANNING Instance RDS DB Instance Amazon Elastic Instance Route 53 Load Balancer Instance RDS DB Instance Read Replica
  • 15.
    WORK WITH FAILURE › Optimise for MTTD and MTTR, not MTBF › Game day exercises › Chaos monkey › Go / NoGo meetings › Retrospectives
  • 16.
    BUT LEGACY SYSTEMSOFTEN LACK THE REQUIRED RESILIENCE
  • 17.
    WHILE WE WORKON OUR RESILIENCE, WE USE LOAD TESTS TO HELP IDENTIFY THE BIGGEST RISKS
  • 18.
    PRE-PROD LOAD TESTINGIS NOT FREE › Extra code to maintain › Usually test runs last several hours › A production-like environment is expensive › Realistic testing is hard › Not all developers like writing (performance) tests
  • 20.
    USE IT WISELY,WHERE PRODUCTION TESTING IS STILL INAPPROPRIATE › It provides no guarantee › Use it to find any showstoppers you can › Essentially, an optional service that teams can use
  • 21.
    USE IT ASA PLAYGROUND TO TRY RISKY CHANGES Photo by vastateparksstaff: www.flickr.com/photos/vastateparksstaff/5330257235
  • 22.
    Load tests Functional Build, unit integration test, etc. tests Very often Less often At least once a day (at night)
  • 23.
    Load tests Functional Build, unit Load test integration test, etc. script check tests Very often Less often At least once a day (at night)
  • 24.
    THE AIM ISNOT PERFECTION, GO FOR “AS REALISTIC AS NEEDED”
  • 25.
    SET UP TESTDATA IN THE WEEKEND, TO MINIMIZE DISRUPTION
  • 26.
    WHEN IS APROBLEM REALLY A PROBLEM?
  • 27.
    FIND AN OBJECTIVEWAY TO JUDGE YOUR FINDINGS
  • 28.
    ESTABLISH REQUIREMENTS TOMAKE CLEAR WHAT IS ACCEPTABLE › Seen from the main stakeholders’ perspective – Response time: users – System resources: ops – Capacity: business › Specific › Measurable › Achievable › Relevant
  • 29.
    Concurrent users Fail: Now: Target: < 100k 150k 200k Intention: The website should at least be Stakeholder: Business able to manage our typical daily load, but we would like some margin for growth and marketing campaigns. Scale: Maximum load in a day, while Meter: Session table row count. response times are still according to spec.
  • 30.
    SO USE AREAL BROWSER TO TEST A REAL USER’S EXPERIENCE
  • 31.
    Response time Fail [Today] Target Homepage.FV > 6 sec 3.9 sec 2 sec Homepage.RV > 5 sec 2.8 sec 1 sec Checkout.FV > 8 sec 6.5 sec 2 sec Details.FV > 6 sec 1.9 sec 2 sec Details.RV > 5 sec 1.7 sec 1 sec Search.FV > 6 sec 4.8 sec 2 sec Search.RV > 5 sec 3.7 sec 1 sec Cart.FV > 6 sec 4.4 sec 2 sec Cart.RV > 5 sec 3.4 sec 1 sec LoginForm.FV > 6 sec 3.5 sec 2 sec LoginForm.RV > 5 sec 2.5 sec 1 sec
  • 33.
    TO MAKE COMPARINGSENSIBLE, MAKE YOUR TESTS DETERMINISTIC Stub systems that you have no control over
  • 34.
    LOAD TESTING SHOULDBE OPTIONAL, THE ONLY THING THAT COUNTS IS PRODUCTION! › Your definition of done should reflect that › The aim is to get early feedback from a safe environment
  • 35.
    ANYTHING YOU FINDIS AN OPPORTUNITY TO FIX MORE THAN ONE PROBLEM
  • 36.
    SO WHAT MONITORINGIS TYPICALLY NEEDED? › Be able to localise where latency is coming from! – For every system, all incoming and outgoing calls (count and time spent stats) › Finite resources (pools, CPU, I/O, etc.) › Number of active users › Response size, where possible › Add whatever you need It should be identical on all environments!
  • 37.
    CONCLUSION In order toput code live without pre-prod load testing, at least the following need to be in place: › Culture › State-of-the-art monitoring › Resilience Without these, support your continuous delivery process with optional load tests and strong specs. Use the load tests to identify some pain points, so you can modify the code and add monitoring, making it safer to do (incremental) dark releases and canary testing in production.
  • 38.
    QUESTIONS? athomas@xebia.com @a32an www.xebia.com blog.xebia.com (we’re hiring)