Release Often Release SafelySergejus Barinovas (@sergejusb)http://sergejus.blogas.lt
This is not a theoretical presentation
This presentation based on real life experience
Successful software workflow
Dilemma: Innovative or Stable?InnovativeOften (bi-weekly) releases of new featuresHigher risk of bugs and downtimesStableHigher uptime and better customer perceptionSeasonal releases of new features
We wanted both …… be innovative and agile while staying as much stable as possible
Stability in our terms99.999% uptime for serving ads2 datacenters + clouds500 M requests / day
Let’s learn Kung Fuof releasing often and safely
Challenges we ha(d/ve)Detect issues in production as soon as possibleTest new features in production while reducing impact for customersRoll-out new features in a controlled manner
Detect issues in production ASAPMonitoringChoose monitoring system carefullyIt took us about 1 year (Zabbix)First list all your possible monitoring use casesPrepare your software for monitoringLogging is a must have!Performance / SLA counters help to measure and understand software betterCreate a clear baseline to compare with after releases
Detect issues in production ASAPAutomated functional testsDesigned to detect end-user issuesDifferently than unit and integration testsUI / business logicStill not as many as we want (Selenium UI / C#)Ongoing process of unifying automated QA testsRun after each release and on periodic basisVery important if you have > 1 serverHuge time saver if tests are repetitive
Though unit tests help in finding bugs during coding, they are more vital when software evolves!Finding
Test new features in productionEven ideal staging environment is not equal to production environmentBefore starting rolling-out new feature it is important to check itsResource consumptionCPU / RAM / HDD / IO / NetworkPerformance impact on existing functionalityResponse times / SLAStabilityErrors / memory leaks
Test new features in productionUse Case #1:Safely rollout new feature that integrates into core data collection pipeline
Test new features in productionDark releasesWorks best with brand new featuresRelease new feature to one or several serversNew feature gets real load, but is not available for customersHave automated rollback package in case something goes wrong
Test new features in productionDark release notes from our release plan
Test new features in productionUse Case #2:Safely migrate to the new SQL connection pooling mechanism
Test new features in productionFeature flags and switchersWorks both for brand new features and updatesFeature can be switched on / off any timeif (FeatureEnabled) then …if (UseNewLogic) then … else …Can effect existing customersPossible to test each server one by one by switching feature on / off
Test new features in productionUse Case #3:Safely migrate to the brand-new intelligent targeting subsystem
Test new features in productionValvesVery similar to switchesFeature can get from 0% to 100% of real loadVery handy to gradually roll-out new features on each server one by oneSo far helped us a lot though require extra development effort
Test new features in productionCaveats we had so farMake sure you can turn features on / off without effecting connected usersCreate simple interface to display current status        of all switches and valves on each affected serverSecure access to switches and valves
Controlling roll-out of new featureSwitches and valves enable very smooth and controlled roll-outPartial roll-out to different datacenters / cloudsDifferent datacenters / clouds have different version of feature releasedRedirect all traffic to the new or old version of feature
Controlling roll-out of new featureFuture research: application level load balancingLoad balancer can act as a switches / valve without actually programming load distribution logicAbility to automatically redirect users to the new version of application while preserving old one
SummaryMonitoring system is very important, but your software should be prepared for thisAutomated functional tests are functional monitoring of your softwareSwitches and valves are very powerful concept for testing in production and roll-outs, but require extra development and maintenance timeDark releases and partial roll-outs are the most cost effective safety mechanism
Thanks! Questions?Sergejus Barinovas (@sergejusb)http://sergejus.blogas.lt

Release Often Release Safely

  • 1.
    Release Often ReleaseSafelySergejus Barinovas (@sergejusb)http://sergejus.blogas.lt
  • 2.
    This is nota theoretical presentation
  • 3.
    This presentation basedon real life experience
  • 4.
  • 5.
    Dilemma: Innovative orStable?InnovativeOften (bi-weekly) releases of new featuresHigher risk of bugs and downtimesStableHigher uptime and better customer perceptionSeasonal releases of new features
  • 6.
    We wanted both…… be innovative and agile while staying as much stable as possible
  • 7.
    Stability in ourterms99.999% uptime for serving ads2 datacenters + clouds500 M requests / day
  • 8.
    Let’s learn KungFuof releasing often and safely
  • 9.
    Challenges we ha(d/ve)Detectissues in production as soon as possibleTest new features in production while reducing impact for customersRoll-out new features in a controlled manner
  • 10.
    Detect issues inproduction ASAPMonitoringChoose monitoring system carefullyIt took us about 1 year (Zabbix)First list all your possible monitoring use casesPrepare your software for monitoringLogging is a must have!Performance / SLA counters help to measure and understand software betterCreate a clear baseline to compare with after releases
  • 11.
    Detect issues inproduction ASAPAutomated functional testsDesigned to detect end-user issuesDifferently than unit and integration testsUI / business logicStill not as many as we want (Selenium UI / C#)Ongoing process of unifying automated QA testsRun after each release and on periodic basisVery important if you have > 1 serverHuge time saver if tests are repetitive
  • 12.
    Though unit testshelp in finding bugs during coding, they are more vital when software evolves!Finding
  • 13.
    Test new featuresin productionEven ideal staging environment is not equal to production environmentBefore starting rolling-out new feature it is important to check itsResource consumptionCPU / RAM / HDD / IO / NetworkPerformance impact on existing functionalityResponse times / SLAStabilityErrors / memory leaks
  • 14.
    Test new featuresin productionUse Case #1:Safely rollout new feature that integrates into core data collection pipeline
  • 15.
    Test new featuresin productionDark releasesWorks best with brand new featuresRelease new feature to one or several serversNew feature gets real load, but is not available for customersHave automated rollback package in case something goes wrong
  • 16.
    Test new featuresin productionDark release notes from our release plan
  • 17.
    Test new featuresin productionUse Case #2:Safely migrate to the new SQL connection pooling mechanism
  • 18.
    Test new featuresin productionFeature flags and switchersWorks both for brand new features and updatesFeature can be switched on / off any timeif (FeatureEnabled) then …if (UseNewLogic) then … else …Can effect existing customersPossible to test each server one by one by switching feature on / off
  • 19.
    Test new featuresin productionUse Case #3:Safely migrate to the brand-new intelligent targeting subsystem
  • 20.
    Test new featuresin productionValvesVery similar to switchesFeature can get from 0% to 100% of real loadVery handy to gradually roll-out new features on each server one by oneSo far helped us a lot though require extra development effort
  • 21.
    Test new featuresin productionCaveats we had so farMake sure you can turn features on / off without effecting connected usersCreate simple interface to display current status of all switches and valves on each affected serverSecure access to switches and valves
  • 22.
    Controlling roll-out ofnew featureSwitches and valves enable very smooth and controlled roll-outPartial roll-out to different datacenters / cloudsDifferent datacenters / clouds have different version of feature releasedRedirect all traffic to the new or old version of feature
  • 23.
    Controlling roll-out ofnew featureFuture research: application level load balancingLoad balancer can act as a switches / valve without actually programming load distribution logicAbility to automatically redirect users to the new version of application while preserving old one
  • 24.
    SummaryMonitoring system isvery important, but your software should be prepared for thisAutomated functional tests are functional monitoring of your softwareSwitches and valves are very powerful concept for testing in production and roll-outs, but require extra development and maintenance timeDark releases and partial roll-outs are the most cost effective safety mechanism
  • 25.
    Thanks! Questions?Sergejus Barinovas(@sergejusb)http://sergejus.blogas.lt