Release Often Release Safely
Upcoming SlideShare
Loading in...5
×
 

Release Often Release Safely

on

  • 2,973 views

Kung-Fu of releasing often but safely for high loaded systems

Kung-Fu of releasing often but safely for high loaded systems

Statistics

Views

Total Views
2,973
Views on SlideShare
837
Embed Views
2,136

Actions

Likes
0
Downloads
8
Comments
0

11 Embeds 2,136

http://sergejus.blogas.lt 1756
http://at2011.agiletour.org 307
http://dotnetgroup.dev 42
http://feeds2.feedburner.com 13
http://coderwall.com 6
http://webcache.googleusercontent.com 4
http://at2012.agiletour.org 3
http://localhost 2
http://www.rssmountain.com 1
http://sergejus.blogas.lt&_=1323084889109 HTTP 1
http://sergejus.blogas.lt. 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Release Often Release Safely Release Often Release Safely Presentation Transcript

  • Release Often Release Safely
    Sergejus Barinovas (@sergejusb)
    http://sergejus.blogas.lt
  • This is not a theoretical presentation
  • This presentation based on real life experience
  • Successful software workflow
  • Dilemma: Innovative or Stable?
    Innovative
    Often (bi-weekly) releases of new features
    Higher risk of bugs and downtimes
    Stable
    Higher uptime and better customer perception
    Seasonal releases of new features
  • We wanted both …
    … be innovative and agile while staying as much stable as possible
  • Stability in our terms
    99.999% uptime for serving ads
    2 datacenters + clouds
    500 M requests / day
  • Let’s learn Kung Fuof releasing often and safely
  • Challenges we ha(d/ve)
    Detect issues in production as soon as possible
    Test new features in production while reducing impact for customers
    Roll-out new features in a controlled manner
  • Detect issues in production ASAP
    Monitoring
    Choose monitoring system carefully
    It took us about 1 year (Zabbix)
    First list all your possible monitoring use cases
    Prepare your software for monitoring
    Logging is a must have!
    Performance / SLA counters help to measure and understand software better
    Create a clear baseline to compare with after releases
  • Detect issues in production ASAP
    Automated functional tests
    Designed to detect end-user issues
    Differently than unit and integration tests
    UI / business logic
    Still not as many as we want (Selenium UI / C#)
    Ongoing process of unifying automated QA tests
    Run after each release and on periodic basis
    Very important if you have > 1 server
    Huge time saver if tests are repetitive
  • Though unit tests help in finding bugs during coding, they are more vital when software evolves!
    Finding
  • Test new features in production
    Even ideal staging environment is not equal to production environment
    Before starting rolling-out new feature it is important to check its
    Resource consumption
    CPU / RAM / HDD / IO / Network
    Performance impact on existing functionality
    Response times / SLA
    Stability
    Errors / memory leaks
  • Test new features in production
    Use Case #1:
    Safely rollout new feature that integrates into core data collection pipeline
  • Test new features in production
    Dark releases
    Works best with brand new features
    Release new feature to one or several servers
    New feature gets real load, but is not available for customers
    Have automated rollback package in case something goes wrong
  • Test new features in production
    Dark release notes from our release plan
  • Test new features in production
    Use Case #2:
    Safely migrate to the new SQL connection pooling mechanism
  • Test new features in production
    Feature flags and switchers
    Works both for brand new features and updates
    Feature can be switched on / off any time
    if (FeatureEnabled) then …
    if (UseNewLogic) then … else …
    Can effect existing customers
    Possible to test each server one by one by switching feature on / off
  • Test new features in production
    Use Case #3:
    Safely migrate to the brand-new intelligent targeting subsystem
  • Test new features in production
    Valves
    Very similar to switches
    Feature can get from 0% to 100% of real load
    Very handy to gradually roll-out new features on each server one by one
    So far helped us a lot though require extra development effort
  • Test new features in production
    Caveats we had so far
    Make sure you can turn features on / off without effecting connected users
    Create simple interface to display current status of all switches and valves on each affected server
    Secure access to switches and valves
  • Controlling roll-out of new feature
    Switches and valves enable very smooth and controlled roll-out
    Partial roll-out to different datacenters / clouds
    Different datacenters / clouds have different version of feature released
    Redirect all traffic to the new or old version of feature
  • Controlling roll-out of new feature
    Future research: application level load balancing
    Load balancer can act as a switches / valve without actually programming load distribution logic
    Ability to automatically redirect users to the new version of application while preserving old one
  • Summary
    Monitoring system is very important, but your software should be prepared for this
    Automated functional tests are functional monitoring of your software
    Switches and valves are very powerful concept for testing in production and roll-outs, but require extra development and maintenance time
    Dark releases and partial roll-outs are the most cost effective safety mechanism
  • Thanks! Questions?
    Sergejus Barinovas (@sergejusb)
    http://sergejus.blogas.lt