Safely
         Deploying on
         the cutting edge
         Eric Holscher
         Urban Airship
         Djangocon 2011



Wednesday, September 7, 2011
Wednesday, September 7, 2011
Talk Contents



         •   Company culture & process
         •   Deployment environment
         •   Tools for deploying
         •   Verifying deployment




Wednesday, September 7, 2011
Process




Wednesday, September 7, 2011
Process

         •   Deploy out of git
         •   Standard git-based production/master branch model
         •   Production branch has releases are tagged with timestamp
             •   deploy-2011-08-03_14-37-38
         •   Feature branches
         •   http://nvie.com/posts/a-successful-git-branching-model/




Wednesday, September 7, 2011
Features



         •   Easily allows you to hot-fix production
         •   Keep a stable master
         •   Run CI on the master branch or long-lived feature branches




Wednesday, September 7, 2011
Services

         •   Everything that we deploy is conceptualized as a service
         •   Services all live in /mnt/services/<slug> (Thanks ec2)
         •   A service is an instance of a repository on a machine
         •   A repository might have multiple services
             •   eg. Airship deployed into “celery” and “web” services
         •   This maps really well onto Chef cookbooks




Wednesday, September 7, 2011
QA Environment


         •   Run all of your master branches
         •   Allow you to get a copy of what will become production
         •   Catch errors before they are seen by customers
         •   Spawn new ones for long-lived feature branches
         •   `host web-0` and figure out based on IP




Wednesday, September 7, 2011
Deployment Design Goals




Wednesday, September 7, 2011
Jump machine




         •   Have a standard place for all deployments to happen
         •   Log all commands run




Wednesday, September 7, 2011
No External Services



         •   Chishop
         •   No external server required to deploy code
         •   All branches are checked out on an admin server




Wednesday, September 7, 2011
Services look the same



         •   Python
         •   Java
         •   “Unix”




Wednesday, September 7, 2011
Composable




         •   Small pieces that you can build into better things
         •   Useful when trying to do something you didn’t plan for




Wednesday, September 7, 2011
Environment




Wednesday, September 7, 2011
Environment


         •   Where code lands on the remote machine
         •   Mimics a chroot
         •   Uses virtualenv & supervisord
         •   Owned by the service-user
         •   Managed by Chef




Wednesday, September 7, 2011
File Structure

         •   /mnt/services/airship
             •   bin/
             •   current -> deploy-2011-08-03_14-37-38
             •   deploy-2011-08-03_14-37-38
             •   etc/
             •   var/




Wednesday, September 7, 2011
etc/



         •   supervisord.conf
             •   [include]
             •   files = *.conf
         •   airship.conf




Wednesday, September 7, 2011
bin/



         •   start
         •   stop
         •   restart
         •   logs




Wednesday, September 7, 2011
SCRIPT_DIR=$(dirname $0)
         SERVICE_DIR=$(cd $SCRIPT_DIR && cd ../ && pwd)

         cd $SERVICE_DIR
         supervisorctl pid > /dev/null 2>&1
         if [ "$?" != "0" ]; then
              echo "Supervisord not running, starting."
              supervisord
         else
              echo "Supervisord running, starting all processes."
              supervisorctl start all
         fi
         cd - > /dev/null 2>&1




Wednesday, September 7, 2011
Bin scripts


         •   All of the process-level binscripts wrap supervisord
         •   bin/start -> supervisordctl start all
         •   bin/start foo -> supervisorctl start foo
         •   bin/stop -> supervisorctl stop all
         •   bin/stop shutdown -> supervisorctl shutdown




Wednesday, September 7, 2011
var/



         •   data/
         •   log/
         •   run/
         •   tmp/




Wednesday, September 7, 2011
Init.d



         •   All services share a common init.d script
         •   This init.d script calls into the service’s bin/
         •   /etc/init.d/airship start -> /mnt/services/airship/bin/start




Wednesday, September 7, 2011
SERVICE_USER='<%= @service %>'
         SERVICE_NAME='<%= @service %>'
         SERVICE_PATH=/mnt/services/$SERVICE_NAME
         set -e
         RET_CODE=0
         case "$1" in
              start)
                 sudo su - $SERVICE_USER -c $SERVICE_PATH/bin/start
                 RET_CODE=$?
                 ;;
              stop)
                 sudo su - $SERVICE_USER -c $SERVICE_PATH/bin/stop
                 RET_CODE=$?
                 ;;
              restart)
                 sudo su - $SERVICE_USER -c $SERVICE_PATH/bin/restart
                 RET_CODE=$?
                 ;;
              status)
                 sudo su - $SERVICE_USER -c $SERVICE_PATH/bin/status
                 RET_CODE=$?
                 ;;
              *)
                 echo "$SERVICE_NAME service usage: $0 {start|stop|restart|status}"
                 ;;
         esac

         exit $RET_CODE


Wednesday, September 7, 2011
Tools




Wednesday, September 7, 2011
Tools



         •   Fabric
         •   Rsync
         •   Pip
         •   Virtualenv




Wednesday, September 7, 2011
Low-level verbs

         •   pull
         •   build
         •   tag
         •   sync
         •   install
         •   rollback
         •   start/stop/restart/reload



Wednesday, September 7, 2011
Pull


         •   Update the code from the source repository
         •   Defaults to the “production” branch
             •   def pull(repo=None, ref='origin/production')
         •   Can pass in a specific revision/branch/tag/hashish
             •   local('git reset --hard %s' % ref, capture=False)




Wednesday, September 7, 2011
Build



         •   Could be called “prepare”
         •   Do local-specific things to get repo into a ready state
         •   Mostly used for compiling in java-land
         •   Useful in Python for running pre-install tasks




Wednesday, September 7, 2011
Tag


         •   Set a tag for the deploy in the git repo
         •   If the current commit already has a tag, use that instead
             •   git tag --contains HEAD
         •   deploy-2011-08-03_14-37-38
             •   strftime('%Y-%m-%d_%H-%M-%S')




Wednesday, September 7, 2011
Sync



         •   Move the code from the local to the remote box
         •   Uses rsync to put it into the remote service directory
         •   Also places a copy of the synced code on the admin box




Wednesday, September 7, 2011
Install



         •   Make the code the active path for code on the machine
         •   This is generally installing code into a virtualenv
         •   Updating the “current” symlink in the service directory
         •   Symlink Django settings file based on environment




Wednesday, September 7, 2011
Rollback


         •   When you break things, you need to undo quickly
         •   Reset the repository to the previous deployed tag
             •   git tag | grep deploy| sort -nr |head -2 |tail -1
         •   Deploy that
         •   Very few moving pieces




Wednesday, September 7, 2011
Start/Stop/Reload




         •   Allow you to bounce services as part of deployment
         •   Allow reload for services that support it




Wednesday, September 7, 2011
CLI UI

         •   Have nice wrapper commands that do common tasks
         •   deploy host:web-0 full_deploy:airship
             ➡ pull,           build, tag, sync, install
         •   deploy host:web-1 deploy:airship
             ➡ tag,            sync, install
         •   deploy host:web-2 sync:airship
             ➡ sync




Wednesday, September 7, 2011
UI cont.




         •   deploy host:web-0 full_deploy:airship restart:airship




Wednesday, September 7, 2011
#!/bin/bash

         cd ~/airdeploy
         DATE=$(date +%Y_%-m_%-d-%H-%m-%s)
         echo "deploy" $@ > logs/$DATE.log
         fab $@
         cd - > /dev/null 2>&1




Wednesday, September 7, 2011
Meta-commands

         •   Hard-code the correct deployment behavior
         •   “Make easy things easy, and wrong things hard”
         •   Knows what machine each service is deployed to
         •   deploy airship
             ➡ deploy          pull:airship
             ➡ deploy          type:web deploy:airship




Wednesday, September 7, 2011
Magicifying




Wednesday, September 7, 2011
Magicifying




         •   Now that we have a solid base, we can automate on top
         •   When you do a meta deploy, it should be a “smart deploy”




Wednesday, September 7, 2011
Workflow



         •   Deploy to one web server, preferably with one worker
         •   Restart it
         •   Run it against heuristics to determine if it’s broken
         •   If it’s broken, rollback, otherwise continue on




Wednesday, September 7, 2011
Heuristics

         •   Any 500s
         •   Number of 200s to non-200s
         •   Number of 500s to 200s
         •   Requests a second
         •   Response time
         •   $$$ (Business metrics)




Wednesday, September 7, 2011
How it works

         •   Tell load balancer to take machine out of pool
             •   /take_me_out_of_the_lb -> 200
         •   Start your code with 1 worker and a different port
             •   supervisorctl start canary
         •   Expose metrics from your services over json
         •   Make sure your load balancer weights it appropriately
         •   Poll your metrics for X time before considering it functional



Wednesday, September 7, 2011
Thanks



         •   Alex Kritikos
         •   Erik Onnen
         •   Schmichael




Wednesday, September 7, 2011
Questions?



         •   Eric Holscher
         •   Urban Airship (Hiring and whatnot)
         •   eric@ericholscher.com




Wednesday, September 7, 2011

Deploying on the cutting edge

  • 1.
    Safely Deploying on the cutting edge Eric Holscher Urban Airship Djangocon 2011 Wednesday, September 7, 2011
  • 2.
  • 3.
    Talk Contents • Company culture & process • Deployment environment • Tools for deploying • Verifying deployment Wednesday, September 7, 2011
  • 4.
  • 5.
    Process • Deploy out of git • Standard git-based production/master branch model • Production branch has releases are tagged with timestamp • deploy-2011-08-03_14-37-38 • Feature branches • http://nvie.com/posts/a-successful-git-branching-model/ Wednesday, September 7, 2011
  • 6.
    Features • Easily allows you to hot-fix production • Keep a stable master • Run CI on the master branch or long-lived feature branches Wednesday, September 7, 2011
  • 7.
    Services • Everything that we deploy is conceptualized as a service • Services all live in /mnt/services/<slug> (Thanks ec2) • A service is an instance of a repository on a machine • A repository might have multiple services • eg. Airship deployed into “celery” and “web” services • This maps really well onto Chef cookbooks Wednesday, September 7, 2011
  • 8.
    QA Environment • Run all of your master branches • Allow you to get a copy of what will become production • Catch errors before they are seen by customers • Spawn new ones for long-lived feature branches • `host web-0` and figure out based on IP Wednesday, September 7, 2011
  • 9.
  • 10.
    Jump machine • Have a standard place for all deployments to happen • Log all commands run Wednesday, September 7, 2011
  • 11.
    No External Services • Chishop • No external server required to deploy code • All branches are checked out on an admin server Wednesday, September 7, 2011
  • 12.
    Services look thesame • Python • Java • “Unix” Wednesday, September 7, 2011
  • 13.
    Composable • Small pieces that you can build into better things • Useful when trying to do something you didn’t plan for Wednesday, September 7, 2011
  • 14.
  • 15.
    Environment • Where code lands on the remote machine • Mimics a chroot • Uses virtualenv & supervisord • Owned by the service-user • Managed by Chef Wednesday, September 7, 2011
  • 16.
    File Structure • /mnt/services/airship • bin/ • current -> deploy-2011-08-03_14-37-38 • deploy-2011-08-03_14-37-38 • etc/ • var/ Wednesday, September 7, 2011
  • 17.
    etc/ • supervisord.conf • [include] • files = *.conf • airship.conf Wednesday, September 7, 2011
  • 18.
    bin/ • start • stop • restart • logs Wednesday, September 7, 2011
  • 19.
    SCRIPT_DIR=$(dirname $0) SERVICE_DIR=$(cd $SCRIPT_DIR && cd ../ && pwd) cd $SERVICE_DIR supervisorctl pid > /dev/null 2>&1 if [ "$?" != "0" ]; then echo "Supervisord not running, starting." supervisord else echo "Supervisord running, starting all processes." supervisorctl start all fi cd - > /dev/null 2>&1 Wednesday, September 7, 2011
  • 20.
    Bin scripts • All of the process-level binscripts wrap supervisord • bin/start -> supervisordctl start all • bin/start foo -> supervisorctl start foo • bin/stop -> supervisorctl stop all • bin/stop shutdown -> supervisorctl shutdown Wednesday, September 7, 2011
  • 21.
    var/ • data/ • log/ • run/ • tmp/ Wednesday, September 7, 2011
  • 22.
    Init.d • All services share a common init.d script • This init.d script calls into the service’s bin/ • /etc/init.d/airship start -> /mnt/services/airship/bin/start Wednesday, September 7, 2011
  • 23.
    SERVICE_USER='<%= @service %>' SERVICE_NAME='<%= @service %>' SERVICE_PATH=/mnt/services/$SERVICE_NAME set -e RET_CODE=0 case "$1" in start) sudo su - $SERVICE_USER -c $SERVICE_PATH/bin/start RET_CODE=$? ;; stop) sudo su - $SERVICE_USER -c $SERVICE_PATH/bin/stop RET_CODE=$? ;; restart) sudo su - $SERVICE_USER -c $SERVICE_PATH/bin/restart RET_CODE=$? ;; status) sudo su - $SERVICE_USER -c $SERVICE_PATH/bin/status RET_CODE=$? ;; *) echo "$SERVICE_NAME service usage: $0 {start|stop|restart|status}" ;; esac exit $RET_CODE Wednesday, September 7, 2011
  • 24.
  • 25.
    Tools • Fabric • Rsync • Pip • Virtualenv Wednesday, September 7, 2011
  • 26.
    Low-level verbs • pull • build • tag • sync • install • rollback • start/stop/restart/reload Wednesday, September 7, 2011
  • 27.
    Pull • Update the code from the source repository • Defaults to the “production” branch • def pull(repo=None, ref='origin/production') • Can pass in a specific revision/branch/tag/hashish • local('git reset --hard %s' % ref, capture=False) Wednesday, September 7, 2011
  • 28.
    Build • Could be called “prepare” • Do local-specific things to get repo into a ready state • Mostly used for compiling in java-land • Useful in Python for running pre-install tasks Wednesday, September 7, 2011
  • 29.
    Tag • Set a tag for the deploy in the git repo • If the current commit already has a tag, use that instead • git tag --contains HEAD • deploy-2011-08-03_14-37-38 • strftime('%Y-%m-%d_%H-%M-%S') Wednesday, September 7, 2011
  • 30.
    Sync • Move the code from the local to the remote box • Uses rsync to put it into the remote service directory • Also places a copy of the synced code on the admin box Wednesday, September 7, 2011
  • 31.
    Install • Make the code the active path for code on the machine • This is generally installing code into a virtualenv • Updating the “current” symlink in the service directory • Symlink Django settings file based on environment Wednesday, September 7, 2011
  • 32.
    Rollback • When you break things, you need to undo quickly • Reset the repository to the previous deployed tag • git tag | grep deploy| sort -nr |head -2 |tail -1 • Deploy that • Very few moving pieces Wednesday, September 7, 2011
  • 33.
    Start/Stop/Reload • Allow you to bounce services as part of deployment • Allow reload for services that support it Wednesday, September 7, 2011
  • 34.
    CLI UI • Have nice wrapper commands that do common tasks • deploy host:web-0 full_deploy:airship ➡ pull, build, tag, sync, install • deploy host:web-1 deploy:airship ➡ tag, sync, install • deploy host:web-2 sync:airship ➡ sync Wednesday, September 7, 2011
  • 35.
    UI cont. • deploy host:web-0 full_deploy:airship restart:airship Wednesday, September 7, 2011
  • 36.
    #!/bin/bash cd ~/airdeploy DATE=$(date +%Y_%-m_%-d-%H-%m-%s) echo "deploy" $@ > logs/$DATE.log fab $@ cd - > /dev/null 2>&1 Wednesday, September 7, 2011
  • 37.
    Meta-commands • Hard-code the correct deployment behavior • “Make easy things easy, and wrong things hard” • Knows what machine each service is deployed to • deploy airship ➡ deploy pull:airship ➡ deploy type:web deploy:airship Wednesday, September 7, 2011
  • 38.
  • 39.
    Magicifying • Now that we have a solid base, we can automate on top • When you do a meta deploy, it should be a “smart deploy” Wednesday, September 7, 2011
  • 40.
    Workflow • Deploy to one web server, preferably with one worker • Restart it • Run it against heuristics to determine if it’s broken • If it’s broken, rollback, otherwise continue on Wednesday, September 7, 2011
  • 41.
    Heuristics • Any 500s • Number of 200s to non-200s • Number of 500s to 200s • Requests a second • Response time • $$$ (Business metrics) Wednesday, September 7, 2011
  • 42.
    How it works • Tell load balancer to take machine out of pool • /take_me_out_of_the_lb -> 200 • Start your code with 1 worker and a different port • supervisorctl start canary • Expose metrics from your services over json • Make sure your load balancer weights it appropriately • Poll your metrics for X time before considering it functional Wednesday, September 7, 2011
  • 43.
    Thanks • Alex Kritikos • Erik Onnen • Schmichael Wednesday, September 7, 2011
  • 44.
    Questions? • Eric Holscher • Urban Airship (Hiring and whatnot) • eric@ericholscher.com Wednesday, September 7, 2011