DISQUS
                         Practicing Continuous Deployment



                                    David Cramer
                                       @zeeg




Saturday, March 10, 12
Shipping new code as soon
                 as it’s ready




Saturday, March 10, 12
Continuous Deployment




                         # Update the site every 5 minutes
                         */5 * * * * cd /www/example.com 
                                       && git pull 
                                       && service apache restart




Saturday, March 10, 12
Saturday, March 10, 12
When it’s ready




Saturday, March 10, 12
When is it ready?




                -        Reviewed by peers


                -        Passes automated tests


                -        Some level of QA




Saturday, March 10, 12
Focus on Stability and Iteration




Saturday, March 10, 12
Workflow


                          Review                  Commit




                         Integration             Failed Build




                          Deploy                 Reporting




                                                  Rollback




Saturday, March 10, 12
The Good
      -     Develop features
            incrementally
      -     Release frequently
      -     Smaller doses of QA
                                      The Bad
                                  -   Culture Shock
                                  -   Stability depends on
                                      test coverage
                                  -   Initial time investment

Saturday, March 10, 12
Keep Development Simple




Saturday, March 10, 12
Development




                -        Automate testing of complicated
                         processes and architecture
                -        Simple can be better than complete
                         -   Especially for local development
                -        python setup.py {develop,test}
                -        Puppet, Chef, Buildout, Fabric, etc.




Saturday, March 10, 12
Production                                 Staging
                                PostgreSQL                              PostgreSQL
                                Memcache                                Memcache
                                Redis                                   Redis
                                Solr                                    Solr
                                Apache                                  Apache
                                Nginx                                   Nginx
                                RabbitMQ                                RabbitMQ
                         (and 100 other painful-to-configure services)



                                CI Server                               Macbook
                                PostgreSQL                              PostgreSQL
                                Memcache                                Apache
                                Redis                                   Memcache
                                Solr                                    Redis
                                Apache                                  Solr
                                Nginx                                   Nginx
                                RabbitMQ                                RabbitMQ


Saturday, March 10, 12
Bootstrapping Local



                -        Simplify local setup
                         -   git clone dcramer@disqus:disqus.git
                         -   make
                         -   python manage.py runserver


                -        Need to test dependancies?
                         -   virtualbox + vagrant up



Saturday, March 10, 12
Progressive Rollout

                         We actively use early versions of features
                                   before public release




Saturday, March 10, 12
Deploy features to portions of a user base at a
                          time to ensure smooth, measurable releases




                                  https://github.com/disqus/gargoyle


Saturday, March 10, 12
•        Iterate quickly by hiding features
                •        Early adopters are free QA



                         from gargoyle import gargoyle

                         def my_view(request):
                             if gargoyle.is_active('awesome', request):
                                 return 'new happy version :D'
                             else:
                                 return 'old sad version :('




Saturday, March 10, 12
SWITCHES = {
                             # enable my_feature for 50%
                             'my_feature': range(0, 50),
                         }

                         def is_active(switch):
                             try:
                                  pct_range = SWITCHES[switch]
                             except KeyError:
                                  return False

                             ip_hash = sum([int(x) for x
                                            in ip_address.split('.')])

                             return (ip_hash % 100 in pct_range)




Saturday, March 10, 12
Review ALL the Commits




                            phabricator.org


Saturday, March 10, 12
Saturday, March 10, 12
Saturday, March 10, 12
Saturday, March 10, 12
Integration
                         (or as we like to call it)




Saturday, March 10, 12
Saturday, March 10, 12
Integration Requirements




                -        Developers must know when they’ve
                         broken something
                         -   IRC, Email, IM
                -        Support proper reporting
                         -   XUnit, Pylint, Coverage.py
                -        Painless setup
                         -   apt-get install jenkins *

                             https://wiki.jenkins-ci.org/display/JENKINS/Installing+Jenkins+on+Ubuntu


Saturday, March 10, 12
Shortcomings


                -        False positives
                         -   Reporting isn't accurate
                         -   Services fail
                         -   Bad Tests
                -        Test coverage
                         -   Regressions on untested code
                -        Feedback delay
                         -   Integration tests vs Unit tests


Saturday, March 10, 12
Fixing False Positives




                -        Re-run tests several times on a failure


                -        Report continually failing tests


                -        Replace external service tests with a
                         functional test suite




Saturday, March 10, 12
Maintaining Coverage




                -        Raise awareness with reporting
                         -   Fail/alert when coverage drops on a build
                -        Commit tests with code
                         -   Coverage against commit diff for
                             untested regressions
                -        Utilize code review




Saturday, March 10, 12
Speeding Up Tests




                -        Write true unit tests
                         -   vs slower integration tests
                -        Mock external services
                -        Distributed and parallel testing
                         -   Matrix builds




Saturday, March 10, 12
Reporting




Saturday, March 10, 12
<You> Why is mongodb-1 down?




             <Ops> It’s down? Must have crashed again

Saturday, March 10, 12
Meaningful Metrics




                -        Rate of traffic (not just hits!)
                         -   Business vs system
                -        Response time (database, web)
                -        Exceptions
                -        Social media
                         -   Twitter




Saturday, March 10, 12
Graphite




                         (Trafficgraphite.wikidot.com
                                  across a cluster of servers)


Saturday, March 10, 12
Sentry




                         sentry.readthedocs.org


Saturday, March 10, 12
Wrap Up




Saturday, March 10, 12
Getting Started




                -        Package your app
                -        Value code review
                -        Ease deployment; fast rollbacks
                -        Setup automated tests
                -        Gather some easy metrics




Saturday, March 10, 12
Going Further




                -        Build an immune system
                         -   Automate deploys, rollbacks (maybe)
                -        Adjust to your culture
                         -   There is no “right way”
                -        SOA == great success




Saturday, March 10, 12
DISQUS
                          Questions?




                          psst, we’re hiring
                          disqus.com/jobs

Saturday, March 10, 12
References



                -        Gargoyle (feature switches)
                         https://github.com/disqus/gargoyle

                -        Sentry (log aggregation)
                         https://github.com/dcramer/sentry

                -        Jenkins CI (continuous integration)
                         http://jenkins-ci.org/

                -        Phabricator (code reviews, bug tracking)
                         https://phabricator.org

                -        Graphite (metrics)
                         http://graphite.wikidot.com/




                                              code.disqus.com
Saturday, March 10, 12

Practicing Continuous Deployment

  • 1.
    DISQUS Practicing Continuous Deployment David Cramer @zeeg Saturday, March 10, 12
  • 2.
    Shipping new codeas soon as it’s ready Saturday, March 10, 12
  • 3.
    Continuous Deployment # Update the site every 5 minutes */5 * * * * cd /www/example.com && git pull && service apache restart Saturday, March 10, 12
  • 4.
  • 5.
  • 6.
    When is itready? - Reviewed by peers - Passes automated tests - Some level of QA Saturday, March 10, 12
  • 7.
    Focus on Stabilityand Iteration Saturday, March 10, 12
  • 8.
    Workflow Review Commit Integration Failed Build Deploy Reporting Rollback Saturday, March 10, 12
  • 9.
    The Good - Develop features incrementally - Release frequently - Smaller doses of QA The Bad - Culture Shock - Stability depends on test coverage - Initial time investment Saturday, March 10, 12
  • 10.
  • 11.
    Development - Automate testing of complicated processes and architecture - Simple can be better than complete - Especially for local development - python setup.py {develop,test} - Puppet, Chef, Buildout, Fabric, etc. Saturday, March 10, 12
  • 12.
    Production Staging PostgreSQL PostgreSQL Memcache Memcache Redis Redis Solr Solr Apache Apache Nginx Nginx RabbitMQ RabbitMQ (and 100 other painful-to-configure services) CI Server Macbook PostgreSQL PostgreSQL Memcache Apache Redis Memcache Solr Redis Apache Solr Nginx Nginx RabbitMQ RabbitMQ Saturday, March 10, 12
  • 13.
    Bootstrapping Local - Simplify local setup - git clone dcramer@disqus:disqus.git - make - python manage.py runserver - Need to test dependancies? - virtualbox + vagrant up Saturday, March 10, 12
  • 14.
    Progressive Rollout We actively use early versions of features before public release Saturday, March 10, 12
  • 15.
    Deploy features toportions of a user base at a time to ensure smooth, measurable releases https://github.com/disqus/gargoyle Saturday, March 10, 12
  • 16.
    Iterate quickly by hiding features • Early adopters are free QA from gargoyle import gargoyle def my_view(request): if gargoyle.is_active('awesome', request): return 'new happy version :D' else: return 'old sad version :(' Saturday, March 10, 12
  • 17.
    SWITCHES = { # enable my_feature for 50% 'my_feature': range(0, 50), } def is_active(switch): try: pct_range = SWITCHES[switch] except KeyError: return False ip_hash = sum([int(x) for x in ip_address.split('.')]) return (ip_hash % 100 in pct_range) Saturday, March 10, 12
  • 18.
    Review ALL theCommits phabricator.org Saturday, March 10, 12
  • 19.
  • 20.
  • 21.
  • 22.
    Integration (or as we like to call it) Saturday, March 10, 12
  • 23.
  • 24.
    Integration Requirements - Developers must know when they’ve broken something - IRC, Email, IM - Support proper reporting - XUnit, Pylint, Coverage.py - Painless setup - apt-get install jenkins * https://wiki.jenkins-ci.org/display/JENKINS/Installing+Jenkins+on+Ubuntu Saturday, March 10, 12
  • 25.
    Shortcomings - False positives - Reporting isn't accurate - Services fail - Bad Tests - Test coverage - Regressions on untested code - Feedback delay - Integration tests vs Unit tests Saturday, March 10, 12
  • 26.
    Fixing False Positives - Re-run tests several times on a failure - Report continually failing tests - Replace external service tests with a functional test suite Saturday, March 10, 12
  • 27.
    Maintaining Coverage - Raise awareness with reporting - Fail/alert when coverage drops on a build - Commit tests with code - Coverage against commit diff for untested regressions - Utilize code review Saturday, March 10, 12
  • 28.
    Speeding Up Tests - Write true unit tests - vs slower integration tests - Mock external services - Distributed and parallel testing - Matrix builds Saturday, March 10, 12
  • 29.
  • 30.
    <You> Why ismongodb-1 down? <Ops> It’s down? Must have crashed again Saturday, March 10, 12
  • 31.
    Meaningful Metrics - Rate of traffic (not just hits!) - Business vs system - Response time (database, web) - Exceptions - Social media - Twitter Saturday, March 10, 12
  • 32.
    Graphite (Trafficgraphite.wikidot.com across a cluster of servers) Saturday, March 10, 12
  • 33.
    Sentry sentry.readthedocs.org Saturday, March 10, 12
  • 34.
  • 35.
    Getting Started - Package your app - Value code review - Ease deployment; fast rollbacks - Setup automated tests - Gather some easy metrics Saturday, March 10, 12
  • 36.
    Going Further - Build an immune system - Automate deploys, rollbacks (maybe) - Adjust to your culture - There is no “right way” - SOA == great success Saturday, March 10, 12
  • 37.
    DISQUS Questions? psst, we’re hiring disqus.com/jobs Saturday, March 10, 12
  • 38.
    References - Gargoyle (feature switches) https://github.com/disqus/gargoyle - Sentry (log aggregation) https://github.com/dcramer/sentry - Jenkins CI (continuous integration) http://jenkins-ci.org/ - Phabricator (code reviews, bug tracking) https://phabricator.org - Graphite (metrics) http://graphite.wikidot.com/ code.disqus.com Saturday, March 10, 12