Burn Down
 The Silos
  Lindsay Holmwood
DevOps
Case study
Applying with technology
High pro!le
fundraising website
Strong siloisation
Devs + Ops in di"erent companies
100% uptime
3 Concepts
Consistency
Repeatability
Visibility
Consistency
ensuring
 identical
behaviour
within an
environment
or across multiple
  environments
con!guration
management
testing
Puppet
   language
    library
 client/server
package { "apache2":
  ensure => installed
}

service { "apache2":
  enable => true,
  ensure => running
}
class apache2 {
    package { "apache2":
      ensure => installed
    }

    service { "apache2":
      enable => true,
      ensure => running
    }
}
Puppet work#ow
 write   apply   debug
class apache2 {
    package { "apache2":
      ensure => installed
    }

    service { "apache2":
      enable => true,
      ensure => running
    }
}
class apache2 {
    package { "apache2":
      ensure => installed
    }

    service { "apache2":
      enable => true,
      ensure => running,
      require => [ Package["apache2"] ]
    }
}
Complex manifests
Proprietary software
Lots of debugging
VMware snapshots
Multiple
deploy environments
Con!guration drift
“roles”
define app_server($collectd_destination, $logging_destination) {
 server { $fqdn:
    logging_destination => $logging_destination
  }
  include apache2
  include mysql::client
  collectd::client { $fqdn:
    collection_destination => $collectd_destination
  }


    if $environment == "production" {
      include production::only::module
    }
}
node "app-01.stage.charity.com" {
  app_server { $fqdn:
    collectd_destination => "stats-01.stage.charity.com",
    logging_destination => "log-01.stage.charity.com"
  }
}

node "app-01.production.charity.com" {
  app_server { $fqdn:
    collectd_destination => "stats-01.production.charity.com",
    logging_destination => "log-01.production.charity.com"
  }
}
node /app-d+.stage.charity.com/ {
  app_server { $fqdn:
    collectd_destination => "stats-01.stage.charity.com",
    logging_destination => "log-01.stage.charity.com"
  }
}

node /app-d+.production.charity.com/ {
  app_server { $fqdn:
    collectd_destination => "stats-01.production.charity.com",
    logging_destination => "log-01.production.charity.com"
  }
}
30 minute builds
Consistency
Repeatability
Function of
consistency
automate,
 to remove
human error
increase speed
 by shortening
feedback loops
automated
deployments
con!guration
management
Capistrano
Ruby DSL around
SSH-in-a-for-loop
Simple, powerful,
can blow your legs o"
Not a substitute for
  con!guration
  management
railsless-deploy
   http://bit.ly/i56ra9
Removes Rails-ism
Great for PHP
capistrano-multistage
     (part of capistrano-ext)
          http://bit.ly/i2moIp
# config/deploy.rb

set   :user, "deploy"
set   :application, "charity.com"
set   :keep_releases, 10
set   :deploy_to, "/srv/#{application}"

set :stages, %w(uat staging production)
# config/deploy/stage.rb

role :app,    "app-01.stage.charity.com",
              "app-02.stage.charity.com"
role :static, "static-01.stage.charity.com”
# config/deploy/production.rb

role :app,    "app-01.prod.charity.com",
              "app-02.prod.charity.com",
              "app-03.prod.charity.com"
role :static, "static-01.prod.charity.com”
cap staging deploy    # deploy to staging
# test
cap production deploy # deploy to production
Capistrano
bootstrap w/ Puppet
class capistrano::user {

    group { "deploy":
      gid => 499
    }

    user { "deploy":
      uid => 499,
      gid => 499,
      home => "/home/deploy",
      shell => "/bin/bash",
      require => Group["deploy"]
    }

    file { "/home/deploy/.ssh/authorized_keys":
      source => "puppet:///modules/capistrano/authorized_keys",
      mode   => 644,
      owner => "deploy",
      group => "deploy",
      require => [ User["deploy"] ]
    }

}
define capistrano::site {

  include capistrano::user

  file { "/srv/$name":
    ensure => directory,
    owner   => deploy,
    group   => deploy,
    require => [ User["deploy"] ]
  }

  file { "/srv/$name/releases":
    ensure => directory,
    owner   => deploy,
    group   => deploy,
    require => [ File["/srv/$name"] ]
  }

...
...

    file { "/srv/$name/shared/log":
      ensure => directory,
      owner => www-data,
      group => www-data,
    }

    file { "/etc/$name":
      source => "puppet:///modules/capistrano/etc/$name",
      recurse => true,
      mode    => "644",
      owner   => root,
      group   => root
    }

}
define app_server($collectd_destination, $logging_destination) {

    include apache2
    include mysql::client
    capistrano::site { “charity.com”: }

}
Deploying to a new
app server is as easy as:
# config/deploy/production.rb

role :app,    "app-01.prod.charity.com",
              "app-02.prod.charity.com",
              "app-03.prod.charity.com"
role :static, "static-01.prod.charity.com”
# config/deploy/production.rb

role :app,    "app-01.prod.charity.com",
              "app-02.prod.charity.com",
              "app-03.prod.charity.com",
              "app-04.prod.charity.com"
role :static, "static-01.prod.charity.com”
or...
# deploy/production.rb

role :app,    "app-01.prod.charity.com",
              "app-02.prod.charity.com",
              "app-03.prod.charity.com",
              "app-04.prod.charity.com"
role :static, "static-01.prod.charity.com”
# deploy/production.rb

role :app, *(1..4).map do |n|
  "app-%.2d.prod.charity.com" % n
end

role :static, "static-01.prod.charity.com”
git-svn mirror
182MB * 20 == PAIN
remote_cache
bad with svn tags
git-svn + cron
fast clones
 commit access
21st century tech
Repeatability
Visibility
one eye on the
     past
one eye on the
    future
metric collection
code changes
monitoring
reports
metric collection
collectd
lightweight
  statistics
 collection
  daemon
platform for
   collecting
time series data
plugin based
network aware
well de!ned APIs
curl_json
<Plugin curl_json>
 <URL "http://localhost:5984/_stats">
   Instance "httpd"
   <Key "httpd/requests/count">
     Type "http_requests"
   </Key>

    <Key "httpd_status_codes/*/count">
      Type "http_response_codes"
    </Key>
  </URL>
</Plugin>
/metrics
code changes
application
       &

con!g management
Your new best friend
monitoring
sudo mmm_control show                   # blocks under high IO
echo -en “shownquitn” | nc 127.1 9988 # instant
sudo mmm_control show                   # blocks under high IO
echo -en “shownquitn” | nc 127.1 9988 # instant



socket = ::TCPSocket.new("127.0.0.1", 9988)
socket.print("shownquitn")
output = socket.read.split("n")
hosts = output.map do |line|
  parts = line.scan(/nasty regex/).flatten

  { "hostname"            =>   parts[0],
    "address"             =>   parts[1],
    "mode"                =>   parts[2],
    "state"               =>   parts[3],
    "role"                =>   parts[5],
    "role_address"        =>   parts[6] }
end
reports
mk-query-digest
      &

   logrotate
# prerotate

SLOWLOG_FILENAME="/var/log/mysql/mysql-slow.log"
OPTIONS="--report --group-by distill --order-by Query_time:max
--timeline --report-format query_report,profile"
DATE="$(date +%Y-%m-%dT%H:%M:%S%z)"
REPORT_FILENAME="/tmp/$(hostname)-mysql-slow-query-report-$DATE"
mk-query-digest $SLOWLOG_FILENAME $OPTIONS > $REPORT_FILENAME

SUBJECT="MySQL Slow Queries Report for $(hostname) as of $DATE"
RECIPIENTS="developers@charity.com,ops@bulletproof.net"
cat $REPORT_FILENAME | nail -n -E -s "$SUBJECT" "$RECIPIENTS"
Retrospectives
Slave explosion
Background:
Background:
MySQL replication
Background:
MySQL replication
MMM
Background:
MySQL replication
MMM
2 masters + 4 slaves
REPLICATION_FAIL
     on one slave
Down to 3 nodes
Increased cluster load
REPLICATION_DELAY
     on another slave
Down to 2 nodes
Inspection
of REPLICATION_DELAY slave
Swapping like mad
Half the memory allocated
Shutdown,
 upgrade,
   boot
Visibility
metric collection
Consistency
Database connectivity
Soft launch
PHP connection errors
Con!g parses + loads
Add con!g dump url
Visibility
curl_json
Redeploy
...
Typo
2 reviewers
of con!g management
both in ops team
Visibility
code changes
Your new best friend
reviewer diversity
devs should have visibility of ops changes
Data consistency
New release
Database migrations
Release promotion
uat

  stage

production
uat !

  stage

production
uat !

  stage !

production
uat !

  stage !

production "
uat !

  stage !

production "
CREATE TABLE foo
           should have been


CREATE TABLE IF NOT EXIST foo
Consistency
uat

  stage

production
uat

  stage

production
uat

  stage

production
Repeatability
Consistency
ensuring
 identical
behaviour
within an
environment
or across multiple
  environments
Repeatability
Function of
consistency
automate,
 to remove
human error
increase speed
 by shortening
feedback loops
Visibility
one eye on the
     past
one eye on the
    future
Communicate!
Thank you!
Credits:
http://www.#ickr.com/photos/48722974@N07/4682302824/     http://www.#ickr.com/photos/lyza/4144764381/
http://www.#ickr.com/photos/acediscovery/3030548744/     http://www.#ickr.com/photos/matchew/424026531/
http://www.#ickr.com/photos/andrew_wertheimer/5268407700/ http://www.#ickr.com/photos/mrwoodnz/4289893182/
http://www.#ickr.com/photos/azrasta/4528604334/          http://www.#ickr.com/photos/myprofe/4396178084/
http://www.#ickr.com/photos/boliston/2351083198/         http://www.#ickr.com/photos/nnova/4834954885/
http://www.#ickr.com/photos/brianwestcott/1497708345/    http://www.#ickr.com/photos/pjern/2150874047/
http://www.#ickr.com/photos/brunogirin/73014722/         http://www.#ickr.com/photos/rubodewig/5161937181/
http://www.#ickr.com/photos/eole/4500783172/             http://www.#ickr.com/photos/rutty/460520720/
http://www.#ickr.com/photos/jacockshaw/1811056252/       http://www.#ickr.com/photos/sarah_lincoln/4740037328/
http://www.#ickr.com/photos/jenny-pics/2719309611/       http://www.#ickr.com/photos/shindotv/3835365695/
http://www.#ickr.com/photos/ldsykora/2414497811/         http://www.#ickr.com/photos/thalamus/306881919/
http://www.#ickr.com/photos/listed_crime/1342164481/     http://www.#ickr.com/photos/traviscrawford/323366600/
http://www.#ickr.com/photos/localsurfer/369116556/       http://www.#ickr.com/photos/webtreatsetc/5303216304/

Burn down the silos! Helping dev and ops gel on high availability websites