Burn Down The Silos  Lindsay Holmwood
DevOps
Case study
Applying with technology
High pro!lefundraising website
Strong siloisationDevs + Ops in di"erent companies
100% uptime
3 Concepts
Consistency
Repeatability
Visibility
Consistency
ensuring identicalbehaviour
within anenvironment
or across multiple  environments
con!gurationmanagement
testing
Puppet   language    library client/server
package { "apache2":  ensure => installed}service { "apache2":  enable => true,  ensure => running}
class apache2 {    package { "apache2":      ensure => installed    }    service { "apache2":      enable => true,      en...
Puppet work#ow write   apply   debug
class apache2 {    package { "apache2":      ensure => installed    }    service { "apache2":      enable => true,      en...
class apache2 {    package { "apache2":      ensure => installed    }    service { "apache2":      enable => true,      en...
Complex manifests
Proprietary software
Lots of debugging
VMware snapshots
Multipledeploy environments
Con!guration drift
“roles”
define app_server($collectd_destination, $logging_destination) { server { $fqdn:    logging_destination => $logging_destin...
node "app-01.stage.charity.com" {  app_server { $fqdn:    collectd_destination => "stats-01.stage.charity.com",    logging...
node /app-d+.stage.charity.com/ {  app_server { $fqdn:    collectd_destination => "stats-01.stage.charity.com",    logging...
30 minute builds
Consistency
Repeatability
Function ofconsistency
automate, to removehuman error
increase speed by shorteningfeedback loops
automateddeployments
con!gurationmanagement
Capistrano
Ruby DSL aroundSSH-in-a-for-loop
Simple, powerful,can blow your legs o"
Not a substitute for  con!guration  management
railsless-deploy   http://bit.ly/i56ra9
Removes Rails-ism
Great for PHP
capistrano-multistage     (part of capistrano-ext)          http://bit.ly/i2moIp
# config/deploy.rbset   :user, "deploy"set   :application, "charity.com"set   :keep_releases, 10set   :deploy_to, "/srv/#{...
# config/deploy/stage.rbrole :app,    "app-01.stage.charity.com",              "app-02.stage.charity.com"role :static, "st...
# config/deploy/production.rbrole :app,    "app-01.prod.charity.com",              "app-02.prod.charity.com",             ...
cap staging deploy    # deploy to staging# testcap production deploy # deploy to production
Capistranobootstrap w/ Puppet
class capistrano::user {    group { "deploy":      gid => 499    }    user { "deploy":      uid => 499,      gid => 499,  ...
define capistrano::site {  include capistrano::user  file { "/srv/$name":    ensure => directory,    owner   => deploy,   ...
...    file { "/srv/$name/shared/log":      ensure => directory,      owner => www-data,      group => www-data,    }    f...
define app_server($collectd_destination, $logging_destination) {    include apache2    include mysql::client    capistrano...
Deploying to a newapp server is as easy as:
# config/deploy/production.rbrole :app,    "app-01.prod.charity.com",              "app-02.prod.charity.com",             ...
# config/deploy/production.rbrole :app,    "app-01.prod.charity.com",              "app-02.prod.charity.com",             ...
or...
# deploy/production.rbrole :app,    "app-01.prod.charity.com",              "app-02.prod.charity.com",              "app-0...
# deploy/production.rbrole :app, *(1..4).map do |n|  "app-%.2d.prod.charity.com" % nendrole :static, "static-01.prod.chari...
git-svn mirror
182MB * 20 == PAIN
remote_cache
bad with svn tags
git-svn + cron
fast clones commit access21st century tech
Repeatability
Visibility
one eye on the     past
one eye on the    future
metric collection
code changes
monitoring
reports
metric collection
collectd
lightweight  statistics collection  daemon
platform for   collectingtime series data
plugin based
network aware
well de!ned APIs
curl_json
<Plugin curl_json> <URL "http://localhost:5984/_stats">   Instance "httpd"   <Key "httpd/requests/count">     Type "http_r...
/metrics
code changes
application       &con!g management
Your new best friend
monitoring
sudo mmm_control show                   # blocks under high IOecho -en “shownquitn” | nc 127.1 9988 # instant
sudo mmm_control show                   # blocks under high IOecho -en “shownquitn” | nc 127.1 9988 # instantsocket = ::TC...
reports
mk-query-digest      &   logrotate
# prerotateSLOWLOG_FILENAME="/var/log/mysql/mysql-slow.log"OPTIONS="--report --group-by distill --order-by Query_time:max-...
Retrospectives
Slave explosion
Background:
Background:MySQL replication
Background:MySQL replicationMMM
Background:MySQL replicationMMM2 masters + 4 slaves
REPLICATION_FAIL     on one slave
Down to 3 nodes
Increased cluster load
REPLICATION_DELAY     on another slave
Down to 2 nodes
Inspectionof REPLICATION_DELAY slave
Swapping like madHalf the memory allocated
Shutdown, upgrade,   boot
Visibility
metric collection
Consistency
Database connectivity
Soft launch
PHP connection errors
Con!g parses + loads
Add con!g dump url
Visibility
curl_json
Redeploy
...
Typo
2 reviewersof con!g management
both in ops team
Visibility
code changes
Your new best friend
reviewer diversitydevs should have visibility of ops changes
Data consistency
New release
Database migrations
Release promotion
uat  stageproduction
uat !  stageproduction
uat !  stage !production
uat !  stage !production "
uat !  stage !production "
CREATE TABLE foo           should have beenCREATE TABLE IF NOT EXIST foo
Consistency
uat  stageproduction
uat  stageproduction
uat  stageproduction
Repeatability
Consistency
ensuring identicalbehaviour
within anenvironment
or across multiple  environments
Repeatability
Function ofconsistency
automate, to removehuman error
increase speed by shorteningfeedback loops
Visibility
one eye on the     past
one eye on the    future
Communicate!
Thank you!Credits:http://www.#ickr.com/photos/48722974@N07/4682302824/     http://www.#ickr.com/photos/lyza/4144764381/htt...
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Burn down the silos! Helping dev and ops gel on high availability websites
Upcoming SlideShare
Loading in...5
×

Burn down the silos! Helping dev and ops gel on high availability websites

2,435

Published on

HA websites are where the rubber meets the road - at 200km/h. Traditional separation of dev and ops just doesn't cut it.

Everything is related to everything. Code relies on performant and resilient infrastructure, but highly performant infrastructure will only get a poorly written application so far. Worse still, root cause analysis in HA sites will more often than not identify problems that don't clearly belong to either devs or ops.

The two options are collaborate or die.

This talk will introduce 3 core principles for improving collaboration between operations and development teams: consistency, repeatability, and visibility. These principles will be investigated with real world case studies and associated technologies audience members can start using now. In particular, there will be a focus on:

- fast provisioning of test environments with configuration management
- reliable and repeatable automated deployments
- application and infrastructure visibility with statistics collection, logging, and visualisation

Published in: Technology, Business
3 Comments
0 Likes
Statistics
Notes
  • Be the first to like this

No Downloads
Views
Total Views
2,435
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
28
Comments
3
Likes
0
Embeds 0
No embeds

No notes for slide

Burn down the silos! Helping dev and ops gel on high availability websites

  1. 1. Burn Down The Silos Lindsay Holmwood
  2. 2. DevOps
  3. 3. Case study
  4. 4. Applying with technology
  5. 5. High pro!lefundraising website
  6. 6. Strong siloisationDevs + Ops in di"erent companies
  7. 7. 100% uptime
  8. 8. 3 Concepts
  9. 9. Consistency
  10. 10. Repeatability
  11. 11. Visibility
  12. 12. Consistency
  13. 13. ensuring identicalbehaviour
  14. 14. within anenvironment
  15. 15. or across multiple environments
  16. 16. con!gurationmanagement
  17. 17. testing
  18. 18. Puppet language library client/server
  19. 19. package { "apache2": ensure => installed}service { "apache2": enable => true, ensure => running}
  20. 20. class apache2 { package { "apache2": ensure => installed } service { "apache2": enable => true, ensure => running }}
  21. 21. Puppet work#ow write apply debug
  22. 22. class apache2 { package { "apache2": ensure => installed } service { "apache2": enable => true, ensure => running }}
  23. 23. class apache2 { package { "apache2": ensure => installed } service { "apache2": enable => true, ensure => running, require => [ Package["apache2"] ] }}
  24. 24. Complex manifests
  25. 25. Proprietary software
  26. 26. Lots of debugging
  27. 27. VMware snapshots
  28. 28. Multipledeploy environments
  29. 29. Con!guration drift
  30. 30. “roles”
  31. 31. define app_server($collectd_destination, $logging_destination) { server { $fqdn: logging_destination => $logging_destination } include apache2 include mysql::client collectd::client { $fqdn: collection_destination => $collectd_destination } if $environment == "production" { include production::only::module }}
  32. 32. node "app-01.stage.charity.com" { app_server { $fqdn: collectd_destination => "stats-01.stage.charity.com", logging_destination => "log-01.stage.charity.com" }}node "app-01.production.charity.com" { app_server { $fqdn: collectd_destination => "stats-01.production.charity.com", logging_destination => "log-01.production.charity.com" }}
  33. 33. node /app-d+.stage.charity.com/ { app_server { $fqdn: collectd_destination => "stats-01.stage.charity.com", logging_destination => "log-01.stage.charity.com" }}node /app-d+.production.charity.com/ { app_server { $fqdn: collectd_destination => "stats-01.production.charity.com", logging_destination => "log-01.production.charity.com" }}
  34. 34. 30 minute builds
  35. 35. Consistency
  36. 36. Repeatability
  37. 37. Function ofconsistency
  38. 38. automate, to removehuman error
  39. 39. increase speed by shorteningfeedback loops
  40. 40. automateddeployments
  41. 41. con!gurationmanagement
  42. 42. Capistrano
  43. 43. Ruby DSL aroundSSH-in-a-for-loop
  44. 44. Simple, powerful,can blow your legs o"
  45. 45. Not a substitute for con!guration management
  46. 46. railsless-deploy http://bit.ly/i56ra9
  47. 47. Removes Rails-ism
  48. 48. Great for PHP
  49. 49. capistrano-multistage (part of capistrano-ext) http://bit.ly/i2moIp
  50. 50. # config/deploy.rbset :user, "deploy"set :application, "charity.com"set :keep_releases, 10set :deploy_to, "/srv/#{application}"set :stages, %w(uat staging production)
  51. 51. # config/deploy/stage.rbrole :app, "app-01.stage.charity.com", "app-02.stage.charity.com"role :static, "static-01.stage.charity.com”
  52. 52. # config/deploy/production.rbrole :app, "app-01.prod.charity.com", "app-02.prod.charity.com", "app-03.prod.charity.com"role :static, "static-01.prod.charity.com”
  53. 53. cap staging deploy # deploy to staging# testcap production deploy # deploy to production
  54. 54. Capistranobootstrap w/ Puppet
  55. 55. class capistrano::user { group { "deploy": gid => 499 } user { "deploy": uid => 499, gid => 499, home => "/home/deploy", shell => "/bin/bash", require => Group["deploy"] } file { "/home/deploy/.ssh/authorized_keys": source => "puppet:///modules/capistrano/authorized_keys", mode => 644, owner => "deploy", group => "deploy", require => [ User["deploy"] ] }}
  56. 56. define capistrano::site { include capistrano::user file { "/srv/$name": ensure => directory, owner => deploy, group => deploy, require => [ User["deploy"] ] } file { "/srv/$name/releases": ensure => directory, owner => deploy, group => deploy, require => [ File["/srv/$name"] ] }...
  57. 57. ... file { "/srv/$name/shared/log": ensure => directory, owner => www-data, group => www-data, } file { "/etc/$name": source => "puppet:///modules/capistrano/etc/$name", recurse => true, mode => "644", owner => root, group => root }}
  58. 58. define app_server($collectd_destination, $logging_destination) { include apache2 include mysql::client capistrano::site { “charity.com”: }}
  59. 59. Deploying to a newapp server is as easy as:
  60. 60. # config/deploy/production.rbrole :app, "app-01.prod.charity.com", "app-02.prod.charity.com", "app-03.prod.charity.com"role :static, "static-01.prod.charity.com”
  61. 61. # config/deploy/production.rbrole :app, "app-01.prod.charity.com", "app-02.prod.charity.com", "app-03.prod.charity.com", "app-04.prod.charity.com"role :static, "static-01.prod.charity.com”
  62. 62. or...
  63. 63. # deploy/production.rbrole :app, "app-01.prod.charity.com", "app-02.prod.charity.com", "app-03.prod.charity.com", "app-04.prod.charity.com"role :static, "static-01.prod.charity.com”
  64. 64. # deploy/production.rbrole :app, *(1..4).map do |n| "app-%.2d.prod.charity.com" % nendrole :static, "static-01.prod.charity.com”
  65. 65. git-svn mirror
  66. 66. 182MB * 20 == PAIN
  67. 67. remote_cache
  68. 68. bad with svn tags
  69. 69. git-svn + cron
  70. 70. fast clones commit access21st century tech
  71. 71. Repeatability
  72. 72. Visibility
  73. 73. one eye on the past
  74. 74. one eye on the future
  75. 75. metric collection
  76. 76. code changes
  77. 77. monitoring
  78. 78. reports
  79. 79. metric collection
  80. 80. collectd
  81. 81. lightweight statistics collection daemon
  82. 82. platform for collectingtime series data
  83. 83. plugin based
  84. 84. network aware
  85. 85. well de!ned APIs
  86. 86. curl_json
  87. 87. <Plugin curl_json> <URL "http://localhost:5984/_stats"> Instance "httpd" <Key "httpd/requests/count"> Type "http_requests" </Key> <Key "httpd_status_codes/*/count"> Type "http_response_codes" </Key> </URL></Plugin>
  88. 88. /metrics
  89. 89. code changes
  90. 90. application &con!g management
  91. 91. Your new best friend
  92. 92. monitoring
  93. 93. sudo mmm_control show # blocks under high IOecho -en “shownquitn” | nc 127.1 9988 # instant
  94. 94. sudo mmm_control show # blocks under high IOecho -en “shownquitn” | nc 127.1 9988 # instantsocket = ::TCPSocket.new("127.0.0.1", 9988)socket.print("shownquitn")output = socket.read.split("n")hosts = output.map do |line| parts = line.scan(/nasty regex/).flatten { "hostname" => parts[0], "address" => parts[1], "mode" => parts[2], "state" => parts[3], "role" => parts[5], "role_address" => parts[6] }end
  95. 95. reports
  96. 96. mk-query-digest & logrotate
  97. 97. # prerotateSLOWLOG_FILENAME="/var/log/mysql/mysql-slow.log"OPTIONS="--report --group-by distill --order-by Query_time:max--timeline --report-format query_report,profile"DATE="$(date +%Y-%m-%dT%H:%M:%S%z)"REPORT_FILENAME="/tmp/$(hostname)-mysql-slow-query-report-$DATE"mk-query-digest $SLOWLOG_FILENAME $OPTIONS > $REPORT_FILENAMESUBJECT="MySQL Slow Queries Report for $(hostname) as of $DATE"RECIPIENTS="developers@charity.com,ops@bulletproof.net"cat $REPORT_FILENAME | nail -n -E -s "$SUBJECT" "$RECIPIENTS"
  98. 98. Retrospectives
  99. 99. Slave explosion
  100. 100. Background:
  101. 101. Background:MySQL replication
  102. 102. Background:MySQL replicationMMM
  103. 103. Background:MySQL replicationMMM2 masters + 4 slaves
  104. 104. REPLICATION_FAIL on one slave
  105. 105. Down to 3 nodes
  106. 106. Increased cluster load
  107. 107. REPLICATION_DELAY on another slave
  108. 108. Down to 2 nodes
  109. 109. Inspectionof REPLICATION_DELAY slave
  110. 110. Swapping like madHalf the memory allocated
  111. 111. Shutdown, upgrade, boot
  112. 112. Visibility
  113. 113. metric collection
  114. 114. Consistency
  115. 115. Database connectivity
  116. 116. Soft launch
  117. 117. PHP connection errors
  118. 118. Con!g parses + loads
  119. 119. Add con!g dump url
  120. 120. Visibility
  121. 121. curl_json
  122. 122. Redeploy
  123. 123. ...
  124. 124. Typo
  125. 125. 2 reviewersof con!g management
  126. 126. both in ops team
  127. 127. Visibility
  128. 128. code changes
  129. 129. Your new best friend
  130. 130. reviewer diversitydevs should have visibility of ops changes
  131. 131. Data consistency
  132. 132. New release
  133. 133. Database migrations
  134. 134. Release promotion
  135. 135. uat stageproduction
  136. 136. uat ! stageproduction
  137. 137. uat ! stage !production
  138. 138. uat ! stage !production "
  139. 139. uat ! stage !production "
  140. 140. CREATE TABLE foo should have beenCREATE TABLE IF NOT EXIST foo
  141. 141. Consistency
  142. 142. uat stageproduction
  143. 143. uat stageproduction
  144. 144. uat stageproduction
  145. 145. Repeatability
  146. 146. Consistency
  147. 147. ensuring identicalbehaviour
  148. 148. within anenvironment
  149. 149. or across multiple environments
  150. 150. Repeatability
  151. 151. Function ofconsistency
  152. 152. automate, to removehuman error
  153. 153. increase speed by shorteningfeedback loops
  154. 154. Visibility
  155. 155. one eye on the past
  156. 156. one eye on the future
  157. 157. Communicate!
  158. 158. Thank you!Credits:http://www.#ickr.com/photos/48722974@N07/4682302824/ http://www.#ickr.com/photos/lyza/4144764381/http://www.#ickr.com/photos/acediscovery/3030548744/ http://www.#ickr.com/photos/matchew/424026531/http://www.#ickr.com/photos/andrew_wertheimer/5268407700/ http://www.#ickr.com/photos/mrwoodnz/4289893182/http://www.#ickr.com/photos/azrasta/4528604334/ http://www.#ickr.com/photos/myprofe/4396178084/http://www.#ickr.com/photos/boliston/2351083198/ http://www.#ickr.com/photos/nnova/4834954885/http://www.#ickr.com/photos/brianwestcott/1497708345/ http://www.#ickr.com/photos/pjern/2150874047/http://www.#ickr.com/photos/brunogirin/73014722/ http://www.#ickr.com/photos/rubodewig/5161937181/http://www.#ickr.com/photos/eole/4500783172/ http://www.#ickr.com/photos/rutty/460520720/http://www.#ickr.com/photos/jacockshaw/1811056252/ http://www.#ickr.com/photos/sarah_lincoln/4740037328/http://www.#ickr.com/photos/jenny-pics/2719309611/ http://www.#ickr.com/photos/shindotv/3835365695/http://www.#ickr.com/photos/ldsykora/2414497811/ http://www.#ickr.com/photos/thalamus/306881919/http://www.#ickr.com/photos/listed_crime/1342164481/ http://www.#ickr.com/photos/traviscrawford/323366600/http://www.#ickr.com/photos/localsurfer/369116556/ http://www.#ickr.com/photos/webtreatsetc/5303216304/
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×