• Like
  • Save
Cloud Operations Bootcamp: Culture - Jesse Robbins
Upcoming SlideShare
Loading in...5
×
 

Cloud Operations Bootcamp: Culture - Jesse Robbins

on

  • 3,347 views

Cloud Operations Bootcamp: Culture

Cloud Operations Bootcamp: Culture

Statistics

Views

Total Views
3,347
Views on SlideShare
3,126
Embed Views
221

Actions

Likes
4
Downloads
66
Comments
0

6 Embeds 221

http://www.opscode.com 200
http://www.getchef.com 10
http://www.slideshare.net 5
http://mndoci.com 4
http://feeds.opscode.com 1
http://opscode.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-ShareAlike LicenseCC Attribution-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Cloud Operations Bootcamp: Culture - Jesse Robbins Cloud Operations Bootcamp: Culture - Jesse Robbins Presentation Transcript

    • Operations Culture Speaker: Jesse Robbins CEO ‣ jesse@opscode.com ‣ @jesserobbins ‣ www.opscode.com 1
    • Today 2
    • Today ‣ Operations is Culture 2
    • Today ‣ Operations is Culture ‣ Failure Happens 2
    • Today ‣ Operations is Culture ‣ Failure Happens ‣ The OODA Loop 2
    • Today ‣ Operations is Culture ‣ Failure Happens ‣ The OODA Loop ‣ Do Fire Drills 2
    • Operations is Culture 3
    • “You don’t choose the moment, the moment chooses you. You only get to choose how prepared you are when it does.” -Fire Chief Mike Burtch 4
    • Cloud Operations is the ability to consistently create and deploy reliable software to an unreliable platform that scales horizontally. http://radar.oreilly.com/2007/10/operations-is-a-competitive-ad.html 5
    • “It’s not my code, it’s your machines! http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr 6
    • “It’s not my code, it’s your machines! http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr 6
    • “It’s not my code, it’s your machines! Spock Scotty Little bit weird Pulls levers & turns knobs Sits closer to the boss Easily excited Thinks too hard Yells a lot in emergencies http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr 6
    • No ngerpointing http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr http://www. ickr.com/photos/rocketjim54/2955889085/ Reserved Copyright © 2010 Opscode, Inc - All Rights 7
    • Fingerpointyness problem!!! argggh! time http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr
    • Fingerpointyness problem!!! argggh! freaking out, not talking, finding fault time http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr
    • Fingerpointyness problem!!! argggh! freaking out, blaming, not talking, covering finding fault ass time http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr
    • Fingerpointyness problem!!! argggh! freaking out, blaming, not talking, covering whining, finding fault ass hiding. hurt egos time http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr
    • Fingerpointyness problem!!! argggh! freaking out, blaming, figuring it not talking, covering whining, out finding fault ass hiding. hurt egos time http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr
    • Fingerpointyness problem!!! argggh! fixed freaking out, blaming, figuring it fixing things not talking, covering whining, out finding fault ass hiding. hurt egos time http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr
    • Being productive problem!!! argggh! time http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr
    • Being productive problem!!! argggh! figuring it out time http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr
    • Being productive problem!!! argggh! fixed figuring it fixing things out time http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr
    • Being productive problem!!! argggh! fixed figuring it fixing things feeling out guilty time http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr
    • Being productive problem!!! argggh! fixed figuring it fixing things feeling move out guilty on with life time http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr
    • This will be on the test: FAILURE HAPPENS!
    • Good Book!
    • Catastrophic Potential Simple Complexity Complex Tight Coupling Loose Created by Jesse Robbins "Catastrophic Potential" adapted from Normal Accidents by Charles Perrow 12
    • Catastrophic Potential Simple Complexity Complex Tight KEEP OUT!!! Coupling Loose Created by Jesse Robbins "Catastrophic Potential" adapted from Normal Accidents by Charles Perrow 12
    • define: Nines (roughly)
    • define: Nines (roughly) 99% 5256 min (3.5 days)
    • define: Nines (roughly) 99% 5256 min (3.5 days) 99.9% 528 min ( 8.8 hours )
    • define: Nines (roughly) 99% 5256 min (3.5 days) 99.9% 528 min ( 8.8 hours ) 99.99% 53 min
    • define: Nines (roughly) 99% 5256 min (3.5 days) 99.9% 528 min ( 8.8 hours ) 99.99% 53 min 99.999% 5 min
    • define: Nines (roughly) 99% 5256 min (3.5 days) 99.9% 528 min ( 8.8 hours ) 99.99% 53 min 99.999% 5 min 99.9999% 30 Seconds
    • define: Nines (roughly) 99% 5256 min (3.5 days) 99.9% 528 min ( 8.8 hours ) 99.99% 53 min 99.999% 5 min 99.9999% 30 Seconds 99.99999% 3 Seconds
    • 99.9% * 99.9% * 99.9% = 99.7% 14
    • Internet Routing... won’t.
    • ;''-1(<"=/-)"3.1>0?-'"@'-': !"#$$%"&'(')*)"+,-.,-/01,( +/.01210*"345467"89: #
    • http://radar.oreilly.com/2008/10/sprint-blocking-cogent-network.html
    • #googlefail
    • YOU Copyright © 2010 Opscode, Inc - All Rights Reserved 21
    • Continuous Power... isn’t
    • 365 Main SF
    • 365 364.96 Main SF
    • http://radar.oreilly.com/2007/07/failure-happens-a-summary-of-t.html
    • http://radar.oreilly.com/2007/07/failure-happens-a-summary-of-t.html
    • Failure happens A single datacenter is the problem • Since they all fail at some point Recovery procedures after failure • Power was gone ~45 minutes • Most services took hours to come back • Some unnamed ones more than 12 hours
    • Geography is a Single Point of Failure
    • Copyright © 2010 Opscode, Inc - All Rights Reserved 30
    • Providers are baskets too.
    • Copyright © 2010 Opscode, Inc - All Rights Reserved 32
    • Failure Happens. Anyone promising otherwise is either foolish or lying (or both).
    • OODA Observe, Orient, Decide, Act 34
    • OODA: Observe, Orient, Decide, Act http://en.wikipedia.org/wiki/OODA_loop 35
    • http://www.slideshare.net/jallspaw/10-deploys-per-day-dev-and-ops-cooperation-at-flickr http://www.flickr.com/photos/dnorman/2678090600
    • Speaker: Jesse Robbins CEO ‣ jesse@opscode.com ‣ @jesserobbins ‣ www.opscode.com 37