6. Jesse Robbins CloudCamp 5minute Presentation

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Notes on slide 1

    firefighters are usually considered to be about 75% paranoid and about 25% pyromaniac.

    Which means this sort of thing makes perfect sense to me at the time.

    The 365main site does not have a typical battery backup system. Instead they rely on Continuous Power Supplies (CPS) which use a flywheel driven alternator to generate electricity. The flywheel is connected to both a large diesel motor and an electric motor which runs on utility power. The flywheel is normally turned by the electric motor, and stores enough kinetic energy to power the alternator for up to 15 seconds. When utility power fails the diesel motor is supposed to start in under 5 seconds, well before the flywheel's kinetic energy is exhausted, providing uninterrupted electrical power.The advantage of a CPS over a battery-based system is that the power going to the datacenter is decoupled from the utility power. This eliminates the complex electrical switching required from most battery-based systems, making many CPS systems simpler and sometimes more reliable.

    In this incident, latent defects caused three generators to fail during start-up. No customers were affected until a fourth generator failed 30 seconds later, which overloaded the surviving backup system and caused power failures to 3 of 8 customer areas.What's most interesting is that the redundant design of the system is what caused it to fail so completely. The failure of the fourth generator should have only brought down one area instead of three. This kind of cascade failure is common in complex & tightly coupled systems. In my experience, these sorts of failure-modes are often identified and then promptly dismissed as being \"nearly impossible\". Unfortunately, the impossible often becomes reality.To put it another way... Failure Happens.

    Hurricane Katrina landed, and like many people I wanted to help.

    1 Event

    6. Jesse Robbins CloudCamp 5minute Presentation - Presentation Transcript

    1. Failure Happens F***, the F***ing thing is F***king F***ed jesse@oreilly.com
    2. T will be on the test: his FA UREH PPENS! IL A FA UREH PPENS! IL A FA UREH PPENS! IL A
    3. Pyromaniac Paranoid
    4. Good Book!
    5. “multiple and unexpected interactions of failures are inevitable” -Charles Perrow
    6. Failure Happens
    7. define: Nines (roughly) 99% 5256 min (3.5 days) 99.9% 528 min ( 8.8 hours ) 99.99% 53 min 99.999% 5 min 99.9999% 30 Seconds 99.99999% 3 Seconds
    8. Internet Routing... won’t.
    9. #googlefail
    10. YOU
    11. Continuous Power... isn’t
    12. 365 Main SF
    13. 365 364.96 Main SF
    14. Failure happens A single datacenter is the problem • Since they all fail at some point Recovery procedures after failure • Power was gone ~45 minutes • Most services took hours to come back • Some unnamed ones more than 12 hours Communication • All DNS servers in the same datacenter!
    15. Truck 1, Rackspace 0
    16. Geography is a Single Point of Failure Single Point of Failure
    17. Taser weilding robbers C I Hosts' Chicago facility robbed twice! (the other two times were merely \"break-ins where things were sto )
    18. Providers are baskets too.
    19. Failure Happens . Anyone promising otherwise is either foolish or lying (or both). (or both). (or both).
    20. G H ! o ere June 22-24, 2009 Jesse Robbins . jesse@oreillycom . jesse@oreillycom
    SlideShare Zeitgeist 2009

    + CloudCamp CommunityCloudCamp Community Nominate

    custom

    216 views, 0 favs, 2 embeds more stats

    Lightning Talk Presentation at CloudCamp @ Interop more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 216
      • 195 on SlideShare
      • 21 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 0
    Most viewed embeds
    • 20 views on http://www.cloudcamp.com
    • 1 views on http://www.garysguide.org

    more

    All embeds
    • 20 views on http://www.cloudcamp.com
    • 1 views on http://www.garysguide.org

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories

    Groups / Events