Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
Azure AD Connect
Next
Download to read offline and view in fullscreen.

Share

SPOF - Single "Person" of Failure

Download to read offline

The talk from DevOps Days Silicon Valley 2015 conference which describes the signs of having or being a single point of failure expert on your system, and the ways to solve the problem

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

SPOF - Single "Person" of Failure

  1. 1. Single Point of Failure… Expert Sasha Rosenbaum, @DivineOps
  2. 2. Who am I? Sasha Rosenbaum Azure & DevOps consultant at 10th Magnitude for 4 years Co-organizer of - DevOps Days Chicago Conference - Chicago Azure meetup @DivineOps
  3. 3. What is a Single Point of Failure? @DivineOps
  4. 4. A single point of failure (SPOF) is a part of a system that, if it fails, will stop the entire system from working @DivineOps
  5. 5. High Availability  Achieving redundancy by removing single points of failure  Having reliable cross-over capabilities to switch between components  Detection of failures as they occur, so that cross-over can be initiated @DivineOps
  6. 6. This is complicated @DivineOps
  7. 7. Architecting for HA @DivineOps
  8. 8. How is the entire system down? @DivineOps
  9. 9. We forgot a dependency! @DivineOps
  10. 10. Oh… @DivineOps
  11. 11. Just imagine buying a server that Uptime of roughly 16 hours a day With interruptions Single one of its kind Cannot be replicated! @DivineOps
  12. 12. Humans are NOT highly available @DivineOps
  13. 13. How did we get here? Lack of budget Lack of people Human nature @DivineOps
  14. 14. How to recognize that you have a problem? @DivineOps
  15. 15. 1 @DivineOps
  16. 16. Keys to the Kingdom @DivineOps
  17. 17. TO MY PRODUCTION SERVER @DivineOps
  18. 18. Even when the systems are automated there are still humans who manage them @DivineOps
  19. 19. Why is there a single admin? The situation evolved organically from having a small team Someone took over deliberately @DivineOps
  20. 20. Role Based Access Grant access based on a role/group Admin group size > 1 Service accounts @DivineOps
  21. 21. Make sure that the person on call has the necessary access to fix the problem @DivineOps
  22. 22. TRUST YOUR PEOPLE!!! @DivineOps
  23. 23. 2 @DivineOps
  24. 24. Beware of the Expert! @DivineOps
  25. 25. “This will take 15 minutes to fix And 8 hours to explain” @DivineOps
  26. 26. We cannot afford the loss of productivity! @DivineOps
  27. 27. Can you afford losing this knowledge? @DivineOps
  28. 28. Delegate to Juniors @DivineOps
  29. 29. Juniors are wonderful people They ask tough questions @DivineOps
  30. 30. Your new hires haven’t yet caught the “This is how it’s always been” virus @DivineOps
  31. 31. You are emotionally invested in your code It is hard not to get protective of it @DivineOps
  32. 32. Documentation Documents Readme Comments Tests Automation Features @DivineOps
  33. 33. 3 @DivineOps
  34. 34. “I cannot afford to take vacation!” @DivineOps
  35. 35. Job security? @DivineOps
  36. 36. Productivity? @DivineOps
  37. 37. Hours / Productivity @DivineOps
  38. 38. Research shows that working longer hours DOES NOT increase productivity @DivineOps
  39. 39. You need rest to be at your best! @DivineOps
  40. 40. Cell phones are the single worse thing that happened to people AND businesses in the last century @DivineOps
  41. 41. If people were actually unreachable we would find a more reliable way to solve problems @DivineOps
  42. 42. Mandatory Vacation @DivineOps
  43. 43. Game Days @DivineOps
  44. 44. Say NO to having a Single PERSON of Failure ;-) @DivineOps
  45. 45. Great job, DoD Silicon Valley! @DivineOps
  • meljeanlegaspi

    Jun. 10, 2017
  • justinschmidt803

    Dec. 5, 2015
  • linuxred

    Dec. 4, 2015
  • cheungpat

    Nov. 22, 2015
  • mordel1

    Nov. 12, 2015
  • jangaraj

    Nov. 10, 2015
  • choffee

    Nov. 10, 2015
  • ramons_03

    Nov. 10, 2015
  • lhawthorn

    Nov. 10, 2015
  • JimDonnelly3

    Nov. 10, 2015
  • joehack3r

    Nov. 10, 2015
  • slachiewicz

    Nov. 9, 2015
  • potoftea

    Nov. 9, 2015
  • StuartStevenson2

    Nov. 9, 2015
  • javierjeronimo1

    Nov. 9, 2015
  • ikurochkin

    Nov. 8, 2015
  • omaruriel

    Nov. 8, 2015
  • venkatgopalan

    Nov. 8, 2015
  • YotamShapira

    Nov. 8, 2015
  • diegoelacerda

    Nov. 8, 2015

The talk from DevOps Days Silicon Valley 2015 conference which describes the signs of having or being a single point of failure expert on your system, and the ways to solve the problem

Views

Total views

5,301

On Slideshare

0

From embeds

0

Number of embeds

247

Actions

Downloads

39

Shares

0

Comments

0

Likes

23

×