Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Fail Fast, Fail Often


Published on

Software projects were historically managed on a bet the farm model. They succeeded or they failed. And when they failed (as big software projects often did), the consequences were typically dire for, not only organizations as a whole, but for many of the individuals involved. Today, by contrast, many software and the development projects have evolved toward a much more incremental, iterative, and experimental process that takes cues from the open source model which excuses (and even rewards) certain types of failure.

In this session, we’ll discuss how failure can be turned into a positive. This includes the organizational dynamics associated with tolerating uncertain outcomes, the need to define acceptable failure parameters, and the technical means by which experimentation can be automated in ways that amplify the positive while minimizing the effect of negative outcomes.

Published in: Technology
  • Be the first to comment

Fail Fast, Fail Often

  1. 1. FAIL FAST, FAIL OFTEN Gordon Haff @ghaff, Technology Evangelist William Henry @ipbabble, DevOps Strategy Lead 13 July 2016
  2. 2. FAILURE 2
  3. 3. 3 FAILURE
  4. 4. 4 FAILURE
  5. 5. ALSO FAILURE 5
  8. 8. DON’T FAIL 8
  9. 9. DON’T FAIL 9
  10. 10. FAIL WELL 10
  11. 11. 11 Experiment by Peter Skillman, former VP of design at Palm
  12. 12. 12 WHAT HE LEARNED • Kindergarteners do not spend 15 minutes in a bunch of status transactions trying to figure out who is going to be CEO of Spaghetti Corporation. • They don’t sit around talking about the problem. They just start building to determine what works and what doesn’t.
  14. 14. 14 FIVE PRINCIPLES: THE RIGHT scope approach workflow incentives culture
  15. 15. 15 THE RIGHT SCOPE Constrain the impact of failure • Enable experimentation • Stop cascading of failures • Make deployments incremental, frequent, and routine events • Generally decouple activities and decisions from each other • Small, autonomous, bounded context services
  16. 16. 16 SMALL • “Two pizza teams” • Well-defined functional units • Organized around business capabilities (Conway's Law)
  17. 17. 17 AUTONOMOUS • Implementation changes can happen independently of other services • Data and functionality exposed only through service calls over the network • Designed to be externalizable • No back-doors
  18. 18. 18 THE RIGHT APPROACH Continuously experiment, iterate, and improve • It’s about the process • Identify mistakes early • Establish safety nets • Fail and move on
  19. 19. 19 THE PROCESS Involves people and communication • The most effective process have continuous communication - think scrums and kanban • Allows for collaboration that can identify failures before they happen • Allows for feedback to continuously improve and cultivate growth • Provides transparency
  20. 20. 20 DEV LESSONS: BREAKING CODE VIOLENTLY Build in violent failures to highlight issues • C/C++ lessons: • Sanity check using assertions • Invariant checks • If ever I’m here in the code and these conditions aren’t met, then I have no business being here. Something is wrong and I should fail violently. • Involves tracing through the failure
  21. 21. 21 AUTOMATED REGRESSION TESTING • As products and services evolve we discovered that maintaining and incrementally adding new tests became valuable • These tests were/are most often based on experienced failures and bugs • Scripts were developed to run nightly builds against various developer changes to test for regression • Testing tools evolved - proprietary and open source
  22. 22. 22 OPS LESSONS: CHAOS MONKEY Test robustness of recovery using failure • Platform should provide uninterrupted services to the customer • Therefore: • Should always recover in acceptable amount of time • We should have random failures to ensure that changes have not regressed or caused new recovery problems
  23. 23. 23 THE RIGHT WORKFLOW Repeatably automate for consistency • Goal is repeatable automation • Toyota’s yellow cord • Initially pipelines may be very different • Different tools • Traditional vs. “cloud native” • It’s a journey • Consolidation evolves naturally
  24. 24. 24 DESIRABLE ENTERPRISE CI/CD WORKFLOW myRepo Project Repo CI Commit Push Pass/Fail Local Test Build Repo CD Release Repo Monitor Build Test Review/ Appr Deliver Deploy 3rd Party
  25. 25. 25 CI/CD PIPELINE TOOLSET CI/CD Workflow UI gerrit
  26. 26. 26 OPS LESSONS: RED/GREEN Configuration as code has built in failure Continuous Integration / Continuous Deployment Image & Package & Metadata Repository src repo Dev./Build QA Production in OHC Events
  27. 27. 27 THE RIGHT INCENTIVES Align rewards and behavior with desirable outcomes • Incentives (advancement, money, recognition) need to reward trust, cooperation, and innovation • Peer reward systems also valuable • Individual has control over their own success • But people still have responsibility for their actions
  28. 28. 28 THE RIGHT CULTURE Build systems and organizations that allow for failing well • Transparency • Even good decisions can have bad outcomes • Innovation inherently risky • Cut losses (avoid sunk cost fallacy) This is why open source is so successful!
  29. 29. 29
  30. 30. 30 BUT CULTURE ISN’T SOMETHING YOU JUST CHANGE • Lack of agreed-to model of what “right” culture looks like • Different organizations require different behaviors • Culture change is difficult to measure and quantify • Culture is very hard to impose • Culture is an output, not an input
  31. 31. 31 CULTURE IS: emergent pervasive the keystone
  32. 32. THANK YOU
  33. 33. CREDITS 33 Tacoma Narrows Bridge: Barney Elliott; The Camera Shop - Screenshot taken from 16MM Kodachrome motion picture film by Barney Elliott. Time cover: Time, Inc. Wipeout, Flickr/CC: Marshmallow challenge: Linux Collaboration Summit: Linux Foundation. Two pizzas: Flickr/CC Frog: Kathy CC/Flickr Square peg Flickr/CC: