Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Building a Successful Organization By Mastering Failure

25,535 views

Published on

The Etsy organization has grown by a significant amount over the last five years. As a company grows, more thought must be put into the techniques that it uses to communicate and deal with failures. This talk will cover several techniques that have helped foster a Just Culture, one in which an effort is made to balance both safety and accountability

Published in: Business, Engineering, Technology
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Just got my check for $500, Sometimes people don't believe me when I tell them about how much you can make taking paid surveys online... So I took a video of myself actually getting paid $500 for paid surveys to finally set the record straight. I'm not going to leave this video up for long, so check it out now before I take it down! ●●● http://ishbv.com/surveys6/pdf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD FULL. BOOKS INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Do you want a longer and thicker penis without expensive surgery, extenders or suction devices that just don't work? Introducing the Penis Enlargement Bible, a 94 page downloadable e-book that has an exclusive two step system that can growth your penis by between 2 and 4 inches within 89 days using safe natural methods ▲▲▲ https://tinyurl.com/yaygh4xh
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Sex in your area for one night is there tinyurl.com/hotsexinarea Copy and paste link in your browser to visit a site)
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Building a Successful Organization By Mastering Failure

  1. Building A Successful Organization By Mastering Failure John Goulah (@johngoulah) Etsy
  2. Marketplace • $1.93B Annual GMS 2014 • 1.4M active sellers • 20M+ active buyers • 30% international GMS • 57%+ mobile visits
  3. Infrastructure • over 5500 MySQL databases • 750K graphite metrics/min • 1.3GB logs written/min • 50M - 75M gearman jobs / day • 30-50 deploys / day
  4. Company • Headquartered in Brooklyn • Over 700 employees • 7 offices around the world • 80+ dogs / 80+ cats
  5. Values
  6. Learning Org a company that facilitates the learning of its members and continuously transforms itself
  7. Five Disciplines
  8. Systems Thinking process of understanding how people, structure, and processes influence one another within a larger system
  9. Personal Mastery an individual holds great importance in a learning organization
  10. Mental Models the assumptions held by individials and organizations
  11. Shared Vision creates a common identity that provides focus and energy for learning
  12. Team Learning the problem solving capacity of the organization is improved through better access to knowledge and expertise
  13. Learning About Failure • architecture reviews • operability reviews • blameless post mortems
  14. failure and success come from the same source
  15. context
  16. can study the system at any time
  17. inflection points • architecture reviews • early feedback and discussion • operability reviews • held before launching • blameless post mortems • held after a failure
  18. Architecture Reviews
  19. Architecture Reviews understand the costs and benefits of a proposed solution, and discuss alternatives
  20. Etsy Tech Axioms • we use a small number of well known tools • all technology decisions come with trade offs • with new technology, many of those trade offs are unknown • we’re growing. things change
  21. with new technology many of those tradeoffs are unknown
  22. Departures a departure is when new technologies or patterns are introduced that deviate from the current known methods of operating the system and maintaining the software
  23. How do I know I need an architecture review? when there is a perceived departure from current technology choices or patterns
  24. How early do you hold them? early enough to be able to bail out or make major course corrections
  25. Who should come? • the people presenting the change • key stakeholders (sr. engineers, or arch review working group) • everyone else that wants to learn about the proposed changes to the system
  26. Architecture Review Meeting Format
  27. Preparation • a proposal is written in a shared document and circulated • comments are added, discussed, and potentially resolved in advance • initial questions for the meeting are collected in a tool such as google moderator
  28. Some General Questions • Do we understand the costs of this departure? • Have we asked hard questions about trade-offs? • What will this prohibit us from doing in the future?
  29. Some General Questions (cont) • Are we impacting visibility, measurability, debuggability and other operability concerns? • Are we impacting testability, security, translatability, performance and other product quality concerns? • Does it makes sense?
  30. The Arch Review • proposal is presented to the group • discuss questions and concerns • decide if we are moving forward or need further discussion
  31. you're saying my project might not move forward?
  32. Why might this end a project? • we learned through this discussion that an alternative is better • we find goals overlap with other projects that are in progress • we discover that it isn't worth the costs now that we have a better idea what they are
  33. At the end we should have • detailed notes from the conversation • agreement on tricky components and document them • a compilation of learnings and questions • a decision of whether to keep going with the project, stop and rethink, or gather more information
  34. Operability Reviews
  35. Operability Reviews understand how the system could break, how we will know, and how we will react
  36. When do we do operability reviews? • after architecture reviews in the product lifecycle, generally right before launch • when we need to gain increased confidence for launch due to the technology, product, or communication choices being risky • if there's a chance you'd surprise teams that operate the software
  37. Who comes to the operability review? representatives from: • Product • Development • Operations • Community/Support • QA
  38. Some Questions • Has the feature been tested enough to deploy to production? • Does everyone know when it will go live, and who will push the feature? • Is there communication about the feature ready to go out with the feature? • Is it possible to turn up this feature on a percentage basis, dark launch, or gameday it?
  39. Some Questions (cont) • Does the launch involves any new production infrastructure? • If so, are those pieces in monitoring or metrics collection? • If so, is there a deployment pipeline in place? • If so, is there a development environment set up to make it work in dev? • If so, are there tests that can be and are run on CI?
  40. Contingency Checklist
  41. Contingency Checklist a list of things that could possibly go "wrong" with a new feature, what we could do about it
  42. Issue What could possibly go wrong with the feature launched in production?
  43. Likelihood What is the likelihood of each item going wrong?
  44. Comments Any comments about the item?
  45. Impact This is just a measure of how impactful this will be if it does actually turn out to be a concern.
  46. Engineering What do we do to mitigate the issue with the item (i.e. can we gracefully degrade?)
  47. Onsite Messaging What is the messaging to the user in the forums, blog, and social media if this needs graceful degradation?
  48. PR Is PR needed for the contingency (i.e. larger scale failure)
  49. Blameless Post Mortems
  50. What is a post mortem? a postmortem is a facilitated meeting during which people involved/interested/close to an accident or incident debriefs together on how we think the event came about
  51. What does it cover? • walking through a timeline of events • learning how things are expected to work "normally", adding the context of everyone’s perspective • exploring what we might do to improve things for the future
  52. Local Rationality we want to know how it made sense for someone to do what they did at the time
  53. searching for second stories instead of human error • asking why is leading to who is responsible • asking how leads to what
  54. Avoiding Human Error Human error points directly to individuals in a complex system. But, in complex systems, system behaviour is driven fundamentally by the goals of the system and the system structure. People just provide the flexibility to make it work.
  55. Avoiding Human Error (cont) Human error implies deviation from “normal” or "ideal", but in complex situations and tasks there is often no normal ideal that can be precisely and exactly described, many variable interconnected touchpoints influence decisions that are made
  56. Recognizing Human Error • be aware of other terms for it: slip, lapse, distraction, mistake, deviation, carelessness, malpractice, recklessness, violation, misjudgement, etc • don’t point to individuals when you really want to understand system itself and the work • how do you feel when something goes wrong? • is it to find who did it / who screwed up, or to find how it happened?
  57. Other Things to Avoid
  58. Root Cause • it leads to a simplistic and linear explanation of how events transpired • linear mental models of causality don’t capture what is needed to improve the safety of a system • ignores the complexity of an event, which is what should be explored if we are going to learn • leads directly to blaming things on human error
  59. Nietzschean anxiety when situations appear both threatening and ambiguous we seem to demand a clear causal agency; because if we cannot establish this agency then the "problem" is potentially irresolvable
  60. Hindsight Bias inclination, after an event has occurred, to see the event as having been predictable, despite there having been little or no objective basis for predicting it
  61. Counterfactuals the human tendency to create possible alternatives to life events that have already occurred; something that is contrary to what actually happened
  62. Morgue https://github.com/etsy/morgue
  63. Post Mortem Meeting Format
  64. Meeting Format • Timeline • Discussion • Remediation Items
  65. Timeline • a rough timeline scaffolding is required • talk about facts that were known at the time, even if hindsight reveals misunderstandings in what we knew • look out for knowledge that some people were aware of, that others were not, and dig into that • no judgement about actions or knowledge (counterfactuals) • tell people to hold that thought if they jump to remediation items at this point
  66. Timeline (cont) • continually ask "What are we missing?" until those involved feel its complete • continually ask "Does everyone agree this is the order in which events took place?" • make sure to include important times for events that happened (alerts, discoveries) • reach a consensus on the timeline and move on to the discussion
  67. Discussion • When an action or decision was taken in the timeline, ask the person: "Think back to what you knew at the time, why did that action make sense to you at the time?" • Did we clean up anything after we were stable, how long did it take? • Was there any troubleshooting fatigue?
  68. Discussion (cont) • Did we do a good job with communication (site status, support, forums, etc)? • Were all tools on hand and working, ready to use when we needed them during the issue? Where there tools we would have liked to have? • Did we have enough metrics visibility to diagnose the issue? • Was there collaborative and thoughtful communication during the issue?
  69. Remediation • Remediation items should have tickets associated with them to follow up on • There can be further post meeting discussion on these but tasks should not linger
  70. Remediation questions • What things could we do to prevent this exact thing from happening in the future? • What things could we do to make troubleshooting similar incidents in the future easier?
  71. In Summary
  72. We Can Learn Before and After Failure
  73. Before • Architecture reviews for new technology • Operability reviews to gain launch confidence
  74. After • Postmortems are done soon after a failure • avoid human error, counterfactuals, hindsight bias, and root cause
  75. Questions? John Goulah (@johngoulah) Etsy

×