Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Off-Hours Critical Issue Escalation

4,390 views

Published on

What do you do when a critical bug crops up in your product...after 5pm? Or on the weekend? Do you panic? Call everyone in the company? Or do you have a plan in place?

This is UserVoice's recently-implemented escalation plan. It's not perfect, and undoubtably will change, but hopefully it can be a useful template for you.

Published in: Business, Technology
  • Be the first to comment

Off-Hours Critical Issue Escalation

  1. 1. Critical Issue Escalation: Our Process Evan Hamilton Head of Community, UserVoice
  2. 2. OMGEVERYTHING IS BROKEN
  3. 3. Why do we need aprocess here?
  4. 4. Why do we need aprocess here?• Our customers need a workingproduct (or they’ll leave)
  5. 5. Why do we need aprocess here?• Our customers need a workingproduct (or they’ll leave)• When things go wrong without aplan, chaos ensues
  6. 6. Why do we need aprocess here?• Our customers need a workingproduct (or they’ll leave)• When things go wrong without aplan, chaos ensues• We don’t work 24/7
  7. 7. Why do we need aprocess here?• Our customers need a workingproduct (or they’ll leave)• When things go wrong without aplan, chaos ensues• We don’t work 24/7• We don’t want to wake everyoneup every time there is an issue
  8. 8. So what is a critical issue?
  9. 9. So what is a critical issue?Work hours:• Interrupting core functionality (Ex: settings not savingconsistently)• OR losing/corrupting data• OR serious consequences (Ex: loss of a majoraccount)• AND can be reproduced (or has been reported byenough people that it must be happening)
  10. 10. So what is a critical issue?Work hours:• Interrupting core functionality (Ex: settings not savingconsistently)• OR losing/corrupting data• OR serious consequences (Ex: loss of a majoraccount)• AND can be reproduced (or has been reported byenough people that it must be happening)Off hours:• Blocking core functionality (Ex: can’t access feature)• AND affecting multiple people• OR losing/corrupting data• AND can be reproduced (or has been reported byenough people that it must be happening)
  11. 11. Step 0: Spot the Issue
  12. 12. Ticket QueueSupport team monitors all day, and at least twice each evening. Ifany of the team will be unavailable for an extended period of time, they’ll deputize someone from Sales or Community.
  13. 13. Ticket Queue Support team monitors all day, and at least twice each evening. If any of the team will be unavailable for an extended period of time, they’ll deputize someone from Sales or Community. Social MediaCommunity team monitors all day, and at least twice each evening. Ifany of the team is unavailable for an extended period of time, they’ll potentially deputize someone from Support.
  14. 14. Ticket Queue Support team monitors all day, and at least twice each evening. If any of the team will be unavailable for an extended period of time, they’ll deputize someone from Sales or Community. Social MediaCommunity team monitors all day, and at least twice each evening. Ifany of the team is unavailable for an extended period of time, they’ll potentially deputize someone from Support. The Rest of the Team They may not be on the Customer Team, but if they see a critical issue, it’s their responsibility to report it.
  15. 15. Step 1: Create a Trello bug. Issue history FTW. Ad-hoc communication FTL.
  16. 16. Step 2:Contact aDeveloper
  17. 17. • Choose the relevant product area (Systems, Front-End, or Code) & contact the dev at the top of that list.
  18. 18. • Choose the relevant product area (Systems, Front-End, or Code) & contact the dev at the top of that list.• Office hours? Ping them in HipChat.
  19. 19. • Choose the relevant product area (Systems, Front-End, or Code) & contact the dev at the top of that list.• Office hours? Ping them in HipChat.• Off hours? Call them, don’t text or chat or email.
  20. 20. • Choose the relevant product area (Systems, Front-End, or Code) & contact the dev at the top of that list.• Office hours? Ping them in HipChat.• Off hours? Call them, don’t text or chat or email.• If they don’t respond to 2 pings within 10 minutes, move down the list.
  21. 21. Dev Escalation List• System dev: Kevin• App devs: Jonathan, Mark, Austin, Joey, Raimo,Rich• Interface devs: Joshua, John, Brad, RichFor System Issues (site is down/slow, emails don’twork): contact system + app devFor Interface Issues (the interface looks broken,won’t work, etc): contact interface + app devFor all other issues: contact app dev
  22. 22. Devs: did you get the call?Then:
  23. 23. Devs: did you get the call?Then:• Respond affirmatively to the personwho contacted you
  24. 24. Devs: did you get the call?Then:• Respond affirmatively to the personwho contacted you• Join the Engineering room onHipChat and let others knowsomeone is working on it
  25. 25. NO additional customer team members should be communicating with the devsolving the problem – only the one who first reported it. More voices confuse and distract.
  26. 26. Step 3: Inform theCustomer Team
  27. 27. Email the whole customer team sothey know about the issue (and thatyou’re working with the devs)
  28. 28. Email the whole customer team sothey know about the issue (and thatyou’re working with the devs)Is it work hours? Also @all everyonein the Support room on HipChat
  29. 29. Step 4:Is it super-critical?
  30. 30. Ask Developer (before they fix the bug):
  31. 31. Ask Developer (before they fix the bug):• Roughly how many people mightthis be affecting?
  32. 32. Ask Developer (before they fix the bug):• Roughly how many people mightthis be affecting?• Roughly what issues might this becausing?
  33. 33. Ask Developer (before they fix the bug):• Roughly how many people mightthis be affecting?• Roughly what issues might this becausing?• (If they’re too busy fixing it,consider calling in a second dev)
  34. 34. Ask Developer (before they fix the bug):• Roughly how many people mightthis be affecting?• Roughly what issues might this becausing?• (If they’re too busy fixing it,consider calling in a second dev)• Customer team: it’s your job toensure this happens
  35. 35. How do I know if it’s Super- Critical?• Does this affect more than 20% of accounts?• Is this a very frustrating or visible bug (vs just anannoyance)?*This is somewhat arbitrary.
  36. 36. If Super-Critical:• Call the Head of Support & Head of Community• Community Department should tweet about the issue(make sure to reschedule any other tweets - “check outour blog” would be an unfortunate tweet during anoutage)• Leave a maximum of 30m between any publicmessages about critical bugs and 15m between publicmessages about downtime• DO NOT suggest a timeframe (it may change)• DO NOT talk about the cause (you may be wrong)• Going to require a long fix? Publish a blog post& Facebook status too
  37. 37. Step 5: Respond to Issues
  38. 38. Who answers what?Work hours?Community handles social media, Support handlestickets.*This is somewhat arbitrary.
  39. 39. Who answers what?Work hours?Community handles social media, Support handlestickets.Off hours?Support handles both (but call in backup if needed).*This is somewhat arbitrary.
  40. 40. Who answers what?Work hours?Community handles social media, Support handlestickets.Off hours?Support handles both (but call in backup if needed). -Regardless, make sure you’re in the Support room in HipChat so you can be communicating with the team-*This is somewhat arbitrary.
  41. 41. Step 6: Solve and Verify
  42. 42. • Dev should fix the issue (duh).
  43. 43. • Dev should fix the issue (duh).• Dev should verify that the fix will stick (may require calling in a second dev)
  44. 44. • Dev should fix the issue (duh).• Dev should verify that the fix will stick (may require calling in a second dev)• Customer Team member should also verify that issues are resolved
  45. 45. Step 5: Report Damage and Close the Loop (the 7 questions)
  46. 46. Dev should answer these questions for the Customer Team member:1. What did our customers experience? (Please be explicit: don’t just say what was broken, explain the experience our customers would have had when trying to accomplish this task.)2. How many/which customers were affected?2. When did this issue start? When was it resolved?3. What caused it?3. What are we doing to avoid it in the future?4. What are the chances that there will be related issues in the short- term future?• What was the damage (data, accounts, etc)?
  47. 47. The Loop-Closing:
  48. 48. The Loop-Closing:1.Entire Engineering Team and Customer Team shouldbe sent these answers so they’re clued in
  49. 49. The Loop-Closing:1.Entire Engineering Team and Customer Team shouldbe sent these answers so they’re clued in2.Customer Team member should follow up with allcustomers who reported the issue*
  50. 50. The Loop-Closing:1.Entire Engineering Team and Customer Team shouldbe sent these answers so they’re clued in2.Customer Team member should follow up with allcustomers who reported the issue*3.If mass communication occurred, publishannouncement of the fix to those channels
  51. 51. The Loop-Closing:1.Entire Engineering Team and Customer Team shouldbe sent these answers so they’re clued in2.Customer Team member should follow up with allcustomers who reported the issue*3.If mass communication occurred, publishannouncement of the fix to those channels*And give the one who reported it a discount for their next month of billing!
  52. 52. Post-issue Communication:should we blog about it?
  53. 53. Post-issue Communication:should we blog about it?• The litmus test: would I be angry if Iexperienced this and then heard nothing? Then blog.
  54. 54. Post-issue Communication:should we blog about it?• The litmus test: would I be angry if Iexperienced this and then heard nothing? Then blog.• (If it only affected a small # of accounts, email them)
  55. 55. Post-issue Communication:should we blog about it?• The litmus test: would I be angry if Iexperienced this and then heard nothing? Then blog.• (If it only affected a small # of accounts, email them)• If extremely severe, considerreimbursements as well.
  56. 56. Hooray, we’ve saved the day!
  57. 57. EvanHamilton@evanhamiltonevan@uservoice.comMore content athttp//:community.uservoice.com

×