Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How Constant Contact Moves 10X Faster with a Connected Incident Management Toolchain


Published on

Lucas Villeneuve, Systems / Mail Operations Engineer, Constant Contact

For Constant Contact (a subsidiary of Endurance) , the second largest website hosting provider, delayed incident response can put its customers’ businesses at risk of losing revenue and leads. Learn how Contact Contact integrated xMatters into multiple ITOps and DevOps tools, including Nagios, New Relic, Splunk, and HipChat, to unify its incident management process across various teams. See how Constant Contacts’s toolchain increases command center visibility, while automating alerting and escalation paging, to enable incident response that is 10x faster than before.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

How Constant Contact Moves 10X Faster with a Connected Incident Management Toolchain

  1. 1. How Constant Contact Tech Ops Moves 10x Faster with a Connected Incident Management Toolchain Lucas Villeneuve | System Engineer
  2. 2. Lucas Villeneuve System Engineer, Constant Contact
  3. 3. About Constant Contact Awe-inspiring templates. Easy-to-use editor. Powerful list-building tools. Expert advice and live support. Constant Contact delivers everything you need to connect with customers, grow your business, and watch big-time results pour in—in real time. So pop some popcorn. It doesn't get more fun than that. Every day we challenge ourselves to find new ways to deliver on the promise of Endurance’s mission: help small businesses navigate the promise, power, and potential of the web. We make email marketing easier, faster, and, well, funner.
  4. 4. About Constant Contact My current team in Constant Contact consists of 8 people but is a part of a larger Tech Ops team of 50. In total, there are approx. 3,500 Endurance IT team members spread out across the globe. 3500 in IT 8 People in Current Team 50 in Tec Ops
  5. 5. About Endurance • A Global Family of Brands, Built to (Em)power. • From website hosting and design to email marketing, our brands deliver the tools and support that small business owners need to fuel their online presence and reach customers everywhere.
  6. 6. GoRuck Endurance Events ALWAYS A TEAM EVENT, NEVER A RACE. Based on Special Forces training, your class is led from start to finish by a Special Forces Cadre. His job is to test your limits, push you beyond them, and build your class into a team. There are no cash prizes at the finish. All you earn is a 2x3 inch patch and the respect of everyone to your left and right.
  7. 7. GoRuck
  8. 8. Life Before xMatters
  9. 9. We Need an Escalation Procedure
  10. 10. Accountability • Process was well adopted within Ops but there was a definite struggle amongst development. • There were no “on-call” schedules set in dev ,which left open a lot of single points of failure. • Establishing a post mortem ceremony was key in ensuring issues did not happen again.
  11. 11. CAPA Bad companies are destroyed by crisis, Good companies survive them, Great companies are improved by them. - Andrew Grove, Intel “ “ CAPA (Corrective and Preventative Actions) items were established to hold teams accountable and really step back and further locate our “weak” points
  12. 12. Why xMatters? • Choosing xMatters suited this maturity journey we were on. • Not only did xMatters allow us to create a customized two-way integration with Nagios but xMatters also has product maturity, • The huge push for mobile functionality was fantastic and really well received across our organization. • Usability - the UI was simple and intuitive.
  13. 13. Use Cases
  14. 14. Monitoring & Incident Workflow
  15. 15. Monitoring & Incident Workflow: DIYA Post Mortem Operations Team Dev Team SWAT Team 1 2 4 5 1. Monitoring – Big Panda, Nagios, and New Relic pushes to on-call group for inspection of monitoring issue 2. Our NOC– connects with xMatters and pushes to on-call group for assignment and action (Jira) 3. Jira Case – xMatters engage button creates actionable responses to create Jira ticket with information 4. Incident Communication / Escalation – Resolve command in HipChat room (based on incident number) to bring transcript into ticket and resolve. Ops team (1st) Dev team (2nd) SWAT team (3rd). 5. Post Mortem – We use xMatters to communicate status and reporting for a Post Mortem review of incident AssignmentNOC 3 ResolvedBig Panda TechOps Teams
  16. 16. Escalation Paging
  17. 17. Use Case 2 – Escalation Paging
  18. 18. Benefits and Metrics • Results: Nearly 50% reduction in production impacting incidents – more SO complete customer outages: 2014 2015 2016 2017 2018 68 incidents – 14 sev 1s 102 incidents – 15 sev 1s 102 incidents – 9 sev 1s 54 incidents – 6 sev 1s 57 incidents YTD – 2 sev 1s
  19. 19. • Unification and visibility across entire company • Migration to New Relic for deeper insight • Integration into Hipchat Looking Ahead
  20. 20. Future Use Case Incident Communication with Hipchat
  21. 21. How Constant Contact Tech Ops Moves 10x Faster with a Connected Incident Management Toolchain Lucas Villeneuve | System Engineer Thank You
  22. 22. Q & A