Leslie Rowe, Senior Director of Technology, Cox Automotive
Learn how Cox Automotive automates event notifications across twenty newly acquired business units, including service operations, network operations etc., from their Toolchain while engaging the right people as fix agents. Leslie Rowe, Senior Director, Technology – Enterprise Operations Center will discuss how Cox Automotive successfully integrates new businesses and minimizes disruptions that result in the merger process. By optimizing communication and remediation processes, Cox Automotive reduces Enterprise Severity 1 and 2 incidents, and streamlines critical operations, resulting in reduced IT costs, increased IT efficiency, and a better customer experience.
Cox Automotive: Delivering a 70% Improvement in MTTR for Major Incidents
1. L e s l i e R o w e - S e n i o r D i r e c t o r o f
Te c h n o l o g y, C o x A u t o m o t i v e
O c t o b e r 2 0 1 8
Cox Automotive:
Delivering a 70%
Improvement in MTTR for
Major Incidents
1
5. No Visibility at the Enterprise Level
Service Delivery Challenges Associated with Rapid Acquisitions
5
…our ITSM needs to support Cox Automotive’s transition to a “One Product
Platform” and implement “transparent and consistent KPIs.” Currently, multiple
processes, tools, and organizational alignment exist across Cox Automotive.
•No Visibility at the
Enterprise Level
Inconsistent Support
Response Time / SLAs
Disparate Service
Delivery Focus
6. Cox Automotive
Service Management
Canada
RTS
Enterprise Service Delivery Alignment Inconsistent
6
DDS Service Management is
inconsistent across
business units, with
various levels of adoption,
including:
● People
● Processes
● Tools
● Continuous
Improvement
7. Focus on Major Incident Management
7
MONITORING/
EVENT
MANAGEMENT
INCIDENT
MANAGEMENT
PROBLEM
MANAGEMENT
CHANGE
MANAGEMENT
REPORTING &
CONTINUOUS
IMPROVEMENT
Goal:
Leadership needs consistent,
transparent visibility to systems
across Cox Automotive
Objective:
Build a scalable ITSM organization,
providing visibility and first-tier
support across the enterprise
Solution:
Create a robust Enterprise
Operations Center
9. 9
Enterprise Operations Center Mission
Ensure the availability of company
websites, applications and hardware
critical to business and the success of
Cox Automotive through monitoring,
detection, major incident management,
communications and problem
management.
10. IT Service Management Offerings
10
MONITORING/
EVENT
MANAGEMENT
INCIDENT
MANAGEMENT
PROBLEM
MANAGEMENT
CHANGE
MANAGEMENT
REPORTING &
CONTINUOUS
IMPROVEMENT
11. Focus on Major Incident Management
11
MONITORING/
EVENT
MANAGEMENT
INCIDENT
MANAGEMENT
PROBLEM
MANAGEMENT
CHANGE
MANAGEMENT
REPORTING &
CONTINUOUS
IMPROVEMENT
Take Action:
• Standardize and optimize incident management
• Improve facilitation and technical skills of staff
• Differentiate Incident and Problem Management
• Rationalize Toolset
• Build a plan based on People, Processes, Tools, Continuous Improvement
Requirements:
• Reduce MTTR
• Improve availability
• Leverage existing people, process, tools (headcount
neutral, costs flat)
12. I + R = $
12
o We must identify and remediate issues sooner, lowering overall
MTTR and improving product availability and client experience
13. EOC – Build a Best Practice Functional Organization
13
Event Management/
Tool Support
Service Management
Architecture
24x7x365
Command Center
Event Response
Request Fulfillment
Level 0
Incident Response
Level 1
Product Support
Level 2
Major Incident
Management
Analytics/Continuous
Improvement
Problem
Management
Change
Management
15. Barriers to Success
15
Organizational dynamics – change
resistance – Cox Automotive vs
business
Siloed IT support knowledge
Lack of Enterprise perspective
Process Inconsistency
ITSM is an acronym, not a philosophy
No plan for onboarding to services
Disparate toolsets and support
structures
Fragmented contracts
Lack of consistent metrics
Perception of EOC Services to
business
Market benefits and value-add
16. People - Optimize 24x7 Incident Management
16
Major
Incident
Management
Event
Response and
Request
Fulfillment
(Tier 0)
Incident
Response
(Tier 1)
Product
Support
(Tier 2)
Best Practices:
• Clarify roles by support tiers vs
geographies
• Provide for incident handoff and
depth in coverage
• Build in redundancy
• Establish and build skills in Major
Incident Management
17. Process – Differentiate Incident From Problem Management
17
Observations:
• Incident and Problem Management
functions not providing expected results
• Incident and Problem Management
handled by same EOC team members
• Same fix agents working continuously
on both, leads to inefficiencies, overtime
and lack of economies of scale
• Incidents interrupt work on proactive
problem management, contributing to
lack of focus on problems and
prevention
Problem Management:
• Recurring Incidents
• Requires collaboration during business hours
• Facilitates by deep product knowledge
• Drives permanent resolutions
• Focuses on proactive prevention
Incident Management:
• Requires 24 x 7 support
• Requires deep troubleshooting skills
• Drives restoration of services
• Focuses on MTTD and MTTR
18. Process - Major Incident Management Communication Process
18
Page to
Bridge
Incident
Status
Restoration
Analysis
20 minutes < 24 hours30-60 minutes,
depending on
severity
x Impacted
Service
Minutes
< 48 hours
20
Min
Event
Alert
RestorationPrevention
0 minutes < 90 days
Leadership
Summary
21. Tools - Automated Communications - Call to Bridge for Troubleshooting
21
• Communication to primary
oncall engineer
• Escalation sequence
• Call logs
• Auditable
• Configurable by text,
email, phone
• Synchronized with
ServiceNow queues
22. Tools - Automated Communications - Status Messaging
22
Messaging intervals
based on:
• Enterprise Severity
• Priority
Integrated with Slack
channel and ServiceNow
incident record:
• Dedicated Slack channels
for Major Incidents
ES1/P1
23. Continuous Improvement – Monthly Reporting
23
❑ The EOC reports Internal (Change, Failures) along with 3rd Party degradations and outages.
❑ These are measured using Impacted Service Minutes (ISMs) classified by our Enterprise Severity Matrix (ESM) and Priority
scales respectively
24. EOC Services and Value Add
24
As Business Unit applications onboard to EOC Services, Cox Automotive’s overall MTTR
has significantly decreased.
UPDATE
FRO
M
BUR
o By detecting outages sooner, Mean Time
To Identify (MTTI) improvements help
remediate degradations and outages
sooner lowering overall MTTR and
improving product availability and Client
experience.
o MTTR = 463 minutes 128
minutes
o 70% Reduction in MTTR first year
27. Lessons Learned
Business
❑ Provide leadership with visibility through real-time alert/outage notifications
❑ Design a “Single Pane of Glass” of consolidated alerts across the enterprise
❑ Leverage collective staff resources for faster and more efficient remediation
❑ Drive Root Cause Analyses to help prevent future impacts
❑ Drive positive behavior and investment through reporting capabilities
Project
❑ Focus on businesses with the least alignment and the greatest appetite
❑ Focus on high priority applications (revenue-generating or customer-impacting)
❑ Build raving fans – the rest will follow
27