Continuous Monitoring and Faster Service Restoration (CM and FSR)

Continuous Monitoring and
Faster Service Restoration
(CM and FSR)

How do we quickly
restore our
services back post
an incident/
outage?

The Problem: For the larger portfolio
of application and services, most of
which are third party and off-the-shelf
based solutions. Due to a wide variety
in how these solutions were designed
and deployed overtime, stop-start
procedures vary widely. There are
often multiple upstream and
downstream dependencies to be met
for restarts. Traditionally, much of the
stop-start or restart of applications is
conducted manually or in a semi-
automated fashion (within an app), that
require an ops engineer to login to
multiple systems to restore full service
of a given application. This leads to
applications being unavailable to
businesses for a prolonged time during
a major incident.

Given the vast heterogeneity of the EBS
portfolio of applications it is important to
provide a stable consolidated solution for
auto-restart. E BS App Ops has
embarked on an initiative (EBS Faster
Service Restoration FSR) to improve
restoration of its applications through
automation. At this point the focus is on
reducing time to recover, automated
dependency management and
eliminating human errors rather than self
healing (i.e. crawl, walk, run). Primary
objectives are to achieve RTO < 15 mins
or to reduce the current time to restore
by at least 80%. This standard
framework needs to adopt across a
variety of applications, be secure &
compliant and provide for verification of
availability of capabilities integrated into
a dashboard.

Continuous Monitoring and Faster Service Restoration (CM and FSR)

More Related Content

Recently uploaded

Featured

Continuous Monitoring and Faster Service Restoration (CM and FSR)