Cerner & AppDynamics
END TO END VISION!
 James Reyes – Cerner CTS CSM – Sr. Service Delivery Manager
 11+ years accountable for Cerner's corporate IT operations (DevOps) for
all enterprise applications to ensure the availability and performance of
these solutions for our internal and external clients.
Corporate Systems Management
 About CTS CSM team..
 Corporate Systems Management
 47 team members (31 US, 16 BLR, India)
 Service Delivery Managers
 Production Owner Architects
 Technology Architects
 System Engineers
 Interns – Cerner Scholars, MIC Interns, KCIT Apprentices
 Responsible for all Enterprise Application Operations (~150 Solutions)
 If you use it to do your job at Cerner, we probably manage it!
Cerner AppDynamics Maturity Journey
Level 1 –
Initial
No defined Process.
Beginning of
system. No process
characterization.
(CHAOS )
Level 2 –
Managed
Process
characterized for
Projects, often
Reactive
(REPEATABLE)
Level 3 -
Defined
Process
characterized for
Organizations, often
Reactive
(WORKPLAN
BASED)
Level 4 -
Quantitatively
Managed
Processes
measured and
controlled
(METRICS BASED)
Level 5 –
Optimized
Focus on Process
Improvement
(PERFORMANCE
ORIENTED)
“Managing the Software Process” by Watts Humphrey, 1989
‘Official’ Problem Statement
 Reproducing & Isolating performance issues requires extensive effort
and often results in substantive delays in meantime to resolution.
 The current volume and effort required to address ticket submissions
is untenable under current constraints from a resource and/or toolset
standpoint. These constraints are causing great impact to cost and
customer/user satisfaction.
But to be clear let’s
start with the User
(‘Actual’ Problem
Statement).
What is their expectation?
They expect the service to work and when it
does not??
UGGG!!!
What can we do it about it??
Step 1 - Start with a Service Oriented
Culture
We are in the service industry…
Collaboration between Support + Ops + Development = User Experience
Needs to be defined, no solution or tool can fix this
Cerner’s Culture enables Incident Response
- How can we get better?
Step 2 – Define an Operational
Support Process
User reports a problem to the help
desk
IRC
Developers
Monitoring
CSM
User
Helpdesk
Resolution
Situation
Investigation
Problem Mgmt
Service Req
Help!
Help desk reproduces the issue,
engages IRC
IRC
Developers
Monitoring
CSM
User
Helpdesk
Resolution
Situation
Investigation
Problem Mgmt
Service Req
Yup,
that’s a
problem!
IRC initiates situation management, engages CSM
IRC
Developers
Monitoring
CSM
User
Helpdesk
Resolution
Situation
Investigation
Problem Mgmt
Service Req
• Contact PO
• Alert xMatters
on call group
CSM is engaged. Developers or alerting are also entry points
IRC
Developers
Monitoring
CSM
User
Helpdesk
Resolution
Situation
Investigation
Problem Mgmt
Service Req
Hey team,
something
doesn’t look
right…
Paging CSM,
Paging CSM...
Situation management initiated by IRC
IRC
Developers
Monitoring
CSM
User
Helpdesk
Resolution
Situation
Investigation
Problem Mgmt
Service Req
• IM or phone call
initiated with team
• Status updates sent
from xMatters
The team investigates
IRC
Developers
Monitoring
CSM
User
Helpdesk
Resolution
Situation
Investigation
Problem Mgmt
Service Req
• Health check of solution
• Engage infrastructure or
vendor
The team prevails! Issue resolved!
IRC
Developers
Monitoring
CSM
User
Helpdesk
Resolution
Situation
Investigation
Problem Mgmt
Service Req
Fixed!
Problem Management (Root cause analysis, corrective
actions)
IRC
Developers
Monitoring
CSM
User
Helpdesk
Resolution
Situation
Investigation
Problem Mgmt
Service Req• Investigate root cause (if
still unknown)
• Identify and implement
corrective actions to
prevent further
occurrences
• Submit issue to problem
management
How can we improve this process?
Where are there constraints?
Thoughts?
Step 3 – Understand your User
Experience
- Enable End User Experience Monitoring (before URL up/down only)
- Why is this usually not first?
Helpdesk
User
Help!
Biggest
Challenge is for
Operations
Performance Degradation
“The App is Slow”
Incomplete Functionality
“This app’s function is no longer
normal”
Outage
“The App is down”
1-HELP and IRC receives these type of calls
AppDynamics End User Experience Monitoring
AppDynamics End User Experience
Monitoring
How are we implementing EUM?
F5 Application Acceleration +
AppDynamics EUM Used Case
- Where should we accelerate?
- Value Statement on Performance gains or losses
Step 4 – Improve MTTR (Meantime to
Resolution)
-War Room! All Hands on Deck!
- EUM + APM can lead us the way
- SMEs of Integrations (Integration Architect)
Resolution
Investigation
• Health check of solution
• Engage infrastructure or
vendor
ODS
ODSCRM Remedy
Solution Change
STG
Remedy
eService
ITWx
xMatters
CernerDotCom
DB
Routing
Apps
Attachments
DB
WSI
UTNGen
SSPApp
IHO Remedy/Nav
KBA
Navigator
My.cerner
IP Factory
TTA
SSA DB
Swx SRM
SRApp
CR(Nav)
MTA’s
MDM
CHD
Client
CHDI
(External F5)
Create/Update
To Cerner
From Cerner
PIVOT/
IRC/
Amb/
SRViewer
Puma
mq
mq
JIRA App
SSP Attachments (IIS)
Content Store (IIS)
SRApp
ODS-SR
MDM
Solution Change
MTA’s
mq
First Attempt – Build Integrated Service Model
Manually
Next Attempt – How can we automate?
AppDynamics APM!
What can Support and Ops
do?
AppDynamics APM Enables Visibility Into the
Business Transaction = End to End
A business transaction represents a distinct unit of
business logic enabled by your application environment,
for example, for an ecommerce website, searching
inventory or placing an order.
A business transaction is defined by an entry point
interactions between the components that participate in
implementing the transaction, such as databases,
application servers, and messaging queues.
Soarian Monitoring Use Case
• Complicated and Highly
Integrated Application
• Limited knowledge of the
application flow
• Similar problems
• Multiple Tools
• Tickets NOT getting resolved
AppDynamics Deathstar
• Easy transition
from 1st to 3rd
level
• Application
Flow
automatically
created
• Low overhead
allows us to
collect more
information
rather than
tune it down
AppDynamics addresses these
problems:
 Leveraging Smart Code Instrumentation to enable in-depth monitoring of
production apps without making configuration changes
 Monitoring every transaction but intelligently capturing details of only the
irregular transactions, making the platform scale to meet the demands of
Cerner
 Knowing our performance in the context of auto-generated dynamic
baselines
AppDynamics eases the pain:
 Automatically discover application topology and interdependencies, and
trace key business transactions based on production application behavior
 Allows us to visualize and prioritize the end to end business transactions’
performance and not just the health of the application and the
infrastructure nodes
AppDynamics Enablement Recap
 Step 1 - Start with a Service Oriented Culture
 Step 2 – Define an Operational Support Process
 Step 3 – Understand your User Experience - EUM
 Step 4 – Improve MTTR (Meantime to Resolution) - APM
AppDynamics EUM + APM can
enable turning this
Questions??
Feel free to contact me – james.reyes@cerner.com
TO

Cerner APM Journey with AppDynamics

  • 1.
  • 2.
     James Reyes– Cerner CTS CSM – Sr. Service Delivery Manager  11+ years accountable for Cerner's corporate IT operations (DevOps) for all enterprise applications to ensure the availability and performance of these solutions for our internal and external clients.
  • 3.
    Corporate Systems Management About CTS CSM team..  Corporate Systems Management  47 team members (31 US, 16 BLR, India)  Service Delivery Managers  Production Owner Architects  Technology Architects  System Engineers  Interns – Cerner Scholars, MIC Interns, KCIT Apprentices  Responsible for all Enterprise Application Operations (~150 Solutions)  If you use it to do your job at Cerner, we probably manage it!
  • 4.
    Cerner AppDynamics MaturityJourney Level 1 – Initial No defined Process. Beginning of system. No process characterization. (CHAOS ) Level 2 – Managed Process characterized for Projects, often Reactive (REPEATABLE) Level 3 - Defined Process characterized for Organizations, often Reactive (WORKPLAN BASED) Level 4 - Quantitatively Managed Processes measured and controlled (METRICS BASED) Level 5 – Optimized Focus on Process Improvement (PERFORMANCE ORIENTED) “Managing the Software Process” by Watts Humphrey, 1989
  • 5.
    ‘Official’ Problem Statement Reproducing & Isolating performance issues requires extensive effort and often results in substantive delays in meantime to resolution.  The current volume and effort required to address ticket submissions is untenable under current constraints from a resource and/or toolset standpoint. These constraints are causing great impact to cost and customer/user satisfaction.
  • 6.
    But to beclear let’s start with the User (‘Actual’ Problem Statement). What is their expectation? They expect the service to work and when it does not?? UGGG!!! What can we do it about it??
  • 7.
    Step 1 -Start with a Service Oriented Culture We are in the service industry… Collaboration between Support + Ops + Development = User Experience Needs to be defined, no solution or tool can fix this Cerner’s Culture enables Incident Response - How can we get better?
  • 8.
    Step 2 –Define an Operational Support Process
  • 9.
    User reports aproblem to the help desk IRC Developers Monitoring CSM User Helpdesk Resolution Situation Investigation Problem Mgmt Service Req Help!
  • 10.
    Help desk reproducesthe issue, engages IRC IRC Developers Monitoring CSM User Helpdesk Resolution Situation Investigation Problem Mgmt Service Req Yup, that’s a problem!
  • 11.
    IRC initiates situationmanagement, engages CSM IRC Developers Monitoring CSM User Helpdesk Resolution Situation Investigation Problem Mgmt Service Req • Contact PO • Alert xMatters on call group
  • 12.
    CSM is engaged.Developers or alerting are also entry points IRC Developers Monitoring CSM User Helpdesk Resolution Situation Investigation Problem Mgmt Service Req Hey team, something doesn’t look right… Paging CSM, Paging CSM...
  • 13.
    Situation management initiatedby IRC IRC Developers Monitoring CSM User Helpdesk Resolution Situation Investigation Problem Mgmt Service Req • IM or phone call initiated with team • Status updates sent from xMatters
  • 14.
    The team investigates IRC Developers Monitoring CSM User Helpdesk Resolution Situation Investigation ProblemMgmt Service Req • Health check of solution • Engage infrastructure or vendor
  • 15.
    The team prevails!Issue resolved! IRC Developers Monitoring CSM User Helpdesk Resolution Situation Investigation Problem Mgmt Service Req Fixed!
  • 16.
    Problem Management (Rootcause analysis, corrective actions) IRC Developers Monitoring CSM User Helpdesk Resolution Situation Investigation Problem Mgmt Service Req• Investigate root cause (if still unknown) • Identify and implement corrective actions to prevent further occurrences • Submit issue to problem management
  • 17.
    How can weimprove this process? Where are there constraints? Thoughts?
  • 18.
    Step 3 –Understand your User Experience - Enable End User Experience Monitoring (before URL up/down only) - Why is this usually not first? Helpdesk User Help!
  • 19.
    Biggest Challenge is for Operations PerformanceDegradation “The App is Slow” Incomplete Functionality “This app’s function is no longer normal” Outage “The App is down” 1-HELP and IRC receives these type of calls
  • 20.
    AppDynamics End UserExperience Monitoring
  • 21.
    AppDynamics End UserExperience Monitoring How are we implementing EUM?
  • 22.
    F5 Application Acceleration+ AppDynamics EUM Used Case - Where should we accelerate? - Value Statement on Performance gains or losses
  • 23.
    Step 4 –Improve MTTR (Meantime to Resolution) -War Room! All Hands on Deck! - EUM + APM can lead us the way - SMEs of Integrations (Integration Architect) Resolution Investigation • Health check of solution • Engage infrastructure or vendor
  • 24.
    ODS ODSCRM Remedy Solution Change STG Remedy eService ITWx xMatters CernerDotCom DB Routing Apps Attachments DB WSI UTNGen SSPApp IHORemedy/Nav KBA Navigator My.cerner IP Factory TTA SSA DB Swx SRM SRApp CR(Nav) MTA’s MDM CHD Client CHDI (External F5) Create/Update To Cerner From Cerner PIVOT/ IRC/ Amb/ SRViewer Puma mq mq JIRA App SSP Attachments (IIS) Content Store (IIS) SRApp ODS-SR MDM Solution Change MTA’s mq First Attempt – Build Integrated Service Model Manually
  • 25.
    Next Attempt –How can we automate? AppDynamics APM!
  • 26.
    What can Supportand Ops do? AppDynamics APM Enables Visibility Into the Business Transaction = End to End A business transaction represents a distinct unit of business logic enabled by your application environment, for example, for an ecommerce website, searching inventory or placing an order. A business transaction is defined by an entry point interactions between the components that participate in implementing the transaction, such as databases, application servers, and messaging queues.
  • 27.
    Soarian Monitoring UseCase • Complicated and Highly Integrated Application • Limited knowledge of the application flow • Similar problems • Multiple Tools • Tickets NOT getting resolved
  • 28.
    AppDynamics Deathstar • Easytransition from 1st to 3rd level • Application Flow automatically created • Low overhead allows us to collect more information rather than tune it down
  • 29.
    AppDynamics addresses these problems: Leveraging Smart Code Instrumentation to enable in-depth monitoring of production apps without making configuration changes  Monitoring every transaction but intelligently capturing details of only the irregular transactions, making the platform scale to meet the demands of Cerner  Knowing our performance in the context of auto-generated dynamic baselines
  • 30.
    AppDynamics eases thepain:  Automatically discover application topology and interdependencies, and trace key business transactions based on production application behavior  Allows us to visualize and prioritize the end to end business transactions’ performance and not just the health of the application and the infrastructure nodes
  • 31.
    AppDynamics Enablement Recap Step 1 - Start with a Service Oriented Culture  Step 2 – Define an Operational Support Process  Step 3 – Understand your User Experience - EUM  Step 4 – Improve MTTR (Meantime to Resolution) - APM
  • 32.
    AppDynamics EUM +APM can enable turning this Questions?? Feel free to contact me – james.reyes@cerner.com TO

Editor's Notes

  • #7 They expect the service to work and when it doesn’t this can happen. They don’t care about what the vision or arch design of the solution is or even why its broken, they just know the service that they typically expect is not there and that is frustrating as a user.
  • #8 Believe it or not we are all in the service industry, we are serving our clients, Cerner’s culture of making it personal and enabling collaboration has enabled this. This culture enables incident response, but we can always in improve, having 100 people on call is good, but could be better. This guy is our goal.
  • #9 Start with the Service Support, talk though the establishing an incident response with support, move to problem which leads to change, config and release…and move to Service delivery for metrics.. Need to establish goals of what we are going trying to hit
  • #20 Do we have any IRC or Help Desk team members out there or anyone that has worked direct support? How many time do you get these questions? Well the app is slow? And that kicks off a script of questions? Ok what part of the app is slow? What OS and browser are you running? Are multiple people in the same place having the same problem? Is the issue is still happening? All these questions are trying to get more and more diag information that inevitably we will page the SME when needed, who then pages his need, and then starts to get more information What if off the bat you could tell what was slow and why it was slow? Instead of going down that crazy flow of slowness that the Bill and David will review in their Soarian Used Case, they know where and that Help Desk person is just not triage but knows as much as the DEV SME.
  • #27 Bye bye stack traces and log monitors, how about real time, currently always trying to go after the fact with those alarms. Changing the pattern for SEs Ops becomes the SME Fixing issues that the users have become immune to System engineers in the house? You know can now see the code that is being passed real time and how its peforming Ops become the SME
  • #29 #1 – Go over colors #2 – Flow of the data #3 – Cross application view (sched.) point out also SLPA. #4 – Different view representations