Appdynamics Training Session

DevOps Master Certified
AppDynamics Training Session
NOT FOR DISTRIBUTION © www.codvatechlabs.com

Agenda
• Role of Observability/Monitoring in DevOps/SRE
• AppDynamicsArchitecture
• AppDynamics – On PremVs SaaS
• Architecture walkthrough - PHPApplication
• AppDynamics UIWalkthrough
• Hands On - JavaAgent and MachineAgent installation
• LiveTroubleshooting Session
• Q/A

Challenges with Traditional NOC
High Alert
Noise
✓No alert prioritization (all alerts are getting converted into incidents directly)
✓High volume of incidents due to lack of event prioritization
Underutilize
NOC
✓NOC engineers are playing only alert escalation & follow up role (purely L1)
✓No technical inputs in case of high severity incidents (P1 Outage) resulted into High MTTR (Mean Time To Resolution)
✓80% of work involves around manually monitoring alerts , watching line of graphs on screen.
Problem
Management
✓Mindset is only on alert resolution rather than problem management
✓Lack of RCA & CAPA practice (Corrective Action & Preventive Action)for repetitive high severity incidents
Scalability
Issue
✓Not able to scale rapidity due to multiple manual process in case of infra expansion
✓High chance of missing monitoring coverage due to manual process & lack of feedback system
SLA Issues
✓Service Level Agreement (SLA) are not business aligned & focus is only on availability of infrastructure
✓Lack of SLIs (Service Level Indicator) & Service Level Objective which resulted into inefficient SLA tracking
✓SLIs are the best way to ensure availability & performance instead of SLA

Predict
Notify &
Act
SRE Roadmap
Collect Data
Correlate and
Triage
Identify
Trends
SRE Golden Signals (Alerting , Troubleshooting ,Tuning & Capacity Planning)
Monitoring , Auditing , Troubleshooting & Security(Compute| Storage | Network | Application)
Start Monitoring CIs
Work closely toward 100%
monitoring coverage using
continuous monitoring
(immutable Infrastructure
as Code)
Monitoring Data Source
▪ Solarwind(Compute,Sto
rage & Network)
▪ Dynatrace(APM)
▪ Synthetic Monitoring
Design & implement CMDB
(Single Source of truth) for
entire infrastructure
Trends & Anomalies
▪ Capacity Planning
▪ Cost
Recommendations
▪ Continuous
compliance
(Detect deviations
from a “golden
baseline” )
▪ Release-to-release
benchmarks
▪ Toil – Automate
repetitive task
Problem Management
▪ Publish Top N noise
makers Cis
▪ Post-mortem
Culture using
Problem
Management
(Learning from
failure)
▪ Implement custom
Self Healing for IT
Infrastructure &
services
▪ Publish SLIs , SLO &
SLM reports
Event Management
▪ Design & implement
AIOps based layer which
will collect
data(metrics/events)
from multiple data
sources & present into
single pane of glass
▪ Design & build service
models
▪ Build event correlation
(topology/stream) to
reduce alert noise
▪ Monitoring Tools
consolidation
Incident Management
▪ Integration of
monitoring events
with ITSM Ticketing
▪ Robust automated
alert notification
(Pager duty | Alarm
Point)
▪ Define SLIs, SLOs&
SLMs
▪ Data available during
production outage

SRE Level(L1) SRE Level(L2) SRE (Tools &Automation SMEs)
Improve MTTD
▪ Virtual team for Live 24*7 monitoring
(availability & performance)
▪ Automated alert escalation to L2 NOC
Support team(P1|P2|P3 - Incidents )
▪ Tracking of escalated alerts till alert
resolution
▪ Engage Incident Management in case
P1& P2 incidents
▪ Engage NOC Dev team in case of
monitoring miss opportunities
▪ Perform Schedule Health Check-up
▪ Daily Schedule Reports(Availability |
Performance | Outage etc)
▪ Other BAU activities
Improve MTTR
▪ Provide L2 analysis for all incidents
▪ Escalate incident to L3/Product
SMEs for open incident
▪ Analyse & fix monitoring alerts
▪ Runbook - Step by step guide for
resolving an incident
▪ Incident Response Report
▪ Post mortem reports(RCA and task to
be performed to avoid future outage)
▪ Engage NOC Dev team for repetitive
task
Note : This team will have L2/ SMEs
from OS , App , DB , Middleware&
Network domain)
Improve MTBF
▪ Monitor every possible metric in
environment
▪ Design & configure robust monitoring
system(Continuous Monitoring)
▪ Working on new monitoring
opportunities
▪ Automate Runbook (Self-Healing)
▪ Toil – Automate repetitive task(shift
from manual to automated approach)
Site Reliability Engineering - Landscape
SRE/DevOps Team Structure

AppDynamics Architecture

AppDynamics Architecture – On Prem Vs SaaS Platform
• AppDynamics On Prem Ref Architecture
• AppDynamics SaaS Ref Architecture

AppDynamics UI Walkthrough
• Application Flow Map
• Transaction Score Card
• BusinessTransactions
• Transaction Snapshots
• Errors and Exceptions
• Dashboards
• Alerting
• Reports

Hands On Session :
• Setup and configure Java agent for JavaApplication
• Setup and configure Machine agent for OS Monitoring
• Lets troubleshoot live application issue

DevOps Master Certified
Q/A
Feel free to reach out us in case of any queries.
✓ Website : https://www.codvatechlabs.com
✓ Email Id : learn@codvatechlabs.com
Ref :
https://docs.appdynamics.com/21.7/en

Appdynamics Training Session

More Related Content

What's hot

Similar to Appdynamics Training Session

Recently uploaded

Appdynamics Training Session