Appdynamics Training Session

DevOps Master Certified
AppDynamics Training Session
NOT FOR DISTRIBUTION © www.codvatechlabs.com

Agenda
• Role of Observability/Monitoring in DevOps/SRE
• AppDynamicsArchitecture
• AppDynamics – On PremVs SaaS
• Architecture walkthrough - PHPApplication
• AppDynamics UIWalkthrough
• Hands On - JavaAgent and MachineAgent installation
• LiveTroubleshooting Session
• Q/A

Challenges with Traditional NOC
High Alert
Noise
✓No alert prioritization (all alerts are getting converted into incidents directly)
✓High volume of incidents due to lack of event prioritization
Underutilize
NOC
✓NOC engineers are playing only alert escalation & follow up role (purely L1)
✓No technical inputs in case of high severity incidents (P1 Outage) resulted into High MTTR (Mean Time To Resolution)
✓80% of work involves around manually monitoring alerts , watching line of graphs on screen.
Problem
Management
✓Mindset is only on alert resolution rather than problem management
✓Lack of RCA & CAPA practice (Corrective Action & Preventive Action)for repetitive high severity incidents
Scalability
Issue
✓Not able to scale rapidity due to multiple manual process in case of infra expansion
✓High chance of missing monitoring coverage due to manual process & lack of feedback system
SLA Issues
✓Service Level Agreement (SLA) are not business aligned & focus is only on availability of infrastructure
✓Lack of SLIs (Service Level Indicator) & Service Level Objective which resulted into inefficient SLA tracking
✓SLIs are the best way to ensure availability & performance instead of SLA

Predict
Notify &
Act
SRE Roadmap
Collect Data
Correlate and
Triage
Identify
Trends
SRE Golden Signals (Alerting , Troubleshooting ,Tuning & Capacity Planning)
Monitoring , Auditing , Troubleshooting & Security(Compute| Storage | Network | Application)
Start Monitoring CIs
Work closely toward 100%
monitoring coverage using
continuous monitoring
(immutable Infrastructure
as Code)
Monitoring Data Source
▪ Solarwind(Compute,Sto
rage & Network)
▪ Dynatrace(APM)
▪ Synthetic Monitoring
Design & implement CMDB
(Single Source of truth) for
entire infrastructure
Trends & Anomalies
▪ Capacity Planning
▪ Cost
Recommendations
▪ Continuous
compliance
(Detect deviations
from a “golden
baseline” )
▪ Release-to-release
benchmarks
▪ Toil – Automate
repetitive task
Problem Management
▪ Publish Top N noise
makers Cis
▪ Post-mortem
Culture using
Problem
Management
(Learning from
failure)
▪ Implement custom
Self Healing for IT
Infrastructure &
services
▪ Publish SLIs , SLO &
SLM reports
Event Management
▪ Design & implement
AIOps based layer which
will collect
data(metrics/events)
from multiple data
sources & present into
single pane of glass
▪ Design & build service
models
▪ Build event correlation
(topology/stream) to
reduce alert noise
▪ Monitoring Tools
consolidation
Incident Management
▪ Integration of
monitoring events
with ITSM Ticketing
▪ Robust automated
alert notification
(Pager duty | Alarm
Point)
▪ Define SLIs, SLOs&
SLMs
▪ Data available during
production outage

SRE Level(L1) SRE Level(L2) SRE (Tools &Automation SMEs)
Improve MTTD
▪ Virtual team for Live 24*7 monitoring
(availability & performance)
▪ Automated alert escalation to L2 NOC
Support team(P1|P2|P3 - Incidents )
▪ Tracking of escalated alerts till alert
resolution
▪ Engage Incident Management in case
P1& P2 incidents
▪ Engage NOC Dev team in case of
monitoring miss opportunities
▪ Perform Schedule Health Check-up
▪ Daily Schedule Reports(Availability |
Performance | Outage etc)
▪ Other BAU activities
Improve MTTR
▪ Provide L2 analysis for all incidents
▪ Escalate incident to L3/Product
SMEs for open incident
▪ Analyse & fix monitoring alerts
▪ Runbook - Step by step guide for
resolving an incident
▪ Incident Response Report
▪ Post mortem reports(RCA and task to
be performed to avoid future outage)
▪ Engage NOC Dev team for repetitive
task
Note : This team will have L2/ SMEs
from OS , App , DB , Middleware&
Network domain)
Improve MTBF
▪ Monitor every possible metric in
environment
▪ Design & configure robust monitoring
system(Continuous Monitoring)
▪ Working on new monitoring
opportunities
▪ Automate Runbook (Self-Healing)
▪ Toil – Automate repetitive task(shift
from manual to automated approach)
Site Reliability Engineering - Landscape
SRE/DevOps Team Structure

AppDynamics Architecture

AppDynamics Architecture – On Prem Vs SaaS Platform
• AppDynamics On Prem Ref Architecture
• AppDynamics SaaS Ref Architecture

AppDynamics UI Walkthrough
• Application Flow Map
• Transaction Score Card
• BusinessTransactions
• Transaction Snapshots
• Errors and Exceptions
• Dashboards
• Alerting
• Reports

Hands On Session :
• Setup and configure Java agent for JavaApplication
• Setup and configure Machine agent for OS Monitoring
• Lets troubleshoot live application issue

DevOps Master Certified
Q/A
Feel free to reach out us in case of any queries.
✓ Website : https://www.codvatechlabs.com
✓ Email Id : learn@codvatechlabs.com
Ref :
https://docs.appdynamics.com/21.7/en

Appdynamics Training Session

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Appdynamics Training Session

Similar to Appdynamics Training Session (20)

Recently uploaded

Recently uploaded (20)

Appdynamics Training Session