Expedia lessons from the trenches -
Managing AppDynamics at scale
Bhadri Govindarajan, Sr. Application Engineer
Vishal Singla, SDE
Bio
•  Vishal Singla
–  SDE, Ecommerce Platform
–  Over 8 Years of Product Development Experience
•  Bhadri Govindarajan
–  Sr. Application Engineer, Ecommerce Platform
–  Over 12 Years Of Experience with Infrastructure and Application
Management
Copyright © 2015 AppDynamics. All rights reserved. 2
Agenda
•  Background
•  AppDynamics Rest API
–  To Automate AppDynamics Transaction
Snapshot
–  For Health Rule Automation
Copyright © 2015 AppDynamics. All rights reserved. 3
Copyright © 2015 AppDynamics. All rights reserved. 4
Your Ultimate Travel Companion
Behind The Book Button
•  Complex Distributed System
•  Multiple Datacenters
•  Several Teams
•  Different Technologies, Different Platforms
–  Web Services
–  Messaging Solutions
–  SQL, NO SQL Solutions
Copyright © 2015 AppDynamics. All rights reserved. 5
Your Ultimate Travel Companion
Behind The Book Button
•  One Booking
–  37 Web Service
–  56 Database
–  11 Queues
–  3 Caching System
Copyright © 2015 AppDynamics. All rights reserved. 6
AUTOMATE APPDYNAMICS
TRANSACTION SNAPSHOT
Vishal Singla
Copyright © 2015 AppDynamics. All rights reserved. 8
Copyright © 2015 AppDynamics. All rights reserved. 9
Copyright © 2015 AppDynamics. All rights reserved. 10
Wait Begins
Copyright © 2015 AppDynamics. All rights reserved. 11
•  INC67095
•  ORD-7234
•  PAY-4325
•  SEARCH-4444
•  …..
Problem(s)
•  Dissatisfied customer
•  Response time of the ticket to reach/identify right owners
–  Trouble shooting time
Copyright © 2015 AppDynamics. All rights reserved. 12
Possible Fixes & Limitations
•  Grow team(s)
–  Not Scalable & Efficient
–  $$$$$
•  Systems transactions link up with common identifier
–  Difficult to maintain contract among 100+ application
interactions
–  $$$
•  Application Performance Management Solution
–  $$
Copyright © 2015 AppDynamics. All rights reserved. 13
Solution Evaluation Criteria
“If you can’t explain it simply, you
don’t understand it well enough.”
- Albert Einstein
Things we were looking for
–  Ability to deep dive in to applications
for troubleshooting
–  Easy to deploy
–  Visual Representation
Copyright © 2015 AppDynamics. All rights reserved. 14
Transaction Snapshot
•  Visual representation of the code paths
•  Depicts a set of diagnostic data, taken at a certain point in
time
•  Code-level visibility for troubleshooting problems in
environment
Copyright © 2015 AppDynamics. All rights reserved. 15
Design & Workflow
Copyright © 2015 AppDynamics. All rights reserved. 16
Copyright © 2015 AppDynamics. All rights reserved. 17
Copyright © 2015 AppDynamics. All rights reserved. 18
Copyright © 2015 AppDynamics. All rights reserved. 19
Copyright © 2015 AppDynamics. All rights reserved. 20
What went Wrong?
Copyright © 2015 AppDynamics. All rights reserved. 21
HEALTH RULE AUTOMATION
Bhadri Govindarajan
Copyright © 2015 AppDynamics. All rights reserved. 23
Typical Operational Problems
Copyright © 2015 AppDynamics. All rights reserved. 24
•  Server Crashed
•  Service Hung, Not responding,
Slow
•  Latency, Performance Problems
•  Errors, Exceptions
•  Garbage Collection Issues
•  Disk IO, Space Issues
•  Traffic Patterns
Unhappy Customers
Financial Loss $$$
Operational Efficiency ==> Customer Delight
•  Effective Monitors & Alerts
–  With Troubleshooting Guides
–  Eliminate Noise
•  Be Predictive
•  Detect Early, Restore Quickly
•  Auto Recover
Copyright © 2015 AppDynamics. All rights reserved. 25
Solution – Year Before Last Year
Copyright © 2015 AppDynamics. All rights reserved. 26
A Year Later
•  Time Consuming
•  Human Errors
•  Standards Drop
•  Inconsistent
•  Work Load Increase
•  Priority Changes
•  Audit Failures
Copyright © 2015 AppDynamics. All rights reserved. 27
Solution - Health Rule Automation
•  Templatized Approach
–  Template Health Rule in the Controller
–  Integrated with JIRA for Change Management
–  Jenkins Jobs
Copyright © 2015 AppDynamics. All rights reserved. 28
Download
Template
Replace
Tier
Upload
Create
Policy
Health Rule Automation – Jenkins Job
Copyright © 2015 AppDynamics. All rights reserved. 29
Design & Workflow
Copyright © 2015 AppDynamics. All rights reserved. 30
Health Rule Automation
Copyright © 2015 AppDynamics. All rights reserved. 31
Gains
•  Created over 600 Health Rules for more
than 70 Applications in 2 hrs
•  Standardized Configurations
•  Decreased Overall MTTR
•  Reduced Cost of Maintaining Monitors
and Alerts
•  Enable/Disable Alerts programmatically
–  During Deployments/Maintenance
–  Reduce Noise
Copyright © 2015 AppDynamics. All rights reserved. 32
Key Takeaways
•  Visualize Problem
•  Rapid Troubleshooting
•  Easy To Implement
•  Scalable Solution
–  Infrastructure Monitoring
–  Database Monitoring
–  Backend Monitoring
–  Business Transaction Monitoring
Copyright © 2015 AppDynamics. All rights reserved. 33
Useful Links
Configuring Data-Collector
•  https://docs.appdynamics.com/display/PRO14S/Configure+Data+Collectors
Configuring transaction snapshot
•  https://docs.appdynamics.com/display/PRO14S/Configure+Transaction
+Snapshots
AppDynamics REST API documentation
•  https://docs.appdynamics.com/display/PRO14S/Use+the+AppDynamics+REST
+API
Copyright © 2015 AppDynamics. All rights reserved. 34
Questions
Copyright © 2015 AppDynamics. All rights reserved. 35
Thank You

AppSphere 15 - Expedia Lessons from the Trenches: Managing AppDynamics at Scale

  • 1.
    Expedia lessons fromthe trenches - Managing AppDynamics at scale Bhadri Govindarajan, Sr. Application Engineer Vishal Singla, SDE
  • 2.
    Bio •  Vishal Singla – SDE, Ecommerce Platform –  Over 8 Years of Product Development Experience •  Bhadri Govindarajan –  Sr. Application Engineer, Ecommerce Platform –  Over 12 Years Of Experience with Infrastructure and Application Management Copyright © 2015 AppDynamics. All rights reserved. 2
  • 3.
    Agenda •  Background •  AppDynamicsRest API –  To Automate AppDynamics Transaction Snapshot –  For Health Rule Automation Copyright © 2015 AppDynamics. All rights reserved. 3
  • 4.
    Copyright © 2015AppDynamics. All rights reserved. 4
  • 5.
    Your Ultimate TravelCompanion Behind The Book Button •  Complex Distributed System •  Multiple Datacenters •  Several Teams •  Different Technologies, Different Platforms –  Web Services –  Messaging Solutions –  SQL, NO SQL Solutions Copyright © 2015 AppDynamics. All rights reserved. 5
  • 6.
    Your Ultimate TravelCompanion Behind The Book Button •  One Booking –  37 Web Service –  56 Database –  11 Queues –  3 Caching System Copyright © 2015 AppDynamics. All rights reserved. 6
  • 7.
  • 8.
    Copyright © 2015AppDynamics. All rights reserved. 8
  • 9.
    Copyright © 2015AppDynamics. All rights reserved. 9
  • 10.
    Copyright © 2015AppDynamics. All rights reserved. 10
  • 11.
    Wait Begins Copyright ©2015 AppDynamics. All rights reserved. 11 •  INC67095 •  ORD-7234 •  PAY-4325 •  SEARCH-4444 •  …..
  • 12.
    Problem(s) •  Dissatisfied customer • Response time of the ticket to reach/identify right owners –  Trouble shooting time Copyright © 2015 AppDynamics. All rights reserved. 12
  • 13.
    Possible Fixes &Limitations •  Grow team(s) –  Not Scalable & Efficient –  $$$$$ •  Systems transactions link up with common identifier –  Difficult to maintain contract among 100+ application interactions –  $$$ •  Application Performance Management Solution –  $$ Copyright © 2015 AppDynamics. All rights reserved. 13
  • 14.
    Solution Evaluation Criteria “Ifyou can’t explain it simply, you don’t understand it well enough.” - Albert Einstein Things we were looking for –  Ability to deep dive in to applications for troubleshooting –  Easy to deploy –  Visual Representation Copyright © 2015 AppDynamics. All rights reserved. 14
  • 15.
    Transaction Snapshot •  Visualrepresentation of the code paths •  Depicts a set of diagnostic data, taken at a certain point in time •  Code-level visibility for troubleshooting problems in environment Copyright © 2015 AppDynamics. All rights reserved. 15
  • 16.
    Design & Workflow Copyright© 2015 AppDynamics. All rights reserved. 16
  • 17.
    Copyright © 2015AppDynamics. All rights reserved. 17
  • 18.
    Copyright © 2015AppDynamics. All rights reserved. 18
  • 19.
    Copyright © 2015AppDynamics. All rights reserved. 19
  • 20.
    Copyright © 2015AppDynamics. All rights reserved. 20
  • 21.
    What went Wrong? Copyright© 2015 AppDynamics. All rights reserved. 21
  • 22.
  • 23.
    Copyright © 2015AppDynamics. All rights reserved. 23
  • 24.
    Typical Operational Problems Copyright© 2015 AppDynamics. All rights reserved. 24 •  Server Crashed •  Service Hung, Not responding, Slow •  Latency, Performance Problems •  Errors, Exceptions •  Garbage Collection Issues •  Disk IO, Space Issues •  Traffic Patterns Unhappy Customers Financial Loss $$$
  • 25.
    Operational Efficiency ==>Customer Delight •  Effective Monitors & Alerts –  With Troubleshooting Guides –  Eliminate Noise •  Be Predictive •  Detect Early, Restore Quickly •  Auto Recover Copyright © 2015 AppDynamics. All rights reserved. 25
  • 26.
    Solution – YearBefore Last Year Copyright © 2015 AppDynamics. All rights reserved. 26
  • 27.
    A Year Later • Time Consuming •  Human Errors •  Standards Drop •  Inconsistent •  Work Load Increase •  Priority Changes •  Audit Failures Copyright © 2015 AppDynamics. All rights reserved. 27
  • 28.
    Solution - HealthRule Automation •  Templatized Approach –  Template Health Rule in the Controller –  Integrated with JIRA for Change Management –  Jenkins Jobs Copyright © 2015 AppDynamics. All rights reserved. 28 Download Template Replace Tier Upload Create Policy
  • 29.
    Health Rule Automation– Jenkins Job Copyright © 2015 AppDynamics. All rights reserved. 29
  • 30.
    Design & Workflow Copyright© 2015 AppDynamics. All rights reserved. 30
  • 31.
    Health Rule Automation Copyright© 2015 AppDynamics. All rights reserved. 31
  • 32.
    Gains •  Created over600 Health Rules for more than 70 Applications in 2 hrs •  Standardized Configurations •  Decreased Overall MTTR •  Reduced Cost of Maintaining Monitors and Alerts •  Enable/Disable Alerts programmatically –  During Deployments/Maintenance –  Reduce Noise Copyright © 2015 AppDynamics. All rights reserved. 32
  • 33.
    Key Takeaways •  VisualizeProblem •  Rapid Troubleshooting •  Easy To Implement •  Scalable Solution –  Infrastructure Monitoring –  Database Monitoring –  Backend Monitoring –  Business Transaction Monitoring Copyright © 2015 AppDynamics. All rights reserved. 33
  • 34.
    Useful Links Configuring Data-Collector • https://docs.appdynamics.com/display/PRO14S/Configure+Data+Collectors Configuring transaction snapshot •  https://docs.appdynamics.com/display/PRO14S/Configure+Transaction +Snapshots AppDynamics REST API documentation •  https://docs.appdynamics.com/display/PRO14S/Use+the+AppDynamics+REST +API Copyright © 2015 AppDynamics. All rights reserved. 34
  • 35.
    Questions Copyright © 2015AppDynamics. All rights reserved. 35
  • 36.