Monitoring Your AWS Cloud Infrastructure


Published on

The presentation includes great overview on why and how to track and monitor your cloud infrastructure. It list the different types of cloud monitoring include the underlying infrastructure all the way up the application stack. Here you can find names of relevant tools that can support monitoring cloud online applications.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Monitoring Your AWS Cloud Infrastructure

  1. 1. Monitoring your cloud-hosted app 18/07/2012 Andreas Chatzakis @achatzakis on twitter AWS Usergroup Greece
  2. 2. whoamiAndreas Chatzakis  CTO & co-founder /  High traffic Greek Real Estate portal  Software delivery team management  IT Operations  co-founder of AWS Usergroup Greece @achatzakis 2
  3. 3. Why monitoringYou need monitoring to proact or react toavailability & performance risks and issues:  Detect problems before (many) users are aware  Alerts and notifications at 3 AM  Be informed of issues you wouldnt be able to recreate  Collect data to discover root cause of an incident ...and automate response for next time  Statistics and KPIs to track service quality trends  Visibility to prioritize optimization efforts  Make sense out of large quantity of logs and data 3
  4. 4. Monitoring in the cloudPrinciples are not that diverse fromtraditional infrastructure but...  Cloud allows us to build highly dynamic setups  More data  Our tools need to adapt  Ephemeral resources require centralized approach  Need aggregation based on server role  Cloud promises agility  Only possible when cost of failure is low  Being able to spot issues in a more automated manner is key  The rise of the devops  Developers need visibility to understand how their code affects costs and impacts availability 4
  5. 5. Types of monitoringThere is a variety of monitoring tools thatcomplement each other  External checks (is my app still up?)  Server monitoring (CPU, RAM, IO...)  Systems monitoring (mySQL, Apache etc metrics)  Process monitoring (restart crashed services)  Application monitoring (bottlenecks in the code)  End user monitoring (client side performance)  Log aggregation & analysis (centralize storage)  Cloud Analytics (do I make the most out of AWS?) 5
  6. 6. Deployment modelsConsider the deployment model of eachmonitoring solution  Agent vs Agent-less  SaaS vs DIY on own computing instances  Consider different AZ or provider  Least privilege principle (e.g. read-only access to agent) 6
  7. 7. Pricing modelsDifferent pricing models offered by thevarious solutions  Freeware  Per host  Per host-hour  Per user  Per alert  Per stored Gbyte 7
  8. 8. External testsExternal tests detect failure & alert you sothat you react  Treats your app as a black box  Periodic check from a bot  Define expected response (specific string)  Tests from different geographies  Report on average response time, latency etc  Alert via email, sms, phone 8
  9. 9. Server & Systems monitoringServer monitoring collects data from OS andSystems  Server metrics (CPU, Load Average, RAM, IO activity)  System metrics (Apache status, MySQL connections...)  Typically works via an agent or remote access  Can point towards root cause  But cant trace issues to specific parts of your code  Helps with capacity planning and scaling decisions 9
  10. 10. Process monitoringProcesses die or misbehave... Monitor theirhealth and automate response  Tools that check critical processes  Restart if crashed process ...or those using too many resources  Can configure complex scenarios  Beware of false positives  Beware of recurring restarts 10
  11. 11. Application monitoringA Flight recorder for your code helps youfix real issues.  It is often hard to recreate a production issue.  Plugs into your app servers & tracks execution  Code tracing  Captures errors, input variables and debugging info  Records performance metrics  Time spent on DB, Cache, external services  Overhead of specific classes or methods  Slow queries 11
  12. 12. End user monitoringGet real data about the experience of yourapps users  It works for you. Does it work for them?  Servers running ok. What about that 3rd party widget?  Typically collects actual end user data via js  Capture performance issues faced by user segments  OS / browser / addons  Network connection speed  Geographical location  First time visit VS warm browser cache 12
  13. 13. Log aggregatorsCentralized storage of logs for cloud setupswith ephemeral instances  Logs are sent over to centralized repository  Persists after server has been decomissioned  Logs are captured, stored, archived & recycled  Logs are indexed and analyzed  Preconfigured analyzers for known apps  Free text analyzers for less known apps  Alerts based on specific patterns, frequencies 13
  14. 14. Swiss knivesThe future might belong to holisticmonitoring solutions  Monitoring at multiple levels  Correlating data can be a godsend for devops  Cloud management tools might move to integrate or provide such functionality 14
  15. 15. A common pitfallWhile it does have its uses, you should notrely on custom application logging  Typically inconsistent logging that is added reactively  Developer bias and lack of operational issues understanding  logging what you anticipate to go wrong  Increased code maintenance costs and risks  Can hurt performance if you are not careful  Instead use a proper monitoring toolset  let developers focus on building new functionality 15
  16. 16. Cloud AnalyticsCombine traditinal monitoring with NewvemsAnalytics and make the most of the cloud  Powerful analytics of cloud usage data  Reveal security & availability issues in your cloud infra  Get actionable insights  Identify opportunities for cost reductions  Spot overloaded resources requiring vertical or horizontal scaling  Visibility and confidence you making the most of the cloud 16
  17. 17. 17
  18. 18. Questions? 18