2. My Intro
● Sysadmin since 2002, started my career in a typical startup, a web-hosting company and daily
firefight
● Worked in JP Morgan, Sabre Inc etc.
● Started JBUG –Bangalore (Jboss User Group) in 2010
● AWS user since since 2011 and the romance continues
3. Start with basics
What is CloudWatch
1) Repository of metrics
● For monitoring AWS services cloudwatch could be a starting point
● AWS service such as Auto Scaling, rely on CloudWatch to execute
● Data points has a timestamp
● Raw data in and statistics out (We will revisit this)
2) Alarm
● Trigger Autoscaling Policies
● SNS Notification
8. Some vocabulary
● Namespace- Container for metrics
● Metrics- Basically Data Points like CPUUtilization, DiskReadBytes,
DiskWriteBytes, NetworkIn, NetworkOut Metrics
○ We could use https://www.npmjs.com/package/cloudwatch-agent to push custom metrics
● Statistics- Metric data aggregated over period of time (min,max,sum,avg
etc)
● Units- Measurement units (example seconds, bytes, percentage etc)
● Periods- Length of time, default is 60 sec
● Alarms
11. Alarm
Example:
● Average CPU utilization is more than 70% for 7 minutes
● Here CPU utilization is the metric
● Average is statistics
● State is 70%
● Specific Time is 7 minutes
Cloudwatch can do AWS billing alarm (Say 20 USD for a personal account)
12. Cloudwatch Automation Options
● Mainly used for custom metrics
● Push the metric to CloudWatch and it will store and graph
● Could be done using
○ AWS APIs (Boto Python or Fog Ruby)
○ RESTful API
○ Tools (Ansible does it)
○ AWS CLI (Easiest)
●
13. AWS CLI example
● app_data.json
[
{
"MetricName": "Tasks",
"Value": 100,
"Unit": "Count"
}
]
● aws cloudwatch put-metric-data --namespace "SomeName" --metric-data
file://app_data.json
● Any of the above approach could help to embed monitoring into App
15. Pros
● Natively available from Amazon
● Single Console
● API provides great potential for customization
● Ingest API allows for custom metric integration
● Allows for SNS notification and automated actions
● Has pager duty integration
16. Limitations
● Low retention (two weeks)
● Can’t play with dashboards
● 1 Minute interval is good
● May be some more I am not aware of :(
17. What about APM?
● Please don’t compare with Newrelic or Appdynamic
● Get some stats
● Method Execution Time
● Method Execution Count
● Exceptions
● Error Count
● Gather using AspectJ
● Publish to Cloudwatch and Visualize
● One of the largest travel tech uses this method but minus the cloudwatch
18. Add more meat in Cloudwatch- Libarto
● http://blog.librato.com/posts/make-cloudwatch-awsome
● Store data for one year
● Better UI