This document discusses several hidden features in CloudWatch for debugging serverless applications, including Logs Insights for powerful log querying, Metrics Insights for SQL queries on metrics, and X-Ray for distributed tracing. It also warns that CloudWatch can be expensive for logging and recommends only logging errors or limiting retention. Third-party services are suggested for specialized serverless observability needs.
CloudWatch hidden features for debugging serverless apps
1. CloudWatch hidden features for
debugging serverless application
Marko (ServerlessLife)
Full-stack Software Developer | AWS Certified Professional |
Serverless Specialist
Blog: www.serverlesslife.com
Email: marko@serverlesslife.com
Twitter: @ServerlessL
LinkedIn: https://www.linkedin.com/in/marko-serverlesslife/
2. CloudWatch
Set of AWS services for observability.
Observability
Umbrella term for:
● Logging
● Metrics
● Tracing
3. Why is it important?
Complex serverless applications look like this:
Source: https://dzone.com/articles/observability-driven-development-for-serverless
5. CloudWatch vs. 3rd party
+ quite powerful
+ it is already there, no issues with sharing the data
- user / developer experience 👎👎👎👎👎
3rd party tools/services:
Try to use services specialized for serverless: Lumigo, Epsagon, Thundra
7. ⚠Warning: CloudWatch is expensive 💸💸💸💸💸
CloudWatch is usually more expensive than Lambda.
Solutions:
● Do not log too much.
● Limit retention.
● Use sampling.
● Log request and other details only in case of an error.
● Enable detailed logging only when needed.
8. Logs Insights
● Takes about 1 min to update.
● Query language is extremely powerful.
● You can use regular expressions and parse the message.
● You can store the query (also with IaC).
● There are already prepared queries.
9. Logs Insights sample
Errors in Lambda:
fields @timestamp, @message
| filter @message like /ERROR/
| sort @timestamp desc
12. Logs Insights sample
Average duration, max duration, min duration, P99 percentile duration and request
count:
filter @type = "REPORT"
| stats avg(@duration), max(@duration), min(@duration), pct(@duration, 99),
count(@duration) by bin(5m)
Credit: https://dev.to/aws-heroes/10-cloudwatch-logs-insights-examples-for-serverless-applications-4293
13. Logs Insights sample
Number of exceptions per 5-minute intervals:
filter @message like /ERROR/
| stats count(*) as exceptionCount by bin(5m)
| sort exceptionCount desc
14. Logs Insights sample
Count a number of cold starts, average init time and maximum init duration of a
Lambda function
filter @type="REPORT"
| fields @memorySize / 1000000 as memorySize
| filter @message like /(?i)(Init Duration)/
| parse @message /^REPORT.*Init Duration: (?<initDuration>.*) ms.*/
| parse @log /^.*/aws/lambda/(?<functionName>.*)/
| stats count() as coldStarts, avg(initDuration) as avgInitDuration,
max(initDuration) as maxIntDuration by functionName, memorySize
15. Logs Insights sample
Lambda cold start percentage over time
filter @type = "REPORT"
| stats
sum(strcontains(
@message,
"Init Duration"))
/ count(*)
* 100
as coldStartPercentage,
avg(@duration)
by bin(5m)
Credit: https://github.com/julianwood/serverless-cloudwatch-logs-insights-examples
16. API Gateway
The Missing Guide to AWS API Gateway Access Logs
https://www.alexdebrie.com/posts/api-gateway-access-logs/
17. Metrics
Data is kept for 15 months.
Math expressions
Metrics Insights (SQL Query) - only last 3h
18. Types of Metrics
● Build in
● Custom
○ Synchronously
○ Asynchronously:
■ Log something ➡ create filter to transform to metric
■ Embedded Metric Format:
Log in specific format ➡ auto-transformed to metrics
19. Alarms
What alerts should you have for serverless applications?
https://lumigo.io/blog/what-alerts-should-you-have-for-serverless-applications/
⚠ Alarm fatigue
The high number of false alarms become senseless background noise.
Composite Alarms - goes into ALARM state only if all conditions of the rule are met
20. Tracing with X-Ray
Stores trace data for the last 30 days.
How to enable:
● Enable on API Gateway & Lambda
● Add:
const AWSXRay = require('aws-xray-sdk-core')
const AWS = AWSXRay.captureAWS(require('aws-sdk'))
● Additional configuration for SQS,
Use segments, annotation and metadata to add additional information.
Additional use for segments:
● Lambda doesn’t allow us to add custom annotations and metadata to its root segment.
This can be resolved by custom subsegment.
● When calling 3rd party service.
22. Insights (do not mix with Logs Insights)
● Lambda Insights
● Contributor Insights ➡ DynamoDB
● …
23. Dashboard
● Automatic dashboards
● Sharing
● Custom Widgets
Create a Lambda that return static HTML. JavaScript is not allowed. Can be dynamic
with using special tags.