Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SRV210 Improving Microservice and Serverless Observability with Monitoring Data

3,184 views

Published on

Hundreds of microservices, millions of AWS Lambda invocations, and dozens of global regions—the way we design, build, and operate cloud infrastructure and applications is increasingly distributed and composed of ephemeral components. From experience, we know a key to success with these systems is the ability to understand them using data. While there is considerable knowledge around how to use metrics and logs to analyze and troubleshoot traditional applications and infrastructure, emerging technology like serverless functions and orchestrated containers require a new observability approach. This is especially true when trying to understand the relationship between new services, like an IoT or mobile backend, and legacy systems.

Presented at AWS re:Invent 2017 by Clay Smith, Developer Advocate at New Relic.

Published in: Technology
  • Be the first to comment

SRV210 Improving Microservice and Serverless Observability with Monitoring Data

  1. 1. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Microservice and Serverless Observability CLAY SMITH, NEW RELIC @SMITHCLAY IMPROVING
  2. 2. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What’s Observability? A measure of how well we can understand a system from the work it does. “I know long all the methods in this service take to execute.”
  3. 3. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What’s Instrumentation? “This method took 25ms to execute” Instrumentation: Measuring events in software using code. (a type of white-box monitoring)
  4. 4. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Agenda 1. System architectures of past, present and future 2. Collecting the right data to understand modern architectures 3. Observability requirements for modern architectures 4. Case study: AWS Lambda Observability 5. Q&A with New Relic Customer
  5. 5. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How Did You Monitor Apps in 1967? Attribution: Bundesarchiv, B 145 Bild-F038812-0014 / Schaack, Lothar / CC-BY-SA 3.0 1. People in lab coats looking at blinking lights. 2. ‘Autotest’ (IBM System/360) • Status print-outs at different points during program execution • Main storage print-out in the event of failure (!) • ‘Automatic patch card inclusion’ (?) Source: IBM System/360 Programmer’s Basic Operating System Programmer’s Guide (September 1967)
  6. 6. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Good News: We Don’t Have to Wear Lab Coats Anymore Attribution: Flickr / Heisenberg Media/8408215473 / CC-BY-SA 3.0 1. People in jeans and hoodies looking at screens 2. Various types of machine data from different sources • Infrastructure • Backend Apps and Services • … Mobile, Browser, IoT, Edge, etc.
  7. 7. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Software Architecture Continues to Change
  8. 8. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. It’s Globally Distributed in Multiple Regions
  9. 9. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. And Compute Is Getting Physically Closer with Edge Computing and IoT
  10. 10. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. The Architecture Is Also Extremely Dynamic Docker container lifespan in minutes (1-100), New Relic April 2017
  11. 11. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. More New Relic Customers Run Complex, Distributed Systems New Relic Service Map of Reference Telco Architecture
  12. 12. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Good Data Can Help with the Technical Shift to New Systems • Improved debugging and troubleshooting • Designs validated with data • Reduced defects, more issues caught proactively • Improved feature velocity Technical
  13. 13. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Good Data Can Help with the Cultural Shift to New Systems • Builds transparency across teams • Shared understanding of complex components • Decisions not (entirely) driven or explained by ‘gut-feelings’ or guessing • Freedom to experiment • Blameless culture • ‘Context not control’ Cultural
  14. 14. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Instrumentation Increases Observability
  15. 15. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. How Do We Make Microservices and Serverless Functions Observable? But...
  16. 16. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. #1: Observable Systems Should Emit Events: Metrics, Logs, and Traces 16 “The database won’t start after the update.” “Our application is 35% slower than last week after this configuration change.” “What are the dependencies for this service?” Logs Metrics Traces New Relic Provides *via Partner Integrations
  17. 17. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. #2: All Components (Not Just Critical Services!) Should Be Instrumented BrowserMobile Server (Virtual) Hardware and Managed Services Host Operating System and Containers Application Amazon EC2 Instance
  18. 18. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. #3: Instrumentation Should Not Be Opt-in, Manual, or ‘Hard to Do’ On-Premises Web Server On Premises Relational Data Synthetic customers Customers Public Cloud Micro Services API Browser Apps Mobile NoSQL Data Store
  19. 19. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Lambda Case Study
  20. 20. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Which Monitoring Batteries Are Included? Amazon Cloudwatch Metrics Amazon Cloudwatch Logs
  21. 21. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Lambda: Key Metrics 1. Invocations 2. Errors 3. Dead Letter Error 4. Duration 5. Throttles 6. Iterator Age (stream-based invocations only) http://docs.aws.amazon.com/lambda/latest/dg/monitoring-functions-metrics.html
  22. 22. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What Else Provides AWS Lambda Observability? AWS X-Ray Request tracing for many AWS-managed services.
  23. 23. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS X-Ray Trace: Example A “cold start” trace initiated from in AWS X-Ray. Annotations in red.
  24. 24. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Warm Start in an X-Ray Trace Note the function executes almost immediately after the service receives the request.
  25. 25. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Traces In Aggregate Show Interesting Trends
  26. 26. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Serverless Architecture for Aggregating Traces
  27. 27. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. What Does the Data Show in Insights?
  28. 28. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. A-ha Moment: It Was Under-provisioned with Memory! Memory: 768mb Memory: 1152mb
  29. 29. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Lessons Learned • Instrument for observability: “What are the internal lambda service latencies for my function?” • Find the right balance of metrics, logs, and traces for a given system: “Over 24 hours what’s the distribution of function duration for my function?” • Use analytics to diagnose: “Are cold starts significant, what other factors are at play?”
  30. 30. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Q&A with Marcus Irven, Scripps Network Serverless Architectures in Production
  31. 31. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. THANK YOU! CLAY SMITH, NEW RELIC TWITTER: @SMITHCLAY

×