Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ten Years of Failing Microservices

1,144 views

Published on


As much as cloud-native applications and microservices help us be more productive and resilient and grow to unprecedented scales, they also bring an entirely new class of challenges. Let’s explore how the challenge of debugging applications has changed in a highly distributed world.

From: https://www.dashcon.io/agenda/ten-years-of-failing-microservices/

Published in: Software
  • Be the first to comment

  • Be the first to like this

Ten Years of Failing Microservices

  1. 1. Ten Years of Failing Microservices Phil Calçado - @pcalcado Meetup/WeWork
  2. 2. What is a microservice?
  3. 3. Monolith
  4. 4. Entry Point Service A Service B
  5. 5. Entry Point Service A Service B Service C Service D Service E
  6. 6. Microservices are highly-distributed applicaDon architectures.
  7. 7. Debugging monolithic applicaDons
  8. 8. Monolith The bug you are looking for is here
  9. 9. Debugging highly-distributed applicaDons
  10. 10. Entry Point Service A Service B Service C Service D Service E 🤔 🤔 🤔 🤔 🤔🤔
  11. 11. Entry Point Service A Service B Service C Service D Service E 🤔 🤔 🤔 🤔 🤔🤔 Most bugs you’ll find in microservices are sDll isolated in a single service.
  12. 12. But which one?
  13. 13. "
  14. 14. Some tools to find out which service to look at when debugging
  15. 15. Staging environments
  16. 16. "I know, we need a staging environment” "We need to fix the staging environment” “Staging is broken, let’s build another staging” “I need you to use the first staging because the new staging isn’t ready” “Don’t use staging, it’s all broken" "I know, we need a staging environment”
  17. 17. "I know, we need a staging environment” "We need to fix the staging environment” “Staging is broken, let’s build another staging” “I need you to use the first staging because the new staging isn’t ready” “Don’t use staging, it’s all broken" "I know, we need a staging environment”
  18. 18. "I know, we need a staging environment” "We need to fix the staging environment” “Staging is broken, let’s build another staging” “I need you to use the first staging because the new staging doesn’t” “Don’t use staging, it’s all broken" "I know, we need a staging environment”
  19. 19. "I know, we need a staging environment” "We need to fix the staging environment” “Staging is broken, let’s build another staging” “I need you to use the first staging because the new staging isn’t ready” “Don’t use staging, it’s all broken" "I know, we need a staging environment”
  20. 20. "I know, we need a staging environment” "We need to fix the staging environment” “Staging is broken, let’s build another staging” “I need you to use the first staging because the new staging isn’t ready” “Don’t use staging, it’s all broken" "I know, we need a staging environment”
  21. 21. "I know, we need a staging environment” "We need to fix the staging environment” “Staging is broken, let’s build another staging” “I need you to use the first staging because the new staging isn’t ready” “Don’t use staging, it’s all broken" "I know, we need a staging environment”
  22. 22. Useful tool: Request Tracing
  23. 23. How to correlate logs across services?
  24. 24. Entry Point Service A Service B Service C Service D Service E req1 req1-right1 req1-leV1 req1-leV2 req1-right1-leV1 req1-right1-right1
  25. 25. 2018-01-11 18:01:02.122 UTC - INFO - req1-right1-right1 - User [14523] deleted by user [56432] 2018-01-11 18:01:02.132 UTC - INFO - req1-right1-right1 - User [12] made admin 2018-01-11 18:01:03.002 UTC - INFO - req1-right1-right1 - User [3522] deleted by user [56432] 2018-01-11 18:01:03.341 UTC - INFO - req1-right1-right1 - User [14523] created via Facebook 2018-01-11 18:01:03.176 UTC - INFO - req1-right1-right1 - User [5643] deleted by user [1] 2018-01-11 18:01:04.265 UTC - INFO - req1-right1-right1 - User [4577] deleted by user [7544] 2018-01-11 18:01:04.531 UTC - INFO - req1-right1-right1 - User [3245] deleted by user [34] 2018-01-11 18:01:06.001 UTC - INFO - req1-right1-right1 - User [14523] deleted by user [56432] Add span to your log lines
  26. 26. $ uncompress log | grep req1-right1 Search your logs by span 2018-01-11 18:01:02.122 UTC - INFO - req1-right1-right1 - User [14523] deleted by user [56432] 2018-01-11 18:01:02.132 UTC - INFO - req2-right1-right1 - User [12] made admin 2018-01-11 18:01:03.002 UTC - INFO - req1-right1-right1 - User [3522] deleted by user [56432] 2018-01-11 18:01:04.143 UTC - ERROR - req1-right1-left1 - Failed to delete picture [3522.jpg] from CDN 2018-01-11 18:01:06.001 UTC - INFO - req1-right1-right1 - User [14523] deleted by user [56432]
  27. 27. Useful tool: Telemetry
  28. 28. As the number of services or users grow, it is expensive to have verbose logging
  29. 29. As you grow: •Verbose logging becomes expensive •There’s so much going on you’ll be lost
  30. 30. The first place to look at should be the telemetry dashboards
  31. 31. Entry Point Service A Service B Service C Service D Service E AutomaDcally create a dashboard for each service
  32. 32. Useful tool: Request/Response Capturing
  33. 33. Service Request Response =>{ , => , => } Storage Queryable Interface
  34. 34. I was really against this idea.
  35. 35. Service Request Response =>{ , => , => } Storage Queryable Interface Service owns the storage
  36. 36. Service Request Response =>{ , => , => } Storage Queryable Interface Network owns the storage
  37. 37. Microservices dirty liale secrets
  38. 38. ~1/10 of all my engineering teams was dedicated to building tooling
  39. 39. Polyglot programming has died
  40. 40. You end up with vendor lock-in with your own tools
  41. 41. …but in 2018 things aren’t nearly as bad anymore.
  42. 42. "Higher-level networking”
  43. 43. Serverless?
  44. 44. Do we have Dme for quesDons?

×