Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Architectures That Scale Deep - Regaining Control in Deep Systems

23 views

Published on

Video and slides synchronized, mp3 and slide download available at URL https://bit.ly/2FWc5Sk.

Ben Sigelman talks about "Deep Systems", their common properties and re-introduces the fundamentals of control theory from the 1960s, including the original conceptualizations of Observability & Controllability. He uses examples from Google & other companies to illustrate how deep systems have damaged people's ability to observe software, and what needs to be done in order to regain control. Filmed at qconsf.com.

Ben Sigelman is a co-founder and the CEO at LightStep, a co-creator of Dapper (Google’s distributed tracing system), and co-creator of the OpenTracing and OpenTelemetry projects (both part of the CNCF). His work and interests gravitate towards observability, especially where microservices, high transaction volumes, and large engineering organizations are involved.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Architectures That Scale Deep - Regaining Control in Deep Systems

  1. 1. Ben Sigelman (@el_bhs, bhs@lightstep.com) Co-founder & CEO: LightStep Co-creator: OpenTracing, OpenTelemetry, Google Dapper, Google Monarch Architectures that Scale Deep: Regaining Control in Deep Systems QCon SF, November 2019
  2. 2. InfoQ.com: News & Community Site • Over 1,000,000 software developers, architects and CTOs read the site world- wide every month • 250,000 senior developers subscribe to our weekly newsletter • Published in 4 languages (English, Chinese, Japanese and Brazilian Portuguese) • Post content from our QCon conferences • 2 dedicated podcast channels: The InfoQ Podcast, with a focus on Architecture and The Engineering Culture Podcast, with a focus on building • 96 deep dives on innovative topics packed as downloadable emags and minibooks • Over 40 new content items per week Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ properties-deep-systems/
  3. 3. Purpose of QCon - to empower software development by facilitating the spread of knowledge and innovation Strategy - practitioner-driven conference designed for YOU: influencers of change and innovation in your teams - speakers and topics driving the evolution and innovation - connecting and catalyzing the influencers and innovators Highlights - attended by more than 12,000 delegates since 2007 - held in 9 cities worldwide Presented at QCon San Francisco www.qconsf.com
  4. 4. Part I Scaling, and Deep Systems
  5. 5. What is scale, anyway?
  6. 6. Scaling wide
  7. 7. Scaling wide
  8. 8. Scaling wide
  9. 9. Scaling wide
  10. 10. Scaling wide
  11. 11. Scaling deep
  12. 12. Scaling deep
  13. 13. Scaling deep
  14. 14. Scaling deep
  15. 15. Scaling deep
  16. 16. How does this look for software?
  17. 17. Software: Scaling wide
  18. 18. Software: Scaling deep
  19. 19. How do real-world systems look?
  20. 20. Microservices at scale aren’t just wide systems, they’re deep systems
  21. 21. Deep Systems Architectures with ≥ 4 layers of independently operated services (including external/cloud dependencies) Deep Systems Architectures with ≥ 4 layers of independently operated services (including external/cloud dependencies)
  22. 22. What do deep systems sound like?
  23. 23. “Don’t deploy on Fridays” What do deep systems sound like?
  24. 24. “Where’s Chris?! I’m dealing with a P0 and they’re the only one who knows how to debug this.” What do deep systems sound like?
  25. 25. “It can’t be our fault, our dashboard says we’re healthy” What do deep systems sound like?
  26. 26. “Kafka is on fire” What do deep systems sound like?
  27. 27. “I need 100% availability from your team. One hundred percent.” What do deep systems sound like?
  28. 28. “I didn’t know I depended on that region” What do deep systems sound like?
  29. 29. “That was on a dashboard but I can’t find it” What do deep systems sound like?
  30. 30. Lots of challenges: - People-management - Security - Multi-tenancy - “Big-customer” success - Performance - Observability What do deep systems sound like?
  31. 31. Part II Control Theory: TL;DR Edition
  32. 32. Why do we care so much about observability, anyway?
  33. 33. A System Inputs Outputs … and its state vector,
  34. 34. Inputs A System Outputs … and its state vector, Observability How well can you infer internal state using only the outputs?
  35. 35. Outputs A System Inputs … and its state vector, Controllability How well can you control internal state using only the inputs?
  36. 36. Controllability is the dual of Observability
  37. 37. Controllability is the dual of Observability
  38. 38. Part III What Deep Systems Mean for Observability
  39. 39. # of services developersperservice Architectural evolution Deep Systems Pure Monoliths
  40. 40. Stress (n): responsibility without controlStress what you can control what you are responsible for
  41. 41. Observability: Shrink This Gap
  42. 42. Mental models A System
  43. 43. Managing Deep Systems Services must have SLOs (“Service Level Objectives”: latency, errors, etc) For effective service management, only three things matter: 0. Releasing service functionality 1. Gradually improving SLOs 2. Rapidly restoring SLOs In a deep system, we must control the entire “triangle” to maintain our SLOs
  44. 44. Controllability == ObservabilityControllability == Observability There’s that word again…
  45. 45. Observability: “The Conventional Wisdom” Observing microservices is hard Google and Facebook solved this (right???) They used Metrics, Logging, and Distributed Tracing… … So we should, too.
  46. 46. 3 Pillars, 3 Experiences Metrics Logs Traces
  47. 47. Three Pillars?Three Pillars? Two giant pipes… Logs Metrics Without Traces: Cognitive Load ≈ O(depth2 )
  48. 48. Three Pillars?Three Pillars? Two giant pipes… Logs Metrics
  49. 49. Two giant pipes… Logs Metrics Without Traces: Cognitive Load ≈ O(depth2 )
  50. 50. Traces
  51. 51. Traces provide Context
  52. 52. Traces provide Context And context rules out invalid hypotheses
  53. 53. Two giant pipes and a filter Logs Metrics Context (from traces)
  54. 54. Context (from traces) Context reduces cognitive load With Traces: Cognitive Load ≈ O(depth) Relevant Metrics Relevant Logs
  55. 55. Observability: Shrink This Gap
  56. 56. Let’s Review
  57. 57. Microservices don’t just scale wide, they scale deep Recognize deep systems
  58. 58. Stress (n): responsibility without controlStress what you can control what you are responsible for
  59. 59. “Controllability” (of SLOs) depends on observability
  60. 60. … and traces are not sprinkles “The Three Pillars of Observability” is a lousy metaphor
  61. 61. Tracing can reduce cognitive load from O(depth2 ) to O(depth)
  62. 62. Tracing is the backbone of simple observability in deep systems
  63. 63. Thank You Feedback always welcome: twitter → @el_bhs the emails → bhs@lightstep.com Play with LightStep, for free, anytime: (no email address required!) lightstep.com/play
  64. 64. Watch the video with slide synchronization on InfoQ.com! https://www.infoq.com/presentations/ properties-deep-systems/

×