Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Alon Fliess: APM – What Is It, and Why Do I Need It? - Architecture Next 20

633 views

Published on

So, you have a mature development process, and you also embrace DevOps. Your development team uses agile methodology. You use Git, and you have a continuous dev, test, and deployment process. But do you sleep well at night? Do you know that your services are up and running? That there are no availability, performance, and stability problems? Do you know if your customers are happy? The answer to all of those questions is precisely what APM systems provide.

Application Performance Monitoring systems have become the IDE of the Site Reliability Engineers (SRE) and, as a matter of fact, for the all DevOps team, including the Dev part. In this session, you will get to know the essence of the APM systems, the good, the bad, and the vision about their future.

Published in: Software
  • DOWNLOAD THE BOOK INTO AVAILABLE FORMAT (New Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download Full doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download PDF EBOOK here { https://urlzs.com/UABbn } ......................................................................................................................... Download EPUB Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... Download doc Ebook here { https://urlzs.com/UABbn } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book THE can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer THE is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBOOK .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, CookBOOK, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, EBOOK, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story THE Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money THE the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths THE Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Alon Fliess: APM – What Is It, and Why Do I Need It? - Architecture Next 20

  1. 1. Observability & More Alon Fliess Chief Architect alonf@codevalue.net @alon_fliess http://alonfliess.me http://codevalue.net
  2. 2. Cloudflare blames ‘bad software’ deployment for today’s outage
  3. 3. About Me  Alon Fliess:  Chief Software Architect & Co-Founder at OzCode & CodeValue  More than 30 years of hands-on experience  Microsoft Regional Director & Microsoft Azure MVP  Spend most of my time in project analysis, architecture, design  Code at night
  4. 4. Azure Israel  https://www.meetup.com/AzureIsrael 4
  5. 5. Agenda  DevOps, the true story  Microservice Architecture, the complexity shift  Ops & Monitoring  Site Reliable Managers  Developers & Observability  Business (marketing, sales, management) and observability  Application Performance Monitoring  How does it work?  Distributed Tracing  Production problem solving 5
  6. 6. The Essence of DevOps  Better Software, Faster! When Development and Operations Synergize  Covers the *entire* Application Lifecycle 6
  7. 7. Microservice Architecture == Complexity Shift 7
  8. 8. Ops  Vital Signs: Heartbeat, Blood Pressure, Temperature 8
  9. 9. What Do Site Reliability Managers (SRE) Want? 9
  10. 10. What Do Developers Want? 10
  11. 11. What Do Marketing & Sales Teams Want? 11
  12. 12. What is Observability? (Twitter 2013) 12
  13. 13. Gartner Critical Capabilities for APM (May 2019) 13 Business Analysis Anomaly Detection IT Operations DevOps Release Application Support Application Development Application Owner Use Cases
  14. 14. 14
  15. 15. APM Players Dynatrace AppDynamics (Cisco) Datadog Splunk Broadcom (CA Technologies) New Relic Riverbed IBM Instana Oracle Tingyun SolarWinds ManageEngine Micro Focus 15
  16. 16. How Does Monitoring & Tracing Work? 16 Operating Systems APM system tracking agent installed on the machine CPU, Memory, I/O, Network Code Tracing Instrumentation Manual Auto Runtime data collection
  17. 17. Instrumentation – Original Pseudo Code 17 Function AddToBasket(var productId, var quantity) if (quantity < 0) return false var product = Dal.GetProductById(productId) BasketService.Add(product, quantity) return true
  18. 18. Instrumentation – Add Logging on Errors 18 Function AddToBasket(var productId, var quantity) if (quantity < 0) Log(“Error: Negative quantity value”) return false var product = Dal.GetProductById(productId) BasketService.Add(product, quantity) return true
  19. 19. Instrumentation – Add Metrics of Usage and Errors 19 Function AddToBasket(var productId, var quantity) metrics.Count(“AddToBasket”, 1) if (quantity < 0) Log(“Error: Negative quantity value”) metrics.Count(“AddToBasketFailure”, 1) return false var product = Dal.GetProductById(productId) BasketService.Add(product, quantity) return true
  20. 20. Instrumentation – Measure Latency 20 Function AddToBasket(var productId, var quantity) metrics.Count(“AddToBasket”, 1) start = time() if (quantity < 0) Log(“Error: Negative quantity value”) metrics.Count(“AddToBasketFailure”, 1) return false var product = Dal.GetProductById(productId); BasketService.Add(product, quantity); metrics.Measure(“AddToBasket”, time() – start); return true;
  21. 21. Instrumentation – Measure Latency Everywhere 21 Function AddToBasket(var productId, var quantity) metrics.Count(“AddToBasket”, 1) start = time() if (quantity < 0) Log(“Error: Negative quantity value”) metrics.Count(“AddToBasketFailure”, 1) return false var product = Dal.GetProductById(productId) metrics.Measure(“AddToBasket_GetProductById”, time() – start) BasketService.Add(product, quantity) metrics.Measure(“AddToBasket”, time() – start) return true
  22. 22. Instrumentation – Add Debugging Information 22 Function AddToBasket(var productId, var quantity) debug.AddParameters(“AddToBasket”, [[“ProductId”, productid],[“quantity”, quantity]]) metrics.Count(“AddToBasket”, 1) start = time() if (quantity < 0) Log(“Error: Negative quantity value”) metrics.Count(“AddToBasketFailure”, 1) debug.AddError(“AddToBasket”, GetErrorData()) return false var product = Dal.GetProductById(productId) debug.AddValue(“AddToBasket”, [[“product”, product]]) metrics.Measure(“AddToBasket_GetProductById”, time() – start) BasketService.Add(product, quantity) metrics.Measure(“AddToBasket”, time() – start) return true
  23. 23. Instrumentation – Original vs. Instrumented Code 23 Function AddToBasket(var productId, var quantity) debug.AddParameters(“AddToBasket”, [[“ProductId”, productid],[“quantity”, quantity]]) metrics.Count(“AddToBasket”, 1) start = time() if (quantity < 0) Log(“Error: Negative quantity value”) metrics.Count(“AddToBasketFailure”, 1) debug.AddError(“AddToBasket”, GetErrorData()) return false var product = Dal.GetProductById(productId) debug.AddValue(“AddToBasket”, [[“product”, product]]) metrics.Measure(“AddToBasket_GetProductById”, time() – start) BasketService.Add(product, quantity) metrics.Measure(“AddToBasket”, time() – start) return true
  24. 24. Instrumentation and Tracing Automation  Aspect Oriented Approach  Communication level instrumentation  Pipeline interception – technology depended  Resource performance counters – DB statistics for example  Code Instrumentation  Manual – deploy a package and call it  Automatic – bytecode instrumentation libraries and tools  Distributed Tracing  Passing call context between services 24
  25. 25. Distributed Tracing 25 Id:123 Application A Service A B Service B Span Span Span
  26. 26. Instrumentation – Call Context 26 Function AddToBasket(var productId, var quantity, var context) debug.AddParameters(context, “AddToBasket”, [[“ProductId”, productid],[“quantity”, quantity]]) metrics.Count(context, “AddToBasket”, 1) start = time() if (quantity < 0) Log(context, “Error: Negative quantity value”) metrics.Count(context, “AddToBasketFailure”, 1) debug.AddError(context, “AddToBasket”, GetErrorData()) return false var product = Dal.GetProductById(context, productId) debug.AddValue(context, “AddToBasket”, [[“product”, product]]) metrics.Measure(context, “AddToBasket_GetProductById”, time() – start) BasketService.Add(context, product, quantity) metrics.Measure(context, “AddToBasket”, time() – start) return true Context: Call Id URL HTTP Method DB Host User Info Timing Info
  27. 27. Instrumentation – Using Span 27 Function AddToBasket(var productId, var quantity, var context) span = trace.BeginSpan(context, {“AddToBasket”, productid, quantity}) if (quantity < 0) span.Error(“Negative quantity value”) return false; var product = Dal.GetProductById(context, productId) span.AddValue(“product”, product) BasketService.Add(context, product, quantity) span.End() return true; Span: Call Id URL HTTP Method DB Host User Info Timing Info
  28. 28. OpenTracing & OpenCencus 28
  29. 29. What Do SREs & Developers Want – From Each Other? 29
  30. 30. New Relic APM Dashboard
  31. 31. APM Error Analysis – Not Enough Information Error Rate Request information Stack trace  APM systems can assist in health monitoring and fault first aid
  32. 32. Production Problem Solving Challenges 10kg Can’t mess with data 10kg No Debugging tools 10kg Code is optimized 10kg Older source code version 10kg Can’t impact performance 10kg Data must stay in a secure env. 10kg Data is private and contains PII 10kg Very hard to reproduce the bug
  33. 33. Problem Solving With APM 33
  34. 34. Production Problem Solving Platforms  OzCode  OverOps  Rookout  Application Insights 34
  35. 35. Problem Solving With a Production Debugger 35
  36. 36. OzCode Production Debugger 36
  37. 37. Summary 37
  38. 38. Q A 38
  39. 39. Alon Fliess Chief Architect alonf@codevalue.net @alon_fliess http://alonfliess.me http://codevalue.net

×