Advertisement
Advertisement

More Related Content

Slideshows for you(20)

Similar to What is going on? Application Diagnostics on Azure - Copenhagen .NET User Group(20)

Advertisement

More from Maarten Balliauw(20)

Advertisement

What is going on? Application Diagnostics on Azure - Copenhagen .NET User Group

  1. What is going on? Application Diagnostics on Azure Maarten Balliauw @maartenballiauw
  2. Who am I? Maarten Balliauw Antwerp, Belgium Developer Advocate, JetBrains Founder, MyGet AZUG Focus on web ASP.NET MVC, Azure, SignalR, ... Former MVP Azure & ASPInsider Big passion: Azure http://blog.maartenballiauw.be @maartenballiauw
  3. Agenda Logging Why? Logging sucks! The need for semantic/structured logging Application Insights SDK, Azure portal Application Insights Analytics Find the needle in the haystack! Build, measure, improve
  4. Logging
  5. Why logging? Troubleshooting – Did a problem occur? Where? Our code? The machine? Performance / Cost – Did we make the application slower? Improvement – Can a problem be detected or avoided? Trends – Need more storage? Need more servers? Customer Experience – Are people happy? Or seeing error after error? Business Decisions – Can we increase sales based on log data / telemetry?
  6. So here’s what we typically do... System.Diagnostics.Trace.TraceInformation( "Something happened"); System.Diagnostics.Trace.TraceWarning( "Error! " + ex.Message); Does this help troubleshooting? Improve the application? Analyze trends?
  7. And here’s a typical log... App.exe [12:13:03:985] Information: 0 : Customer address updated. App.exe [12:13:04:011] Error: 8 : System.NullReferenceException occurred. Value can not be null. App.exe [12:13:04:567] Information: 0 : Machine policy value 'Debug' is 0 App.exe [12:13:04:569] Verbose: 9 : ******* RunEngine ******* Action: ******* CommandLine: ********** App.exe [12:13:04:578] Information: 9 : Entered CheckResult() App.exe [12:13:04:689] Debug: 0 : Created. Req=0, Ret=8, Font: Req=null Does this help troubleshooting? Improve the application? Analyze trends?
  8. No.
  9. Log files suck. Stupid string data Unless using ETW / proper System.Diagnostics listener Typical log file has no “context” Typical log file has no correlation How to get data off our machines? How to report/analyze?
  10. Log files suck. Stupid string data Typical log file has no “context” Process/thread Machine name User performing the action Data specific to the event being logged Typical log file has no correlation How to get data off our machines? How to report/analyze?
  11. Log files suck. Stupid string data Typical log file has no “context” Typical log file has no correlation, e.g.: What was the server load when this was logged? How much memory was consumed at the time? Which web request invoked this action? How to get data off our machines? How to report/analyze?
  12. Log files suck. Stupid string data Typical log file has no “context” Typical log file has no correlation How to get data off our machines? How to report/analyze?
  13. Log files suck. Stupid string data Typical log file has no “context” Typical log file has no correlation How to get data off our machines? How to report/analyze? Log processing and analysis – log, store, process, query, proactive analysis Logstash, ElasticSearch, ... – cumbersome to setup
  14. What if we had (better) contextual information?
  15. Structured/Semantic logging “Of or relating to meaning, especially meaning in language” – the dictionary Windows Event Log Event Tracing for Windows (ETW) Semantic Logging Application Blog (PnP SLAB) Serilog …
  16. Event Source XML representation containing contextual information application-specific data elements Can be read by humans (just a message, like any log) Can be read by machines (XML, baby!) add “meaning” – we know what the data points can be allow automated aggregaton / trend analysis Fast! (not very developer friendly)
  17. Using EventSource DEMO
  18. Event Source Super awesome It has structured logging and context Can collect all kinds of data combined (e.g. HTTP + own event source) Hard to maintain Versioning Ceremony One method for every single thing we want to log (or at least an event id) Requires thinking about the events to log Lots of data
  19. Serilog - https://serilog.net/ Logging library like any other, mostly Sinks https://github.com/serilog/serilog/wiki/Provided-Sinks Console, file, event log Database ElasticSearch AppInsights Structured data as a first-class concept Enrichers, filters, …
  20. Structured data in Serilog Simple scalar values (bool, string, int, …) Collections and dictionaries Full objects Rendered by a sink to, for example, a string Can be stored by a sink to provide additional dimensions on data
  21. Serilog - Basics DEMO
  22. Enrichment of data Enrich log entries with additional context Machine name User name Process / thread information ASP.NET client IP / hostname ASP.NET user agent … Using Serilog enrichers (when applicable) In custom ASP.NET middleware
  23. But still... How to get the data off our machine(s)? How to analyze/aggregate/predict? Which customer had this Exception and why? Did our sales go up after we made this page faster? What button on the front-end crashes this microservice? Are # Exceptions increasing? Do I need special infrastructure to store and process logs?
  24. The cloud to the rescue? (Azure) Azure Insights Azure platform’s core monitoring and alerting services (standard Metrics, Alerts, Autoscale, Email Notifications and Audit Logs) Application Insights Application monitoring & logging for your applications (web, mobile, desktop, …) in cloud or other server Log Analytics (OMS) Log Analytics, part of the Operations Management Suite (OMS) ingests data from Azure servers and provides rich ‘global’ view & advanced log search based alerting capabilities
  25. Application Insights
  26. Application Insights Azure Service + library/SDK Solves “where to store”, “how to ship”, “how to analyze” Enriches data Correlates data (e.g. client + server + dependency/DB/...) Telemetry! Allows structured logging Allows rich querying, alerting Works for many, many platforms and languages Web, Windows, Xamarin, any application type really .NET, Java, JavaScript, Objective-C, PHP, Python, Ruby, ...
  27. Getting Started DEMO
  28. Do I have to run on Azure? No. REST API to send data to Various tools and SDK’s to collect data specific to language/platform Status Monitor on non-Azure machines
  29. Data collected Performance Counters Requests (both server/client side) Traces (more later) Exceptions Dependencies Custom Metrics & Events That is a lot of data that can be correlated!
  30. Different Views
  31. Application Insights portal Dashboard Basic metrics (# Alerts, Response times, Exceptions, Failed Requests, ...) Gateway to all of the below Application map Smart Detection / Alerts Live Metrics Stream Search, Availability, Failures, Performance, ... Metrics (and events) Analytics
  32. Application Map DEMO
  33. Smart Detection / Alerts DEMO
  34. Live Metrics Stream DEMO
  35. Search, Availability, Failures, Performance, ... DEMO
  36. Metrics and Events Send custom events and metrics Name of event/metric + a value Events – help find how the application is used and can be optimized User resized column in a grid User logged in User clicked “Share” No more stock ... Metrics – help alerting / finding missing functionality Purchase occurred Search returned zero results # Support calls ...
  37. Metrics and Events DEMO
  38. So what about logging... Configure Serilog to use Application Insights as a sink Custom data becomes available (*) So we can correlate our own data with performance, requests, traces, exceptions, depenencies, ... But how to do this? And how to consume? (*) Make sure the entire object is in the log message, no tracking of non-printed data
  39. Logging DEMO
  40. Application Insights Analytics
  41. Application Insights Analytics Former MS-internal tool “Kusto” Near-realtime log ingestion and analysis Lets you run custom queries requests | where timestamp > ago(24h) | summarize count() by client_CountryOrRegion | top 10 by count_ | render piechart
  42. Queries over logs! Input dataset traces customEvents pageViews requests dependencies exceptions availabilityResults customMetrics performanceCounters browserTimings | Operators where count project join limit order | Renderer table chart (bar, pie, time, area, scatter) https://docs.microsoft.com/en-us/azure/application-insights/app-insights-analytics-reference
  43. Application Insights Analytics DEMO
  44. Not covered in this talk VS plugin for Application Insights Run some analysis inside VS Continuous export Export data from Application Insights continuously if you want to analyze in other tools Retain data longer than 7 days Join with other data (preview) https://docs.microsoft.com/en-us/azure/application-insights/app-insights- analytics-using#import-data Join logging and metrics with other data sources, query with AI Analytics
  45. Conclusion
  46. But... Why? Seriously, why are you in this session...
  47. But... Why? Troubleshooting – Did a problem occur? Where? Our code? The machine? Performance / Cost – Did we make the application slower? Improvement – Can a problem be detected or avoided? Trends – Need more storage? Need more servers? Customer Experience – Are people happy? Or seeing error after error? Business Decisions – Can we increase sales based on log data/telemetry?
  48. Structured logging Enrich technical logs with business data Allows correlating when things go wrong Allows making business decisions when using a good data analysis tool on top
  49. Application Insights Is that good analysis tool! (for web, mobile, desktop!) Solves Collecting various sources Ingesting various sources Enrichment with structured logging (e.g. Serilog) Helps analyzing all that data Default views, metrics, alerts, monitoring Custom querying – Application Insights Analytics Build, measure, improve! Both technical and business – there isn’t always a divide between both...
  50. Thank you! Need training, coaching, mentoring, performance analysis? Hire me! https://blog.maartenballiauw.be/hire-me.html http://blog.maartenballiauw.be @maartenballiauw

Editor's Notes

  1. https://pixabay.com/p-960806/
  2. Highlight which customer? Awesome, stack trace for Exception, “Debug is 0” great! RunEngine entry very useful. What if we want to aggregate “Req” / “Ret”?
  3. More structured/contextual data! Yay!
  4. Open Demo-EventSource Look at PerformBusinessLogic – some actions happening there, logging some items Look at CRMSystemEventSource In itself, pretty straightforward – attributes to inform the even tracing infrastructure about what something is and does One method per event we want to trace – easy to use BUT – how to version this? Do we really need to add a new method, decorate it with an attribute, ... For every single thing we want to log? And then invoke that method when needed? Cumbersome...
  5. Open Demo-Serilog Look at NuGet packages installed – Serilog and some sinks Browse NuGet for more Serilog packages and see millions Setting up Serilog – check Main method and explain what happens, show multiple sinks, show enrichers Show business logic again Still has log statements, no way around that Logging is now structured – we provide a message which can contain some info about the data we are logging But for example when catching the Exception, we can write a simple message and log the actual Exception with it Run the application, see colored console output, see a file has been created Explore the collapsed region: we can write to other sinks e.g. Windows Event Log Or attach an observable – explore the code and mention that even data that is not rendered by a sink is still present with the message, so if we could ship this to a system we could search/track/measure Exceptions, Customer objects, ... Based on property name
  6. https://pixabay.com/p-960806/
  7. Open portal.azure.com Show how to create Application Insights service Open service, click “Getting Started” Explain there are a few entry points to collect all data Client-side (JS) Server-side (or mobile) – different data points from inside app, from inside server, Exceptions, ... Custom metrics and events Open Demo_WebApp Show AppInsights package reference added, show Startup.cs where to configure (3 simple “Use...”) Show _Layout.cshtml, run app, show snippet Back to portal – data! (overview) Response times (collected from server and perf counters) Page view load time (correlated from client-side JS) # requests # failures Click server response time, see more detail, grouping, aggregation We can search data, correlate other data, ...
  8. https://pixabay.com/en/cat-animal-eyes-grey-view-views-351926/
  9. Open portal.azure.com MyGet AppInsights Application Map Show Availability tests test the server-side Show client-side uses server-side Click around, explain what we can see here
  10. Open portal.azure.com MyGet AppInsights Smart Detection We see nothing! Which is good. Smart Detection tries to be proactive in detecting things that are “not normal” Look at settings for list of examples Alerts (open Alerts) Add alert, see on which metrics we can alert Show examples, show web tests feeding into alerts, ...
  11. Open portal.azure.com MyGet AppInsights Live Metrics Stream Connects to our running application and inspects the data that comes in real-time Metrics, health, server CPU and memory, ... We can filter per server as well
  12. Open portal.azure.com MyGet AppInsights Browse through the blades Availability – explain web tests, explain these can be simple or actual tests created in VS that have multiple steps and checks Failures – we can see diagrams of exceptions vs. other metrics, as well as Exceptions Open Exception, see the occurences of this specific one See Exception message, stack trace, ... Analyze trends / root cause: what happened 5 minutes before this event? Performance – what is our slowest page? Drill into details, find out why Also lets us improve then see if the page drops in the ranking, continuous improvement Servers – Some server details, quickly go through this Browser – Browser stats, verify client-side code, page render times, find slowest rendering pages, ... Usage – Correlate technical side with user-side Sessions, users, custom events (more later), ...
  13. Open portal.azure.com MyGet AppInsights In Usage tab, note these “feature names” Custom events! We track feature usage there to see if features are actually being used or not Switch to Demo_WebApp In HomeController -> About, show the way to track metrics In Contact.cshtml, show how to track metrics/events client-side as well Switch to Demo AppInsights, show the portal’s Metrics tab, show how we can filter/correlate
  14. Open Demo_WebApp Show the HomeController – still same code as before, this time using _logger from ASP.NET Core Mention using full object data to make sure the object is streamed to AppInsights later on Startup.cs Show Serilog configuration – log to AppInsights Show how it ties into ASP.NET Core logging infrastructure Run the application, make a few requests to /Home/About, /Home/Contact, /Home/About?throw, ... Switch to Demo appinsights, use Search Lots of data – verbosity should probably be tuned... Search for “invoice” – yay custom log data! Expand one of them, click “All available telemetry” -> correlates all traces we had for this specific one, correlating our trace with request, exception, ... Expand the Exception, see that we even have customer id and customer email with it – super useful to have it all tied together
  15. https://pixabay.com/p-960806/
  16. Open portal.azure.com Demo AppInsights Show some of the sample queries, show the editor, show the various rendering types (auto complete in there) Run a few sample queries and find custom dimensions All traces for our About page traces | where operation_Name == "GET /Home/About" | order by timestamp desc | limit 500 Number of invoices per customer (customMetrics!) customMetrics | extend customerId = tostring(customDimensions.['Customer.Id']) | project customerId, value | summarize avg(value) by customerId | render piechart MyGet AppInsights All requests from yesterday that have an Exception requests | where timestamp > ago(1d) | where success == "False" | join kind = leftouter ( exceptions | where timestamp > ago(1d) ) on operation_Id | summarize exceptionCount = count() by operation_Name, outerMessage | order by exceptionCount asc Are we being DDoS-ed by one specific client? requests | summarize count() by client_IP | top 10 by count_ | render piechart Explain concept of package sources – at one point we DDoS-ed ourselves (infinite loop) Found using AI Analytics:
Advertisement