Successfully reported this slideshow.

What is going on? Application Diagnostics on Azure - Copenhagen .NET User Group

4

Share

1 of 51
1 of 51

What is going on? Application Diagnostics on Azure - Copenhagen .NET User Group

4

Share

Download to read offline

We all like building and deploying cloud applications. But what happens once that’s done? How do we know if our application behaves like we expect it to behave? Of course, logging! But how do we get that data off of our machines? How do we sift through a bunch of seemingly meaningless diagnostics? In this session, we’ll look at how we can keep track of our Azure application using structured logging, AppInsights and AppInsights analytics to make all that data more meaningful.

We all like building and deploying cloud applications. But what happens once that’s done? How do we know if our application behaves like we expect it to behave? Of course, logging! But how do we get that data off of our machines? How do we sift through a bunch of seemingly meaningless diagnostics? In this session, we’ll look at how we can keep track of our Azure application using structured logging, AppInsights and AppInsights analytics to make all that data more meaningful.

More Related Content

Similar to What is going on? Application Diagnostics on Azure - Copenhagen .NET User Group

More from Maarten Balliauw

Related Books

Free with a 14 day trial from Scribd

See all

What is going on? Application Diagnostics on Azure - Copenhagen .NET User Group

  1. 1. What is going on? Application Diagnostics on Azure Maarten Balliauw @maartenballiauw
  2. 2. Who am I? Maarten Balliauw Antwerp, Belgium Developer Advocate, JetBrains Founder, MyGet AZUG Focus on web ASP.NET MVC, Azure, SignalR, ... Former MVP Azure & ASPInsider Big passion: Azure http://blog.maartenballiauw.be @maartenballiauw
  3. 3. Agenda Logging Why? Logging sucks! The need for semantic/structured logging Application Insights SDK, Azure portal Application Insights Analytics Find the needle in the haystack! Build, measure, improve
  4. 4. Logging
  5. 5. Why logging? Troubleshooting – Did a problem occur? Where? Our code? The machine? Performance / Cost – Did we make the application slower? Improvement – Can a problem be detected or avoided? Trends – Need more storage? Need more servers? Customer Experience – Are people happy? Or seeing error after error? Business Decisions – Can we increase sales based on log data / telemetry?
  6. 6. So here’s what we typically do... System.Diagnostics.Trace.TraceInformation( "Something happened"); System.Diagnostics.Trace.TraceWarning( "Error! " + ex.Message); Does this help troubleshooting? Improve the application? Analyze trends?
  7. 7. And here’s a typical log... App.exe [12:13:03:985] Information: 0 : Customer address updated. App.exe [12:13:04:011] Error: 8 : System.NullReferenceException occurred. Value can not be null. App.exe [12:13:04:567] Information: 0 : Machine policy value 'Debug' is 0 App.exe [12:13:04:569] Verbose: 9 : ******* RunEngine ******* Action: ******* CommandLine: ********** App.exe [12:13:04:578] Information: 9 : Entered CheckResult() App.exe [12:13:04:689] Debug: 0 : Created. Req=0, Ret=8, Font: Req=null Does this help troubleshooting? Improve the application? Analyze trends?
  8. 8. No.
  9. 9. Log files suck. Stupid string data Unless using ETW / proper System.Diagnostics listener Typical log file has no “context” Typical log file has no correlation How to get data off our machines? How to report/analyze?
  10. 10. Log files suck. Stupid string data Typical log file has no “context” Process/thread Machine name User performing the action Data specific to the event being logged Typical log file has no correlation How to get data off our machines? How to report/analyze?
  11. 11. Log files suck. Stupid string data Typical log file has no “context” Typical log file has no correlation, e.g.: What was the server load when this was logged? How much memory was consumed at the time? Which web request invoked this action? How to get data off our machines? How to report/analyze?
  12. 12. Log files suck. Stupid string data Typical log file has no “context” Typical log file has no correlation How to get data off our machines? How to report/analyze?
  13. 13. Log files suck. Stupid string data Typical log file has no “context” Typical log file has no correlation How to get data off our machines? How to report/analyze? Log processing and analysis – log, store, process, query, proactive analysis Logstash, ElasticSearch, ... – cumbersome to setup
  14. 14. What if we had (better) contextual information?
  15. 15. Structured/Semantic logging “Of or relating to meaning, especially meaning in language” – the dictionary Windows Event Log Event Tracing for Windows (ETW) Semantic Logging Application Blog (PnP SLAB) Serilog …
  16. 16. Event Source XML representation containing contextual information application-specific data elements Can be read by humans (just a message, like any log) Can be read by machines (XML, baby!) add “meaning” – we know what the data points can be allow automated aggregaton / trend analysis Fast! (not very developer friendly)
  17. 17. Using EventSource DEMO
  18. 18. Event Source Super awesome It has structured logging and context Can collect all kinds of data combined (e.g. HTTP + own event source) Hard to maintain Versioning Ceremony One method for every single thing we want to log (or at least an event id) Requires thinking about the events to log Lots of data
  19. 19. Serilog - https://serilog.net/ Logging library like any other, mostly Sinks https://github.com/serilog/serilog/wiki/Provided-Sinks Console, file, event log Database ElasticSearch AppInsights Structured data as a first-class concept Enrichers, filters, …
  20. 20. Structured data in Serilog Simple scalar values (bool, string, int, …) Collections and dictionaries Full objects Rendered by a sink to, for example, a string Can be stored by a sink to provide additional dimensions on data
  21. 21. Serilog - Basics DEMO
  22. 22. Enrichment of data Enrich log entries with additional context Machine name User name Process / thread information ASP.NET client IP / hostname ASP.NET user agent … Using Serilog enrichers (when applicable) In custom ASP.NET middleware
  23. 23. But still... How to get the data off our machine(s)? How to analyze/aggregate/predict? Which customer had this Exception and why? Did our sales go up after we made this page faster? What button on the front-end crashes this microservice? Are # Exceptions increasing? Do I need special infrastructure to store and process logs?
  24. 24. The cloud to the rescue? (Azure) Azure Insights Azure platform’s core monitoring and alerting services (standard Metrics, Alerts, Autoscale, Email Notifications and Audit Logs) Application Insights Application monitoring & logging for your applications (web, mobile, desktop, …) in cloud or other server Log Analytics (OMS) Log Analytics, part of the Operations Management Suite (OMS) ingests data from Azure servers and provides rich ‘global’ view & advanced log search based alerting capabilities
  25. 25. Application Insights
  26. 26. Application Insights Azure Service + library/SDK Solves “where to store”, “how to ship”, “how to analyze” Enriches data Correlates data (e.g. client + server + dependency/DB/...) Telemetry! Allows structured logging Allows rich querying, alerting Works for many, many platforms and languages Web, Windows, Xamarin, any application type really .NET, Java, JavaScript, Objective-C, PHP, Python, Ruby, ...
  27. 27. Getting Started DEMO
  28. 28. Do I have to run on Azure? No. REST API to send data to Various tools and SDK’s to collect data specific to language/platform Status Monitor on non-Azure machines
  29. 29. Data collected Performance Counters Requests (both server/client side) Traces (more later) Exceptions Dependencies Custom Metrics & Events That is a lot of data that can be correlated!
  30. 30. Different Views
  31. 31. Application Insights portal Dashboard Basic metrics (# Alerts, Response times, Exceptions, Failed Requests, ...) Gateway to all of the below Application map Smart Detection / Alerts Live Metrics Stream Search, Availability, Failures, Performance, ... Metrics (and events) Analytics
  32. 32. Application Map DEMO
  33. 33. Smart Detection / Alerts DEMO
  34. 34. Live Metrics Stream DEMO
  35. 35. Search, Availability, Failures, Performance, ... DEMO
  36. 36. Metrics and Events Send custom events and metrics Name of event/metric + a value Events – help find how the application is used and can be optimized User resized column in a grid User logged in User clicked “Share” No more stock ... Metrics – help alerting / finding missing functionality Purchase occurred Search returned zero results # Support calls ...
  37. 37. Metrics and Events DEMO
  38. 38. So what about logging... Configure Serilog to use Application Insights as a sink Custom data becomes available (*) So we can correlate our own data with performance, requests, traces, exceptions, depenencies, ... But how to do this? And how to consume? (*) Make sure the entire object is in the log message, no tracking of non-printed data
  39. 39. Logging DEMO
  40. 40. Application Insights Analytics
  41. 41. Application Insights Analytics Former MS-internal tool “Kusto” Near-realtime log ingestion and analysis Lets you run custom queries requests | where timestamp > ago(24h) | summarize count() by client_CountryOrRegion | top 10 by count_ | render piechart
  42. 42. Queries over logs! Input dataset traces customEvents pageViews requests dependencies exceptions availabilityResults customMetrics performanceCounters browserTimings | Operators where count project join limit order | Renderer table chart (bar, pie, time, area, scatter) https://docs.microsoft.com/en-us/azure/application-insights/app-insights-analytics-reference
  43. 43. Application Insights Analytics DEMO
  44. 44. Not covered in this talk VS plugin for Application Insights Run some analysis inside VS Continuous export Export data from Application Insights continuously if you want to analyze in other tools Retain data longer than 7 days Join with other data (preview) https://docs.microsoft.com/en-us/azure/application-insights/app-insights- analytics-using#import-data Join logging and metrics with other data sources, query with AI Analytics
  45. 45. Conclusion
  46. 46. But... Why? Seriously, why are you in this session...
  47. 47. But... Why? Troubleshooting – Did a problem occur? Where? Our code? The machine? Performance / Cost – Did we make the application slower? Improvement – Can a problem be detected or avoided? Trends – Need more storage? Need more servers? Customer Experience – Are people happy? Or seeing error after error? Business Decisions – Can we increase sales based on log data/telemetry?
  48. 48. Structured logging Enrich technical logs with business data Allows correlating when things go wrong Allows making business decisions when using a good data analysis tool on top
  49. 49. Application Insights Is that good analysis tool! (for web, mobile, desktop!) Solves Collecting various sources Ingesting various sources Enrichment with structured logging (e.g. Serilog) Helps analyzing all that data Default views, metrics, alerts, monitoring Custom querying – Application Insights Analytics Build, measure, improve! Both technical and business – there isn’t always a divide between both...
  50. 50. Thank you! Need training, coaching, mentoring, performance analysis? Hire me! https://blog.maartenballiauw.be/hire-me.html http://blog.maartenballiauw.be @maartenballiauw

Editor's Notes

  • https://pixabay.com/p-960806/
  • Highlight which customer? Awesome, stack trace for Exception, “Debug is 0” great! RunEngine entry very useful. What if we want to aggregate “Req” / “Ret”?
  • More structured/contextual data! Yay!
  • Open Demo-EventSource
    Look at PerformBusinessLogic – some actions happening there, logging some items
    Look at CRMSystemEventSource
    In itself, pretty straightforward – attributes to inform the even tracing infrastructure about what something is and does
    One method per event we want to trace – easy to use
    BUT – how to version this? Do we really need to add a new method, decorate it with an attribute, ...
    For every single thing we want to log? And then invoke that method when needed? Cumbersome...
  • Open Demo-Serilog
    Look at NuGet packages installed – Serilog and some sinks
    Browse NuGet for more Serilog packages and see millions
    Setting up Serilog – check Main method and explain what happens, show multiple sinks, show enrichers
    Show business logic again
    Still has log statements, no way around that
    Logging is now structured – we provide a message which can contain some info about the data we are logging
    But for example when catching the Exception, we can write a simple message and log the actual Exception with it
    Run the application, see colored console output, see a file has been created
    Explore the collapsed region: we can write to other sinks e.g. Windows Event Log
    Or attach an observable – explore the code and mention that even data that is not rendered by a sink is still present with the message, so if we could ship this to a system we could search/track/measure Exceptions, Customer objects, ... Based on property name
  • https://pixabay.com/p-960806/
  • Open portal.azure.com
    Show how to create Application Insights service
    Open service, click “Getting Started”
    Explain there are a few entry points to collect all data
    Client-side (JS)
    Server-side (or mobile) – different data points from inside app, from inside server, Exceptions, ...
    Custom metrics and events
    Open Demo_WebApp
    Show AppInsights package reference added, show Startup.cs where to configure (3 simple “Use...”)
    Show _Layout.cshtml, run app, show snippet
    Back to portal – data! (overview)
    Response times (collected from server and perf counters)
    Page view load time (correlated from client-side JS)
    # requests
    # failures
    Click server response time, see more detail, grouping, aggregation
    We can search data, correlate other data, ...

  • https://pixabay.com/en/cat-animal-eyes-grey-view-views-351926/
  • Open portal.azure.com
    MyGet AppInsights
    Application Map
    Show Availability tests test the server-side
    Show client-side uses server-side
    Click around, explain what we can see here
  • Open portal.azure.com
    MyGet AppInsights
    Smart Detection
    We see nothing! Which is good. Smart Detection tries to be proactive in detecting things that are “not normal”
    Look at settings for list of examples
    Alerts (open Alerts)
    Add alert, see on which metrics we can alert
    Show examples, show web tests feeding into alerts, ...

  • Open portal.azure.com
    MyGet AppInsights
    Live Metrics Stream
    Connects to our running application and inspects the data that comes in real-time
    Metrics, health, server CPU and memory, ...
    We can filter per server as well
  • Open portal.azure.com
    MyGet AppInsights
    Browse through the blades
    Availability – explain web tests, explain these can be simple or actual tests created in VS that have multiple steps and checks
    Failures – we can see diagrams of exceptions vs. other metrics, as well as Exceptions
    Open Exception, see the occurences of this specific one
    See Exception message, stack trace, ...
    Analyze trends / root cause: what happened 5 minutes before this event?
    Performance – what is our slowest page? Drill into details, find out why
    Also lets us improve then see if the page drops in the ranking, continuous improvement
    Servers – Some server details, quickly go through this
    Browser – Browser stats, verify client-side code, page render times, find slowest rendering pages, ...
    Usage – Correlate technical side with user-side
    Sessions, users, custom events (more later), ...
  • Open portal.azure.com
    MyGet AppInsights
    In Usage tab, note these “feature names”
    Custom events! We track feature usage there to see if features are actually being used or not
    Switch to Demo_WebApp
    In HomeController -> About, show the way to track metrics
    In Contact.cshtml, show how to track metrics/events client-side as well
    Switch to Demo AppInsights, show the portal’s Metrics tab, show how we can filter/correlate

  • Open Demo_WebApp
    Show the HomeController – still same code as before, this time using _logger from ASP.NET Core
    Mention using full object data to make sure the object is streamed to AppInsights later on
    Startup.cs
    Show Serilog configuration – log to AppInsights
    Show how it ties into ASP.NET Core logging infrastructure
    Run the application, make a few requests to /Home/About, /Home/Contact, /Home/About?throw, ...
    Switch to Demo appinsights, use Search
    Lots of data – verbosity should probably be tuned...
    Search for “invoice” – yay custom log data!
    Expand one of them, click “All available telemetry” -> correlates all traces we had for this specific one, correlating our trace with request, exception, ...
    Expand the Exception, see that we even have customer id and customer email with it – super useful to have it all tied together
  • https://pixabay.com/p-960806/
  • Open portal.azure.com
    Demo AppInsights
    Show some of the sample queries, show the editor, show the various rendering types (auto complete in there)
    Run a few sample queries and find custom dimensions
    All traces for our About page
    traces
    | where operation_Name == "GET /Home/About"
    | order by timestamp desc
    | limit 500

    Number of invoices per customer (customMetrics!)
    customMetrics
    | extend customerId = tostring(customDimensions.['Customer.Id'])
    | project customerId, value
    | summarize avg(value) by customerId | render piechart
    MyGet AppInsights
    All requests from yesterday that have an Exception
    requests
    | where timestamp > ago(1d)
    | where success == "False"
    | join kind = leftouter (
    exceptions
    | where timestamp > ago(1d)
    ) on operation_Id
    | summarize exceptionCount = count() by operation_Name, outerMessage
    | order by exceptionCount asc

    Are we being DDoS-ed by one specific client?
    requests | summarize count() by client_IP | top 10 by count_ | render piechart

    Explain concept of package sources – at one point we DDoS-ed ourselves (infinite loop)
    Found using AI Analytics:

  • ×