CodeFest 2014. Christopher Bennage — Semantic Logging. Avoiding the log chaos

5,256 views
5,158 views

Published on

Published in: Internet, Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
5,256
On SlideShare
0
From Embeds
0
Number of Embeds
199
Actions
Shares
0
Downloads
6
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • First, let’s acknowledge the painfulness of logging.We need logging because it helps us to understand what is going on inside of our systems. This is especially true as our system become more sophisticated. It is almost impossible to understand what is going on in a distributed or message-based system without some kind of logging.Let’s also acknowledge that logging its not a particularly interesting topic either. Often, we are only concerned about it because we know that it will be important later. Adding logging to a system can feel a like a distraction from getting the real work done. Since it is annoying to add logging, we often rush through it.The problem is compounded again by the fact that we frequently don’t knowhow we are going to use the logs until we need to use the logs.The result is that our logs are often:InconsistentWithout structureHard to consumeHard to automateOverwhelmingPhoto: http://www.flickr.com/photos/absent/2157057475/
  • “Semantic” logging is a way to address these problems.We use the word “semantic” to express that our logs have “meaning”. You might also call this approach something like “structured” logging, or perhaps “strongly typed” logging.Even though this talk is about the Semantic Logging Application Block, what I am really interest in is changing the way you think about logging. Creating a structured or semantic log is essentially a different way of thinking about logging.The primary idea is that we deliberately include structure in the entries we log from the very beginning. We are acknowledging from the beginning that we cannot predict exactly how our logs will be consumed.The entries in our log will have more metadata describing what was logged. It also means that we use logging techniques that encourage more discipline as we set up our logging.Our intention with SLAB is to help you “put the effort in the right place”. We want to remove as much of the friction as possible for taking a more structured and semantic approach to your logs.
  • Let’s take a look at a typical unstructured log.Here we have output from some sort of sorting algorithm. Even though I am calling it unstructured, we should recognize that it is not entirely without structure. We see things like event ids, log levels, et cetera.Nevertheless, it is not structured enough. What we really care about is usually the log message in each entry. The messages are strings that were intended to read by a human. The developer who added this logging had an understanding of what the names in the entries meant. However, it is not clear what those names mean when we look at the log after the fact.The problem is that the relevant information is flattened into the message. Unless you understand how to reserved parse the message you will have difficultly understanding what the message really means.Now this is really a simple example, imagine a more complex and distributed system. The loss context when reading the log messages becomes even more pronounced.
  • Now let’s take a look at the output from a structured log. Here we have some log entries stored in an Azure table.We still have a message that is intended for humans to read. However, we are also retained that same information in a more structured format. Notice the “5” in the log message. Now notice how value appears in the payload for the log entry.Likewise, in the name example you can see the approver name embedded in the human-readable message, but also broken out into a separate field.This allows us to query by the payload argument without any additional parsing of the human-readable message.
  • Our team, patterns and practices, experienced a lot of this pain first hand when we were building WASABI, the Auto Scaling Block for Azure. (I must confess that I was not part of that particular team.) We had a personal need to improve our logging.Now, we want to share the things that helped us. Specifically, we want to change the way developers think about logging.We want you to think about the consumption and purpose of the logging at the begging.We allow the appropriate decisions to be made at the appropriate times by explicitly separating:What to logWhen to log itWhere the log is storedMany logging frameworks already abstract away the idea of where the log is stored, so that’s not necessarily something new.
  • One thing that can be confusing about SLAB is understanding the technologies that are being used and how they related to one another.Event Tracing for Windows, or ETW, is a service provided by the Windows platform. It is a general purpose, high speed tracing facility for Windows. It was first introduced all the way back in Windows 2000. It offers excellent performance, but most .NET developer are not familiar with it because it has been historically hard to use from managed code.In .NET 4.5, the EventSource class was introduced. It eases the authoring experience for tracing events. It is extensible however it only supports ETW out of the box.SLAB builds on top of EventSource, and allows you to provide different destinations for events that are published with EventSource. This means that you can use SLAB without any knowledge of ETW.
  • Let’s take a more visual look at the technologies involved.First, you will begin by deriving from .NET’s EventSource class. This custom class is where you will define what can be logged for your system. (Remember that we are trying to separate the what, when, and where. The when is defined as the location in our application where we invoke the custom event source.)I haven’t mentioned it yet, but with SLAB you have a the choice about processes the events either in-process or out-of-process. In the in-process case, we don’t really care about ETW at all. We set up some called an ObservableEventListener that takes events published from our custom event source and writes them to the sinks. In this case, the parts that are SLAB are the ObservableEventListener itself and the sinks. This may be a little bit confusing, because ObservableEventListener inherits from EventListener which is part of .NET itself.In the out-of-process story, the custom event source in our primary application is publishing the events to ETW. From ETW, we can access the events in a few different ways such as the Event Log or PerfView.Generally though we have a secondary application running whose purpose is to receive the published events and write them to one or more sinks.
  • Let’s talk more specifically about the features that SLAB offers. There are five high level features to understand:Sinks – are the destination where log entries will ultimately be persisted.Formatters – provide a pluggable way to format the log entries before they are persistedOut-of-process service – we provide a ready-to-go service for working with the events out-of-processEvent Source Analyzer – for ensuring that we have consistently and properly designed our event sourceObservable Event Listener – allows for interesting filtering of event using Reactive Extensions (Rx)
  • In the box, we provide sinks for:Azure TablesSQL DatabaseFlat fileRolling flat fileConsoleElasticsearchThe sink for Elasticsearch has just been release (March 28) as part of SLAB 1.1. We’re very excited about it because it is first release since going open source, and we had significant community involvement.I should note that Windows Event Log is not supported by SLAB. However is now natively supported by EventSource.
  • The formatters are used for sinks such as flat file and console.
  • In the box we provide both a Windows Service and a console application that can be used to monitor events out-of-process. In both case, all of the sinks are supported. The hosts are configuration driven and there is support for re-configuring on-the-fly.The events in the originating process are sent toETW. The host is a dedicated process, independent of your application, that is used to just to persist the events to different destinations.This has several benefits, however the primary benefit is resiliency. There is increased fault tolerance in case of your application crashing. In addition, there is no need to reference SLAB in the monitored application if you are only using out-of-process. Likewise, it remove some of the logging overhead from your primary application.The output from multiple monitored applications can be sent to a single out-of-process host. There are some downsides to this. It can be slightly more complicated to set up and it is more difficult to work with at development time. We discuss the trades off in the documentation.
  • There are still some rough edges when using EventSource. For example, you have to specify the event id in both the attribute decorating the logging method as well as in the call to WriteEvent. Likewise, you have to be consistent in the order of the parameters passed to these two methods. Since this can be error prone, SLAB provides the Event Source Analyzer. It allows you to inspect a specific instance of an event source for problems.Here you can see some sample code where we inspect the event source during a unit test.
  • SLAB’s implementation of Event Listener implements IObservable. Likewise, the included syncs implement IObserver. These interfaces are part of Reactive Extensions, also known as Rx. Since these interfaces are used, this means that we can leverage Rx for filtering and transforming the event stream before we pass it along to a sink.I suspect that there are many of you who are not familiar with Rx. That’s okay. I should also state that there is no hard dependency on Rx unless you want to use it for processing the events. Let me do a quick demonstration of Rx and how it can work with SLAB for these who may not be familiar with it.
  • Originally, I was going to share “what’s next” with SLAB. However, we just released the 1.1 update for SLAB (March 27).This new release includes support for ActivityIds. Activity Ids are a feature added to Event Source in .NET 4.5.1. They allow you to trace events across a logical transition. This is especially usefully when using something like TPL (Task Parallel Library). I won’t go into details here, there’s more information about Activity Id support in the release notes.In additional to some performance improvements and bug fixes, we have also included an additional sink for Elasticsearch. If you are not familiar with Elasticsearch, it’s a data store with built-in full text search and analysis capabilities. Again, more details are available in the release notes.I would also like to emphasis that this release was largely due to community contributions. All of the projects under the Enterprise Library umbrella are now open sourced under Apache 2.0 and we are welcoming community contributions. If you are interested in contributing, please check http://aka.ms/entlibopen.
  • If you are interested in SLAB, let me recommend the following:Evaluate SLAB and adopt it (search for “slab” in NuGet).Read the docs - aka.ms/slab Practice the Hands-on Labs &Quickstarts - aka.ms/el6holsEngage with us by providing feedback and/or submitting contributions - slab.codeplex.com We are very interested in your feedback. Now that we are truly open source, you can be very direct with your feedback.  You can send us a pull request!
  • CodeFest 2014. Christopher Bennage — Semantic Logging. Avoiding the log chaos

    1. 1. Christopher Bennage patterns & practices microsoft.com/practices Semantic Logging Avoiding the Log Chaos
    2. 2. • No real structure • What’s in there? • Sheer number of files & types of logs is overwhelming • Hard to consume/automate • Subject to compatibility/inconsistencies. Logs are Frustrating
    3. 3. Semantic?
    4. 4. Unstructured Log An example 176 [main] INFO examples.Sort - Populating an array of 2 elements in reverse order. 225 [main] INFO examples.SortAlgo - Entered the sort method. 262 [main] DEBUG SortAlgo.OUTER i=1 - Outer loop. 276 [main] DEBUG SortAlgo.SWAP i=1 j=0 - Swapping intArray[0] = 1 and intArray[1] = 0 290 [main] DEBUG SortAlgo.OUTER i=0 - Outer loop. 304 [main] INFO SortAlgo.DUMP - Dump of integer array: 317 [main] INFO SortAlgo.DUMP - Element [0] = 0 331 [main] INFO SortAlgo.DUMP - Element [1] = 1 343 [main] INFO examples.Sort - The next log statement should be an error message. 346 [main] ERROR SortAlgo.DUMP - Tried to dump an uninitialized array. 467 [main] INFO examples.Sort - Exiting main method.
    5. 5. Structured Log An example using Azure Query by payload argument
    6. 6. • Logging cannot be just a checkmark of doing something. • You have to think about consumption and purpose. • Allow appropriate decisions to be made at appropriate time, explicitly separating: •WHAT to log •WHEN to log it •WHERE to log We are on a mission Changing the way people think about logging
    7. 7. demo >> EventSource & SLAB in-process
    8. 8. Technologies at Play Event Tracing for Windows (ETW) • Native to Windows platform • Great performance & OK diagnostic tooling • Historically hard to publish events EventSource class • Introduced in .NET Framework 4.5 • Meant to ease authoring experience • Extensible but supports ETW-only out of the box Semantic Logging Application Block (SLAB) • Provides several destinations for events published with EventSource • Does not require any knowledge in ETW • Additional tooling support for authoring events
    9. 9. Technologies at Play
    10. 10. • Sinks • Formatters • Out-Of-Process Service • Event Source Analyzer • Observable Event Listener Sinks Features of SLAB
    11. 11. • Azure Tables • SQL Database • Flat file • Rolling flat file • Console • Elasticsearch Sinks Features of SLAB
    12. 12. • JSON • XML • Natural (plain-text) Formatters Features of SLAB
    13. 13. • Hosted as a Windows Service or console • All sinks are supported • Configuration-driven with support for re-configuration Benefits •Increased fault tolerance in case of application crash •Monitored application does not reference SLAB •Can monitor multiple processes from a single service. •Moves the logging overhead from the application to a separate process (but the overhead is still there!) Out of Process Service Features of SLAB
    14. 14. // can be run in a unit test TestMethod public void EventSourceAnalyzer AExpenseEvents // will verify correctness of events // this example has inconsistent ID and order of parameters Event public void int string int this Event Source Analyzer Features of SLAB
    15. 15. • Event listener is IObservable. • Event sinks are IObservers. • Can leverage Reactive Extensions (Rx) to filter, pre-process or transform the event stream before it’s persisted. Based on Observable Features of SLAB
    16. 16. demo >> Flush on Error/Alarm Flood Throttle
    17. 17. • Support for ActivityIds • Ability to capture events from source not publicly available • Sink for Elasticsearch • Performance improvements • Improved extensibility story • Minor bug fixes http://aka.ms/slab1_1 SLAB 1.1 What’s Coming Next Now
    18. 18. • Evaluate SLAB and adopt it (search for “slab” in NuGet). • Read the docs - aka.ms/slab • Practice the Hands-on Labs & Quickstarts - aka.ms/el6hols • Engage with us by providing feedback and/or submitting contributions - slab.codeplex.com Call to Action
    19. 19. • http://slab.codeplex.com • http://aka.ms/slab • http://entlib.codeplex.com Resources microsoft.com/practices • @bennage • dev.bennage.com
    20. 20. Вопросы? Christopher Bennage patterns & practices Microsoft christopher.bennage@microsoft.com

    ×