Logging is one of those things that everyone complains about, but doesn't dedicate time to. Of course, the first rule of logging is "do it". Without that, you have no visibility into system activities when investigations are required. But, the end goal is much, much more than this. Almost all applications require security audit logs for compliance; application logs for visibility across all cloud properties; and application tracing for tracking usage patterns and business intelligence. The latter is that magic sauce that helps businesses learn about their customer or in some cases the data is FOR the customer. Without a strategy this can get very messy, fast. In this session Michele will discuss design patterns for a sound logging and audit strategy; considerations for security and compliance; the benefits of a noSQL approach; and more.
7. Why do we log?
• Troubleshooting visibility
• Security audits, review, early detection
• Post incident forensics
• Track change history
• Insights into user activity
• Reporting and analysis
8. What to log?
EXAMPLE:
Application Events
Windows Logs
IIS Logs
Trace Output
EXAMPLE:
Login Attempts
Unauthorized/
Authorized Access
Password Resets
EXAMPLE:
Session Trace
Purchase Flow
Report Generation
Feature Access
EXAMPLE:
Change history for
any critical system
records
Live Streaming / Analytics
Event Logs Audit Logs Activity Logs History Logs
16. Logs and Compliance
• Contain no user credentials
• No PII, PHI or identifiable user data
• Retention period (1 year is good baseline)
• A structured archival process
• Alert if log reaches capacity
• Authorized access
• Protections from modifications (write-only)
18. Benefits of noSQL
• Log details tend to evolve
– Schema-less storage is best
– Re-indexing may be necessary
• Co-location with mainline databases
– Adds complexity and overhead (potentially)
– Does not allow a separate “evolution” team
around telemetry and analysis
19. Audit Log Use Cases
• Every login attempt (success or failure)
• Excessive login attempts and lockouts
• Blocking/blacklisting users, IP addresses, access ports
• Every logout
• Every modification to user table, including permissions
• All configuration changes
• Attempts to access restricted resources, APIs from
unexpected paths
• All access to PII / PHI in an individually identifiable way
20. Audit Log Fields
• Date/time of event
• Machine name/instance
• Process ID
• User ID (possibly encrypted) / Session ID
• Type of event
• Success or failure of the event (if applicable)
• Seriousness of the event violation (if applicable)
• Message (free form)
• Stack Trace (if applicable)
22. History Logs
• Changes made to key tables
• Describes
– Who changed the record?
– From which application?
– Which fields changed?
• Need the ability to surface this to applications
– Sometimes to users
– Always to operations to solve problems
23. Implement a History Log Helper
HistoryLogger.Current.Write();
IHistoryLogger
HistoryLogger
History Logs
DocumentDB
Claims
Users
Orders
Claims
Claims
…
24. Wrap History in the DAL
History Logs
OrdersDal
UsersDal
ContentDal
Relational DB
Orders
Claims
Users
Content
25. Wrap History in the DAL
History Logs
OrdersDal
UsersDal
ContentDal
Relational DB
Orders
Claims
Users
Content
26. What happened with my order?
History Logs
OrdersDal
UsersDal
ContentDal
Relational DB
Orders
Claims
Users
Content
27. Activity Logs
• Not specific to code execution and
troubleshooting, diagnostics
• Specific to the application, user activity
• COULD be informative to users as well
– History of recent activity in the site
– Reports they requested, downloads, other…
• Provides insights to the business regarding
user activity, trends and patterns
– Non-critical analysis
32. Client and Server Logging
Client
Apps
Mobile API Client API Log API Client API Log API
Loggers
Web
Browsers
Mobile
Apps
Event Logs Audit Logs Activity Logs History Logs
33. What can I queue?
Loggers
ETW
DocDB
Event Logs Audit Logs Activity Logs History Logs
35. Queued Logging
• Considerations
– Timestamps matter
– Correlation across nodes matters (to a point)
– Guaranteed exactly one in order doesn’t exist
– Async is good (mostly)
• That said
– Priority matters (hot, warm, default)
– Simplicity matters
– Throughput matters
37. Problem Statement
• We need immediate access to what the HECK
is going on when there is a problem
• Sometimes I use (in order):
– Google Analytics
– Event Logs (Azure Website)
– Table Storage queries (STRIKE THAT, USELESS)
– Blob storage CSVs (good enough, not realtime)
Visibility into runtime behavior for troubleshooting or analysis
Early detection of security incidents, identification of potential threats
Forensic analysis to discover the cause of events, and ways to avoid them in future with software controls or other means
General business intelligence and analysis of user and system behavior
----- Meeting Notes (12/3/14 07:53) -----
it starts to look like a lot of work...
so, if I could impart one message up front it would be this
Add heavy lifting guy
----- Meeting Notes (12/3/14 07:53) -----
assume your devs are stupid and lazy
----- Meeting Notes (12/3/14 05:45) -----
with this we can litter our code with verbose logs
Example, migration to cloud, risky to add logs, risky not to have them
----- Meeting Notes (12/3/14 07:53) -----
DEMO 1 - show that logging code
Add heavy lifting guy
----- Meeting Notes (12/3/14 05:45) -----
so we have a wrapper class
it starts with basic event logging
sts, wrote to event log, etw trace, event source today
cloud, use what comes naturally
Without it, you have no visibility
If trying to “get it right” is preventing you from logging, you’re already in trouble
Just log, worry about improvements later
We don’t know how you do it
We don’t care how you do it
We do need to know where it goes (devops)
Add heavy lifting guy
The technical details will be platform dependent
Inheritance
Dependency injection, these are details
The point is, auditing is intentional; you call it out
It goes to a different place;
----- Meeting Notes (12/3/14 07:53) -----
DEMO 2 - show doc db classes, show results
add a field?
----- Meeting Notes (12/3/14 07:53) -----
these logs are only good if you actually review them
Add heavy lifting guy
----- Meeting Notes (12/3/14 07:59) -----
sql logs not helpful to surface to apps
helpful for forensics, not accessible to many
----- Meeting Notes (12/3/14 07:59) -----
DEMO 3 - ??? look at history? any object works?
Add heavy lifting guy
You are collecting logs
Now what, site is down, how do you know what’s up?
What kinds of exceptions are being thrown?
Where in the code are there uncaught exceptions tossing up the chain?
Are you catching and logging those?
You are collecting logs
Now what, site is down, how do you know what’s up?
What kinds of exceptions are being thrown?
Where in the code are there uncaught exceptions tossing up the chain?
Are you catching and logging those?