O'Reilly Webinar Five Mistakes Log Analysis

•

5 likes•993 views

Anton Chuvakin

O'Reilly Webinar on "Five Mistakes Log Analysis"

Technology Business

LogLogic Confidential Thursday, March 19, 20151
The Top Five Log
Analysis Mistakes
Dr Anton Chuvakin
Chief Logging Evangelist
LogLogic, Inc

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
2Confidential |
Summary
1. System, Network and Security Logs
2. Why Look at Logs?
3. Brief Log Analysis Overview
4. Log Analysis Mistakes

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
3Confidential |
Log Data Overview
 Audit logs
 Transaction logs
 Intrusion logs
 Connection logs
 System performance records
 User activity logs
 Various alerts and other
messages
 Firewalls/intrusion prevention
 Routers/switches
 Intrusion detection
 Servers, desktops, mainframes
 Business applications
 Databases
 Anti-virus
 VPNs
What logs? From Where?

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
4Confidential |
Login? Logon? Log in?
<122> Mar 4 09:23:15 localhost sshd[27577]: Accepted password for kyle from
::ffff:192.168.138.35 port 2895 ssh2
<13> Fri Mar 17 14:29:38 2006 680 Security SYSTEM User Success Audit
ENTERPRISE Account Logon
Logon attempt by: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon acco
unt: POWERUSER Source Workstation: ENTERPRISE Error Code: 0xC000006A
4574
<57> Dec 25 00:04:32:%SEC_LOGIN-5-LOGIN_SUCCESS:Login Success
[user:yellowdog] [Source:10.4.2.11] [localport:23] at 20:55:40 UTC Fri Feb 28
2006
<18> Dec 17 15:45:57 10.14.93.7 ns5xp: NetScreen device_id=ns5xp system-
warning-00515: Admin User netscreen has logged on via Telnet from
10.14.98.55:39073 (2002-12-17 15:50:53)

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
5Confidential |
“Arrgh! Why
Don’t We Just
Ignore’Em?”

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
6Confidential |
Log Management Mandate and Regulations
Regulations
Require LMI
 SOX
 GLBA
 FISMA
 JPA
 NIST 800-53
 Capture audit records
 Regularly review audit records
for unusual activity and
violations
 Automatically process audit
records
 Protect audit information from
unauthorized deletion
 Retain audit logs
 PCI
 HIPAA
 SLAs
Mandates
Demand It
 PCI : Requirement 10
and beyond
 Logging and user activities
tracking are critical
 Automate and secure audit trails
for event reconstruction
 Review logs daily
 Retain audit trail history for
at least one year
 COBIT
 ISO
 ITIL
 COBIT 4
 Provide audit trail
for root-cause analysis
 Use logging to detect unusual or
abnormal activities
 Regularly review access, privileges,
changes
 Verify backup completion
 ISO17799
 Maintain audit logs for system
access and use, changes, faults,
corrections, capacity demands
 Review the results of monitoring
activities regularly and ensure the
accuracy of logs
Controls
Require it
“Get fined, Get
Sanctioned”
“Lose Customers,
Reputation, Revenue or Job”
“Get fined, Go To Jail”

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
7Confidential |
So, How Do People Do It?

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
8Confidential |
Log Analysis Basics
 Manual
– ‘Tail’, ‘more’, ‘grep’, ‘notepad’, etc
 Filtering
– Positive and negative (“Artificial ignorance”)
 Summarization and reports
– “Top X of Y”
 Visualization
 Log indexing and searching
 Correlation
– Rule-based and other
 Log data mining

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
9Confidential |
From Log Analysis to Log Management
 Threat protection and discovery
 Incident response
 Forensics, “e-discovery” and litigation support
 Regulatory compliance
 Internal policies and procedure compliance
 Internal and external audit support
 IT system and network troubleshooting
 IT performance management

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
10Confidential |
Looks Complicated?! No
Wonder People Make
Mistakes …

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
11Confidential |
Six Mistakes of Log Analysis and Log Management
0. Not logging at all.
1. Not looking at the logs
2. Storing logs for too short a time
3. Prioritizing the log records before collection
4. Ignoring the logs from applications
5. Only looking for “known bad” stuff

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
12Confidential |
Mistake 0: Not Logging AT ALL …
… and its aggravated version: “… and not
knowing that you don’t”
 No logging? -> well, no logs for incident
response, audits, compliance
Got logs?
If your answer is ‘NO”, don’t listen further: run
and enable logging right now!

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
13Confidential |
Example: Oracle
 Defaults:
– minimum system logging
– minimum database server access
– no data access logging
 So, where is …
– data access audit
– schema and data change audit
– configuration change audit

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
14Confidential |
Mistake 1: Not looking at logs
 Collection of logs has value!
 But review boosts the value 10-fold (numbers are estimates
)
 More in-depth analysis boosts it a lot more!
 Two choices here …
– Review after an incident
– Ongoing review

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
15Confidential |
Example Log Review Priorities
1. DMZ NIDS
2. DMZ firewall
3. DMZ servers with applications
4. Critical internal servers
5. Other servers
6. Select critical application
7. Other applications

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
16Confidential |
Mistake 2: Storing Logs For Too Short A Time
 You are saying you HAD logs? And how is it
useful?
 Retention question is a hard one. Truly,
nobody has the answer!
– Seven years? A year? 90 days? A week? Until the
disk runs out?
 Common: 90 days online and up to 1-3 years
“nearline” or offline

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
17Confidential |
Also A Mistake: Storing Logs for TOO LONG?!
 Retention = storage + destruction
 Why DESTROY LOGS?
– Privacy regulations (mostly EU)
– Litigation risk management
– System resource utilization

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
18Confidential |
Example Retention Strategy
Type + network + storage tier
 IDS + DMZ + online = 90 days
 Firewall + DMZ + online = 30 days
 Servers + internal + online = 90 days
 ALL + DMZ + archive = 3 years
 Critical + internal + archive = 5 years
 OTHER + internal + archive = 1 year

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
19Confidential |
Mistake 3: Deciding What’s Relevant Before
Collection
 How would you know what is …
– … Security-relevant
– … Compliance-relevant
– … or will solve the problem you’d have
TOMORROW!?
 Also affects “forensic quality” of logs
 Prioritization Challenge – Got ESP? 
 “Simple” – just grab everything!

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
20Confidential |
Example Common Logging Order
Log everything
Retain most everything
Analyze enough
Summarize and report on a subset
Look at some
Act in real-time on a few

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
21Confidential |
Mistake 4: Ignoring Logs from Applications
 Firewall – Yes, Linux – Yes, Windows –
Yes, NIDS and NIPS – Yes
but …
 Oracle - ?
 SAP - ?
 Your Application X – No
Log standards are coming: MITRE CEE!

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
22Confidential |
Example: Jumbled Mess of Application Logs
|22:01:40|BTC| 7|000|DDIC | |LC2|Systemerror when
executing external command DB6_DATA_COLLECTOR on
gneisenau ()
|22:02:32|BTC| 7|000|DDIC | |R49|Communication error,
CPIC return code 020, SAP return code 456
|22:02:32|BTC| 7|000|DDIC | |R5A|> Conversation ID:
38910614
|22:02:32|BTC| 7|000|DDIC | |R64|> CPI-C function:
CMSEND(SAP)

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
23Confidential |
Mistake 5: Looking for only the bad stuff
 Correlation, filters, regex matching – oh, no! 
 Why such approaches?
– You have to know what you are looking for!
 Can we somehow just “see what we need to
see”?
– Data mining technology can help

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
24Confidential |
Conclusions: Mistakes Summary
0. Not logging at all.
1. Not looking at the logs
2. Storing logs for too short a time
3. Prioritizing the log records before collection
4. Ignoring the logs from applications
5. Only looking for “known bad” stuff

Thursday, March 19, 2015
Mitigating Risk. Automating Compliance.
25Confidential |
Thanks for Attending the Presentation
Dr Anton Chuvakin, GCIH, GCFA
Chief Logging Evangelist
http://www.chuvakin.org
Coauthor of “Security Warrior” (O’Reilly, 2004) and “PCI
Compliance” (Syngress, 2007)
See http://www.info-secure.org for my papers, books, reviews
and other security resources related to logs. Book on logs is
coming soon! Also see http://chuvakin.blogspot.com

Similar to O'Reilly Webinar Five Mistakes Log Analysis

Log management and compliance: What's the real story? by Dr. Anton ChuvakinAnton Chuvakin

How to Gain Visibility and Control: Compliance Mandates, Security Threats and...Anton Chuvakin

Visualization in the Age of Big DataRaffael Marty

5 Things Your Security Administrator Should Tell YouHelpSystems

Qualys user group presentation - vulnerability management - November 2009 v1 3Tom King

Ibm ofa ottawa_ how_secure_is_your_data_eric_offenbergdawnrk

IBM i Security SIEM IntegrationPrecisely

Automation: Embracing the Future of SecOpsIBM Security

The impact of consumerizationMichel de Goede

SACON - Incident Response Automation & Orchestration (Amit Modi)Priyanka Aash

B3948Bryan Borra

Finding attacks with these 6 eventsMichael Gough

Maximize your IT Data and AnalyticsIvanti

Tizor_Data-Best-Practices.pptwebhostingguy

Log Analytics for Distributed MicroservicesKai Wähner

IT Audit For Non-IT AuditorsEd Tobias

How to Perform Continuous Vulnerability ManagementIvanti

CISA (1).pdfInfosec Train

Similar to O'Reilly Webinar Five Mistakes Log Analysis (20)

Log management and compliance: What's the real story? by Dr. Anton Chuvakin

How to Gain Visibility and Control: Compliance Mandates, Security Threats and...

Visualization in the Age of Big Data

5 Things Your Security Administrator Should Tell You

Qualys user group presentation - vulnerability management - November 2009 v1 3

Ibm ofa ottawa_ how_secure_is_your_data_eric_offenberg

IBM i Security SIEM Integration

Automation: Embracing the Future of SecOps

The impact of consumerization

SACON - Incident Response Automation & Orchestration (Amit Modi)

B3948

Finding attacks with these 6 events

Maximize your IT Data and Analytics

Tizor_Data-Best-Practices.ppt

Log Analytics for Distributed Microservices

IT Audit For Non-IT Auditors

How to Perform Continuous Vulnerability Management

CISA (1).pdf

Recently uploaded

CNIC Information System with Pakdata Cf In Pakistandanishmna97

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10

Choreo: Empowering the Future of Enterprise Software EngineeringWSO2

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software

AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)Samir Dash

Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformWSO2

Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz

DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood

Simplifying Mobile A11y Presentation.pptxMarkSteadman7

AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin

Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz

Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services

How to Check CNIC Information Online with Pakdata cfdanishmna97

The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software

Decarbonising Commercial Real Estate: The Role of Operational PerformanceIES VE

[BuildWithAI] Introduction to Gemini.pdfSandro Moreira

Navigating Identity and Access Management in the Modern EnterpriseWSO2

Exploring Multimodal Embeddings with MilvusZilliz

Recently uploaded (20)

CNIC Information System with Pakdata Cf In Pakistan

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

Choreo: Empowering the Future of Enterprise Software Engineering

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)

Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform

Introduction to Multilingual Retrieval Augmented Generation (RAG)

DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...

Simplifying Mobile A11y Presentation.pptx

AWS Community Day CPH - Three problems of Terraform

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood

"I see eyes in my soup": How Delivery Hero implemented the safety system for ...

Vector Search -An Introduction in Oracle Database 23ai.pptx

How to Check CNIC Information Online with Pakdata cf

The Zero-ETL Approach: Enhancing Data Agility and Insight

Decarbonising Commercial Real Estate: The Role of Operational Performance

[BuildWithAI] Introduction to Gemini.pdf

Navigating Identity and Access Management in the Modern Enterprise

Exploring Multimodal Embeddings with Milvus

O'Reilly Webinar Five Mistakes Log Analysis

1. LogLogic Confidential Thursday, March 19, 20151 The Top Five Log Analysis Mistakes Dr Anton Chuvakin Chief Logging Evangelist LogLogic, Inc

2. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 2Confidential | Summary 1. System, Network and Security Logs 2. Why Look at Logs? 3. Brief Log Analysis Overview 4. Log Analysis Mistakes

3. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 3Confidential | Log Data Overview  Audit logs  Transaction logs  Intrusion logs  Connection logs  System performance records  User activity logs  Various alerts and other messages  Firewalls/intrusion prevention  Routers/switches  Intrusion detection  Servers, desktops, mainframes  Business applications  Databases  Anti-virus  VPNs What logs? From Where?

4. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 4Confidential | Login? Logon? Log in? <122> Mar 4 09:23:15 localhost sshd[27577]: Accepted password for kyle from ::ffff:192.168.138.35 port 2895 ssh2 <13> Fri Mar 17 14:29:38 2006 680 Security SYSTEM User Success Audit ENTERPRISE Account Logon Logon attempt by: MICROSOFT_AUTHENTICATION_PACKAGE_V1_0 Logon acco unt: POWERUSER Source Workstation: ENTERPRISE Error Code: 0xC000006A 4574 <57> Dec 25 00:04:32:%SEC_LOGIN-5-LOGIN_SUCCESS:Login Success [user:yellowdog] [Source:10.4.2.11] [localport:23] at 20:55:40 UTC Fri Feb 28 2006 <18> Dec 17 15:45:57 10.14.93.7 ns5xp: NetScreen device_id=ns5xp system- warning-00515: Admin User netscreen has logged on via Telnet from 10.14.98.55:39073 (2002-12-17 15:50:53)

5. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 5Confidential | “Arrgh! Why Don’t We Just Ignore’Em?”

6. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 6Confidential | Log Management Mandate and Regulations Regulations Require LMI  SOX  GLBA  FISMA  JPA  NIST 800-53  Capture audit records  Regularly review audit records for unusual activity and violations  Automatically process audit records  Protect audit information from unauthorized deletion  Retain audit logs  PCI  HIPAA  SLAs Mandates Demand It  PCI : Requirement 10 and beyond  Logging and user activities tracking are critical  Automate and secure audit trails for event reconstruction  Review logs daily  Retain audit trail history for at least one year  COBIT  ISO  ITIL  COBIT 4  Provide audit trail for root-cause analysis  Use logging to detect unusual or abnormal activities  Regularly review access, privileges, changes  Verify backup completion  ISO17799  Maintain audit logs for system access and use, changes, faults, corrections, capacity demands  Review the results of monitoring activities regularly and ensure the accuracy of logs Controls Require it “Get fined, Get Sanctioned” “Lose Customers, Reputation, Revenue or Job” “Get fined, Go To Jail”

7. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 7Confidential | So, How Do People Do It?

8. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 8Confidential | Log Analysis Basics  Manual – ‘Tail’, ‘more’, ‘grep’, ‘notepad’, etc  Filtering – Positive and negative (“Artificial ignorance”)  Summarization and reports – “Top X of Y”  Visualization  Log indexing and searching  Correlation – Rule-based and other  Log data mining

9. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 9Confidential | From Log Analysis to Log Management  Threat protection and discovery  Incident response  Forensics, “e-discovery” and litigation support  Regulatory compliance  Internal policies and procedure compliance  Internal and external audit support  IT system and network troubleshooting  IT performance management

10. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 10Confidential | Looks Complicated?! No Wonder People Make Mistakes …

11. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 11Confidential | Six Mistakes of Log Analysis and Log Management 0. Not logging at all. 1. Not looking at the logs 2. Storing logs for too short a time 3. Prioritizing the log records before collection 4. Ignoring the logs from applications 5. Only looking for “known bad” stuff

12. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 12Confidential | Mistake 0: Not Logging AT ALL … … and its aggravated version: “… and not knowing that you don’t”  No logging? -> well, no logs for incident response, audits, compliance Got logs? If your answer is ‘NO”, don’t listen further: run and enable logging right now!

13. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 13Confidential | Example: Oracle  Defaults: – minimum system logging – minimum database server access – no data access logging  So, where is … – data access audit – schema and data change audit – configuration change audit

14. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 14Confidential | Mistake 1: Not looking at logs  Collection of logs has value!  But review boosts the value 10-fold (numbers are estimates )  More in-depth analysis boosts it a lot more!  Two choices here … – Review after an incident – Ongoing review

15. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 15Confidential | Example Log Review Priorities 1. DMZ NIDS 2. DMZ firewall 3. DMZ servers with applications 4. Critical internal servers 5. Other servers 6. Select critical application 7. Other applications

16. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 16Confidential | Mistake 2: Storing Logs For Too Short A Time  You are saying you HAD logs? And how is it useful?  Retention question is a hard one. Truly, nobody has the answer! – Seven years? A year? 90 days? A week? Until the disk runs out?  Common: 90 days online and up to 1-3 years “nearline” or offline

17. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 17Confidential | Also A Mistake: Storing Logs for TOO LONG?!  Retention = storage + destruction  Why DESTROY LOGS? – Privacy regulations (mostly EU) – Litigation risk management – System resource utilization

18. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 18Confidential | Example Retention Strategy Type + network + storage tier  IDS + DMZ + online = 90 days  Firewall + DMZ + online = 30 days  Servers + internal + online = 90 days  ALL + DMZ + archive = 3 years  Critical + internal + archive = 5 years  OTHER + internal + archive = 1 year

19. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 19Confidential | Mistake 3: Deciding What’s Relevant Before Collection  How would you know what is … – … Security-relevant – … Compliance-relevant – … or will solve the problem you’d have TOMORROW!?  Also affects “forensic quality” of logs  Prioritization Challenge – Got ESP?   “Simple” – just grab everything!

20. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 20Confidential | Example Common Logging Order Log everything Retain most everything Analyze enough Summarize and report on a subset Look at some Act in real-time on a few

21. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 21Confidential | Mistake 4: Ignoring Logs from Applications  Firewall – Yes, Linux – Yes, Windows – Yes, NIDS and NIPS – Yes but …  Oracle - ?  SAP - ?  Your Application X – No Log standards are coming: MITRE CEE!

22. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 22Confidential | Example: Jumbled Mess of Application Logs |22:01:40|BTC| 7|000|DDIC | |LC2|Systemerror when executing external command DB6_DATA_COLLECTOR on gneisenau () |22:02:32|BTC| 7|000|DDIC | |R49|Communication error, CPIC return code 020, SAP return code 456 |22:02:32|BTC| 7|000|DDIC | |R5A|> Conversation ID: 38910614 |22:02:32|BTC| 7|000|DDIC | |R64|> CPI-C function: CMSEND(SAP)

23. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 23Confidential | Mistake 5: Looking for only the bad stuff  Correlation, filters, regex matching – oh, no!   Why such approaches? – You have to know what you are looking for!  Can we somehow just “see what we need to see”? – Data mining technology can help

24. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 24Confidential | Conclusions: Mistakes Summary 0. Not logging at all. 1. Not looking at the logs 2. Storing logs for too short a time 3. Prioritizing the log records before collection 4. Ignoring the logs from applications 5. Only looking for “known bad” stuff

25. Thursday, March 19, 2015 Mitigating Risk. Automating Compliance. 25Confidential | Thanks for Attending the Presentation Dr Anton Chuvakin, GCIH, GCFA Chief Logging Evangelist http://www.chuvakin.org Coauthor of “Security Warrior” (O’Reilly, 2004) and “PCI Compliance” (Syngress, 2007) See http://www.info-secure.org for my papers, books, reviews and other security resources related to logs. Book on logs is coming soon! Also see http://chuvakin.blogspot.com

Editor's Notes

“Six Mistakes of Log Management” Anton Chuvakin, Ph.D., GCIA, GCIH, GCFA Director of Product Management @ LogLogic, Inc Top Log Mistakes #1 not logging at all. #2 not looking at the logs #3 storing logs for too short a time #4 prioritizing the log records before collection #5 ignoring the logs from applications #6 only looking at what they know is bad Since I wrote my log mistakes paper a few years ago, the domain of log analysis changed a lot. Many factors affected it; among those are new regulatory compliance requirements, wider adoption of “best practice” and governance frameworks such as ISO, COBIT and ITIL as well as new technologies with their log files. New standards, such as NIST 800-92 Guide, have been created. Thus, I am updating this article with newly committed mistakes as well as new prospective on the old ones. Thus, this article, just like its predecessor, again covers the typical mistakes organizations make while approaching management of computer logs and other records produced by IT infrastructure components. As digital technology continues to spread (“A web-enabled fridge, anybody? Its just 8 (!) grand today, you know” ) and computers start playing even more important role in our lives (I do have a penchant for the obvious, don’t I? ), the records that they produce, a.k.a. logs, start to play bigger and bigger role. From firewalls and intrusion prevention systems to databases and enterprise applications to wireless access points and VOIP gateways, logs are being spewed forth at an every increasing pace. Both security and other IT components not only increase in numbers, but often come with more logging enabled out of the box. Example of that trend include Linux systems as well as web servers that now ship with increased level of logging. All those systems, both legacy and novel, are known to generate copious amounts of logs, audit trails, records and alerts, that beg for constant attention. Thus, many companies and government agencies are trying to set up repeatable log collection, centralization and analysis processes and tools. However, when planning and implementing log collection and analysis infrastructure, the organizations often discover that they are not realizing the full promise of such a system and, in fact, sometimes notice that the efficiency is not gained but lost as a result. This often happens due to the following common log analysis mistakes. We will start from the obvious, but unfortunately all too common even in this age of Sarbanes-Oxley and PCI. This mistake destroys all possible chances of benefiting from logs. It is the mistake #1: not logging at all. The more exciting flavor of this mistake is: “not logging and not even knowing it until it is too late.” How can it be “too late”, some say? “Its just logs!” Welcome to 2006! Not having “just logs” can lead to losing your income (PCI that contain logging requirements implies that violations might lead to your credit card processing privileges being cancelled by Visa or Mastercard and thus putting you out of business), reputation (somebody stole a few credit card number from your database, but the media reported that all of the 40 million credit card have been stolen since you were unable to prove otherwise) or even your freedom (see various Sarbanes-Oxley horror stories in the media) Even better-prepared organizations fall for this one. Here is a recent example. Does your web server have logging enabled? Sure, it is a default option on both of the popular web servers: Apache and Microsoft IIS. Does your server operating system log messages? Sure, nobody cancelled /var/log/messages. But does your database? Oops! Default option in Oracle is to not do any audit logging. Maybe MS SQL fares better? Nope, same thing, you need to dig deep in the system to even start a moderate level of audit trail generation. Thus, to avoid this mistake one needs to sometimes go beyond the defaults and make sure that the software and hardware deployed does have some level of logging enabled. In case of Oracle, for example, it might boil down to making sure that the ‘audit_trail’ variable is set to ‘db’; for other systems it might be more complicated. #2 Not looking at the logs is the second mistake. While making sure that logs do exist and then collecting and storing them is important, it is only a means to an end – knowing what is going on in your environment and being able to respond to it as well as possibly predict what will happen later. Thus, once the technology is in place and logs are collected, there needs to be a process of ongoing monitoring and review that hooks into actions and possible escalations, if needed. In addition, personnel reviewing or monitoring logs should have enough information to be able to determine what they really mean and what – if any – action is required. It is worthwhile to note that some organizations take a half-step in the right direction: they only review logs (provided they didn’t commit the first mistake and they actually have something to review) after a major incident (be it a compromise, information leak or a mysterious server crash) and avoid ongoing monitoring and log review, often by quoting “the lack of resources”. This gives them the reactive benefit of log analysis, which is important, but fails to realize the proactive one – knowing when bad stuff is about to happen or become worse. For example, if you review logs, you might learn that the failover was activated on a firewall, and, even though the connection stayed on, the incident is certainly worth looking into. If you don’t and your network connectivity goes away, you’d have to rely on your ever-helpful logs in investigation why *both* failover devices went down … In fact, looking at logs proactively helps organizations to better realize the value of their existing network, security and system infrastructure. It is also critical to stress that some types of organizations have to look at log files and audit tracks due to regulatory pressure of some kind. For example, US HIPAA regulation compels medical organizations to establish audit record and analysis program (even though the enforcement action is notorious lacking). In the even more extreme case, PCI (Payment Card Industry) security standard has provisions for both log collection and log monitoring and periodic review, highlighting the fact that collection of logs does not stand on its own. #3 The third common mistake is storing logs for too short a time. This makes the security or IT operations team think they have all the logs needed for monitoring and investigation or troubleshooting and then leading to the horrible realization after the incident that all logs are gone due to their shortsighted retention policy. It often happens (especially in the case of insider attacks) that the incident is discovered a long time – sometimes many months - after the crime or abuse has been committed. One might save some money on storage hardware, but lose the tenfold due to regulatory fines. If low cost is critical, the solution is sometimes in splitting the retention in two parts: shorter-term online storage (that costs more) and long-term offline storage (that is much cheaper). A better three-tier approach is also common and resolves some of the limitations of the previous one. In this case, shorter-term online storage is complemented by a near-line storage where logs are still accessible and searchable. The oldest and the least relevant log records are offloaded to the third tier, such as tape or DVDs, where they can be stored inexpensively, but without any way to selectively access the needed logs. More specifically, one financial institution was storing logs online for 90 days, then in the near-line searchable storage for 2 years and then on tape for up to 7 years or even more. #4 The fourth mistake is related to log record prioritization. While people need a sense of priority to better organize their log analysis efforts, the common mistake nowadays is in prioritizing the log records before collection. In fact, even some “best practice” documents recommend only collecting “the important stuff.” But what is important? This is where the above guidance documents fall short by not specifying it in any useful form. While there are some approaches to the problem, all that I am aware of can lead to glaring holes in security posture or even undermine the regulatory compliance efforts. For example, many people would claim that network intrusion detection and prevention logs are inherently more important than, say, VPN concentrator logs. Well, it might be true in the world where external threats completely dominate the insider abuse (i.e. not in this one). VPN logs, together with server and workstation logs, is what you would most likely need to conduct an internal investigation about the information leak or even a malware infection. Thus, similar claims about the elevated importance of whatever other log type can be similarly disputed, which would lead us to a painful realization that you do need to collect everything. But can you? Before you answer this, try to answer whether you can make the right call on which log is more important even before seeing it and this problem will stop looking unsolvable. In fact, there are cost-effective solutions to achieve just that. The mistake #5 is in ignoring the logs from applications, by only focusing on the perimeter and internal network devices and possibly also servers, but not going “higher up the stack” to look at the application logging. The realm of enterprise applications ranges from SAPs and PeopleSofts of the worlds to small homegrown applications, which nevertheless handle mission-critical processes for many enterprises. Legacy applications, running on mainframes and midrange systems, are out there as well, often running the core business processes as well. The availability and quality of logs differs wildly across the application, ranging from missing (the case for many home-grown applications) to extremely detailed and voluminous (the case for many mainframe applications). Lack of common logging standards and even of logging guidance for software developers lead to many challenges with application logs. Despite the challenges, one needs to make sure that the application logs are collected and made available for analysis as well as for longer term-retention. This can be accomplished by configuring your log management software to collect them and by establishing a log review policy, both for the on-incident review and periodic proactive log review. #6 Even the most advanced and mature organizations fall into the pitfall of the sixth error. It is sneaky and insidious, and can severely reduce the value of a log analysis project. It occurs when organization is only looking at what they know is bad in the logs. Indeed, a vast majority of open source and some commercial tools are set up to filter and look for bad log lines, attack signatures, critical events, etc. For example, “swatch” is a classic free log analysis tool that is powerful, but only at one thing: looking for defined bad things in log files. Moreover, when people talk about log analysis they usually mean sifting through logs looking for things of note. However, to fully realize the value of log data one has to take it to the next level to log mining: actually discovering things of interest in log files without having any preconceived notion of ‘what we need to find’. It sounds obvious - how can we be sure that we know of all the possible malicious behavior in advance – but it disregarded so often. Sometimes, it is suggested that it is simpler to just list all the known good things and then look for the rest. It sounds like a solution, but such task is not only onerous, but also thankless: it is usually even harder to list all the good things than it is to list all the bad things that might happen on a system or network. So many different things occur, malfunction or misbehave, that weeding out attack traces just by listing all the possibilities is not effective. A more intelligent approach is needed! Some of the data mining (also called “knowledge discovery in databases” or KDD) and visualization methods actually work on log data with great success. They allow organizations to look for real anomalies in log data, beyond ‘known bad’ and ‘known good’. To conclude, avoiding the above six mistakes we covered will take your log analysis program to a next level and enhance the value of the existing security and logging infrastructures. Dr Anton Chuvakin, GCIA, GCIH, GCFA (http://www.chuvakin.org) is a recognized security expert and book author. He currently works at LogLogic, where he is involved with defining and executing on a product vision and strategy, driving the product roadmap, conducting research as well as assisting key customers with their LogLogic implementations. He was previously a Security Strategist with a security information management company. He is an author of a book &quot;Security Warrior&quot; and a contributor to &quot;Know Your Enemy II&quot;, &quot;Information Security Management Handbook&quot;, &quot;Hacker&apos;s Challenge 3&quot; and the upcoming book on PCI. Anton also published numerous papers on a broad range of security subjects. In his spare time he maintains his security portal http://www.info-secure.org and several blogs.
Abstract: This presentation will cover operational security challenges that organizations face while deploying log and alert collection and analysis infrastructure, highlighting the top five most common mistakes organizations make in this process, including: not storing logs long enough to comply with government regulation and mandates, not preserving the forensic quality of the logs, only looking for known &quot;bad records.&quot; From there, the session will dive into how to avoid these, and other, mistakes. Additionally, it will provide tips and tricks for how government users can get the most value out of various log files generated by systems, applications and security devices.
I did mention security data, events, etc on the previous slides. But what am I really talking about? In other words, what do we LOG and MONITOR? What is called “security data” in this presentation consists of various audit records (left), generated by various devices and softwares (right). It should be noted that business applications also generate security data, such as by recording access decisions or generating messages indicative of exploitation attempts.
Notice even though each of these examples are from different sources, all have the fabulous five data: Time Sending machine Sending process or program Severity Message
This slide summarizes the methods and techniques for making sense of logs that we will cover on the next few slides. They range from trivial perusing of logs all the way into machine intelligence and data mining. Manual log analysis includes using such tools as ‘tail’, ‘more’, “notepad”, ‘vi’, etc to look at log files and try to understand them. Filtering logs is the next techniques with two variations: Positive and negative (“artificial ignorance”). Positive filtering is trying to focus on the “bad” things that one need to to see, investigate and then act on: attacks, failures, etc. Negative filtering, which Marcus Ranum called “artificial ignorance” (as an opposite of artificial intelligence) is about looking for all the stuff you know is normal (looking for “good”) then throwing it away and focusing on what is left. The latter is often much more effective since it is hard to know the “bad” logs up front. Summarization and reports is the mainstay of log analysis: “Top Connection by Bandwidth”, “Top Attacks by Country”, “Top Users with Authentication Failures” and other reports based on logs is what most people think about when they think about “log analysis.” Simple visualization: is this picture “…worth a thousand words?” The answer is “sometimes” since many visualization methods actually confuse, not clarify the data. Visualization techniques range from very simple (such as pie charts and bar charts) to maps, trees and other advanced visual methods. Log searching is pretty much “googing” the logs: most people who collected vast amounts of log data and now have to resort to ‘grep’ to analyze the logs knows this method and the frustrations it brings. Correlation is typically understood to be “rule-based” correlation (even though there are other “correlation” types). The best way to understand is to note that there is “correlation” (small ‘c’ – which typically means using rules to tie log messages together) and “Correlation” (big ‘C’ – which means any method for looking at logs in relationship to each other) Using Log Data mining methods for looking at logs are still on the drawing boards. DM methods will – hopefully AND probably! – allow future log analysts to generate conclusion automatically from raw data and move even more of the log analysis tasks to the systems from human brains … This slide summarizes the methods and techniques for making sense of logs that we will cover on the next few slides. They range from trivial perusing of logs all the way into machine intelligence and data mining. Manual log analysis includes using such tools as ‘tail’, ‘more’, “notepad”, ‘vi’, etc to look at log files and try to understand them. Filtering logs is the next techniques with two variations: Positive and negative (“artificial ignorance”). Positive filtering is trying to focus on the “bad” things that one need to to see, investigate and then act on: attacks, failures, etc. Negative filtering, which Marcus Ranum called “artificial ignorance” (as an opposite of artificial intelligence) is about looking for all the stuff you know is normal (looking for “good”) then throwing it away and focusing on what is left. The latter is often much more effective since it is hard to know the “bad” logs up front. Summarization and reports is the mainstay of log analysis: “Top Connection by Bandwidth”, “Top Attacks by Country”, “Top Users with Authentication Failures” and other reports based on logs is what most people think about when they think about “log analysis.” Simple visualization: is this picture “…worth a thousand words?” The answer is “sometimes” since many visualization methods actually confuse, not clarify the data. Visualization techniques range from very simple (such as pie charts and bar charts) to maps, trees and other advanced visual methods. Log searching is pretty much “googing” the logs: most people who collected vast amounts of log data and now have to resort to ‘grep’ to analyze the logs knows this method and the frustrations it brings. Correlation is typically understood to be “rule-based” correlation (even though there are other “correlation” types). The best way to understand is to note that there is “correlation” (small ‘c’ – which typically means using rules to tie log messages together) and “Correlation” (big ‘C’ – which means any method for looking at logs in relationship to each other) Using Log Data mining methods for looking at logs are still on the drawing boards. DM methods will – hopefully AND probably! – allow future log analysts to generate conclusion automatically from raw data and move even more of the log analysis tasks to the systems from human brains …
All those require quick and intelligent access to logs! SECURITY + OPS + COMPLIANCE
It is the mistake #1: not logging at all. The more exciting flavor of this mistake is: “not logging and not even knowing it until it is too late.” How can it be “too late”, some say? “Its just logs!” Welcome to 2006! Not having “just logs” can lead to losing your income (PCI that contain logging requirements implies that violations might lead to your credit card processing privileges being cancelled by Visa or Mastercard and thus putting you out of business), reputation (somebody stole a few credit card number from your database, but the media reported that all of the 40 million credit card have been stolen since you were unable to prove otherwise) or even your freedom (see various Sarbanes-Oxley horror stories in the media) Even better-prepared organizations fall for this one. Here is a recent example. Does your web server have logging enabled? Sure, it is a default option on both of the popular web servers: Apache and Microsoft IIS. Does your server operating system log messages? Sure, nobody cancelled /var/log/messages. But does your database? Oops! Default option in Oracle is to not do any audit logging. Maybe MS SQL fares better? Nope, same thing, you need to dig deep in the system to even start a moderate level of audit trail generation. Thus, to avoid this mistake one needs to sometimes go beyond the defaults and make sure that the software and hardware deployed does have some level of logging enabled. In case of Oracle, for example, it might boil down to making sure that the ‘audit_trail’ variable is set to ‘db’; for other systems it might be more complicated.
#2 Not looking at the logs is the second mistake. While making sure that logs do exist and then collecting and storing them is important, it is only a means to an end – knowing what is going on in your environment and being able to respond to it as well as possibly predict what will happen later. Thus, once the technology is in place and logs are collected, there needs to be a process of ongoing monitoring and review that hooks into actions and possible escalations, if needed. In addition, personnel reviewing or monitoring logs should have enough information to be able to determine what they really mean and what – if any – action is required. It is worthwhile to note that some organizations take a half-step in the right direction: they only review logs (provided they didn’t commit the first mistake and they actually have something to review) after a major incident (be it a compromise, information leak or a mysterious server crash) and avoid ongoing monitoring and log review, often by quoting “the lack of resources”. This gives them the reactive benefit of log analysis, which is important, but fails to realize the proactive one – knowing when bad stuff is about to happen or become worse. For example, if you review logs, you might learn that the failover was activated on a firewall, and, even though the connection stayed on, the incident is certainly worth looking into. If you don’t and your network connectivity goes away, you’d have to rely on your ever-helpful logs in investigation why *both* failover devices went down … In fact, looking at logs proactively helps organizations to better realize the value of their existing network, security and system infrastructure. It is also critical to stress that some types of organizations have to look at log files and audit tracks due to regulatory pressure of some kind. For example, US HIPAA regulation compels medical organizations to establish audit record and analysis program (even though the enforcement action is notorious lacking). In the even more extreme case, PCI (Payment Card Industry) security standard has provisions for both log collection and log monitoring and periodic review, highlighting the fact that collection of logs does not stand on its own.
They went from “high external risk” to inside and from servers to desktops
Mandatory 7 years is a myth! #3 The third common mistake is storing logs for too short a time. This makes the security or IT operations team think they have all the logs needed for monitoring and investigation or troubleshooting and then leading to the horrible realization after the incident that all logs are gone due to their shortsighted retention policy. It often happens (especially in the case of insider attacks) that the incident is discovered a long time – sometimes many months - after the crime or abuse has been committed. One might save some money on storage hardware, but lose the tenfold due to regulatory fines. If low cost is critical, the solution is sometimes in splitting the retention in two parts: shorter-term online storage (that costs more) and long-term offline storage (that is much cheaper). A better three-tier approach is also common and resolves some of the limitations of the previous one. In this case, shorter-term online storage is complemented by a near-line storage where logs are still accessible and searchable. The oldest and the least relevant log records are offloaded to the third tier, such as tape or DVDs, where they can be stored inexpensively, but without any way to selectively access the needed logs. More specifically, one financial institution was storing logs online for 90 days, then in the near-line searchable storage for 2 years and then on tape for up to 7 years or even more.
NIST 800-92 guide is confusing on that! #4 The fourth mistake is related to log record prioritization. While people need a sense of priority to better organize their log analysis efforts, the common mistake nowadays is in prioritizing the log records before collection. In fact, even some “best practice” documents recommend only collecting “the important stuff.” But what is important? This is where the above guidance documents fall short by not specifying it in any useful form. While there are some approaches to the problem, all that I am aware of can lead to glaring holes in security posture or even undermine the regulatory compliance efforts. For example, many people would claim that network intrusion detection and prevention logs are inherently more important than, say, VPN concentrator logs. Well, it might be true in the world where external threats completely dominate the insider abuse (i.e. not in this one). VPN logs, together with server and workstation logs, is what you would most likely need to conduct an internal investigation about the information leak or even a malware infection. Thus, similar claims about the elevated importance of whatever other log type can be similarly disputed, which would lead us to a painful realization that you do need to collect everything. But can you? Before you answer this, try to answer whether you can make the right call on which log is more important even before seeing it and this problem will stop looking unsolvable. In fact, there are cost-effective solutions to achieve just that.
The mistake #5 is in ignoring the logs from applications, by only focusing on the perimeter and internal network devices and possibly also servers, but not going “higher up the stack” to look at the application logging. The realm of enterprise applications ranges from SAPs and PeopleSofts of the worlds to small homegrown applications, which nevertheless handle mission-critical processes for many enterprises. Legacy applications, running on mainframes and midrange systems, are out there as well, often running the core business processes as well. The availability and quality of logs differs wildly across the application, ranging from missing (the case for many home-grown applications) to extremely detailed and voluminous (the case for many mainframe applications). Lack of common logging standards and even of logging guidance for software developers lead to many challenges with application logs. Despite the challenges, one needs to make sure that the application logs are collected and made available for analysis as well as for longer term-retention. This can be accomplished by configuring your log management software to collect them and by establishing a log review policy, both for the on-incident review and periodic proactive log review.
#6 Even the most advanced and mature organizations fall into the pitfall of the sixth error. It is sneaky and insidious, and can severely reduce the value of a log analysis project. It occurs when organization is only looking at what they know is bad in the logs. Indeed, a vast majority of open source and some commercial tools are set up to filter and look for bad log lines, attack signatures, critical events, etc. For example, “swatch” is a classic free log analysis tool that is powerful, but only at one thing: looking for defined bad things in log files. Moreover, when people talk about log analysis they usually mean sifting through logs looking for things of note. However, to fully realize the value of log data one has to take it to the next level to log mining: actually discovering things of interest in log files without having any preconceived notion of ‘what we need to find’. It sounds obvious - how can we be sure that we know of all the possible malicious behavior in advance – but it disregarded so often. Sometimes, it is suggested that it is simpler to just list all the known good things and then look for the rest. It sounds like a solution, but such task is not only onerous, but also thankless: it is usually even harder to list all the good things than it is to list all the bad things that might happen on a system or network. So many different things occur, malfunction or misbehave, that weeding out attack traces just by listing all the possibilities is not effective. A more intelligent approach is needed! Some of the data mining (also called “knowledge discovery in databases” or KDD) and visualization methods actually work on log data with great success. They allow organizations to look for real anomalies in log data, beyond ‘known bad’ and ‘known good’.

O'Reilly Webinar Five Mistakes Log Analysis

Recommended

Recommended

More Related Content

Similar to O'Reilly Webinar Five Mistakes Log Analysis

Similar to O'Reilly Webinar Five Mistakes Log Analysis (20)

More from Anton Chuvakin

More from Anton Chuvakin (20)

Recently uploaded

Recently uploaded (20)

O'Reilly Webinar Five Mistakes Log Analysis

Editor's Notes