Logs: Can’t Hate Them, Won’t Love Them: Brief Log Management Class by Anton Chuvakin

Logs: Can’t Hate Them, Won’t Love Them! Dr. Anton Chuvakin Security Warrior Consulting www.securitywarriorconsulting.com April 2010

What Is It? This is a short log analysis and log management class given by Dr. Anton Chuvakin of Security Warrior Consulting at Project HoneynetAnnual Event 2010 in Mexico City, Mexico www.chuvakin.org www.SecurityWarriorConsulting.com

Outline Logs, WTH? Logs and Log Analysis Log Analysis Methods Log Analysis -> Log Management Log Management Mistakes Future Ideas Conclusions

Hilarity!!! “Logs Are Data??! Bua-ha-ha-ha-ha-haaa!” Aug 11 09:11:19 xx null pif ? exit! 0

Log Data Overview From Where? What Logs? ,[object Object]

Various alerts and other messages,[object Object]

Log Chaos II - Accept? messages:Dec 16 17:28:49 10.14.93.7 ns5xp: NetScreen device_id=ns5xp system-notification-00257(traffic): start_time="2002-12-16 17:33:36" duration=5 policy_id=0 service=telnet proto=6 src zone=Trust dst zone=Untrust action=Permit sent=1170 rcvd=1500 src=10.14.94.221 dst=10.14.98.107 src_port=1384 dst_port=23 translated ip=10.14.93.7 port=1206 Apr 6 06:06:02 Checkpoint NGX SRC=Any,DEST=ANY,Accept=nosubstitute,Do Not Log,Installspyware,lieonyourtaxes,orbetteryet,dontpaythem Mar 6 06:06:02 winonasu-pix %PIX-6-302013: Built outbound TCP connection 315210 596 for outside:172.196.9.206/1214 (172.196.9.206/1214) to inside:199.17.151.103/1438 (199.17.151.103/1438)

SHOCK!!! … and that is BEFORE we even mention application logs!

Log Chaos Everywhere! No standard format No standard schema, no level of details No standard meaning No taxonomy No standard transport No shared knowledge on what to log and how No logging guidance for developers No standard API / libraries for log production

Result? %PIX|ASA-3-713185 Error: Username too long - connection aborted %PIX|ASA-5-501101 User transitioning priv level ERROR: transport error 202: send failed: Success sles10sp1oes oesaudit: type=CWD msg=audit(09/27/07 22:09:45.683:318) : cwd=/home/user1

More results? userenv[error] 1030 RCI-CORPsupx No description available Aug 11 09:11:19 xx null pif ? exit! 0 Apr 23 23:03:08 support last message repeated 3 times Apr 23 23:04:23 support last message repeated 5 times Apr 23 23:05:38 support last message repeated 5 times

It DOES Suck! Well, it does… … but we need to analyze logs every time an incident occurs and in many other cases!

LOG ANALYSIS We will discuss ,[object Object]

Some log analysis tools,[object Object]

Log Analysis Basics: Summary Manual Filtering Summarization and reports Simple visualization Log searching Correlation Log Data mining

Log Analysis Basics: Manual Manual log review Just fire your trusty tail, more, notepad, vi, Event Viewer, etc and hop to it!  Pros: Easy, no tools required (neither build nor buy) Cons: Try it with 10GB log file one day  Boring as Hell! 

See!? Log for VMware Server, pid=2364, version=e.x.p, build=build-63231, option=BETA, section=2[2007-12-03 14:57:00.931 'App' 4516 info] Current working directory: C:ocuments and Settingsll Userspplication DataMwareMware Server [2007-12-03 14:57:00.946 'BaseLibs' 4516 info] HOSTINFO: Seeing Intel CPU, numCoresPerCPU 2 numThreadsPerCore 1. [2007-12-03 14:57:00.946 'BaseLibs' 4516 info] HOSTINFO: This machine has 1 physical CPUS, 2 total cores, and 2 logical CPUs. [2007-12-03 14:57:00.946 'App' 4516 info] Trying blklistsvc [2007-12-03 14:57:00.946 'App' 4516 info] Trying cimsvc [2007-12-03 14:57:00.946 'App' 4516 info] Trying directorysvc [2007-12-03 14:57:00.946 'App' 4516 info] Trying hostsvc [2007-12-03 14:57:01.571 'NetworkProvider' 4516 info] Using netmap configuration file C:ocuments and Settingsll Userspplication DataMwareMware Serveretmap.conf [2007-12-03 14:57:01.587 'NetworkProvider' 4516 error] VNL_GetBriggeState call failed with status 1.Refreshing network information failed [2007-12-03 14:57:03.165 'NetworkProvider' 4516 info] Active ftp is 1 [2007-12-03 14:57:03.165 'NetworkProvider' 4516 info] Allowanyoui is 0 [2007-12-03 14:57:03.165 'NetworkProvider' 4516 info] udptimeout is 30 [2007-12-03 14:57:03.337 'HostsvcPlugin' 4516 warning] No advanced options found [2007-12-03 14:57:03.368 'Hostsvc::AutoStartManager' 4516 info] VM autostart configuration: C:ocuments and Settingsll Userspplication DataMwareMware ServerostdmAutoStart.xml [2007-12-03 14:57:04.212 'Locale' 4516 info] Locale subsystem initialized from C:rogram FilesMwareMware Serverocale/ with default locale en. [2007-12-03 14:57:04.212 'ResourcePool ha-root-pool' 4516 info] Resource pool instantiated [2007-12-03 14:57:04.212 'ResourcePool ha-root-pool' 4516 info] Refresh interval: 60 seconds [2007-12-03 14:57:04.212 'HostsvcPlugin' 4516 info] Plugin initialized [2007-12-03 14:57:04.212 'App' 4516 info] Trying internalsvc [2007-12-03 14:57:04.259 'App' 4516 info] Trying nfcsvc [2007-12-03 14:57:04.305 'Nfc' 4516 info] Breakpoints disabled [2007-12-03 14:57:04.321 'BaseLibs' 4516 info] Using system libcrypto, version 9070AF [2007-12-03 14:57:06.399 'BaseLibs' 4516 info] [NFC DEBUG] Successfully loaded the diskLib library [2007-12-03 14:57:06.415 'Nfc' 4516 info] Plugin initialized [2007-12-03 14:57:06.415 'App' 4516 info] Trying partitionsvc [2007-12-03 14:57:06.415 'App' 4516 info] Trying proxysvc

Log Analysis Basics: Filtering Log Filtering Just show me the bad stuff; here is the list (positive) Just ignore the good stuff; here is the list (negative or Artificial Ignorance) Pros: Easy result interpretation: see->act Many tools or write your own Cons: Patterns beyond single messages? Neither good nor bad, but interesting?

Example: How to grep Logs? The easiest log analysis method (Linux/Unix): # grepailure /var/log/messages Filter interesting failure message in messages log # grep –v uccess /var/log/messages Filter messages other than success in messages log # grep –vf LIST /var/log/messages Filter messages other than those listed in FILE

Log Analysis Basics: Summary Summarization and reports Top X Users, Connections by IP, etc Pros: Dramatically reduces the size of data Suitable for high-level reporting Cons: Loss of information by summarizing Which report to pick for a task?

Make A Summary SELECT source, destination, proto, user, COUNT(*) FROMlog_tableWHERE user LIKE ‘an%’ GROUP BY source, destination, proto, user ORDER BY source DESC P.S. Pray tell me, how those nasty logs ended up in a nice database like that? 

Log Analysis Basics: Search Googling Logs User specifies a time period, a log source or all, and an expression; gets back logs that match (regexvs Boolean) Pro Easy to understand Quick to do Con What do you search for? A LOT of data back, sometimes

Log Analysis Basics: Correlation Correlation Rule-based and other 'correlation' and 'Correlation' algorithms Pro Highly automated Con Needs rules written by experts Needs tuning for each site

Example Rule <rule id="40112" level="12" timeframe="240"> <if_group>authentication_success</if_group> <if_matched_group>authentication_failures </if_matched_group> <same_source_ip /> <description>Multiple authentication failures followed a success.</description> </rule> OSSEC rule shown; see OSSEC.net for details

Log Analysis Basics: Data Mining Log mining Algorithms that extract meaning from raw data Pro Promises fully-automated analysis Con Still research-grade technology

Example Ranum NBS Ranum’s “nbs” (never before seen) – the simplest log mining tool. No knowledge about “bad” goes in -> insight comes out! Look Ma, NO RULES! Use the tool to pick up anomalous messages from your log pool See for more: http://www.slideshare.net/anton_chuvakin/log-mining-beyond-log-analysis

Log Analysis Basics: Visualization Visualization, from simple to 4D A pie chart worth a thousand words? Pro You just look at it and know what it means and what to do Con You just look at it, and hmmm….

Log Analysis Basics: When Real time vs. historical analysis Do you always need real-time? What data cannot be analyzed in real-time? A day later vs. never question Historical analysis for deep insight

How To Start Using The Tools? 1. Collect logs Tools: Syslog-ng, standard syslog, etc 2. Store logs Tools: MySQL, etc 3. Search logs Tools: grep, splunk, etc 4. Correlate and alert Tools: OSSEC, OSSIM, sec, nbs, logwatch, etc

Key Points to Remember Techniques review Tools review Any other tool suggestions? Start thinking buy vs. build

From Log Analysis to Log Management We will discuss ,[object Object]

Log management motivations,[object Object]

Log Analysis to Log Management Files, syslog, other Act Collect Secure Humans still needed! Make Conclusions SNMP, E-mail, etc Alert Search Report Store Search, Report and Analytics Immutable Logs

Log Management Challenges Not enough data Too much data Diverse records Time out of sync False records Duplicate data Hard to get data

LOG RETENTION – A TRIVIAL MATTER? We will discuss ,[object Object]

Issues with various log retention technologies,[object Object]

What is NOT Retention? A database that stores a few fields from each log A tape closet with log data tapes that were never verified – and lurking rats A syslog server that just spools logs into files

Retention Time Question I have the answer!  No, not really. Regulations? Unambiguous: PCI – keep’em for 1 year Tiered retention strategy Online Near line Offline/tape

Example: Retention Strategy Type + network + storage tier IDS + DMZ + online = 90 days Firewall + DMZ + online = 30 days Servers + internal + online = 90 days ALL + DMZ + archive = 3 years Critical + internal + archive = 5 years OTHER + internal + archive = 1 year

How to Create A Log Retention Strategy Assess applicable compliance requirements Look at risk posture and other needs Look at various log sources and their log volumes Review available storage options Decide on tiers

Log Storage Tiers: Options RDBMS ,[object Object],Flat files ,[object Object],Hybrid ,[object Object],Proprietary datastore ,[object Object],Tape

Example: How to Deal with A Trillion Log Messages How to manage a trillion (~1000 billions) log messages? Hundreds of terabytes (1/2 of a petabyte …) of data Which tool to pick? "Sorry, buddy, you are writing your own code here!”

Key Points to Remember What is really log retention? Review log storage option to use (or to buy in a vendor tool) Learn about storage challenges

LOGGING MISTAKES We will discuss ,[object Object],[object Object]

Mistake 1: Not Logging AT ALL … … and its aggravated version: “… and not knowing that you don’t” No logging? -> well, no logs for incident investigation and response, audits, C&A, control validation, compliance Got logs? If your answer is ‘NO' don’t listen further: run and enable logging right now!

Example: Oracle Defaults: minimum system logging minimum database server access no data access logging So, where is … data access audit schema and data change audit configuration change audit

Mistake 2: Not looking at logs Collection of logs has value! But review boosts the value 10-fold(numbersare estimates ) More in-depth analysis boosts it a lot more! Two choices here … Review after an incident Ongoing review

Example Log Review Priorities DMZ NIDS DMZ firewall DMZ servers with applications Critical internal servers Other servers Select critical application Other applications

Mistake 3: Storing logs for too short a time You are saying you HAD logs? And how is it useful? Retention question is a hard one. Truly, nobody has the answer! Seven years? A year? 90 days? A week? Until the disk runs out? Common: 90 days online and up to 1-3 years near line or offline

Logs: Can’t Hate Them, Won’t Love Them: Brief Log Management Class by Anton Chuvakin

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (16)

Similar to Logs: Can’t Hate Them, Won’t Love Them: Brief Log Management Class by Anton Chuvakin

Similar to Logs: Can’t Hate Them, Won’t Love Them: Brief Log Management Class by Anton Chuvakin (20)

More from Anton Chuvakin

More from Anton Chuvakin (20)

Recently uploaded

Recently uploaded (20)

Logs: Can’t Hate Them, Won’t Love Them: Brief Log Management Class by Anton Chuvakin

Editor's Notes