Your SlideShare is downloading. ×
Mining Your Logs - Gaining Insight Through Visualization
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Mining Your Logs - Gaining Insight Through Visualization

6,662

Published on

In this two part presentation we will explore log analysis and log visualization. We will have a look at the history of log analysis; where log analysis stands today, what tools are available to …

In this two part presentation we will explore log analysis and log visualization. We will have a look at the history of log analysis; where log analysis stands today, what tools are available to process logs, what is working today, and more importantly, what is not working in log analysis. What will the future bring? Do our current approaches hold up under future requirements? We will discuss a number of issues and will try to figure out how we can address them.
By looking at various log analysis challenges, we will explore how visualization can help address a number of them; keeping in mind that log visualization is not just a science, but also an art. We will apply a security lens to look at a number of use-cases in the area of security visualization. From there we will discuss what else is needed in the area of visualization, where the challenges lie, and where we should continue putting our research and development efforts.

0 Comments
17 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
6,662
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
306
Comments
0
Likes
17
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Mining Your LogsGaining Insight Through Visualization Raffael Marty - @zrlram Google TechTalk March 2011
  • 2. Raffael Marty• Founder @• Chief Security Strategist and Product Manager @ Splunk• Manager Solutions @ ArcSight• Intrusion Detection Research @ IBM Research• IT Security Consultant @ PriceWaterhouse Coopers Applied Security Visualization Publisher: Addison Wesley (August, 2008) ISBN: 0321510100 Logging as a Service 2 © by Raffael Marty
  • 3. Agenda•Log Analysis •Future Needs•History •Data Visualization•Log Architectures •Visualization Concepts•What’s Working and •Security Visualization What’s Not? Use-Cases Logging as a Service 3 © by Raffael Marty
  • 4. Log Analysis10.0.20.9 - - [22/Mar/2011:10:00:52 +0000] "GET /admin/customer/customer/612/HTTP/1.1" 200 2261 "https://logdog.loggly.org/admin/customer/customer/""Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; en-us) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27"TYhzVH8AAAEAAGOkBOQAAADA 6552682010-12-28T18:12:10.031+00:00 frontend2-raffysyslog-ng[19600]: syslog-ng starting up;version=3.1.12011-01-10T21:27:04.820+00:00 frontend2-raffykernel: : [ 664.107313] blocked inboundIN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:d8:30:62:5f:6a:a3:08:00 SRC=10.0.20.109 DST=10.0.20.255LEN=180 TOS=0x00 PREC=0x00 TTL=64 ID=126PROTO=UDP SPT=17500 DPT=17500 LEN=160 Logging as a Service 4 © by Raffael Marty
  • 5. History• 1980 Eric Allman develops syslogd(8)• 1996 Intellitactics• 1997 Tivoli Risk Manager developed by IBM Research in Zurich (later Zurich Correlation Engine, ZCE)• 1999 - 2010 A number of log management / SIEM players enter the market (software, appliances)• 2000 ArcSight - 2010 sold for $1.65bn to HP• 2009 Loggly (logging as a service) Logging as a Service 5 © by Raffael Marty
  • 6. History - The Other View• Network management (SNMP)• IDS false positive reduction• Security monitoring (multiple data sources)• Unification of NOC and SOC (failed?)• Application monitoring (moving up the stack) - original tools failed due to architectural constraints - new approaches have been presented Logging as a Service 6 © by Raffael Marty
  • 7. Log Management TodayWhere are you? Logging as a Service © by Raffael Marty
  • 8. Log Management Today less toolsDIY Log Management CEP and SIEM Advanced Analytics•grep •Open source •Open source •Not log specific!•Perl •Commercial •Commercial•SQL MapReduce •Open source Logging as a Service © by Raffael Marty
  • 9. Open Source Tools• graylog2 • lire • MS Logparser• logstash • LogSurfer • Sguil• swatch • SEC • Octopussy• tenshi • LogHound • Sagan• logwatch • slct• OSSEC • log2timeline• snare • logzilla• lasso • OSSIM this list is likely incomplete! Logging as a Service 9 © by Raffael Marty
  • 10. Commercial Tools this list is likely incomplete!pixlcloud | Visualization in the Cloud 10 © PixlCloud LLC 2011
  • 11. Log Architectures 11
  • 12. Log Mgmt Architecture Storage: - on board - external storage array - clustersCollection: Processing:- syslog - indexing- OPSEC - context storage- SDEE - clustering- netflow- database Logging as a Service 12 © by Raffael Marty
  • 13. Log Mgmt Architecture raw normalized or rawCollection: Processing: Data Access:- syslog - indexing - free-text search- OPSEC - context storage - field-based search- SDEE - clustering - tagging schemas- netflow- database Logging as a Service 13 © by Raffael Marty
  • 14. Agents and Connectors • piece of code to transport logs to a central location • features • often additional features: • special protocols: - batch - parse - OPSEC, SDEE - compress - normalize - Windows - encrypt - aggregate • file-based collection - sign - enrichment (context) - fail-over • database collectionpixlcloud | Visualization in the Cloud 14 © PixlCloud LLC 2011
  • 15. SIEM Architecture asset context raw normalized identity context ... context / tagging RDBMSLogging as a Service 15 © by Raffael Marty
  • 16. SIEM Architecture• RDBMS schema - Fixed number and type of fields - New data sources with new fields? ‣ overloading• RDBMS clusters are expensive and scale poorly• Need a parser for every data source• Slow historical data queries• Hard to configure database efficiently - because of different use-cases Logging as a Service 16 © by Raffael Marty
  • 17. SIEM Architecture Benefits• Parsed data enables - real-time correlation - real-time statistics - data augmentation (context) close to source• Unified data access language - over a fixed set of fields• Real-time dashboards Logging as a Service 17 © by Raffael Marty
  • 18. Search vs. SIEM• Full-text indexing• Parsing at search time Example search: Example search: denied user=rmarty • use index to find • use index to find ALL occurrences of ‘denied’ occurrences of ‘rmarty’ • apply parser to results • remove results where user is not rmarty Logging as a Service 18 © by Raffael Marty
  • 19. New SIEM - Hybrid Models• Use parsers for known data sources• Collect everything else• Index all data and use index for search• Correlate parsed data Logging as a Service 19 © by Raffael Marty
  • 20. Categorization and Tagging•How do you find all failed logins across any data source? security:538 OR “sshd authentication failure” OR “sshd failed password” OR ...•Does not scale - for new data sources - for new events of existing sources id -> object, action, status•Define a ‘taxonomy’ for all events•Map events into taxonomy Logging as a Service 20 © by Raffael Marty
  • 21. Content Creation• Rules, dashboards, reports, searches can use taxonomy: object=authentication AND action=login AND status=success• All failures related to files: object=file AND status=failure • Approach scales well• Mixing with other fields: • Huge effort to build and action=login AND user=rmarty maintain mappings Logging as a Service 21 © by Raffael Marty
  • 22. Logging as a Service (LaaS)• Economically advantageous - think about TCO• Pay as you go• Elastic infrastructure scales with your needs• No installation needed• No setup costs / time for logging solution• Open platform with RESTful APIs Logging as a Service 22
  • 23. Loggly Data Sources Consumers Loggly user interface UI extensions mobile-166 My syslog Data collection Proxies API Data access Distributed Indexers and Search Machines indexing and processingLog Archive Distributed data store Logging as a Service 23
  • 24. Tool Usage DIY MR Log Mgmt SIEM LaaSdata known known unknown known -sources only a few only a few many manyanalysis known exploration unknown unknown extenduse-cases one or a few large-scale many many platformdynamic no no yes yes yesuse-casesreal-time extend no no no yescorrelation platform engineer engineers license licensecost hardware hardware (hardware) hardware subscription maintenance maintenance maintenance maintenance Should you rather do it yourself (DIY)? Logging as a Service 24 © by Raffael Marty
  • 25. What is Working andWhat is not? 25
  • 26. What’s Working• Log collection• Log centralization• Alerting on a priori known patterns• Solving specific, known use-cases for sets of known data sources, e.g., - monitoring privileged access to financial servers - generating compliance reports - security forensics Logging as a Service 26 © by Raffael Marty
  • 27. What’s Not Working• Log formats are all over and not documented Mar 16 08:09:58 kernel: [ 0.000000] Normal 1048576 -> 1048576• No logging guidelines / developer education• Parsing is broken - based on regexes - numerous mistakes - doesn’t scale Logging as a Service 27 © by Raffael Marty
  • 28. What’s Not Working• Normalization is broken: - IP to hostnames (when to do DNS lookup) - usernames (rmarty vs. ram vs. raffy)• Categorization / Taxonomy - doesn’t scale - is always out of date - is buggy - expensive• Prioritization has no working formula• Anomaly detection is voodoo! Logging as a Service 28 © by Raffael Marty
  • 29. What Does It Mean?• We don’t understand our data• Security Operations Center (SOC) monitors all corporate data sources. Analysts - don’t know all the applications - don’t know all the setups - don’t know what log records are ‘normal’ behavior --> Need tools to enable log owners to work with their data Logging as a Service 29 © by Raffael Marty
  • 30. Future Needs 30
  • 31. We Need Better Tools• We will have more and more data and need to deal with larger amounts of data - SIEM needs to support new distributed, scalable data management technologies• More and more application layer data - How are we going to deal with all the parsing / entity extraction? - We need logging standards and guidelines• How do we help analysts understand the data? - What is important and what is not? - Mapping problems to business process, business risk! Logging as a Service 31 © by Raffael Marty
  • 32. Data Visualization 32
  • 33. Data/Log Visualization • Exploration and Discovery • Answer Questions • Communicate Information • Support DecisionsLogging as a Service 33 © by Raffael Marty
  • 34. Security Visualization• We are nowhere!• Visualization is an afterthought• Sec Viz dichotomy• Tools are lacking fundamental capabilities• Users don’t understand data, how can they understand visuals? Logging as a Service 34 © by Raffael Marty
  • 35. VisualizationConcepts 35
  • 36. The Analysis Approach Details onOverview first Zoom demand Principle by Ben Shneiderman Logging as a Service 36 © by Raffael Marty
  • 37. Simultaneous ViewsLogging as a Service 37 © by Raffael Marty
  • 38. Dynamic ColoringLogging as a Service 38 © by Raffael Marty
  • 39. Linked ViewsLogging as a Service 39 © by Raffael Marty
  • 40. Legible / Usable Graphs Reducing non data ink!Logging as a Service 40 © by Raffael Marty
  • 41. Choosing the Right ChartLogging as a Service 41 © by Raffael Marty
  • 42. Ode to the PieLogging as a Service 42 © by Raffael Marty
  • 43. Careful With Interpretations Logging as a Service 43 © by Raffael Marty
  • 44. SecViz Examples 44
  • 45. Logging as a Service 45 © by Raffael Marty
  • 46. Logging as a Service 46 © by Raffael Marty
  • 47. Logging as a Service 47 © by Raffael Marty
  • 48. Situational Awareness• Treemap• Protovis.JS• Size: Amount• Brightness: Variance• Color: Sensor• Shows: Scans - bright spots• Thanks to Chris Horsley Logging as a Service 48 © by Raffael Marty
  • 49. Logging as a Service 49 © by Raffael Marty
  • 50. Firewall TreemapLogging as a Service 50 © by Raffael Marty
  • 51. Firewall Log Port Source IP Destination IPLogging as a Service 51 © by Raffael Marty
  • 52. IDS Sig Tuning - Treemap Hierarchy: Source Destination Signature Number of Events Color: Priority Size: Number of alertsLogging as a Service 52 © by Raffael Marty
  • 53. Vulnerability Data by HostLogging as a Service 53 © by Raffael Marty
  • 54. Visualization Future• A solution to entity extraction• Dynamic and interactive displays• Computer aided intelligence / visualization - Computer supported exploration - Highly interactive• Expert system that captures domain knowledge - Collaborative Logging as a Service 54 © by Raffael Marty
  • 55. http://secviz.org Share, discuss, challenge, and learn about security visualization.• List: secviz.org/mailinglist• Twitter: @secviz Logging as a Service 55 © by Raffael Marty
  • 56. about.me/raffy 56

×