"Grand Challenges" of Log Management


Published on

Anton's "Grand Challenges" of Log Management defines a few critical problems with managing logs and log analysis.

Published in: Technology, Business
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Talk points: Involvement invention separate team Past skills and experiences Thought about it!
  • "Grand Challenges" of Log Management

    1. 1. “ Grand Challenges” of Log Management Dr Anton Chuvakin Chief Logging Evangelist LogLogic, Inc Mitigating Risk. Automating Compliance.
    2. 2. Who is Anton? <ul><li>Dr. Anton Chuvakin from LogLogic “is probably the number one authority on system logging in the world ” </li></ul><ul><li>SANS Institute (2008) </li></ul><ul><li>( http://www.sans.edu/resources/securitylab/loglogic_chuvakin.php ) </li></ul>
    3. 3. Outline <ul><li>Log Management Intro </li></ul><ul><li>Innovation: BIG vs Small </li></ul><ul><ul><li>Step away from small, tactical “issues” for a second … </li></ul></ul><ul><li>“ Grand Challenges” of Log Management </li></ul><ul><li>How you can help?! </li></ul>
    4. 4. Why “Grand Challenges”? <ul><li>Log management BIG and unsolved problems that cause major pain! </li></ul><ul><li>Problems that people tried to solve – and FAILED! </li></ul><ul><li>From collection to decision-making based on logs; from compliance to security and operations; from today’s log sources to the future – there are challenges everywhere! </li></ul>
    5. 5. GC1 – Secure and Reliable Log Collection <ul><li>Challenge </li></ul><ul><ul><li>To collect the logs securely, reliably AND without heavy management overhead and complexity of access </li></ul></ul><ul><li>Why a grand challenge? </li></ul><ul><ul><li>Agents vs remote grabbing vs stream: all suck. Security and reliability cost major management overhead </li></ul></ul><ul><li>Current approaches? </li></ul><ul><ul><li>Agents + remote grab (administrator access) + stream (syslog) </li></ul></ul><ul><li>Why still a challenge? </li></ul><ul><ul><li>All approaches have critical drawbacks </li></ul></ul>
    6. 6. GC2 - Log Parsing and Regexs <ul><li>Challenge </li></ul><ul><ul><li>To turn logs into information, one needs to parse them; to parse them one needs [typically] expert-created regular expressions (regex’s) </li></ul></ul><ul><li>Why a grand challenge? </li></ul><ul><ul><li>Every log type requires hand-writing a set of regexes </li></ul></ul><ul><li>Current approaches? </li></ul><ul><ul><li>UIs, “semi-auto”/assisted regex creators, limited auto-extraction, choosing not to parse, etc </li></ul></ul><ul><li>Why still a challenge? </li></ul><ul><ul><li>Despite all tools, log expert must create the rules </li></ul></ul>
    7. 7. GC3 – Fast Ad Hoc Summarization <ul><li>Challenge </li></ul><ul><ul><li>Everybody wants reports on this and that NOW! They rarely know what is ‘this’ and ‘that’ </li></ul></ul><ul><li>Why a grand challenge? </li></ul><ul><ul><li>Its either fast or ad hoc, not both. Users want both! </li></ul></ul><ul><li>Current approaches? </li></ul><ul><ul><li>Database tuning; custom indices; “non-RDBMS with a little bit of RDBMS” </li></ul></ul><ul><li>Why still a challenge? </li></ul><ul><ul><li>None really work or require expert tuning; pick lesser evil </li></ul></ul>
    8. 8. GC4 – Automated Meaning Extraction <ul><li>Challenge </li></ul><ul><ul><li>Automatically analyze logs and gain useful information, across domains (security, ops, compliance) </li></ul></ul><ul><li>Why a grand challenge? </li></ul><ul><ul><li>Log analysis is heavily manual, interpretative and domain- and system-specific </li></ul></ul><ul><li>Current approaches? </li></ul><ul><ul><li>Rule-based, summarization, filtering, minimum anomaly detection </li></ul></ul><ul><li>Why still a challenge? </li></ul><ul><ul><li>“ Log analysis is an art, not science” -> no automation </li></ul></ul>
    9. 9. GC5 – Scalable Data Presentation <ul><li>Challenge </li></ul><ul><ul><li>How to present log massive volumes of log data to users to help them solve their problems across domains </li></ul></ul><ul><li>Why a grand challenge? </li></ul><ul><ul><li>Tables, pie charts, graphs all leave much to be desired; lose information and don’t scale </li></ul></ul><ul><li>Current approaches? </li></ul><ul><ul><li>Table, pie/bar chart, graphs, “advanced” visualization </li></ul></ul><ul><li>Why still a challenge? </li></ul><ul><ul><li>No effective method is invented yet </li></ul></ul>
    10. 10. GC6 – “Fuzzy” Search <ul><li>Challenge </li></ul><ul><ul><li>How to find the “right” log message (s) without knowing what to look for, exactly? </li></ul></ul><ul><li>Why a grand challenge? </li></ul><ul><ul><li>Many uses of logs require searching but users often don’t know what to look for </li></ul></ul><ul><li>Current approaches? </li></ul><ul><ul><li>Trying keywords + wildcards + refining search as we go </li></ul></ul><ul><li>Why still a challenge? </li></ul><ul><ul><li>No method to incorporate uncertainty in search is found yet </li></ul></ul>
    11. 11. Call to Action! <ul><li>Simple  </li></ul><ul><li>Pick a challenge and solve it!!! </li></ul><ul><li>Come to discuss!! </li></ul><ul><li>Act on ideas! </li></ul><ul><li>Send ideas! </li></ul><ul><li>Explore! </li></ul>
    12. 12. Thank You! <ul><li>Anton Chuvakin, Ph.D. www.chuvakin.org </li></ul><ul><li>Chief Logging Evangelist </li></ul><ul><li>LogLogic, Inc www.loglogic.com </li></ul><ul><li>See www.info-secure.org for my papers, books, reviews </li></ul><ul><li>and other security and logging resources. </li></ul><ul><li>Subscribe to my blog at www.securitywarrior.org </li></ul>
    13. 13. Further Reading / Blog Posts <ul><li>“Idea Log Management Tool” ( http://chuvakin.blogspot.com/2007/11/ideal-log-management-tool.html ) </li></ul><ul><li>“Future Problems -1” ( http://chuvakin.blogspot.com/2008/06/ideal-tool-to-solve-real-problems-of.html ) </li></ul><ul><li>“Future Problems -2” ( http://chuvakin.blogspot.com/2008/08/ideal-tool-to-solve-real-problems-of.html ) </li></ul>
    14. 14. Other Candidate Challenges
    15. 15. GCC1 – “Bad” Logs <ul><li>Challenge </li></ul><ul><ul><li>Many logs just don’t have the right information in them; correlated logs; logs w missing info; multi-file logs, etc </li></ul></ul><ul><li>Why a grand challenge? </li></ul><ul><ul><li>Making sense of logs is hard if key information is missing </li></ul></ul><ul><li>Current approaches? </li></ul><ul><ul><li>Ignore the problem, try to manually enrich the information from other sources </li></ul></ul><ul><li>Why still a challenge? </li></ul><ul><ul><li>The above doesn’t help; deeper change in how logging is done is probably needed </li></ul></ul>
    16. 16. GCC2 – Log Chaos <ul><li>Challenge </li></ul><ul><ul><li>Logs come in a dizzying variety of formats, via different ports, they look different – how do we understand them? </li></ul></ul><ul><li>Why a grand challenge? </li></ul><ul><ul><li>Lack of log standards make log analysis unreliable and complex art </li></ul></ul><ul><li>Current approaches? </li></ul><ul><ul><li>Take logs one by one; write regexes or index </li></ul></ul><ul><li>Why still a challenge? </li></ul><ul><ul><li>No log standard has been created yet; CEE is working on it! ( http://cee.mitre.org ) </li></ul></ul>
    17. 17. GCC3 – Unified Log Storage Data Model <ul><li>Challenge </li></ul><ul><ul><li>Logs come in a dizzying variety of formats, some can be parsed, some can’t – how to store them for quick and smart access (not “slow and painful”) </li></ul></ul><ul><li>Why a grand challenge? </li></ul><ul><ul><li>Logs are just too different </li></ul></ul><ul><li>Current approaches? </li></ul><ul><ul><li>RDBMS (one vs many) vs flat files vs custom vs … </li></ul></ul><ul><li>Why still a challenge? </li></ul><ul><ul><li>Despite the effort, many limitations persists </li></ul></ul>