Anton Chuvakin on Security Data Centralization


Published on

My old preso on data centralization; I might not agree with everything I said back then ...

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Anton Chuvakin on Security Data Centralization

    1. 1. Centralizing Centralization v. 0.2 October 2003 Anton Chuvakin, Ph.D., GCIA, GCIH Senior Security Analyst
    2. 2. Outline <ul><li>Security data centralization overview </li></ul><ul><li>Value of centralization </li></ul><ul><li>Single device type and cross-device centralization </li></ul><ul><li>Normalization, pros and cons </li></ul><ul><li>Categorization, security event types and standards </li></ul><ul><li>Correlation, types and methods </li></ul><ul><li>Why do you have to do it? </li></ul>
    3. 3. Terms <ul><li>Message – some system indication that the event has occurred </li></ul><ul><li>Log or audit record – recorded message related to the event </li></ul><ul><li>Log file – collection of the above records </li></ul><ul><li>Alert – a message usually sent to notify an operator </li></ul><ul><li>Device – a source of security-relevant logs </li></ul>
    4. 4. Centralization <ul><li>Centralized security controls: </li></ul><ul><li>Cheaper to manage </li></ul><ul><li>Easier to audit </li></ul><ul><li>Save money on staff </li></ul><ul><li>Reduce training costs </li></ul>
    5. 5. Security Data Overview <ul><li>What data? </li></ul><ul><li>Audit logs </li></ul><ul><li>Transaction logs </li></ul><ul><li>Intrusion logs </li></ul><ul><li>Connection logs </li></ul><ul><li>System performance records </li></ul><ul><li>User activity logs </li></ul><ul><li>Various alerts </li></ul><ul><li>From where? </li></ul><ul><li>Firewalls/intrusion prevention </li></ul><ul><li>Routers/switches </li></ul><ul><li>Intrusion detection </li></ul><ul><li>Hosts </li></ul><ul><li>Business applications (databases, servers) </li></ul><ul><li>Anti-virus </li></ul><ul><li>VPNs </li></ul>
    6. 6. Centralized Data <ul><li>Why centralize security data? </li></ul><ul><li>Accessibility </li></ul><ul><ul><li>All audit records in one place </li></ul></ul><ul><li>Cross-device searchability and analysis </li></ul><ul><ul><li>Categorization </li></ul></ul><ul><ul><li>Correlation </li></ul></ul><ul><li>De-duplication / volume reduction </li></ul><ul><li>Reduced response time </li></ul><ul><li>Increase in the efficiency of existing security point solutions </li></ul>
    7. 7. Requirements <ul><li>What do you need to start? </li></ul><ul><li>Collect the data </li></ul><ul><li>Convert to common format </li></ul><ul><li>Reduce in size, if possible </li></ul><ul><li>Transport securely to a central location </li></ul><ul><li>Process in real-time </li></ul><ul><li>Alert on threats </li></ul><ul><li>Store securely </li></ul><ul><li>Report on trends </li></ul>
    8. 8. Challenges <ul><li>Need to overcome these: </li></ul><ul><li>Too much data </li></ul><ul><li>Not enough data </li></ul><ul><li>Diverse records </li></ul><ul><li>False alarms </li></ul><ul><li>Duplicate data </li></ul><ul><li>Hard to get data </li></ul><ul><li>Chain of custody concerns </li></ul>
    9. 9. Case Made! So... <ul><li>...everybody build a “central console” for their stuff. Results : central consoles for ... </li></ul><ul><li>Each firewall vendor </li></ul><ul><li>Multiple firewalls and routers </li></ul><ul><li>Each IDS type </li></ul><ul><li>IDS and vulnerability scanners </li></ul><ul><li>Routers and network management data </li></ul><ul><li>Ad infinitum...  </li></ul>
    10. 10. Case I: NIDS Console <ul><li>Relatively small number of devices </li></ul><ul><li>False alarms </li></ul><ul><li>False positives </li></ul><ul><li>Large volume (needs tuning) </li></ul><ul><li>Many of the recorded events require response </li></ul><ul><li>The data sometimes need to be viewed (and responded to) in near real-time </li></ul>
    11. 11. Example: ACID Console <ul><li>Collects events from multiple Snort NIDS sensors </li></ul><ul><li>Uses relational DBMS to store data </li></ul><ul><li>Retains (and can show) full packet payload </li></ul><ul><li>Web front-end </li></ul><ul><li>Advanced search queries </li></ul><ul><ul><li>Search across sensors </li></ul></ul><ul><ul><li>Search by any packet field/combination </li></ul></ul><ul><li>Data graphing </li></ul><ul><li>No real-time tools </li></ul>
    12. 12. Case II: Desktop Protection <ul><li>Such as personal firewall/IDS, anti-virus, host IPS </li></ul><ul><li>Characteristics: </li></ul><ul><li>Huge number of devices </li></ul><ul><li>Low volume from each device </li></ul><ul><li>Needs status monitoring (disabled by the user?) </li></ul><ul><li>Data might be transmitted over the slow WAN link </li></ul><ul><li>Rarely looked at </li></ul><ul><li>Requires cross-device correlation for meaningful analysis </li></ul>
    13. 13. Diverse devices <ul><li>Moving from a single type of data source to heterogeneous sources </li></ul><ul><li>Volume is getting even higher </li></ul><ul><li>Data diversity problem arises </li></ul><ul><ul><li>Binary and text logs </li></ul></ul><ul><ul><li>Undocumented formats </li></ul></ul><ul><ul><li>Free form logs </li></ul></ul><ul><ul><li>Same events described differently </li></ul></ul><ul><ul><li>Different level of detail in collected data </li></ul></ul><ul><li>How to analyze? </li></ul>
    14. 14. Normalization Defined <ul><li>Solution: normalization i.e. converting recorded events to a common format or schema (often XML) </li></ul><ul><li>How to normalize? </li></ul><ul><li>Look at common fields in security event records </li></ul><ul><ul><li>Source IP, port, protocol </li></ul></ul><ul><ul><li>Event type </li></ul></ul><ul><ul><li>Device instance </li></ul></ul><ul><ul><li>Severity </li></ul></ul><ul><li>Create a data model to cover all these and more </li></ul><ul><li>Map the original event fields to the new general schema </li></ul>
    15. 15. Normalization Example
    16. 16. Normalization <ul><li>Advantages: </li></ul><ul><li>Store – known storage requirements </li></ul><ul><li>Analyze (c orrelate, c ategorize) and search – same attributes </li></ul><ul><li>Prioritize – uniform severity </li></ul><ul><li>Present/visualize – common reports </li></ul><ul><li>Challenges: </li></ul><ul><li>Data loss </li></ul><ul><ul><li>What if something does not fit the model? </li></ul></ul><ul><li>Overhead </li></ul><ul><ul><li>Too much of a good thing? </li></ul></ul><ul><li>Over-normalization </li></ul><ul><ul><li>What if its not really the same ? </li></ul></ul><ul><li>Mapping incompatibilities </li></ul><ul><ul><li>Is this more of a source or a destination after all? </li></ul></ul>
    17. 17. Categorization <ul><li>Data format is the same, but what about the content ? </li></ul><ul><li>MSBlaster, Nimda, CodeRed -> Malware </li></ul><ul><li>Statd Attack, SSH Exploit -> Unix Exploits </li></ul><ul><li>UDP Bomb, Boink, Smurf -> Legacy DoS </li></ul><ul><li>Select Categories </li></ul><ul><li>Malware </li></ul><ul><li>Attacks and Exploits </li></ul><ul><li>Vulnerable Software </li></ul><ul><li>System Failures </li></ul><ul><li>AAA </li></ul><ul><li>Change Management </li></ul>
    18. 18. Value of Categorization <ul><li>Value of Categorization </li></ul><ul><li>Adds intelligence to event data collection </li></ul><ul><li>Enhances high-level reporting </li></ul><ul><li>Provides understanding of the detected threat types and supplies the context for their interpretation </li></ul><ul><li>Challenges with Categorization </li></ul><ul><li>No universal standard </li></ul><ul><li>(but work in progress!) </li></ul><ul><li>Too much variety in data makes every categorization effort incomplete </li></ul><ul><li>Every security vendor is trying to create its own scheme (yak!) </li></ul>
    19. 19. Correlation <ul><li>Defined : </li></ul><ul><li>General: “establishing or finding relationships between entities” </li></ul><ul><li>Security: “improving threat identification and assessment by looking not only at individual events, but at their sets , bound by some common parameter ('related')” </li></ul><ul><li>Correlation is enabled by centralization, normalization and also enhanced by categorization. </li></ul>
    20. 20. Correlation Types <ul><li>Rule-based </li></ul><ul><ul><li>Uses pre-existing knowledge of the attack (the rule) and is able to define what has been detected in precise terms </li></ul></ul><ul><li>Statistical </li></ul><ul><ul><li>Relies upon the knowledge of normal activities, which has been accumulated over time to detect the deviations </li></ul></ul>
    21. 21. Policy, Vulnerability and Incident Management <ul><li>Other data (information, knowledge) </li></ul><ul><li>Knowledge </li></ul><ul><ul><li>Security policies and procedures </li></ul></ul><ul><ul><li>Industry and organization security guidelines </li></ul></ul><ul><li>Vulnerability and asset data </li></ul><ul><ul><li>Scans </li></ul></ul><ul><ul><li>Asset attributes </li></ul></ul><ul><li>Security Incidents </li></ul><ul><ul><li>Centralized incident handling and reporting </li></ul></ul>
    22. 22. Conclusion <ul><li>Centralization of security data is crucial for... </li></ul><ul><li>Large organization </li></ul><ul><ul><li>Can't do security without it! </li></ul></ul><ul><li>Small/medium companies </li></ul><ul><ul><li>Needed to succeed with little security staff </li></ul></ul>
    23. 23. Thanks for Viewing the Presentation <ul><li>Anton Chuvakin, Ph.D., GCIA, GCIH, GCFA </li></ul><ul><li> </li></ul><ul><li>Author of “Security Warrior” (O’Reilly) – </li></ul><ul><li>Read my blog at http:// </li></ul><ul><li>Book on logs is coming soon! </li></ul><ul><li>See for my papers, books, reviews and other security resources related to logs </li></ul>