SlideShare a Scribd company logo
Mining Your Logs
Gaining Insight Through Visualization




          Raffael Marty - @zrlram
               Google TechTalk March 2011
Raffael Marty
• Founder @
• Chief Security Strategist and Product Manager @ Splunk
• Manager Solutions @ ArcSight
• Intrusion Detection Research @ IBM Research
• IT Security Consultant @ PriceWaterhouse Coopers


          Applied Security Visualization
               Publisher: Addison Wesley (August, 2008)
                           ISBN: 0321510100




       Logging as a Service                               2   © by Raffael Marty
Agenda

•Log Analysis                   •Future Needs

•History                        •Data Visualization

•Log Architectures              •Visualization Concepts

•What’s Working and             •Security Visualization
 What’s Not?                     Use-Cases

   Logging as a Service     3                         © by Raffael Marty
Log Analysis
10.0.20.9 - - [22/Mar/2011:10:00:52 +0000] "GET /admin/customer/customer/612/
HTTP/1.1" 200 2261 "https://logdog.loggly.org/admin/customer/customer/"
"Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; en-us) AppleWebKit/
533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27"
TYhzVH8AAAEAAGOkBOQAAADA 655268

2010-12-28T18:12:10.031+00:00 frontend2-raffy
syslog-ng[19600]: syslog-ng starting up;
version='3.1.1'

2011-01-10T21:27:04.820+00:00 frontend2-raffy
kernel: : [ 664.107313] blocked inbound
IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:d8:30:62:5f:
6a:a3:08:00 SRC=10.0.20.109 DST=10.0.20.255
LEN=180 TOS=0x00 PREC=0x00 TTL=64 ID=126
PROTO=UDP SPT=17500 DPT=17500 LEN=160



        Logging as a Service            4                             © by Raffael Marty
History
• 1980 Eric Allman develops syslogd(8)
• 1996 Intellitactics
• 1997 Tivoli Risk Manager developed by IBM Research
  in Zurich (later Zurich Correlation Engine, ZCE)
• 1999 - 2010 A number of log management / SIEM
  players enter the market (software, appliances)
• 2000 ArcSight - 2010 sold for $1.65bn to HP
• 2009 Loggly (logging as a service)
     Logging as a Service      5                © by Raffael Marty
History - The Other View
• Network management (SNMP)
• IDS false positive reduction
• Security monitoring (multiple data sources)
• Unification of NOC and SOC (failed?)
• Application monitoring (moving up the stack)
 - original tools failed due to architectural constraints
 - new approaches have been presented


     Logging as a Service         6                         © by Raffael Marty
Log Management Today




Where are you?
  Logging as a Service          © by Raffael Marty
Log Management Today




                                                  less tools
DIY      Log Management   CEP and SIEM    Advanced Analytics
•grep     •Open source     •Open source   •Not log specific!
•Perl     •Commercial      •Commercial
•SQL                      MapReduce
                           •Open source


   Logging as a Service                                © by Raffael Marty
Open Source Tools
• graylog2                  • lire           • MS Logparser
• logstash                  • LogSurfer      • Sguil
• swatch                    • SEC            • Octopussy
• tenshi                    • LogHound       • Sagan
• logwatch                  • slct
• OSSEC                     • log2timeline
• snare                     • logzilla
• lasso                     • OSSIM                this list is likely incomplete!

     Logging as a Service             9                     © by Raffael Marty
Commercial Tools




                                              this list is likely incomplete!

pixlcloud | Visualization in the Cloud   10        © PixlCloud LLC 2011
Log Architectures

            11
Log Mgmt Architecture
                                             Storage:
                                             - on board
                                             - external storage array
                                             - clusters




Collection:              Processing:
- syslog                 - indexing
- OPSEC                  - context storage
- SDEE                   - clustering
- netflow
- database

  Logging as a Service             12                         © by Raffael Marty
Log Mgmt Architecture
                  raw
                               normalized
                                 or raw




Collection:              Processing:         Data Access:
- syslog                 - indexing          - free-text search
- OPSEC                  - context storage   - field-based search
- SDEE                   - clustering        - tagging schemas
- netflow
- database

  Logging as a Service             13                      © by Raffael Marty
Agents and Connectors
  • piece of code to transport logs to a central location
  • features                • often additional features:   • special protocols:
    - batch                   - parse                       - OPSEC, SDEE
    - compress                - normalize                   - Windows
    - encrypt                 - aggregate
                                                           • file-based collection
    - sign                    - enrichment (context)
    - fail-over
                                                           • database collection




pixlcloud | Visualization in the Cloud         14                    © PixlCloud LLC 2011
SIEM Architecture
                                               asset context
                raw
                          normalized             identity
                                                 context

                                                ...
                       context / tagging

                                       RDBMS




Logging as a Service         15                       © by Raffael Marty
SIEM Architecture
• RDBMS schema
 - Fixed number and type of fields
 - New data sources with new fields?
  ‣   overloading
• RDBMS clusters are expensive and scale poorly
• Need a parser for every data source
• Slow historical data queries
• Hard to configure database efficiently
 - because of different use-cases
       Logging as a Service            16         © by Raffael Marty
SIEM Architecture Benefits
• Parsed data enables
 - real-time correlation
 - real-time statistics
 - data augmentation (context) close to source

• Unified data access language
 - over a fixed set of fields

• Real-time dashboards

    Logging as a Service        17               © by Raffael Marty
Search vs. SIEM
• Full-text indexing
• Parsing at search time

     Example search:                    Example search:
     denied                             user=rmarty
     •   use index to find               • use index to find ALL
         occurrences of ‘denied’          occurrences of ‘rmarty’
                                        • apply parser to results
                                        • remove results where
                                          user is not rmarty

    Logging as a Service           18                               © by Raffael Marty
New SIEM - Hybrid Models
• Use parsers for known data sources
• Collect everything else
• Index all data and use index for search
• Correlate parsed data




    Logging as a Service    19              © by Raffael Marty
Categorization and Tagging
•How do you find all failed logins across any data source?
 security:538 OR “sshd authentication failure” OR “sshd failed
 password” OR ...


•Does not scale
 - for new data sources
 - for new events of existing sources        id -> object, action, status


•Define a ‘taxonomy’ for all events
•Map events into taxonomy
     Logging as a Service               20                             © by Raffael Marty
Content Creation
• Rules, dashboards, reports, searches can use
  taxonomy:
 object=authentication AND action=login AND status=success

• All failures related to files:
 object=file AND status=failure
                                     • Approach scales well
• Mixing with other fields:          • Huge effort to build and
 action=login AND user=rmarty          maintain mappings

     Logging as a Service       21                    © by Raffael Marty
Logging as a Service (LaaS)
• Economically advantageous - think about TCO
• Pay as you go
• Elastic infrastructure scales with your needs
• No installation needed
• No setup costs / time for logging solution
• Open platform with RESTful APIs


     Logging as a Service   22
Loggly
      Data Sources            Consumers
                                                                              Loggly
                                                                           user interface
                                                                                            UI extensions




                                             mobile-166        My syslog




                                                                                            Data collection
      Proxies                   API                                                         Data access



                                                                                            Distributed
                                 Indexers and Search Machines                               indexing and
                                                                                            processing

Log Archive                                                                                 Distributed
                                                                                            data store



                Logging as a Service                      23
Tool Usage
                   DIY             MR       Log Mgmt             SIEM          LaaS

data          known           known         unknown        known
                                                                         -
sources       only a few      only a few    many           many

analysis      known           exploration   unknown        unknown       extend
use-cases     one or a few    large-scale   many           many          platform

dynamic       no              no            yes            yes           yes
use-cases
real-time                                                                extend
            no                no            no             yes
correlation                                                              platform

              engineer        engineers     license        license
cost          hardware        hardware      (hardware)     hardware      subscription
              maintenance     maintenance   maintenance    maintenance

                             Should you rather do it yourself (DIY)?

          Logging as a Service                        24                                © by Raffael Marty
What is Working and
What is not?
            25
What’s Working
• Log collection
• Log centralization
• Alerting on a priori known patterns
• Solving specific, known use-cases for sets of
  known data sources, e.g.,
 - monitoring privileged access to financial servers
 - generating compliance reports
 - security forensics

     Logging as a Service       26                     © by Raffael Marty
What’s Not Working
• Log formats are all over and not documented
 Mar 16 08:09:58 kernel: [ 0.000000] Normal 1048576 -> 1048576

• No logging guidelines / developer education
• Parsing is broken
 - based on regexes
 - numerous mistakes
 - doesn’t scale



     Logging as a Service    27                     © by Raffael Marty
What’s Not Working
• Normalization is broken:
 - IP to hostnames (when to do DNS lookup)
 - usernames (rmarty vs. ram vs. raffy)

• Categorization / Taxonomy
 - doesn’t scale            - is always out of date
 - is buggy                 - expensive

• Prioritization has no working formula
• Anomaly detection is voodoo!
     Logging as a Service       28                    © by Raffael Marty
What Does It Mean?
• We don’t understand our data
• Security Operations Center (SOC) monitors all
  corporate data sources. Analysts
 - don’t know all the applications
 - don’t know all the setups
 - don’t know what log records are ‘normal’ behavior

         --> Need tools to enable log owners to work
                       with their data

     Logging as a Service        29                    © by Raffael Marty
Future Needs

               30
We Need Better Tools
• We will have more and more data and need to deal with
  larger amounts of data
 - SIEM needs to support new distributed, scalable data management
  technologies
• More and more application layer data
 - How are we going to deal with all the parsing / entity extraction?
 - We need logging standards and guidelines

• How do we help analysts understand the data?
 - What is important and what is not?
 - Mapping problems to business process, business risk!

     Logging as a Service               31                              © by Raffael Marty
Data Visualization

             32
Data/Log Visualization
     • Exploration and Discovery


     • Answer Questions


     • Communicate Information


     • Support Decisions
Logging as a Service       33      © by Raffael Marty
Security Visualization
• We are nowhere!
• Visualization is an afterthought
• Sec Viz dichotomy
• Tools are lacking fundamental capabilities
• Users don’t understand data, how can
  they understand visuals?


    Logging as a Service   34                  © by Raffael Marty
Visualization
Concepts
                35
The Analysis Approach
                                  Details on
Overview first             Zoom
                                   demand




                                  Principle by Ben Shneiderman

    Logging as a Service    36              © by Raffael Marty
Simultaneous Views




Logging as a Service   37     © by Raffael Marty
Dynamic Coloring




Logging as a Service   38       © by Raffael Marty
Linked Views




Logging as a Service        39        © by Raffael Marty
Legible / Usable Graphs




             Reducing non data ink!
Logging as a Service   40             © by Raffael Marty
Choosing the Right Chart




Logging as a Service   41   © by Raffael Marty
Ode to the Pie




Logging as a Service     42         © by Raffael Marty
Careful With Interpretations




 Logging as a Service   43   © by Raffael Marty
SecViz Examples


             44
Logging as a Service   45   © by Raffael Marty
Logging as a Service   46   © by Raffael Marty
Logging as a Service   47   © by Raffael Marty
Situational Awareness
• Treemap
• Protovis.JS
• Size: Amount
• Brightness: Variance
• Color: Sensor
• Shows: Scans -
  bright spots


• Thanks to Chris Horsley

         Logging as a Service   48      © by Raffael Marty
Logging as a Service   49   © by Raffael Marty
Firewall Treemap




Logging as a Service   50        © by Raffael Marty
Firewall Log
      Port                Source IP   Destination IP




Logging as a Service            51                     © by Raffael Marty
IDS Sig Tuning - Treemap
                            Hierarchy:
                              Source
                              Destination
                              Signature
                              Number of Events
                            Color: Priority
                            Size: Number of alerts




Logging as a Service   52                 © by Raffael Marty
Vulnerability Data by Host




Logging as a Service   53   © by Raffael Marty
Visualization Future
• A solution to entity extraction
• Dynamic and interactive displays
• Computer aided intelligence / visualization
 - Computer supported exploration
 - Highly interactive

• Expert system that captures domain knowledge
 - Collaborative


     Logging as a Service    54                 © by Raffael Marty
http://secviz.org
        Share, discuss, challenge, and learn about security
                           visualization.
• List: secviz.org/mailinglist
• Twitter: @secviz




      Logging as a Service       55                       © by Raffael Marty
about.me/raffy
                 56

More Related Content

What's hot

Big Data and Machine Learning with FIWARE
Big Data and Machine Learning with FIWAREBig Data and Machine Learning with FIWARE
Big Data and Machine Learning with FIWAREFernando Lopez Aguilar
 
Hermes: Free the Data! Distributed Computing with MongoDB
Hermes: Free the Data! Distributed Computing with MongoDBHermes: Free the Data! Distributed Computing with MongoDB
Hermes: Free the Data! Distributed Computing with MongoDBMongoDB
 
IoT Discovery tutorial
IoT Discovery tutorialIoT Discovery tutorial
IoT Discovery tutorialTarek Elsaleh
 
Leveraging DNS to Surface Attacker Activity
Leveraging DNS to Surface Attacker ActivityLeveraging DNS to Surface Attacker Activity
Leveraging DNS to Surface Attacker ActivitySqrrl
 
Security From The Big Data and Analytics Perspective
Security From The Big Data and Analytics PerspectiveSecurity From The Big Data and Analytics Perspective
Security From The Big Data and Analytics PerspectiveAll Things Open
 
What's Next for Google's BigTable
What's Next for Google's BigTableWhat's Next for Google's BigTable
What's Next for Google's BigTableSqrrl
 
Our journey with druid - from initial research to full production scale
Our journey with druid - from initial research to full production scaleOur journey with druid - from initial research to full production scale
Our journey with druid - from initial research to full production scaleItai Yaffe
 
IoT Discovery GE: An Introduction
IoT Discovery GE: An IntroductionIoT Discovery GE: An Introduction
IoT Discovery GE: An IntroductionTarek Elsaleh
 
Searching for Better Code: Presented by Grant Ingersoll, Lucidworks
Searching for Better Code: Presented by Grant Ingersoll, LucidworksSearching for Better Code: Presented by Grant Ingersoll, Lucidworks
Searching for Better Code: Presented by Grant Ingersoll, LucidworksLucidworks
 
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidOpen Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidDataWorks Summit
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Demi Ben-Ari
 
Deep Learning in Security—An Empirical Example in User and Entity Behavior An...
Deep Learning in Security—An Empirical Example in User and Entity Behavior An...Deep Learning in Security—An Empirical Example in User and Entity Behavior An...
Deep Learning in Security—An Empirical Example in User and Entity Behavior An...Databricks
 
Splunk, SIEMs, and Big Data - The Undercroft - November 2019
Splunk, SIEMs, and Big Data - The Undercroft - November 2019Splunk, SIEMs, and Big Data - The Undercroft - November 2019
Splunk, SIEMs, and Big Data - The Undercroft - November 2019Jonathan Singer
 
Splunking configfiles 20211208_daniel_wilson
Splunking configfiles 20211208_daniel_wilsonSplunking configfiles 20211208_daniel_wilson
Splunking configfiles 20211208_daniel_wilsonBecky Burwell
 
Apache Eagle: eBay构建开源分布式实时预警引擎实践
Apache Eagle: eBay构建开源分布式实时预警引擎实践Apache Eagle: eBay构建开源分布式实时预警引擎实践
Apache Eagle: eBay构建开源分布式实时预警引擎实践Hao Chen
 
Perfecting Your Streaming Skills with Spark and Real World IoT Data
Perfecting Your Streaming Skills with Spark and Real World IoT DataPerfecting Your Streaming Skills with Spark and Real World IoT Data
Perfecting Your Streaming Skills with Spark and Real World IoT DataAdaryl "Bob" Wakefield, MBA
 

What's hot (20)

2015 moloch recipes
2015 moloch recipes2015 moloch recipes
2015 moloch recipes
 
Big Data and Machine Learning with FIWARE
Big Data and Machine Learning with FIWAREBig Data and Machine Learning with FIWARE
Big Data and Machine Learning with FIWARE
 
Hermes: Free the Data! Distributed Computing with MongoDB
Hermes: Free the Data! Distributed Computing with MongoDBHermes: Free the Data! Distributed Computing with MongoDB
Hermes: Free the Data! Distributed Computing with MongoDB
 
Log analysis with elastic stack
Log analysis with elastic stackLog analysis with elastic stack
Log analysis with elastic stack
 
Workshop slides
Workshop slidesWorkshop slides
Workshop slides
 
IoT Discovery tutorial
IoT Discovery tutorialIoT Discovery tutorial
IoT Discovery tutorial
 
Leveraging DNS to Surface Attacker Activity
Leveraging DNS to Surface Attacker ActivityLeveraging DNS to Surface Attacker Activity
Leveraging DNS to Surface Attacker Activity
 
Security From The Big Data and Analytics Perspective
Security From The Big Data and Analytics PerspectiveSecurity From The Big Data and Analytics Perspective
Security From The Big Data and Analytics Perspective
 
What's Next for Google's BigTable
What's Next for Google's BigTableWhat's Next for Google's BigTable
What's Next for Google's BigTable
 
Our journey with druid - from initial research to full production scale
Our journey with druid - from initial research to full production scaleOur journey with druid - from initial research to full production scale
Our journey with druid - from initial research to full production scale
 
IoT Discovery GE: An Introduction
IoT Discovery GE: An IntroductionIoT Discovery GE: An Introduction
IoT Discovery GE: An Introduction
 
Searching for Better Code: Presented by Grant Ingersoll, Lucidworks
Searching for Better Code: Presented by Grant Ingersoll, LucidworksSearching for Better Code: Presented by Grant Ingersoll, Lucidworks
Searching for Better Code: Presented by Grant Ingersoll, Lucidworks
 
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidOpen Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
 
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Berlin 2017
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
 
Deep Learning in Security—An Empirical Example in User and Entity Behavior An...
Deep Learning in Security—An Empirical Example in User and Entity Behavior An...Deep Learning in Security—An Empirical Example in User and Entity Behavior An...
Deep Learning in Security—An Empirical Example in User and Entity Behavior An...
 
Splunk, SIEMs, and Big Data - The Undercroft - November 2019
Splunk, SIEMs, and Big Data - The Undercroft - November 2019Splunk, SIEMs, and Big Data - The Undercroft - November 2019
Splunk, SIEMs, and Big Data - The Undercroft - November 2019
 
Splunking configfiles 20211208_daniel_wilson
Splunking configfiles 20211208_daniel_wilsonSplunking configfiles 20211208_daniel_wilson
Splunking configfiles 20211208_daniel_wilson
 
Apache Eagle: eBay构建开源分布式实时预警引擎实践
Apache Eagle: eBay构建开源分布式实时预警引擎实践Apache Eagle: eBay构建开源分布式实时预警引擎实践
Apache Eagle: eBay构建开源分布式实时预警引擎实践
 
Perfecting Your Streaming Skills with Spark and Real World IoT Data
Perfecting Your Streaming Skills with Spark and Real World IoT DataPerfecting Your Streaming Skills with Spark and Real World IoT Data
Perfecting Your Streaming Skills with Spark and Real World IoT Data
 

Viewers also liked

A10 Thunder Convergent Firewall (CFW)
A10 Thunder Convergent Firewall (CFW)A10 Thunder Convergent Firewall (CFW)
A10 Thunder Convergent Firewall (CFW)A10 Networks
 
An Introduction to Plotting in Perl using PDL::Graphics::PLplot
An Introduction to Plotting in Perl using PDL::Graphics::PLplotAn Introduction to Plotting in Perl using PDL::Graphics::PLplot
An Introduction to Plotting in Perl using PDL::Graphics::PLplotdcmertens
 
кодирование информации
кодирование информациикодирование информации
кодирование информацииbulb314
 
IBM - SIEM 2.0: Информационая безопасность с открытыми глазами
IBM - SIEM 2.0: Информационая безопасность с открытыми глазамиIBM - SIEM 2.0: Информационая безопасность с открытыми глазами
IBM - SIEM 2.0: Информационая безопасность с открытыми глазамиExpolink
 
Информационная безопасность. Лекция 2.
Информационная безопасность. Лекция 2.Информационная безопасность. Лекция 2.
Информационная безопасность. Лекция 2.Александр Лысяк
 
Siem
SiemSiem
Siemcnpo
 
Troubleshooting common oslo.messaging and RabbitMQ issues
Troubleshooting common oslo.messaging and RabbitMQ issuesTroubleshooting common oslo.messaging and RabbitMQ issues
Troubleshooting common oslo.messaging and RabbitMQ issuesMichael Klishin
 
Log Mining: Beyond Log Analysis
Log Mining: Beyond Log AnalysisLog Mining: Beyond Log Analysis
Log Mining: Beyond Log AnalysisAnton Chuvakin
 
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaSOpenstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaSSadique Puthen
 
Multi tier-app-network-topology-neutron-final
Multi tier-app-network-topology-neutron-finalMulti tier-app-network-topology-neutron-final
Multi tier-app-network-topology-neutron-finalSadique Puthen
 
Информационная безопасность: Вводная лекция
Информационная безопасность: Вводная лекцияИнформационная безопасность: Вводная лекция
Информационная безопасность: Вводная лекцияMax Kornev
 
How to Troubleshoot OpenStack Without Losing Sleep
How to Troubleshoot OpenStack Without Losing SleepHow to Troubleshoot OpenStack Without Losing Sleep
How to Troubleshoot OpenStack Without Losing SleepSadique Puthen
 
Final Year Projects Computer Science (Information security) -2015
Final Year Projects Computer Science (Information security) -2015Final Year Projects Computer Science (Information security) -2015
Final Year Projects Computer Science (Information security) -2015Syed Ubaid Ali Jafri
 

Viewers also liked (17)

A10 Thunder Convergent Firewall (CFW)
A10 Thunder Convergent Firewall (CFW)A10 Thunder Convergent Firewall (CFW)
A10 Thunder Convergent Firewall (CFW)
 
An Introduction to Plotting in Perl using PDL::Graphics::PLplot
An Introduction to Plotting in Perl using PDL::Graphics::PLplotAn Introduction to Plotting in Perl using PDL::Graphics::PLplot
An Introduction to Plotting in Perl using PDL::Graphics::PLplot
 
Log Data Mining
Log Data MiningLog Data Mining
Log Data Mining
 
кодирование информации
кодирование информациикодирование информации
кодирование информации
 
Web based remote monitoring systems
Web based remote monitoring systemsWeb based remote monitoring systems
Web based remote monitoring systems
 
IBM - SIEM 2.0: Информационая безопасность с открытыми глазами
IBM - SIEM 2.0: Информационая безопасность с открытыми глазамиIBM - SIEM 2.0: Информационая безопасность с открытыми глазами
IBM - SIEM 2.0: Информационая безопасность с открытыми глазами
 
3 nlp
3 nlp3 nlp
3 nlp
 
Информационная безопасность. Лекция 2.
Информационная безопасность. Лекция 2.Информационная безопасность. Лекция 2.
Информационная безопасность. Лекция 2.
 
Siem
SiemSiem
Siem
 
Troubleshooting common oslo.messaging and RabbitMQ issues
Troubleshooting common oslo.messaging and RabbitMQ issuesTroubleshooting common oslo.messaging and RabbitMQ issues
Troubleshooting common oslo.messaging and RabbitMQ issues
 
Log Mining: Beyond Log Analysis
Log Mining: Beyond Log AnalysisLog Mining: Beyond Log Analysis
Log Mining: Beyond Log Analysis
 
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaSOpenstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
Openstack on Fedora, Fedora on Openstack: An Introduction to cloud IaaS
 
Multi tier-app-network-topology-neutron-final
Multi tier-app-network-topology-neutron-finalMulti tier-app-network-topology-neutron-final
Multi tier-app-network-topology-neutron-final
 
Информационная безопасность: Вводная лекция
Информационная безопасность: Вводная лекцияИнформационная безопасность: Вводная лекция
Информационная безопасность: Вводная лекция
 
How to Troubleshoot OpenStack Without Losing Sleep
How to Troubleshoot OpenStack Without Losing SleepHow to Troubleshoot OpenStack Without Losing Sleep
How to Troubleshoot OpenStack Without Losing Sleep
 
Final Year Projects Computer Science (Information security) -2015
Final Year Projects Computer Science (Information security) -2015Final Year Projects Computer Science (Information security) -2015
Final Year Projects Computer Science (Information security) -2015
 
charlottediaz+CV
charlottediaz+CVcharlottediaz+CV
charlottediaz+CV
 

Similar to Mining Your Logs - Gaining Insight Through Visualization

Splunk as a_big_data_platform_for_developers_spring_one2gx
Splunk as a_big_data_platform_for_developers_spring_one2gxSplunk as a_big_data_platform_for_developers_spring_one2gx
Splunk as a_big_data_platform_for_developers_spring_one2gxDamien Dallimore
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)Spark Summit
 
Facebook architecture
Facebook architectureFacebook architecture
Facebook architecturedrewz lin
 
Qcon 090408233824-phpapp01
Qcon 090408233824-phpapp01Qcon 090408233824-phpapp01
Qcon 090408233824-phpapp01jgregory1234
 
Facebook的架构
Facebook的架构Facebook的架构
Facebook的架构yiditushe
 
Facebook architecture
Facebook architectureFacebook architecture
Facebook architecturemysqlops
 
KNIME Software Overview
KNIME Software OverviewKNIME Software Overview
KNIME Software OverviewKNIMESlides
 
Hadoop and subsystems in livedoor #Hcj11f
Hadoop and subsystems in livedoor #Hcj11fHadoop and subsystems in livedoor #Hcj11f
Hadoop and subsystems in livedoor #Hcj11fSATOSHI TAGOMORI
 
Security Challenges in Cloud Integration - Cloud Security Alliance, Austin Ch...
Security Challenges in Cloud Integration - Cloud Security Alliance, Austin Ch...Security Challenges in Cloud Integration - Cloud Security Alliance, Austin Ch...
Security Challenges in Cloud Integration - Cloud Security Alliance, Austin Ch...Glen Roberts, CISSP
 
SnapLogic corporate presentation
SnapLogic corporate presentationSnapLogic corporate presentation
SnapLogic corporate presentationpbridges
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsClaudiu Barbura
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsAerospike, Inc.
 
Playing in the Same Sandbox: MySQL and Oracle
Playing in the Same Sandbox:  MySQL and OraclePlaying in the Same Sandbox:  MySQL and Oracle
Playing in the Same Sandbox: MySQL and Oraclelynnferrante
 
Hadoop Summit - Hausenblas 20 March
Hadoop Summit - Hausenblas 20 MarchHadoop Summit - Hausenblas 20 March
Hadoop Summit - Hausenblas 20 MarchMapR Technologies
 
Understanding the Value and Architecture of Apache Drill
Understanding the Value and Architecture of Apache DrillUnderstanding the Value and Architecture of Apache Drill
Understanding the Value and Architecture of Apache DrillDataWorks Summit
 
Streaming Solutions for Real time problems
Streaming Solutions for Real time problemsStreaming Solutions for Real time problems
Streaming Solutions for Real time problemsAbhishek Gupta
 
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...Deepak Chandramouli
 
Splunk Conf2010: Corporate Express presents Splunk with SAP
Splunk Conf2010: Corporate Express presents Splunk with SAPSplunk Conf2010: Corporate Express presents Splunk with SAP
Splunk Conf2010: Corporate Express presents Splunk with SAPSplunk
 

Similar to Mining Your Logs - Gaining Insight Through Visualization (20)

Splunk as a_big_data_platform_for_developers_spring_one2gx
Splunk as a_big_data_platform_for_developers_spring_one2gxSplunk as a_big_data_platform_for_developers_spring_one2gx
Splunk as a_big_data_platform_for_developers_spring_one2gx
 
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
 
Facebook architecture
Facebook architectureFacebook architecture
Facebook architecture
 
Qcon 090408233824-phpapp01
Qcon 090408233824-phpapp01Qcon 090408233824-phpapp01
Qcon 090408233824-phpapp01
 
Facebook的架构
Facebook的架构Facebook的架构
Facebook的架构
 
Facebook architecture
Facebook architectureFacebook architecture
Facebook architecture
 
KNIME Software Overview
KNIME Software OverviewKNIME Software Overview
KNIME Software Overview
 
Hadoop and subsystems in livedoor #Hcj11f
Hadoop and subsystems in livedoor #Hcj11fHadoop and subsystems in livedoor #Hcj11f
Hadoop and subsystems in livedoor #Hcj11f
 
Security Challenges in Cloud Integration - Cloud Security Alliance, Austin Ch...
Security Challenges in Cloud Integration - Cloud Security Alliance, Austin Ch...Security Challenges in Cloud Integration - Cloud Security Alliance, Austin Ch...
Security Challenges in Cloud Integration - Cloud Security Alliance, Austin Ch...
 
SnapLogic corporate presentation
SnapLogic corporate presentationSnapLogic corporate presentation
SnapLogic corporate presentation
 
Introduction to Apache Drill
Introduction to Apache DrillIntroduction to Apache Drill
Introduction to Apache Drill
 
Lessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatternsLessons learned from embedding Cassandra in xPatterns
Lessons learned from embedding Cassandra in xPatterns
 
The role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial InformaticsThe role of NoSQL in the Next Generation of Financial Informatics
The role of NoSQL in the Next Generation of Financial Informatics
 
Ibm db2update2019 icp4 data
Ibm db2update2019   icp4 dataIbm db2update2019   icp4 data
Ibm db2update2019 icp4 data
 
Playing in the Same Sandbox: MySQL and Oracle
Playing in the Same Sandbox:  MySQL and OraclePlaying in the Same Sandbox:  MySQL and Oracle
Playing in the Same Sandbox: MySQL and Oracle
 
Hadoop Summit - Hausenblas 20 March
Hadoop Summit - Hausenblas 20 MarchHadoop Summit - Hausenblas 20 March
Hadoop Summit - Hausenblas 20 March
 
Understanding the Value and Architecture of Apache Drill
Understanding the Value and Architecture of Apache DrillUnderstanding the Value and Architecture of Apache Drill
Understanding the Value and Architecture of Apache Drill
 
Streaming Solutions for Real time problems
Streaming Solutions for Real time problemsStreaming Solutions for Real time problems
Streaming Solutions for Real time problems
 
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...
PayPal datalake journey | teradata - edge of next | san diego | 2017 october ...
 
Splunk Conf2010: Corporate Express presents Splunk with SAP
Splunk Conf2010: Corporate Express presents Splunk with SAPSplunk Conf2010: Corporate Express presents Splunk with SAP
Splunk Conf2010: Corporate Express presents Splunk with SAP
 

More from Raffael Marty

Exploring the Defender's Advantage
Exploring the Defender's AdvantageExploring the Defender's Advantage
Exploring the Defender's AdvantageRaffael Marty
 
Extended Detection and Response (XDR) An Overhyped Product Category With Ulti...
Extended Detection and Response (XDR)An Overhyped Product Category With Ulti...Extended Detection and Response (XDR)An Overhyped Product Category With Ulti...
Extended Detection and Response (XDR) An Overhyped Product Category With Ulti...Raffael Marty
 
How To Drive Value with Security Data
How To Drive Value with Security DataHow To Drive Value with Security Data
How To Drive Value with Security DataRaffael Marty
 
Cyber Security Beyond 2020 – Will We Learn From Our Mistakes?
Cyber Security Beyond 2020 – Will We Learn From Our Mistakes?Cyber Security Beyond 2020 – Will We Learn From Our Mistakes?
Cyber Security Beyond 2020 – Will We Learn From Our Mistakes?Raffael Marty
 
Artificial Intelligence – Time Bomb or The Promised Land?
Artificial Intelligence – Time Bomb or The Promised Land?Artificial Intelligence – Time Bomb or The Promised Land?
Artificial Intelligence – Time Bomb or The Promised Land?Raffael Marty
 
Understanding the "Intelligence" in AI
Understanding the "Intelligence" in AIUnderstanding the "Intelligence" in AI
Understanding the "Intelligence" in AIRaffael Marty
 
AI & ML in Cyber Security - Why Algorithms are Dangerous
AI & ML in Cyber Security - Why Algorithms are DangerousAI & ML in Cyber Security - Why Algorithms are Dangerous
AI & ML in Cyber Security - Why Algorithms are DangerousRaffael Marty
 
AI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are DangerousAI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are DangerousRaffael Marty
 
Delivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and VisualizationDelivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and VisualizationRaffael Marty
 
AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't Changed
AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't ChangedAI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't Changed
AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't ChangedRaffael Marty
 
Security Insights at Scale
Security Insights at ScaleSecurity Insights at Scale
Security Insights at ScaleRaffael Marty
 
Creating Your Own Threat Intel Through Hunting & Visualization
Creating Your Own Threat Intel Through Hunting & VisualizationCreating Your Own Threat Intel Through Hunting & Visualization
Creating Your Own Threat Intel Through Hunting & VisualizationRaffael Marty
 
Creating Your Own Threat Intel Through Hunting & Visualization
Creating Your Own Threat Intel Through Hunting & VisualizationCreating Your Own Threat Intel Through Hunting & Visualization
Creating Your Own Threat Intel Through Hunting & VisualizationRaffael Marty
 
Visualization in the Age of Big Data
Visualization in the Age of Big DataVisualization in the Age of Big Data
Visualization in the Age of Big DataRaffael Marty
 
Big Data Visualization
Big Data VisualizationBig Data Visualization
Big Data VisualizationRaffael Marty
 
The Heatmap
 - Why is Security Visualization so Hard?
The Heatmap
 - Why is Security Visualization so Hard?The Heatmap
 - Why is Security Visualization so Hard?
The Heatmap
 - Why is Security Visualization so Hard?Raffael Marty
 
Visualization for Security
Visualization for SecurityVisualization for Security
Visualization for SecurityRaffael Marty
 
The Heatmap
 - Why is Security Visualization so Hard?
The Heatmap
 - Why is Security Visualization so Hard?The Heatmap
 - Why is Security Visualization so Hard?
The Heatmap
 - Why is Security Visualization so Hard?Raffael Marty
 
Cloud - Security - Big Data
Cloud - Security - Big DataCloud - Security - Big Data
Cloud - Security - Big DataRaffael Marty
 

More from Raffael Marty (20)

Exploring the Defender's Advantage
Exploring the Defender's AdvantageExploring the Defender's Advantage
Exploring the Defender's Advantage
 
Extended Detection and Response (XDR) An Overhyped Product Category With Ulti...
Extended Detection and Response (XDR)An Overhyped Product Category With Ulti...Extended Detection and Response (XDR)An Overhyped Product Category With Ulti...
Extended Detection and Response (XDR) An Overhyped Product Category With Ulti...
 
How To Drive Value with Security Data
How To Drive Value with Security DataHow To Drive Value with Security Data
How To Drive Value with Security Data
 
Cyber Security Beyond 2020 – Will We Learn From Our Mistakes?
Cyber Security Beyond 2020 – Will We Learn From Our Mistakes?Cyber Security Beyond 2020 – Will We Learn From Our Mistakes?
Cyber Security Beyond 2020 – Will We Learn From Our Mistakes?
 
Artificial Intelligence – Time Bomb or The Promised Land?
Artificial Intelligence – Time Bomb or The Promised Land?Artificial Intelligence – Time Bomb or The Promised Land?
Artificial Intelligence – Time Bomb or The Promised Land?
 
Understanding the "Intelligence" in AI
Understanding the "Intelligence" in AIUnderstanding the "Intelligence" in AI
Understanding the "Intelligence" in AI
 
Security Chat 5.0
Security Chat 5.0Security Chat 5.0
Security Chat 5.0
 
AI & ML in Cyber Security - Why Algorithms are Dangerous
AI & ML in Cyber Security - Why Algorithms are DangerousAI & ML in Cyber Security - Why Algorithms are Dangerous
AI & ML in Cyber Security - Why Algorithms are Dangerous
 
AI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are DangerousAI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are Dangerous
 
Delivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and VisualizationDelivering Security Insights with Data Analytics and Visualization
Delivering Security Insights with Data Analytics and Visualization
 
AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't Changed
AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't ChangedAI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't Changed
AI & ML in Cyber Security - Welcome Back to 1999 - Security Hasn't Changed
 
Security Insights at Scale
Security Insights at ScaleSecurity Insights at Scale
Security Insights at Scale
 
Creating Your Own Threat Intel Through Hunting & Visualization
Creating Your Own Threat Intel Through Hunting & VisualizationCreating Your Own Threat Intel Through Hunting & Visualization
Creating Your Own Threat Intel Through Hunting & Visualization
 
Creating Your Own Threat Intel Through Hunting & Visualization
Creating Your Own Threat Intel Through Hunting & VisualizationCreating Your Own Threat Intel Through Hunting & Visualization
Creating Your Own Threat Intel Through Hunting & Visualization
 
Visualization in the Age of Big Data
Visualization in the Age of Big DataVisualization in the Age of Big Data
Visualization in the Age of Big Data
 
Big Data Visualization
Big Data VisualizationBig Data Visualization
Big Data Visualization
 
The Heatmap
 - Why is Security Visualization so Hard?
The Heatmap
 - Why is Security Visualization so Hard?The Heatmap
 - Why is Security Visualization so Hard?
The Heatmap
 - Why is Security Visualization so Hard?
 
Visualization for Security
Visualization for SecurityVisualization for Security
Visualization for Security
 
The Heatmap
 - Why is Security Visualization so Hard?
The Heatmap
 - Why is Security Visualization so Hard?The Heatmap
 - Why is Security Visualization so Hard?
The Heatmap
 - Why is Security Visualization so Hard?
 
Cloud - Security - Big Data
Cloud - Security - Big DataCloud - Security - Big Data
Cloud - Security - Big Data
 

Mining Your Logs - Gaining Insight Through Visualization

  • 1. Mining Your Logs Gaining Insight Through Visualization Raffael Marty - @zrlram Google TechTalk March 2011
  • 2. Raffael Marty • Founder @ • Chief Security Strategist and Product Manager @ Splunk • Manager Solutions @ ArcSight • Intrusion Detection Research @ IBM Research • IT Security Consultant @ PriceWaterhouse Coopers Applied Security Visualization Publisher: Addison Wesley (August, 2008) ISBN: 0321510100 Logging as a Service 2 © by Raffael Marty
  • 3. Agenda •Log Analysis •Future Needs •History •Data Visualization •Log Architectures •Visualization Concepts •What’s Working and •Security Visualization What’s Not? Use-Cases Logging as a Service 3 © by Raffael Marty
  • 4. Log Analysis 10.0.20.9 - - [22/Mar/2011:10:00:52 +0000] "GET /admin/customer/customer/612/ HTTP/1.1" 200 2261 "https://logdog.loggly.org/admin/customer/customer/" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_6; en-us) AppleWebKit/ 533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27" TYhzVH8AAAEAAGOkBOQAAADA 655268 2010-12-28T18:12:10.031+00:00 frontend2-raffy syslog-ng[19600]: syslog-ng starting up; version='3.1.1' 2011-01-10T21:27:04.820+00:00 frontend2-raffy kernel: : [ 664.107313] blocked inbound IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:d8:30:62:5f: 6a:a3:08:00 SRC=10.0.20.109 DST=10.0.20.255 LEN=180 TOS=0x00 PREC=0x00 TTL=64 ID=126 PROTO=UDP SPT=17500 DPT=17500 LEN=160 Logging as a Service 4 © by Raffael Marty
  • 5. History • 1980 Eric Allman develops syslogd(8) • 1996 Intellitactics • 1997 Tivoli Risk Manager developed by IBM Research in Zurich (later Zurich Correlation Engine, ZCE) • 1999 - 2010 A number of log management / SIEM players enter the market (software, appliances) • 2000 ArcSight - 2010 sold for $1.65bn to HP • 2009 Loggly (logging as a service) Logging as a Service 5 © by Raffael Marty
  • 6. History - The Other View • Network management (SNMP) • IDS false positive reduction • Security monitoring (multiple data sources) • Unification of NOC and SOC (failed?) • Application monitoring (moving up the stack) - original tools failed due to architectural constraints - new approaches have been presented Logging as a Service 6 © by Raffael Marty
  • 7. Log Management Today Where are you? Logging as a Service © by Raffael Marty
  • 8. Log Management Today less tools DIY Log Management CEP and SIEM Advanced Analytics •grep •Open source •Open source •Not log specific! •Perl •Commercial •Commercial •SQL MapReduce •Open source Logging as a Service © by Raffael Marty
  • 9. Open Source Tools • graylog2 • lire • MS Logparser • logstash • LogSurfer • Sguil • swatch • SEC • Octopussy • tenshi • LogHound • Sagan • logwatch • slct • OSSEC • log2timeline • snare • logzilla • lasso • OSSIM this list is likely incomplete! Logging as a Service 9 © by Raffael Marty
  • 10. Commercial Tools this list is likely incomplete! pixlcloud | Visualization in the Cloud 10 © PixlCloud LLC 2011
  • 12. Log Mgmt Architecture Storage: - on board - external storage array - clusters Collection: Processing: - syslog - indexing - OPSEC - context storage - SDEE - clustering - netflow - database Logging as a Service 12 © by Raffael Marty
  • 13. Log Mgmt Architecture raw normalized or raw Collection: Processing: Data Access: - syslog - indexing - free-text search - OPSEC - context storage - field-based search - SDEE - clustering - tagging schemas - netflow - database Logging as a Service 13 © by Raffael Marty
  • 14. Agents and Connectors • piece of code to transport logs to a central location • features • often additional features: • special protocols: - batch - parse - OPSEC, SDEE - compress - normalize - Windows - encrypt - aggregate • file-based collection - sign - enrichment (context) - fail-over • database collection pixlcloud | Visualization in the Cloud 14 © PixlCloud LLC 2011
  • 15. SIEM Architecture asset context raw normalized identity context ... context / tagging RDBMS Logging as a Service 15 © by Raffael Marty
  • 16. SIEM Architecture • RDBMS schema - Fixed number and type of fields - New data sources with new fields? ‣ overloading • RDBMS clusters are expensive and scale poorly • Need a parser for every data source • Slow historical data queries • Hard to configure database efficiently - because of different use-cases Logging as a Service 16 © by Raffael Marty
  • 17. SIEM Architecture Benefits • Parsed data enables - real-time correlation - real-time statistics - data augmentation (context) close to source • Unified data access language - over a fixed set of fields • Real-time dashboards Logging as a Service 17 © by Raffael Marty
  • 18. Search vs. SIEM • Full-text indexing • Parsing at search time Example search: Example search: denied user=rmarty • use index to find • use index to find ALL occurrences of ‘denied’ occurrences of ‘rmarty’ • apply parser to results • remove results where user is not rmarty Logging as a Service 18 © by Raffael Marty
  • 19. New SIEM - Hybrid Models • Use parsers for known data sources • Collect everything else • Index all data and use index for search • Correlate parsed data Logging as a Service 19 © by Raffael Marty
  • 20. Categorization and Tagging •How do you find all failed logins across any data source? security:538 OR “sshd authentication failure” OR “sshd failed password” OR ... •Does not scale - for new data sources - for new events of existing sources id -> object, action, status •Define a ‘taxonomy’ for all events •Map events into taxonomy Logging as a Service 20 © by Raffael Marty
  • 21. Content Creation • Rules, dashboards, reports, searches can use taxonomy: object=authentication AND action=login AND status=success • All failures related to files: object=file AND status=failure • Approach scales well • Mixing with other fields: • Huge effort to build and action=login AND user=rmarty maintain mappings Logging as a Service 21 © by Raffael Marty
  • 22. Logging as a Service (LaaS) • Economically advantageous - think about TCO • Pay as you go • Elastic infrastructure scales with your needs • No installation needed • No setup costs / time for logging solution • Open platform with RESTful APIs Logging as a Service 22
  • 23. Loggly Data Sources Consumers Loggly user interface UI extensions mobile-166 My syslog Data collection Proxies API Data access Distributed Indexers and Search Machines indexing and processing Log Archive Distributed data store Logging as a Service 23
  • 24. Tool Usage DIY MR Log Mgmt SIEM LaaS data known known unknown known - sources only a few only a few many many analysis known exploration unknown unknown extend use-cases one or a few large-scale many many platform dynamic no no yes yes yes use-cases real-time extend no no no yes correlation platform engineer engineers license license cost hardware hardware (hardware) hardware subscription maintenance maintenance maintenance maintenance Should you rather do it yourself (DIY)? Logging as a Service 24 © by Raffael Marty
  • 25. What is Working and What is not? 25
  • 26. What’s Working • Log collection • Log centralization • Alerting on a priori known patterns • Solving specific, known use-cases for sets of known data sources, e.g., - monitoring privileged access to financial servers - generating compliance reports - security forensics Logging as a Service 26 © by Raffael Marty
  • 27. What’s Not Working • Log formats are all over and not documented Mar 16 08:09:58 kernel: [ 0.000000] Normal 1048576 -> 1048576 • No logging guidelines / developer education • Parsing is broken - based on regexes - numerous mistakes - doesn’t scale Logging as a Service 27 © by Raffael Marty
  • 28. What’s Not Working • Normalization is broken: - IP to hostnames (when to do DNS lookup) - usernames (rmarty vs. ram vs. raffy) • Categorization / Taxonomy - doesn’t scale - is always out of date - is buggy - expensive • Prioritization has no working formula • Anomaly detection is voodoo! Logging as a Service 28 © by Raffael Marty
  • 29. What Does It Mean? • We don’t understand our data • Security Operations Center (SOC) monitors all corporate data sources. Analysts - don’t know all the applications - don’t know all the setups - don’t know what log records are ‘normal’ behavior --> Need tools to enable log owners to work with their data Logging as a Service 29 © by Raffael Marty
  • 31. We Need Better Tools • We will have more and more data and need to deal with larger amounts of data - SIEM needs to support new distributed, scalable data management technologies • More and more application layer data - How are we going to deal with all the parsing / entity extraction? - We need logging standards and guidelines • How do we help analysts understand the data? - What is important and what is not? - Mapping problems to business process, business risk! Logging as a Service 31 © by Raffael Marty
  • 33. Data/Log Visualization • Exploration and Discovery • Answer Questions • Communicate Information • Support Decisions Logging as a Service 33 © by Raffael Marty
  • 34. Security Visualization • We are nowhere! • Visualization is an afterthought • Sec Viz dichotomy • Tools are lacking fundamental capabilities • Users don’t understand data, how can they understand visuals? Logging as a Service 34 © by Raffael Marty
  • 36. The Analysis Approach Details on Overview first Zoom demand Principle by Ben Shneiderman Logging as a Service 36 © by Raffael Marty
  • 37. Simultaneous Views Logging as a Service 37 © by Raffael Marty
  • 38. Dynamic Coloring Logging as a Service 38 © by Raffael Marty
  • 39. Linked Views Logging as a Service 39 © by Raffael Marty
  • 40. Legible / Usable Graphs Reducing non data ink! Logging as a Service 40 © by Raffael Marty
  • 41. Choosing the Right Chart Logging as a Service 41 © by Raffael Marty
  • 42. Ode to the Pie Logging as a Service 42 © by Raffael Marty
  • 43. Careful With Interpretations Logging as a Service 43 © by Raffael Marty
  • 45. Logging as a Service 45 © by Raffael Marty
  • 46. Logging as a Service 46 © by Raffael Marty
  • 47. Logging as a Service 47 © by Raffael Marty
  • 48. Situational Awareness • Treemap • Protovis.JS • Size: Amount • Brightness: Variance • Color: Sensor • Shows: Scans - bright spots • Thanks to Chris Horsley Logging as a Service 48 © by Raffael Marty
  • 49. Logging as a Service 49 © by Raffael Marty
  • 50. Firewall Treemap Logging as a Service 50 © by Raffael Marty
  • 51. Firewall Log Port Source IP Destination IP Logging as a Service 51 © by Raffael Marty
  • 52. IDS Sig Tuning - Treemap Hierarchy: Source Destination Signature Number of Events Color: Priority Size: Number of alerts Logging as a Service 52 © by Raffael Marty
  • 53. Vulnerability Data by Host Logging as a Service 53 © by Raffael Marty
  • 54. Visualization Future • A solution to entity extraction • Dynamic and interactive displays • Computer aided intelligence / visualization - Computer supported exploration - Highly interactive • Expert system that captures domain knowledge - Collaborative Logging as a Service 54 © by Raffael Marty
  • 55. http://secviz.org Share, discuss, challenge, and learn about security visualization. • List: secviz.org/mailinglist • Twitter: @secviz Logging as a Service 55 © by Raffael Marty