Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Visualization in the Age of Big Data

5,982 views

Published on

The extent and impact of recent security breaches is showing that current security approaches are just not working. But what can we do to protect our business? We have been advocating monitoring for a long time as a way to detect subtle, advanced attacks that are still making it through our defenses. However, products have failed to deliver on this promise.
Current solutions don't scale in both data volume and analytical insights. In this presentation we will explore what security monitoring is. Specifically, we are going to explore the question of how to visualize a billion log records. A number of security visualization examples will illustrate some of the challenges with big data visualization. They will also help illustrate how data mining and user experience design help us get a handle on the security visualization challenges - enabling us to gain deep insight for a number of security use-cases.

Published in: Data & Analytics, Internet
  • Hey guys! Who wants to chat with me? More photos with me here 👉 http://www.bit.ly/katekoxx
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hello! High Quality And Affordable Essays For You. Starting at $4.99 per page - Check our website! https://vk.cc/82gJD2
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Visualization in the Age of Big Data

  1. 1. Raffael Marty, CEO Visualization 
 In The Age of Big Data HoneyNet Project Workshop Stavanger, Norway May, 2015
  2. 2. Security. Analytics. Insight.2 How Compromises Are Detected Mandiant M Trends Report 2014 Threat Report Attackers innetworks before detection 27 days 229 days Average time toresolveacyberattack Seems Like Cyber Security 
 Is Not Working
  3. 3. Security. Analytics. Insight.3 breaches can be detected (early) - or even be prevented - if we looked at the data Monitoring To The Rescue
  4. 4. Security. Analytics. Insight.4 Interactive Visualization
  5. 5. Security. Analytics. Insight.5 I am Raffy - I do Viz! IBM Research
  6. 6. Security. Analytics. Insight.6 • Security Landscape • What is Going Wrong? • A New Approach • Security Analytics • Big Data Lake • Visualization • Challenges • Data Discovery and Exploration • Examples Overview
  7. 7. Security. Analytics. Insight.7 Monitoring Tools Scoring Behavior Log Mgmt Threat Feeds Context Ticket IR False Positive Manual
 Triage Sandboxes … Data Sources Firewall IPS Proxy AV Endpoint … SIEM
  8. 8. Security. Analytics. Insight.8 • Products / Tools • Firewall - Blocks traffic based on pre-defined rules • Web Application Firewall - Monitors for signs of known malicious activity in Web traffic • Intrusion Prevention System - Looks for ‘signs’ of known attacks in traffic and protocol violations • Anti Virus - Looks for ‘signs’ of known attacks on the end system • Malware Sandbox - Runs new binaries and monitors their behavior for malicious signs • Security Information Management - Uses pre-defined rules to correlate signs from different data streams to augment intelligence • Vulnerability Scanning - Searches for known vulnerabilities and vulnerable software • Rely on pattern matching and signatures based knowledge from the past • Reactive -> always behind • Unknown and new threats -> won’t be detected • ‘Imperfect’ patterns and rules -> cause a lot of false positives We Are Monitoring - What is Going Wrong? Defense Has Been Relying On Past Knowledge
  9. 9. Security Analytics
  10. 10. Security. Analytics. Insight.10 A New Approach ENABLE analysts to leverage their knowledge effectively and efficiently • scalability - big data based, extensible platform • visualization - interactive exploration of billions of events • knowledge - capture from experts - leverage machines to guide - automate where possible - enable collaboration We Need 
 Analysts in the Loop! (not better algorithms)
  11. 11. Security. Analytics. Insight.11 • Intercept attacks (APT) early in the kill chain • Detecting intrusions • Detecting data leaks • Network-based anomaly detection • Threat Intelligence • Attack surface analysis • Speed up forensic investigations and incident response • Insider threat detection • User behavior monitoring • Privilege abuse • Fraud detection • Compliance • Continuous monitoring • Risk quantification and metrics • Business improvements • Spending justification for security • Spending optimization (esp. cloud) Use-Cases Enabled Through Analytics Data Stores Analytics Forensics Models Admin 10.9.79.109 --> 3.16.204.150 10.8.24.80 --> 192.168.148.193 10.8.50.85 --> 192.168.148.193 10.8.48.128 --> 192.168.148.193 10.9.79.6 --> 192.168.148.193 10.9.79.6 10.8.48.128 80 53 8.8.8.8 127.0.0.1 Anomalies Decomposition Data Seasonal Trend Anomaly Details Find Intruders and ‘New Attacks’ Resolve Incidents Quicker Communicate Findings
  12. 12. Security. Analytics. Insight.12 Analytics Platform - How It’s Done Rules Patterns Scoring context data Security Big Data Lake • Explore 
 & Hunt • Visual
 Forensics Behavior Anomaly 
 Detection • Alert 
 Triage Visualization Analytics • Visualization in the center • Not relying on past knowledge • Analytics to support not alert
  13. 13. 13 Visualization
  14. 14. Security. Analytics. Insight.14 Visualization To … Present / Communicate Discover / Explore
  15. 15. Security. Analytics. Insight.15 Unknown Unknowns - Visualization Is Central "There are 1000 ways for someone to steal information. If we knew how, we could have prevented it. Visualization helps find that one way.” - CISO UBS Switzerland
  16. 16. Security. Analytics. Insight.16 Visualization Example (Unknown Unknowns) PixlCloud is a visual analytics platform for cyber security. This example shows a heatmap of behavior over time. In this case, we see activity per user. We can see that ‘vincent’ is visually different from all of the other users. He shows up very lightly over the entire time period. This seems to be something to look into. We were able to find this purely visual, without understanding the data more intrinsically.
  17. 17. Security. Analytics. Insight.17 Why Visualization? the stats ... http://en.wikipedia.org/wiki/Anscombe%27s_quartet the data...
  18. 18. Security. Analytics. Insight.18 Why Visualization? http://en.wikipedia.org/wiki/Anscombe%27s_quartet Human analyst: • patterndetection • remembers context • fantasticintuition • canpredict
  19. 19. Security. Analytics. Insight. • Access to data • Parsed data and data context • Data architecture for central data access and fast queries • Application of data mining (how?, what?, scalable, …) • Visualization tools that support • Complex visual types (||-coordinates, treemaps, 
 heat maps, link graphs) • Linked views • Data mining (clustering, …) • Visual analytics workflow 19 Visualization Challenges
  20. 20. Security. Analytics. Insight.20 Access paradigms for a backend: • Analytical queries - mainly for visual interaction • Accessing large amounts of data in aggregated ways • Support for intelligent caching (reduce slow re-query of data) • Statistics - answering frequent ‘aggregation’ queries very fast • Ad-hoc search • Raw data retrieval • Context - deal with data context for time-series data Enablement - Data Layer Requirements Note: No mention of HADOOP!
  21. 21. Big Data Lake
  22. 22. Security. Analytics. Insight.22 The Big Data Lake • One central location to store all cyber security data • “Data collected only once and third party software leveraging it” • Scalability and interoperability • Hard problems: • Parsing: can you re-parse? • Data store capabilities (search, analytics, distributed processing, etc.) • Access to data: SQL (even in Hadoop context), how can products access the data? Prevent Re-Collection?
  23. 23. Security. Analytics. Insight.23 The Security Data Lake - Federated Data Access SIEM dispatcher SIEM 
 connector SIEM console Prod A AD / LDAP HR … IDS FW Prod B DBs Data Lake SNMP Many many challenges!
  24. 24. Security. Analytics. Insight.24 Data Lake Version 0.5a SIEM columnar or search engine
 or log management processing SIEM 
 connector raw logs SIEM console SQL or search
 interface processing filtering H D F S lake Current solutions (log mgmt / siem): - not open - don’t scale
  25. 25. 25 Data Discovery & Exploration
  26. 26. Security. Analytics. Insight.26 Visualize Me Lots (>1TB) of Data
  27. 27. Security. Analytics. Insight.27 Information Visualization Mantra Overview Zoom / Filter Details on Demand Principle by Ben Shneiderman
  28. 28. 28 SecViz Examples
  29. 29. Security. Analytics. Insight.29 Additional information about objects, such as: • machine • roles • criticality • location • owner • … • user • roles • office location • … Add Context source destination machine and 
 user context machine role user role
  30. 30. Security. Analytics. Insight.30 Traffic Flow Analysis With Context
  31. 31. Security. Analytics. Insight.31 An Analytical Example - Monitor Password Resets threshold outliers have different magnitudes
  32. 32. Security. Analytics. Insight.32 Approximate Curve fitting a curve distance to curve
  33. 33. Security. Analytics. Insight.33 • Holt Winters is exponential smoothing • Lets you define thresholds for alerting! Data Mining Applied • Hard to define alert threshold better 
 threshold
  34. 34. copyright (c) 2013pixlcloud | creating actionable data stories Internet Service Provider • Monitoring entire network • shows scans across customers on port 445 (Windows shares) new worm emerging
  35. 35. Security. Analytics. Insight.35 Machine Learning - Clustering Users Source:
 Email logs Explanation:
 The graph shows email communications between employees and outside people. 
 By clustering the data, different user groups become visible automatically. 
 It became visible that there was an entire cluster that we cannot assign to a known group of users! unknown product teams sales and marketing competition
  36. 36. Security. Analytics. Insight.36 Intra-Role Anomaly - Random Order users time dc(machines)
  37. 37. Security. Analytics. Insight.37 Intra-Role Anomaly - With Seriation
  38. 38. Security. Analytics. Insight.38 Intra-Role Anomaly - Sorted by User Role Administrator Sales Development Finance Admin???
  39. 39. Security. Analytics. Insight.39 • This looks interesting • What is it? • Green -> Port 53 • Only port 53? • What IPs? • What’s the time behavior? • The graph doesn’t answer these questions Graphs - A Story
  40. 40. Security. Analytics. Insight.40 Graphs - A Story • Adding a port histogram • Select DNS traffic and see if other ports light up. Note how this is a user experience challenge!
  41. 41. Security. Analytics. Insight.41 • Linked Views • Histograms for • Source • Port (Source) • Destination • ||-coord DNS Traffic - A Closer Look
  42. 42. 42 Bringing It All 
 Together
  43. 43. Security. Analytics. Insight.43 Bringing It All Together Data Stores Analytics Forensics Models Admin 10.9.79.109 --> 3.16.204.150 10.8.24.80 --> 192.168.148.193 10.8.50.85 --> 192.168.148.193 10.8.48.128 --> 192.168.148.193 10.9.79.6 --> 192.168.148.193 10.9.79.6 10.8.48.128 80 53 8.8.8.8 127.0.0.1 Anomalies Decomposition Data Seasonal Trend Anomaly Details “Hunt” ExplainVisual Search • Big data backend • Own visualization engine (Web-based) • Visualization workflows
  44. 44. Security. Analytics. Insight.44 http://secviz.org List: secviz.org/mailinglist Twitter: @secviz Share, discuss, challenge, and learn about security visualization. Security Visualization Community
  45. 45. Security. Analytics. Insight.45 BlackHat Workshop Visual Analytics - Delivering Actionable Security Intelligence August 1-6 2015, Las Vegas, USA big data | analytics | visualization http://secviz.org
  46. 46. Security. Analytics. Insight. raffael.marty@pixlcloud.com http://slideshare.net/zrlram http://secviz.org and @secviz Further resources:

×