Security Insights at Scale


Published on

Ensuring security of a company’s data and infrastructure has largely become a data analytics challenge. It is about finding and understanding patterns and behaviors that are indicative of malicious activities or deviations from the norm. Data, Analytics, and Visualization are used to gain insights and discover those malicious activities. These three components play off of each other, but also have their inherent challenges. A few examples will be given to explore and illustrate some of these challenges,

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Security Insights at Scale

  1. 1. Security - Insights At Scale Raffael Marty VP Security Analytics @ Sophos May 2016 XLDB 2016, Stanford, USA
  2. 2. © Raffael Marty 2 "This presentation was prepared solely by Raffael Marty in his personal capacity. The material, views, and opinions expressed in this presentation are the author's own and do not reflect the views of Sophos Ltd. or its affiliates." Disclaimer
  3. 3. Security – Shift Towards Analytics 6 Past Present Future Prevention • Single instance focus • AV, firewalls, IDS • Cross entity intelligence • Synchronized security Detection • Data collection and centralization • Big data technologies • Machine learning attempts • Many challenges • Prediction? • Machine assisted insights • UX focus • Patterns, behaviors, collaboration + • Data driven learn Why the shift? Attackers use novel and specific methods to compromise each target.
  4. 4. Security 7 Gaining Insights: Finding novel attacks
  5. 5. Data 9 • Types of data o Time-series (with lots of categorical fields) o Context (spatial data) – Entities, blacklists, etc. o Multiple records for one “transaction” (fusion?) • Many access use-cases o Lookups / joins (external services also) o Search, aggregate, compute, … (One interface? (extended) SQL?) • Data challenges o Collection (many data formats, many transports) o Scale (storage cost, access speed) o Encryption (transparent, fast) o Operational challenges (bottlenecks, etc.) o Collaboration (security, transport) o How to find relevant insights? Not statistical anomalies! • Can we get a reference implementation? The proverbial hair ball
  6. 6. Analytics 10 • Mostly anomaly / outlier detection! Finding attacker behavior in the data o But what’s normal? This is not about statistical outliers! • Approaches o Cohort analysis (users and machines) -> e.g., clustering o Hypothesis implementation -> e.g., beacon detection o ”Learning” behavior -> e.g., interactive visualization of metrics • Analytics challenges o Categorical data o Large amounts of data o Statistical vs. actual anomalies o Distance functions o Not a ‘closed’ system • We need humans in the loop! And that’s where visualization comes in. Analytics drives visualization. 10
  7. 7. Visualization – Why? © Raffael Marty 14 1. Use analytics to prepare and summarize data. 2. Visualize the output. 3. Help human analysts make decisions and take actions.
  8. 8. Why Visualization? 15 • SELECT count(distinct protocol) FROM flows; • SELECT count(distinct port) FROM flows; • SELECT count(distinct src_network) FROM flows; • SELECT count(distinct dest_network) FROM flows; • SELECT port, count(*) FROM flows GROUP BY port; • SELECT protocol, count(CASE WHEN flows < 200 THEN 1 END) AS [<200], count(CASE WHEN flows>= 201 AND flows < 300 THEN 1 END) AS [201 - 300], count(CASE WHEN flows>= 301 AND flows < 350 THEN 1 END) AS [301 - 350], count(CASE WHEN flows>= 351 THEN 1 END) AS [>351] FROM flows GROUP BY protocol; • SELECT port, count(distinct src_network) FROM flows GROUP BY port; • SELECT src_network, count(distinct dest_network) FROM flows GROUP BY port; • SELECT src_network, count(distinct dest_network) AS dn, sum(flows) FROM flows GROUP BY port, dn; • SELECT port, protocol, count(*) FROM flows GROUP BY port, protocol; • SELECT sum(flows), dest_network FROM flows GROUP BY dest_network; • etc. port dest_network protocol src_network flows
  9. 9. Visualization Challenges • Visualizing 1TB of data? • Visualization Mantra by Ben Shneiderman • Drives backend requirements • Capture visual learnings – automate findings Security. Analytics. Insight.27 Information Visualization Mantra Overview Zoom / Filter Details on Demand Principle by Ben Shneiderman
  10. 10. Sophos – Security Made Simple 20 • For non experts • Consolidating security capabilities • Open architecture • Data science to SOLVE problems not to highlight issues Analytics UTM/Next-Gen Firewall Wireless Web Email Disk Encryption File Encryption Endpoint / Next-Gen Endpoint Mobile Server Sophos Central
  11. 11. @raffaelmarty © Raffael Marty 21