Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Raffael Marty, CEO
Big Data Visualization
London
February, 2015
Security. Analytics. Insight.2
• Visualization
• Design Principles
• Dashboards
• SOC Dashboard
• Data Discovery and Explo...
Security. Analytics. Insight.3
I am Raffy - I do Viz!
IBM Research
4
Visualization
Security. Analytics. Insight.5
Why Visualization?
the stats ...
http://en.wikipedia.org/wiki/Anscombe%27s_quartet
the data...
Security. Analytics. Insight.6
Why Visualization?
http://en.wikipedia.org/wiki/Anscombe%27s_quartet
Human analyst:
• patte...
Security. Analytics. Insight.7
Visualization To …
Present / Communicate Discover / Explore
Design Principles
Security. Analytics. Insight.9
Choosing Visualizations
Objective AudienceData
Security. Analytics. Insight.10
• Objective: Find attackers in the network moving laterally
• Defines data needed (netflow...
Security. Analytics. Insight.11
• Show  comparisons, contrasts,
differences
• Show  causality, mechanism,
explanation, sys...
Security. Analytics. Insight.12
Show Context
42
Security. Analytics. Insight.
42
is just a number
and means nothing without
context
13
Show Context
Security. Analytics. Insight.15
Use Numbers To Highlight Most Important Parts of Data
Numbers
Summaries
Security. Analytics. Insight.16
Additional information about
objects, such as:
• machine
• roles
• criticality
• location
...
Security. Analytics. Insight.17
Traffic Flow Analysis With Context
Security. Analytics. Insight.18
http://www.scifiinterfaces.com/
• Black background
• Blue or green colors
• Glow
Aesthetics...
Security. Analytics. Insight.19
B O R I N G
Security. Analytics. Insight.20
Sexier
Security. Analytics. Insight.21
• Audience, audience, audience!
• Comprehensive Information (enough context)
• Highlight i...
22
SOC Dashboards
Security. Analytics. Insight.23
Mostly Blank
Security. Analytics. Insight.24
• Disappears too quickly
• Analysts focus is on their own screens
• SOC dashboard just dis...
Security. Analytics. Insight.25
• Provide analyst with context
• “What else is going on in the environment right now?”
• B...
Security. Analytics. Insight.26
Show Comparisons
Current Measure
week prior
Security. Analytics. Insight.27
• News feed summary (FS ISAC feeds, mailinglists, threat feeds)
• Monitoring twitter or IR...
28
Data Discovery &
Exploration
Security. Analytics. Insight.29
Visualize Me Lots (>1TB) of Data
Security. Analytics. Insight.30
Information Visualization Mantra
Overview Zoom / Filter Details on Demand
Principle by Ben...
Security. Analytics. Insight.31
• Access to data
• Parsed data and data context
• Data architecture for central data acces...
Big Data Lake
Security. Analytics. Insight.33
• One central location to store all cyber security data
• “Data collected only once and th...
Security. Analytics. Insight.34
Federated Data Access
SIEM
dispatcher
SIEM 

connector
SIEM console
Prod A
AD / LDAP
HR
…
...
Security. Analytics. Insight.35
Multiple Data Stores
raw logs
key-value
structured
real-time

processing
(un)-structured d...
Security. Analytics. Insight.36
Technologies (Example)
raw logs
key-value
(Cassandra)
columnar
(parquet)
real-time

proces...
Security. Analytics. Insight.37
SIEM Integration - Log Management First
SIEM
columnar
or
search engine

or
log management
...
Security. Analytics. Insight.38
Simple SIEM Integration
raw, csv, json
flume
log data
SQL
(Impala,
with SerDe)
H
D
F
S
SIEM...
Security. Analytics. Insight.39
SIEM Integration - Advanced
SIEM
columnar
(parquet)
processing
syslog data
SQL
(Impala,
Sp...
Security. Analytics. Insight.40
What I am Working On
Data Stores Analytics Forensics Models Admin
10.9.79.109 --> 3.16.204...
Security. Analytics. Insight.41
BlackHat Workshop
Visual Analytics -
Delivering Actionable Security
Intelligence
August 1-...
Security. Analytics. Insight.42
http://secviz.org
List: secviz.org/mailinglist
Twitter: @secviz
Share, discuss, challenge,...
Security. Analytics. Insight.
raffael.marty@pixlcloud.com
http://slideshare.net/zrlram
http://secviz.org and @secviz
Furth...
Big Data Visualization
Upcoming SlideShare
Loading in …5
×

Big Data Visualization

23,698 views

Published on

An overview of some methods and principles for big data visualization. The presentation quickly hits on the topic of dashboards and some cyber security uses. The topic of a big data lake is also briefly discussed in the context of a cyber security big data setup.

Published in: Data & Analytics, Internet
  • Hey guys! Who wants to chat with me? More photos with me here 👉 http://www.bit.ly/katekoxx
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Hello! High Quality And Affordable Essays For You. Starting at $4.99 per page - Check our website! https://vk.cc/82gJD2
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Data visualization won't be as appealing if you don't choose the appropriate charts for the data sets. This article on "Selecting Appropriate Data Visualization with PowerBI" should help you in selecting the right charts that best suit your data. https://goo.gl/WU9KQX
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Big Data Visualization

  1. Raffael Marty, CEO Big Data Visualization London February, 2015
  2. Security. Analytics. Insight.2 • Visualization • Design Principles • Dashboards • SOC Dashboard • Data Discovery and Exploration • Data Requirements for Visualization • Big Data Lake Overview
  3. Security. Analytics. Insight.3 I am Raffy - I do Viz! IBM Research
  4. 4 Visualization
  5. Security. Analytics. Insight.5 Why Visualization? the stats ... http://en.wikipedia.org/wiki/Anscombe%27s_quartet the data...
  6. Security. Analytics. Insight.6 Why Visualization? http://en.wikipedia.org/wiki/Anscombe%27s_quartet Human analyst: • patterndetection • remembers context • fantasticintuition • canpredict
  7. Security. Analytics. Insight.7 Visualization To … Present / Communicate Discover / Explore
  8. Design Principles
  9. Security. Analytics. Insight.9 Choosing Visualizations Objective AudienceData
  10. Security. Analytics. Insight.10 • Objective: Find attackers in the network moving laterally • Defines data needed (netflow, sflow, …) • maybe restrict to a network segment • Audience: security analyst, risk team, … • Informs how to visualize / present data For Example - Lateral Movement Recon Weaponize Deliver Exploit Install C2 Act
  11. Security. Analytics. Insight.11 • Show  comparisons, contrasts, differences • Show  causality, mechanism, explanation, systematic structure. • Show  multivariate data; that is, show more than 1 or 2 variables. by Edward Tufte Principals of Analytic Design
  12. Security. Analytics. Insight.12 Show Context 42
  13. Security. Analytics. Insight. 42 is just a number and means nothing without context 13 Show Context
  14. Security. Analytics. Insight.15 Use Numbers To Highlight Most Important Parts of Data Numbers Summaries
  15. Security. Analytics. Insight.16 Additional information about objects, such as: • machine • roles • criticality • location • owner • … • user • roles • office location • … Add Context source destination machine and 
 user context machine role user role
  16. Security. Analytics. Insight.17 Traffic Flow Analysis With Context
  17. Security. Analytics. Insight.18 http://www.scifiinterfaces.com/ • Black background • Blue or green colors • Glow Aesthetics Matter
  18. Security. Analytics. Insight.19 B O R I N G
  19. Security. Analytics. Insight.20 Sexier
  20. Security. Analytics. Insight.21 • Audience, audience, audience! • Comprehensive Information (enough context) • Highlight important data • Use graphics when appropriate • Good choice of graphics and design • Aesthetically pleasing • Enough information to decide if action is necessary • No scrolling • Real-time vs. batch? (Refresh-rates) • Clear organization Dashboard Design Principles
  21. 22 SOC Dashboards
  22. Security. Analytics. Insight.23 Mostly Blank
  23. Security. Analytics. Insight.24 • Disappears too quickly • Analysts focus is on their own screens • SOC dashboard just distracts • Detailed information not legible • Put the detailed dashboards on the analysts screens! Dashboards For Discovery
  24. Security. Analytics. Insight.25 • Provide analyst with context • “What else is going on in the environment right now?” • Bring Into Focus • Turn something benign into something interesting • Disprove • Turn something interesting into something benign Use SOC Dashboard For Context Environment informs detection policies
  25. Security. Analytics. Insight.26 Show Comparisons Current Measure week prior
  26. Security. Analytics. Insight.27 • News feed summary (FS ISAC feeds, mailinglists, threat feeds) • Monitoring twitter or IRC for certain activity / keywords • Volumes or metrics (e.g., #firewall blocks, #IDS alerts, #failed transactions) • Top N metrics: • Top 10 suspicious users • Top 10 servers connecting outbound What To Put on Screens Provide context to individual security alerts http://raffy.ch/blog/2015/01/15/dashboards-in-the-security-opartions-center-soc/
  27. 28 Data Discovery & Exploration
  28. Security. Analytics. Insight.29 Visualize Me Lots (>1TB) of Data
  29. Security. Analytics. Insight.30 Information Visualization Mantra Overview Zoom / Filter Details on Demand Principle by Ben Shneiderman • summary / aggregation • data mining • signal detection (IDS, behavioral, etc.)
  30. Security. Analytics. Insight.31 • Access to data • Parsed data and data context • Data architecture for central data access and fast queries • Application of data mining (how?, what?, scalable, …) • Visualization tools that support • Complex visual types (||-coordinates, treemaps, 
 heat maps, link graphs) • Linked views • Data mining (clustering, …) • Collaboration, information sharing • Visual analytics workflow Visualization Challenges
  31. Big Data Lake
  32. Security. Analytics. Insight.33 • One central location to store all cyber security data • “Data collected only once and third party software leveraging it” • Scalability and interoperability • More than deploying an off the shelf product from a vendor • Data use influences both data formats and technologies to store the data • search, analytics, relationships, and distributed processing • correlation, and statistical summarization • What to do with Context? Enrich or join? • Hard problems: • Parsing: can you re-parse? Common naming scheme! • Data store capabilities (search, analytics, distributed processing, etc.) • Access to data: SQL (even in Hadoop context), how can products access the data? The Big Data Lake
  33. Security. Analytics. Insight.34 Federated Data Access SIEM dispatcher SIEM 
 connector SIEM console Prod A AD / LDAP HR … IDS FW Prod B DBs Data Lake Caveats: • Dispatcher? • Standard access to dispatcher /
 products enabled • Data lake technology? SNMP
  34. Security. Analytics. Insight.35 Multiple Data Stores raw logs key-value structured real-time
 processing (un)-structured data context SQL s t o r a g e stats index queue distributed
 processing a c c e s s graph Caveat: • Need multiple types of 
 data stores
  35. Security. Analytics. Insight.36 Technologies (Example) raw logs key-value (Cassandra) columnar (parquet) real-time
 processing (Spark) (un)-structured data context SQL (Impala, SparkSQL) H D F S aggregates index (ES) queue (Kafka) distributed
 processing (Spark) a c c e s s graph (GraphX) Caveat: • No out of the box solution available
  36. Security. Analytics. Insight.37 SIEM Integration - Log Management First SIEM columnar or search engine
 or log management processing SIEM 
 connector raw logs SIEM console SQL or search
 interface processing filtering H D F S e.g., PIG parsing
  37. Security. Analytics. Insight.38 Simple SIEM Integration raw, csv, json flume log data SQL (Impala, with SerDe) H D F S SIEM 
 connector SIEM Requirement: • SIEM connector to forward text- based data to Flume. SQL interface Tableau, etc. SIEM console
  38. Security. Analytics. Insight.39 SIEM Integration - Advanced SIEM columnar (parquet) processing syslog data SQL (Impala, SparkSQL) H D F S index (ES) queue (Kafka) a c c e s s other data sources SIEM 
 connector raw logs SIEM console SQL and search 
 interface Tableau, Kibana, etc. requires parsing and formatting in a SIEM readable format (e.g., CEF)
  39. Security. Analytics. Insight.40 What I am Working On Data Stores Analytics Forensics Models Admin 10.9.79.109 --> 3.16.204.150 10.8.24.80 --> 192.168.148.193 10.8.50.85 --> 192.168.148.193 10.8.48.128 --> 192.168.148.193 10.9.79.6 --> 192.168.148.193 10.9.79.6 10.8.48.128 80 53 8.8.8.8 127.0.0.1 Anomalies Decomposition Data Seasonal Trend Anomaly Details “Hunt” ExplainVisual Search • Big data backend • Own visualization engine (Web-based) • Visualization workflows
  40. Security. Analytics. Insight.41 BlackHat Workshop Visual Analytics - Delivering Actionable Security Intelligence August 1-6 2015, Las Vegas, USA big data | analytics | visualization
  41. Security. Analytics. Insight.42 http://secviz.org List: secviz.org/mailinglist Twitter: @secviz Share, discuss, challenge, and learn about security visualization. Security Visualization Community
  42. Security. Analytics. Insight. raffael.marty@pixlcloud.com http://slideshare.net/zrlram http://secviz.org and @secviz Further resources:

×