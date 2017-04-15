Apply Big Data and Data Lake for processing security data collections Date: 04.02.2017 Gregory Shlyuger, Ph.D. Enterprise ...
Agenda 2 » Cyber Security – Modern Enterprise Thread » SIEM Conceptual Architecture » SIEM Implementation – What can go wr
Cyber Security – Modern Enterprise Thread 62% Increase in Cyber Security Breaches since 2013. 3 More than 200 days Average...
4 In 2005 Mark Nicolett and Amrit Williams from Gartner introduced term “Security Information Event Management” (SIEM). SI...
5SIEM Conceptual Architecture Vulnerability Scans User Information Asset Information Threat Intelligence Contextual Data O...
SIEM Components 1. Data Aggregation: Log management aggregates data from many sources, including network, security, server...
SIEM Components 3. Alerting The automated analysis of correlated events and production of alerts, to notify recipients of ...
SIEM Components 5. Compliance Applications can be employed to automate the gathering of compliance data, producing reports...
SIEM Components 6. Retention employing long-term storage of historical data to facilitate correlation of data over time, a...
SIEM – What Can Be Wrong With Implementation 10 SIEM 1. Collect Everything 2. Poor Source Data Health 3. Over Complicate N...
Data Lake As SIEM Enhancement Data Lake IS NOT Replacement for SEIM SIEM • Originated from needs to consolidate Security D...
SIEM / Data Lake Integration – Approach 1. Data source duplicates the stream to both a SIEM connector and the Data Lake. 12
Proc  Easy deployment through a change of source configuration.  Data in the data lake is independent of SIEM, no downst...
SIEM / Data Lake Integration – Approach 2. Data is sent to a SIEM connector, which splits the data to the SIEM and the Dat...
Proc  Data is already parsed when it gets to the data lake.  Data in SIEM can be linked to raw data in the data lake. 15...
SIEM / Data Lake Integration – Approach 3. Data is first sent into the Data Lake and then forwarded via a SIEM connector t...
Proc  Filtering can be applied to reduce the load on the SIEM.  One stream of data consumes less bandwidth.  Data in th...
SIEM / Data Lake Integration – Approach 4. Data is picked up by the SIEM first and then forwarded on to the Data Lake. 18
Proc  All data from the SIEM (including alerts) can be forwarded to the Data Lake.  Parsed data is available in the Data...
Security Data Analytics Platform 20Apache Metron – Next SIEM Evolution 2013 - Project Started By Cisco. 2015 - Accepted In...
21Use Case – Adding Squid Proxy Log To Metron Platform ImplementationWhat Is Squid? Squid is a caching proxy for the Web s...
22Use Case – Adding Squid Proxy Log To Metron Platform Platformmplementation1. Proxy event needs to be enriched so that th...
23Implementation Use Case on Apache Metron
24Real-Time Enrichment Telemetry Events - BEFORE 24 When you make an outbound http connection to https://partner.mountsina...
25Real-Time Enrichment Telemetry Events - AFTER 25 Magic that Metron will do - telemetry event as it is streamed through t...
26 Thank You
Apply big data and data lake for processing security data collections

Computer security, information security and event management (SIEM) and non-event based raw data (NERD) is a feed activity for modern cyber domain network architecture. Each type of cyber domain such as Software Defined Networks, Virtualization, Service Orchestration or Cloud/Elastic computers, essential carryover characteristics. Each cyber domain might have slightly different properties. Enrichment NERD and SIEM models with Raw Activity Event Data allowed transformation the raw sensor flowing through the system into enriched data elements that are both descriptive and predictive in nature. This paper detail some scenarios for evidence collection, parsing, enrichment, the implementation k-Nearest Neighbor (kNN) classifier as a proof of concept (POC) for Apache Metron cyber security framework. For anomaly detection on Hadoop, utilizing Data Lake, data science and machine learning algorithm indicate this is a viable approach towards collecting, analyzing sensor data and analytical grid processing in a complex and ambiguous environment.

Published in: Technology
