SlideShare a Scribd company logo
Apply Big Data and Data Lake for
processing security data collections
Date: 04.02.2017
Gregory Shlyuger, Ph.D.
Enterprise Technology Architect, Mount Sinai PPS
SPIE Presentation (2017)
Agenda 2
» Cyber Security – Modern Enterprise Thread
» SIEM Conceptual Architecture
» SIEM Implementation – What can go wrong?
» Data Lake / SIEM Integrations
» Apache Metron, Security Data Analytics Platform – Next SIEM Evolution
» Use Case - Adding Squid Proxy Logs to Metron Platform
Cyber Security – Modern Enterprise Thread
62%
Increase in Cyber Security
Breaches since 2013.
3
More than 200 days
Average time an advanced
security breach goes unnoticed.
More than 3 Trillion
Total cost of cyber Security
breaches.
1 in 3
Security professionals are not
familiar with cyber security
threads.
4
In 2005 Mark Nicolett and Amrit
Williams from Gartner introduced
term “Security Information Event
Management” (SIEM).
SIEM = SIM + SEM
5SIEM Conceptual Architecture
Vulnerability Scans
User Information
Asset Information
Threat Intelligence
Contextual Data
Operating Systems
Applications
Devices
Databases
Event Data
SIEM
System Outputs
Analysis, Reports, Real Time Monitoring
System Inputs
Data Connection
Normalization
Aggregation
Correlation
Logic Rules
SIEM Components
1. Data Aggregation:
Log management aggregates data from many sources, including network, security, servers,
databases, applications.
2. Correlation:
Looks for common attributes and links events together into meaningful bundles. This
provides the ability to perform a variety of correlation techniques to integrate different
sources, in order to turn data into useful information.
6
SIEM Components
3. Alerting
The automated analysis of correlated events and production of alerts, to notify recipients of
immediate issues. Alerting can be sent to a dashboard or sent via third party channels such
as email.
4. Dashboards
Tools can take event data and turn it into informational charts to assist in seeing patterns, or
identifying activity that is not forming a standard pattern.
7
SIEM Components
5. Compliance
Applications can be employed to automate the gathering of compliance data, producing
reports that adapt to existing security, governance and auditing processes.
8
SIEM Components
6. Retention
employing long-term storage of historical data to facilitate correlation of data over time, and
to provide the retention necessary for compliance requirements. Long term log data
retention is critical in forensic investigations as it is unlikely that discovery of a network
breach will be at the time of the breach occurring.
7. Forensic analysis
The ability to search across logs on different nodes and time periods based on specific
criteria. This mitigates having to aggregate log information in your head or having to search
through thousands and thousands of logs.
9
SIEM – What Can Be Wrong With Implementation 10
SIEM
1.
Collect
Everything
2.
Poor
Source
Data Health
3.
Over
Complicate
Network
Models
4.
Too Much
Focus on
top 10
5.
Lost in
Compliance
6.
Using a
SIEM as a
log search
tool
1. Collect Everything: Collect with Specific Plan. Grow your capabilities
methodically and according with your plan.
2. Poor Source Data Health – Ensure signature are up-to-date and
configure the way they should be, and timestamp is correct.
3. Overcomplicated Network Model – Start with a simple, high-level
model. Don’t start with thousands of zones. What Is business requirements?
4. Too Much Focus On Top 10 Event – When looking for a bad guy
looking for destruction. When trying to find attacks, you’ll probably never
see in top 10 lists. Bottom 10 list more interesting.
5. Lost In Compliance – Don’t use off shell compliance. The off-the-shell
solution most likely will require customization.
6. Log Search Tool – Don’t chasing events in logs, build/use automatically
monitor for incidents.
Data Lake As SIEM Enhancement
Data Lake IS NOT Replacement for SEIM
SIEM
• Originated from needs to consolidate Security Data.
• SIEM incapable of scaling to loads of IT Big Volume Data.
Data Lake
• Central location where all security data is Collected and Stored.
• Running on commodity hardware.
• Allow effectively applying Machine Learning and Map Reduce.
11
SIEM / Data Lake Integration – Approach 1.
Data source duplicates the
stream to both a SIEM
connector and the Data Lake.
12
Proc
 Easy deployment through a change of
source configuration.
 Data in the data lake is independent of
SIEM, no downstream implications.
 Raw data is preserved.
 Fairly nonintrusive for the infrastructure.
Only source configuration needs to be
changed.
13
Cons
 Data source needs a way to split
data to two destinations.
 Parsing has to be done separately
in the data lake.
 Data in SIEM cannot be linked to
its raw data in the data lake.
SIEM / Data Lake Integration – Approach 1.
SIEM / Data Lake Integration – Approach 2.
Data is sent to a SIEM
connector, which splits the
data to the SIEM and the
Data Lake.
14
Proc
 Data is already parsed when it
gets to the data lake.
 Data in SIEM can be linked to raw
data in the data lake.
15
Cons
 Connector needs a way to split data to
two destinations.
 Need a connector for all data sources.
 SIEM and data lake get the same data.
 To keep raw data, connector needs a
way to forward data in raw format.
 Missing or wrong parsers result in "lost"
data.
SIEM / Data Lake Integration – Approach 2.
SIEM / Data Lake Integration – Approach 3.
Data is first sent into
the Data Lake and
then forwarded via a
SIEM connector to the
SIEM.
16
Proc
 Filtering can be applied to reduce
the load on the SIEM.
 One stream of data consumes
less bandwidth.
 Data in the Data Lake can be
parsed at any time, and parsing
can be updated.
17
Cons
 SIEM connector needs to support data
formats when reading from the Data
Lake.
 Data in the Data Lake needs to be
parsed separately.
SIEM / Data Lake Integration – Approach 3.
SIEM / Data Lake Integration – Approach 4.
Data is picked up by
the SIEM first and then
forwarded on to the
Data Lake.
18
Proc
 All data from the SIEM (including
alerts) can be forwarded to the
Data Lake.
 Parsed data is available in the
Data Lake.
 Existing environment can be
upgraded easily without much
change to the existing setup.
19
Cons
 SIEM needs a way to export the data to a Data
Lake.
 SIEM stays the bottleneck for performance.
 Needs a connector for all data sources.
 SIEM and Data Lake get the same data. No
pre-filtering for SIEM.
 Raw data is hard to preserve.
 Missing or wrong parsers result in "lost" data.
SIEM / Data Lake Integration – Approach 4.
Security Data Analytics
Platform
20Apache Metron – Next SIEM Evolution
2013 - Project Started By Cisco.
2015 - Accepted Into Apache Incubation.
2016 - Apache Metron v 0.1 was release.
2017 - Apache Metron v 0.3.1 was release.
21Use Case – Adding Squid Proxy Log To Metron Platform
ImplementationWhat Is Squid?
Squid is a caching proxy for the Web supporting HTTP, HTTPS,
FTP, and more.
It reduces bandwidth and improves response times by caching and
reusing frequently-requested web pages.
Business Requirements:
Need to add proxy events from Squid logs in real-time to existing
real time security monitoring.
22Use Case – Adding Squid Proxy Log To Metron Platform
Platformmplementation1. Proxy event needs to be enriched so that
the domain names are enriched with the IP.
2. In real-time, the IP within the proxy event
must be checked for threat intel feeds.
3. If there is a threat intel hit, an alert needs to
be raised.
4. The system should provide the ability to
configure rules and prioritize/score different
types of alerts.
The end user must be able to see the
new telemetry events completely
enriched from the new data source.
User should be able to see the alerts
prioritized by the high priority with the
corresponding data.
Be able to deploy a machine learning
model that derives additional insights
from the stream.
*All of these requirements will need to be implemented without writing any new code.
5.
6.
7.
23Implementation Use Case on Apache Metron
24Real-Time Enrichment Telemetry Events - BEFORE
24
When you make an outbound http connection to
https://partner.mountsinai.org from a given host, the following entry
is added to a Squid file called access.log.
4861576382.3812 161 387.8.445.068 TCP_MISS/200 107501 GET
https://partner.mountsinai.org – DIRECT/199.27.74.04 text/html
The domain name of the outbound connection.
Unix Epoch Time. IP of host where connection was made.
25Real-Time Enrichment Telemetry Events - AFTER
25
Magic that Metron will do - telemetry event as it is streamed
through the platform in real-time will be processed.
Convert from Unix Epoch
to Time Stamp.
4861576382.3812 161 387.8.445.068 TCP_MISS/200 107501 GET
https://partner.mountsinai.org – DIRECT/199.27.74.04 text/html
IP of host where connection was made,
Use Metron’s asset enrichment.
Use the Metron’s Threat Intel Services to cross-reference the IP with threat Intel feed.
26
Thank You

More Related Content

What's hot

Solving Cyber at Scale
Solving Cyber at ScaleSolving Cyber at Scale
Solving Cyber at Scale
DataWorks Summit/Hadoop Summit
 
Application Programming Interface
Application Programming InterfaceApplication Programming Interface
Application Programming Interface
Seculert
 
Treat Detection using Hadoop
Treat Detection using HadoopTreat Detection using Hadoop
Treat Detection using Hadoop
DataWorks Summit
 
Splunk Stream - Einblicke in Netzwerk Traffic
Splunk Stream - Einblicke in Netzwerk TrafficSplunk Stream - Einblicke in Netzwerk Traffic
Splunk Stream - Einblicke in Netzwerk Traffic
Splunk
 
ESM 101 (ESM v6.9.1c)
ESM 101 (ESM v6.9.1c)ESM 101 (ESM v6.9.1c)
ESM 101 (ESM v6.9.1c)
Protect724tk
 
Splunk App for Stream - Einblicke in Ihren Netzwerkverkehr
Splunk App for Stream - Einblicke in Ihren NetzwerkverkehrSplunk App for Stream - Einblicke in Ihren Netzwerkverkehr
Splunk App for Stream - Einblicke in Ihren Netzwerkverkehr
Georg Knon
 
Splunk App for Stream
Splunk App for StreamSplunk App for Stream
Splunk App for Stream
Splunk
 
6. Kepware_IIoT_Solution
6. Kepware_IIoT_Solution6. Kepware_IIoT_Solution
6. Kepware_IIoT_Solution
Steve Lim
 
Building a future-proof cyber security platform with Apache Metron
Building a future-proof cyber security platform with Apache MetronBuilding a future-proof cyber security platform with Apache Metron
Building a future-proof cyber security platform with Apache Metron
DataWorks Summit
 
Splunk Enterprise for InfoSec Hands-On Breakout Session
Splunk Enterprise for InfoSec Hands-On Breakout SessionSplunk Enterprise for InfoSec Hands-On Breakout Session
Splunk Enterprise for InfoSec Hands-On Breakout Session
Splunk
 
SplunkLive! München 2016 - Splunk für Security
SplunkLive! München 2016 - Splunk für SecuritySplunkLive! München 2016 - Splunk für Security
SplunkLive! München 2016 - Splunk für Security
Splunk
 
Splunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk App for Stream for Enhanced Operational Intelligence from Wire DataSplunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk
 
Adam ochs sentinel
Adam ochs sentinelAdam ochs sentinel
Adam ochs sentinel
Adam Ochs
 
Hands-On Security Breakout Session- Disrupting the Kill Chain
Hands-On Security Breakout Session- Disrupting the Kill ChainHands-On Security Breakout Session- Disrupting the Kill Chain
Hands-On Security Breakout Session- Disrupting the Kill Chain
Splunk
 
PaNDA - a platform for Network Data Analytics: an overview
PaNDA - a platform for Network Data Analytics: an overviewPaNDA - a platform for Network Data Analytics: an overview
PaNDA - a platform for Network Data Analytics: an overview
Cisco DevNet
 
Splunk app for stream
Splunk app for stream Splunk app for stream
Splunk app for stream
csching
 
Splunk Live! Utrecht 2016 - CERT EU
Splunk Live! Utrecht 2016 - CERT EUSplunk Live! Utrecht 2016 - CERT EU
Splunk Live! Utrecht 2016 - CERT EU
Splunk
 
Getting Started Getting Started With Splunk Enterprise
Getting Started Getting Started With Splunk EnterpriseGetting Started Getting Started With Splunk Enterprise
Getting Started Getting Started With Splunk Enterprise
Splunk
 
Burning Down the Haystack to Find the Needle: Security Analytics in Action
Burning Down the Haystack to Find the Needle:  Security Analytics in ActionBurning Down the Haystack to Find the Needle:  Security Analytics in Action
Burning Down the Haystack to Find the Needle: Security Analytics in Action
Josh Sokol
 
Splunk - Verwandeln Sie Datensilos in Operational Intelligence
Splunk - Verwandeln Sie Datensilos in Operational IntelligenceSplunk - Verwandeln Sie Datensilos in Operational Intelligence
Splunk - Verwandeln Sie Datensilos in Operational Intelligence
Splunk
 

What's hot (20)

Solving Cyber at Scale
Solving Cyber at ScaleSolving Cyber at Scale
Solving Cyber at Scale
 
Application Programming Interface
Application Programming InterfaceApplication Programming Interface
Application Programming Interface
 
Treat Detection using Hadoop
Treat Detection using HadoopTreat Detection using Hadoop
Treat Detection using Hadoop
 
Splunk Stream - Einblicke in Netzwerk Traffic
Splunk Stream - Einblicke in Netzwerk TrafficSplunk Stream - Einblicke in Netzwerk Traffic
Splunk Stream - Einblicke in Netzwerk Traffic
 
ESM 101 (ESM v6.9.1c)
ESM 101 (ESM v6.9.1c)ESM 101 (ESM v6.9.1c)
ESM 101 (ESM v6.9.1c)
 
Splunk App for Stream - Einblicke in Ihren Netzwerkverkehr
Splunk App for Stream - Einblicke in Ihren NetzwerkverkehrSplunk App for Stream - Einblicke in Ihren Netzwerkverkehr
Splunk App for Stream - Einblicke in Ihren Netzwerkverkehr
 
Splunk App for Stream
Splunk App for StreamSplunk App for Stream
Splunk App for Stream
 
6. Kepware_IIoT_Solution
6. Kepware_IIoT_Solution6. Kepware_IIoT_Solution
6. Kepware_IIoT_Solution
 
Building a future-proof cyber security platform with Apache Metron
Building a future-proof cyber security platform with Apache MetronBuilding a future-proof cyber security platform with Apache Metron
Building a future-proof cyber security platform with Apache Metron
 
Splunk Enterprise for InfoSec Hands-On Breakout Session
Splunk Enterprise for InfoSec Hands-On Breakout SessionSplunk Enterprise for InfoSec Hands-On Breakout Session
Splunk Enterprise for InfoSec Hands-On Breakout Session
 
SplunkLive! München 2016 - Splunk für Security
SplunkLive! München 2016 - Splunk für SecuritySplunkLive! München 2016 - Splunk für Security
SplunkLive! München 2016 - Splunk für Security
 
Splunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk App for Stream for Enhanced Operational Intelligence from Wire DataSplunk App for Stream for Enhanced Operational Intelligence from Wire Data
Splunk App for Stream for Enhanced Operational Intelligence from Wire Data
 
Adam ochs sentinel
Adam ochs sentinelAdam ochs sentinel
Adam ochs sentinel
 
Hands-On Security Breakout Session- Disrupting the Kill Chain
Hands-On Security Breakout Session- Disrupting the Kill ChainHands-On Security Breakout Session- Disrupting the Kill Chain
Hands-On Security Breakout Session- Disrupting the Kill Chain
 
PaNDA - a platform for Network Data Analytics: an overview
PaNDA - a platform for Network Data Analytics: an overviewPaNDA - a platform for Network Data Analytics: an overview
PaNDA - a platform for Network Data Analytics: an overview
 
Splunk app for stream
Splunk app for stream Splunk app for stream
Splunk app for stream
 
Splunk Live! Utrecht 2016 - CERT EU
Splunk Live! Utrecht 2016 - CERT EUSplunk Live! Utrecht 2016 - CERT EU
Splunk Live! Utrecht 2016 - CERT EU
 
Getting Started Getting Started With Splunk Enterprise
Getting Started Getting Started With Splunk EnterpriseGetting Started Getting Started With Splunk Enterprise
Getting Started Getting Started With Splunk Enterprise
 
Burning Down the Haystack to Find the Needle: Security Analytics in Action
Burning Down the Haystack to Find the Needle:  Security Analytics in ActionBurning Down the Haystack to Find the Needle:  Security Analytics in Action
Burning Down the Haystack to Find the Needle: Security Analytics in Action
 
Splunk - Verwandeln Sie Datensilos in Operational Intelligence
Splunk - Verwandeln Sie Datensilos in Operational IntelligenceSplunk - Verwandeln Sie Datensilos in Operational Intelligence
Splunk - Verwandeln Sie Datensilos in Operational Intelligence
 

Similar to Apply big data and data lake for processing security data collections

Privacy Preserving Data Mining Technique to Recover Association Rules Using H...
Privacy Preserving Data Mining Technique to Recover Association Rules Using H...Privacy Preserving Data Mining Technique to Recover Association Rules Using H...
Privacy Preserving Data Mining Technique to Recover Association Rules Using H...
IJSRED
 
Softnix Security Data Lake
Softnix Security Data Lake Softnix Security Data Lake
Softnix Security Data Lake
Softnix Technology
 
The ultimate guide to cloud computing security-Hire cloud expert
The ultimate guide to cloud computing security-Hire cloud expertThe ultimate guide to cloud computing security-Hire cloud expert
The ultimate guide to cloud computing security-Hire cloud expert
Chapter247 Infotech
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
ijceronline
 
Survey on Lightweight Secured Data Sharing Scheme for Cloud Computing
Survey on Lightweight Secured Data Sharing Scheme for Cloud ComputingSurvey on Lightweight Secured Data Sharing Scheme for Cloud Computing
Survey on Lightweight Secured Data Sharing Scheme for Cloud Computing
IRJET Journal
 
An Approach towards Shuffling of Data to Avoid Tampering in Cloud
An Approach towards Shuffling of Data to Avoid Tampering in CloudAn Approach towards Shuffling of Data to Avoid Tampering in Cloud
An Approach towards Shuffling of Data to Avoid Tampering in Cloud
IRJET Journal
 
HPE-Security update talk presented in Vienna to partners on 15th April 2016
HPE-Security update talk presented in Vienna to partners on 15th April 2016HPE-Security update talk presented in Vienna to partners on 15th April 2016
HPE-Security update talk presented in Vienna to partners on 15th April 2016
SteveAtHPE
 
How to Maximize Data Governance in Snowflake Test Environment
How to Maximize Data Governance in Snowflake Test EnvironmentHow to Maximize Data Governance in Snowflake Test Environment
How to Maximize Data Governance in Snowflake Test Environment
Jade Global
 
IRJET- Securing Cloud Data Under Key Exposure
IRJET- Securing Cloud Data Under Key ExposureIRJET- Securing Cloud Data Under Key Exposure
IRJET- Securing Cloud Data Under Key Exposure
IRJET Journal
 
SGSB Webcast 2 : Smart grid and data security
SGSB Webcast 2 : Smart grid and data securitySGSB Webcast 2 : Smart grid and data security
SGSB Webcast 2 : Smart grid and data security
Andy Bochman
 
Secure Data Storage in Cloud Using Encryption and Steganography
Secure Data Storage in Cloud Using Encryption and SteganographySecure Data Storage in Cloud Using Encryption and Steganography
Secure Data Storage in Cloud Using Encryption and Steganography
iosrjce
 
J017236366
J017236366J017236366
J017236366
IOSR Journals
 
A proposed Solution: Data Availability and Error Correction in Cloud Computing
A proposed Solution: Data Availability and Error Correction in Cloud ComputingA proposed Solution: Data Availability and Error Correction in Cloud Computing
A proposed Solution: Data Availability and Error Correction in Cloud Computing
CSCJournals
 
Bio-Cryptography Based Secured Data Replication Management in Cloud Storage
Bio-Cryptography Based Secured Data Replication Management in Cloud StorageBio-Cryptography Based Secured Data Replication Management in Cloud Storage
Bio-Cryptography Based Secured Data Replication Management in Cloud Storage
IJERA Editor
 
IRJET- Providing Privacy in Healthcare Cloud for Medical Data using Fog Compu...
IRJET- Providing Privacy in Healthcare Cloud for Medical Data using Fog Compu...IRJET- Providing Privacy in Healthcare Cloud for Medical Data using Fog Compu...
IRJET- Providing Privacy in Healthcare Cloud for Medical Data using Fog Compu...
IRJET Journal
 
IRJET- Multimedia Content Security with Random Key Generation Approach in...
IRJET-  	  Multimedia Content Security with Random Key Generation Approach in...IRJET-  	  Multimedia Content Security with Random Key Generation Approach in...
IRJET- Multimedia Content Security with Random Key Generation Approach in...
IRJET Journal
 
McAfee SIEM solution
McAfee SIEM solution McAfee SIEM solution
McAfee SIEM solution
hashnees
 
Iaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd secured and efficient data scheduling of intermediate data setsIaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd Iaetsd
 
SplunkLive! Munich 2018: Siemens Security Use Case
SplunkLive! Munich 2018: Siemens Security Use CaseSplunkLive! Munich 2018: Siemens Security Use Case
SplunkLive! Munich 2018: Siemens Security Use Case
Splunk
 
Overall System Architecture of Big Data of Wind Power Based on IoT_20161...
Overall System Architecture of Big Data of Wind Power Based on IoT_20161...Overall System Architecture of Big Data of Wind Power Based on IoT_20161...
Overall System Architecture of Big Data of Wind Power Based on IoT_20161...
元 黄
 

Similar to Apply big data and data lake for processing security data collections (20)

Privacy Preserving Data Mining Technique to Recover Association Rules Using H...
Privacy Preserving Data Mining Technique to Recover Association Rules Using H...Privacy Preserving Data Mining Technique to Recover Association Rules Using H...
Privacy Preserving Data Mining Technique to Recover Association Rules Using H...
 
Softnix Security Data Lake
Softnix Security Data Lake Softnix Security Data Lake
Softnix Security Data Lake
 
The ultimate guide to cloud computing security-Hire cloud expert
The ultimate guide to cloud computing security-Hire cloud expertThe ultimate guide to cloud computing security-Hire cloud expert
The ultimate guide to cloud computing security-Hire cloud expert
 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
 
Survey on Lightweight Secured Data Sharing Scheme for Cloud Computing
Survey on Lightweight Secured Data Sharing Scheme for Cloud ComputingSurvey on Lightweight Secured Data Sharing Scheme for Cloud Computing
Survey on Lightweight Secured Data Sharing Scheme for Cloud Computing
 
An Approach towards Shuffling of Data to Avoid Tampering in Cloud
An Approach towards Shuffling of Data to Avoid Tampering in CloudAn Approach towards Shuffling of Data to Avoid Tampering in Cloud
An Approach towards Shuffling of Data to Avoid Tampering in Cloud
 
HPE-Security update talk presented in Vienna to partners on 15th April 2016
HPE-Security update talk presented in Vienna to partners on 15th April 2016HPE-Security update talk presented in Vienna to partners on 15th April 2016
HPE-Security update talk presented in Vienna to partners on 15th April 2016
 
How to Maximize Data Governance in Snowflake Test Environment
How to Maximize Data Governance in Snowflake Test EnvironmentHow to Maximize Data Governance in Snowflake Test Environment
How to Maximize Data Governance in Snowflake Test Environment
 
IRJET- Securing Cloud Data Under Key Exposure
IRJET- Securing Cloud Data Under Key ExposureIRJET- Securing Cloud Data Under Key Exposure
IRJET- Securing Cloud Data Under Key Exposure
 
SGSB Webcast 2 : Smart grid and data security
SGSB Webcast 2 : Smart grid and data securitySGSB Webcast 2 : Smart grid and data security
SGSB Webcast 2 : Smart grid and data security
 
Secure Data Storage in Cloud Using Encryption and Steganography
Secure Data Storage in Cloud Using Encryption and SteganographySecure Data Storage in Cloud Using Encryption and Steganography
Secure Data Storage in Cloud Using Encryption and Steganography
 
J017236366
J017236366J017236366
J017236366
 
A proposed Solution: Data Availability and Error Correction in Cloud Computing
A proposed Solution: Data Availability and Error Correction in Cloud ComputingA proposed Solution: Data Availability and Error Correction in Cloud Computing
A proposed Solution: Data Availability and Error Correction in Cloud Computing
 
Bio-Cryptography Based Secured Data Replication Management in Cloud Storage
Bio-Cryptography Based Secured Data Replication Management in Cloud StorageBio-Cryptography Based Secured Data Replication Management in Cloud Storage
Bio-Cryptography Based Secured Data Replication Management in Cloud Storage
 
IRJET- Providing Privacy in Healthcare Cloud for Medical Data using Fog Compu...
IRJET- Providing Privacy in Healthcare Cloud for Medical Data using Fog Compu...IRJET- Providing Privacy in Healthcare Cloud for Medical Data using Fog Compu...
IRJET- Providing Privacy in Healthcare Cloud for Medical Data using Fog Compu...
 
IRJET- Multimedia Content Security with Random Key Generation Approach in...
IRJET-  	  Multimedia Content Security with Random Key Generation Approach in...IRJET-  	  Multimedia Content Security with Random Key Generation Approach in...
IRJET- Multimedia Content Security with Random Key Generation Approach in...
 
McAfee SIEM solution
McAfee SIEM solution McAfee SIEM solution
McAfee SIEM solution
 
Iaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd secured and efficient data scheduling of intermediate data setsIaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd secured and efficient data scheduling of intermediate data sets
 
SplunkLive! Munich 2018: Siemens Security Use Case
SplunkLive! Munich 2018: Siemens Security Use CaseSplunkLive! Munich 2018: Siemens Security Use Case
SplunkLive! Munich 2018: Siemens Security Use Case
 
Overall System Architecture of Big Data of Wind Power Based on IoT_20161...
Overall System Architecture of Big Data of Wind Power Based on IoT_20161...Overall System Architecture of Big Data of Wind Power Based on IoT_20161...
Overall System Architecture of Big Data of Wind Power Based on IoT_20161...
 

Recently uploaded

Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
SitimaJohn
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
Edge AI and Vision Alliance
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 

Recently uploaded (20)

Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptxOcean lotus Threat actors project by John Sitima 2024 (1).pptx
Ocean lotus Threat actors project by John Sitima 2024 (1).pptx
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
“Building and Scaling AI Applications with the Nx AI Manager,” a Presentation...
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 

Apply big data and data lake for processing security data collections

  • 1. Apply Big Data and Data Lake for processing security data collections Date: 04.02.2017 Gregory Shlyuger, Ph.D. Enterprise Technology Architect, Mount Sinai PPS SPIE Presentation (2017)
  • 2. Agenda 2 » Cyber Security – Modern Enterprise Thread » SIEM Conceptual Architecture » SIEM Implementation – What can go wrong? » Data Lake / SIEM Integrations » Apache Metron, Security Data Analytics Platform – Next SIEM Evolution » Use Case - Adding Squid Proxy Logs to Metron Platform
  • 3. Cyber Security – Modern Enterprise Thread 62% Increase in Cyber Security Breaches since 2013. 3 More than 200 days Average time an advanced security breach goes unnoticed. More than 3 Trillion Total cost of cyber Security breaches. 1 in 3 Security professionals are not familiar with cyber security threads.
  • 4. 4 In 2005 Mark Nicolett and Amrit Williams from Gartner introduced term “Security Information Event Management” (SIEM). SIEM = SIM + SEM
  • 5. 5SIEM Conceptual Architecture Vulnerability Scans User Information Asset Information Threat Intelligence Contextual Data Operating Systems Applications Devices Databases Event Data SIEM System Outputs Analysis, Reports, Real Time Monitoring System Inputs Data Connection Normalization Aggregation Correlation Logic Rules
  • 6. SIEM Components 1. Data Aggregation: Log management aggregates data from many sources, including network, security, servers, databases, applications. 2. Correlation: Looks for common attributes and links events together into meaningful bundles. This provides the ability to perform a variety of correlation techniques to integrate different sources, in order to turn data into useful information. 6
  • 7. SIEM Components 3. Alerting The automated analysis of correlated events and production of alerts, to notify recipients of immediate issues. Alerting can be sent to a dashboard or sent via third party channels such as email. 4. Dashboards Tools can take event data and turn it into informational charts to assist in seeing patterns, or identifying activity that is not forming a standard pattern. 7
  • 8. SIEM Components 5. Compliance Applications can be employed to automate the gathering of compliance data, producing reports that adapt to existing security, governance and auditing processes. 8
  • 9. SIEM Components 6. Retention employing long-term storage of historical data to facilitate correlation of data over time, and to provide the retention necessary for compliance requirements. Long term log data retention is critical in forensic investigations as it is unlikely that discovery of a network breach will be at the time of the breach occurring. 7. Forensic analysis The ability to search across logs on different nodes and time periods based on specific criteria. This mitigates having to aggregate log information in your head or having to search through thousands and thousands of logs. 9
  • 10. SIEM – What Can Be Wrong With Implementation 10 SIEM 1. Collect Everything 2. Poor Source Data Health 3. Over Complicate Network Models 4. Too Much Focus on top 10 5. Lost in Compliance 6. Using a SIEM as a log search tool 1. Collect Everything: Collect with Specific Plan. Grow your capabilities methodically and according with your plan. 2. Poor Source Data Health – Ensure signature are up-to-date and configure the way they should be, and timestamp is correct. 3. Overcomplicated Network Model – Start with a simple, high-level model. Don’t start with thousands of zones. What Is business requirements? 4. Too Much Focus On Top 10 Event – When looking for a bad guy looking for destruction. When trying to find attacks, you’ll probably never see in top 10 lists. Bottom 10 list more interesting. 5. Lost In Compliance – Don’t use off shell compliance. The off-the-shell solution most likely will require customization. 6. Log Search Tool – Don’t chasing events in logs, build/use automatically monitor for incidents.
  • 11. Data Lake As SIEM Enhancement Data Lake IS NOT Replacement for SEIM SIEM • Originated from needs to consolidate Security Data. • SIEM incapable of scaling to loads of IT Big Volume Data. Data Lake • Central location where all security data is Collected and Stored. • Running on commodity hardware. • Allow effectively applying Machine Learning and Map Reduce. 11
  • 12. SIEM / Data Lake Integration – Approach 1. Data source duplicates the stream to both a SIEM connector and the Data Lake. 12
  • 13. Proc  Easy deployment through a change of source configuration.  Data in the data lake is independent of SIEM, no downstream implications.  Raw data is preserved.  Fairly nonintrusive for the infrastructure. Only source configuration needs to be changed. 13 Cons  Data source needs a way to split data to two destinations.  Parsing has to be done separately in the data lake.  Data in SIEM cannot be linked to its raw data in the data lake. SIEM / Data Lake Integration – Approach 1.
  • 14. SIEM / Data Lake Integration – Approach 2. Data is sent to a SIEM connector, which splits the data to the SIEM and the Data Lake. 14
  • 15. Proc  Data is already parsed when it gets to the data lake.  Data in SIEM can be linked to raw data in the data lake. 15 Cons  Connector needs a way to split data to two destinations.  Need a connector for all data sources.  SIEM and data lake get the same data.  To keep raw data, connector needs a way to forward data in raw format.  Missing or wrong parsers result in "lost" data. SIEM / Data Lake Integration – Approach 2.
  • 16. SIEM / Data Lake Integration – Approach 3. Data is first sent into the Data Lake and then forwarded via a SIEM connector to the SIEM. 16
  • 17. Proc  Filtering can be applied to reduce the load on the SIEM.  One stream of data consumes less bandwidth.  Data in the Data Lake can be parsed at any time, and parsing can be updated. 17 Cons  SIEM connector needs to support data formats when reading from the Data Lake.  Data in the Data Lake needs to be parsed separately. SIEM / Data Lake Integration – Approach 3.
  • 18. SIEM / Data Lake Integration – Approach 4. Data is picked up by the SIEM first and then forwarded on to the Data Lake. 18
  • 19. Proc  All data from the SIEM (including alerts) can be forwarded to the Data Lake.  Parsed data is available in the Data Lake.  Existing environment can be upgraded easily without much change to the existing setup. 19 Cons  SIEM needs a way to export the data to a Data Lake.  SIEM stays the bottleneck for performance.  Needs a connector for all data sources.  SIEM and Data Lake get the same data. No pre-filtering for SIEM.  Raw data is hard to preserve.  Missing or wrong parsers result in "lost" data. SIEM / Data Lake Integration – Approach 4.
  • 20. Security Data Analytics Platform 20Apache Metron – Next SIEM Evolution 2013 - Project Started By Cisco. 2015 - Accepted Into Apache Incubation. 2016 - Apache Metron v 0.1 was release. 2017 - Apache Metron v 0.3.1 was release.
  • 21. 21Use Case – Adding Squid Proxy Log To Metron Platform ImplementationWhat Is Squid? Squid is a caching proxy for the Web supporting HTTP, HTTPS, FTP, and more. It reduces bandwidth and improves response times by caching and reusing frequently-requested web pages. Business Requirements: Need to add proxy events from Squid logs in real-time to existing real time security monitoring.
  • 22. 22Use Case – Adding Squid Proxy Log To Metron Platform Platformmplementation1. Proxy event needs to be enriched so that the domain names are enriched with the IP. 2. In real-time, the IP within the proxy event must be checked for threat intel feeds. 3. If there is a threat intel hit, an alert needs to be raised. 4. The system should provide the ability to configure rules and prioritize/score different types of alerts. The end user must be able to see the new telemetry events completely enriched from the new data source. User should be able to see the alerts prioritized by the high priority with the corresponding data. Be able to deploy a machine learning model that derives additional insights from the stream. *All of these requirements will need to be implemented without writing any new code. 5. 6. 7.
  • 23. 23Implementation Use Case on Apache Metron
  • 24. 24Real-Time Enrichment Telemetry Events - BEFORE 24 When you make an outbound http connection to https://partner.mountsinai.org from a given host, the following entry is added to a Squid file called access.log. 4861576382.3812 161 387.8.445.068 TCP_MISS/200 107501 GET https://partner.mountsinai.org – DIRECT/199.27.74.04 text/html The domain name of the outbound connection. Unix Epoch Time. IP of host where connection was made.
  • 25. 25Real-Time Enrichment Telemetry Events - AFTER 25 Magic that Metron will do - telemetry event as it is streamed through the platform in real-time will be processed. Convert from Unix Epoch to Time Stamp. 4861576382.3812 161 387.8.445.068 TCP_MISS/200 107501 GET https://partner.mountsinai.org – DIRECT/199.27.74.04 text/html IP of host where connection was made, Use Metron’s asset enrichment. Use the Metron’s Threat Intel Services to cross-reference the IP with threat Intel feed.