Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi

•Download as PPTX, PDF•

4 likes•1,357 views

This document discusses using Apache NiFi to build a high-speed cyber security data pipeline. It outlines the challenges of ingesting, transforming, and routing large volumes of security data from various sources to stakeholders like security operations centers, data scientists, and executives. It proposes using NiFi as a centralized data gateway to ingest data from multiple sources using a single entry point, transform the data according to destination needs, and reliably deliver the data while avoiding issues like network traffic and data duplication. The document provides an example NiFi flow and discusses metrics from processing over 20 billion events through 100+ production flows and 1000+ transformations.

Building the High Speed Cyber Security
Data Pipeline Using Apache NiFi
Praveen Kanumarlapudi

60% of Small
Businesses Fold Within
6 Months of a Cyber
Attack.

Global Security Key Stake Holders
Security Operations Center Data Scientists Data Analysts Executives
An information security
operations center
("ISOC" or "SOC") is a
facility where
enterprise information
systems (websites,
applications, databases,
data centers and
servers, networks,
desktops and other
endpoints) are
monitored, assessed,
and defended.
Technology : SIEM
Security data scientists
have the skills to
understand complex
algorithms and build
advanced models for
threat and anomaly
detection and applying
these concepts to real
security data sets in
single or clustered
environments.
Technology : Python, R,
Big Data, Spark/Scala or
MATLAB…
Map and trace the data
from system to system
for solving a given
business or incident
problem.
Design and create data
reports using various
reporting tools that
help business executive
to make better
decisions.
Implements new
metrics for business
(KPIs)
Technology : SQL, SIEM,
Big Data, Reporting
tools
CSO’s,
CISO’s

Cyber Security ‘BIG data’ challenges
• Speed , Volume and Variety
 Data Ingestion
 Cleansing
 Transformation
• data reliance
 Executives – KPI Metrics
 Data scientists
 SOC
 Data Analysts
• Real-Time context

A couple of years Ago !
Network logs
Web logs
AD Logs
Infrastructure
logs
Application
Logs
Threat Intel
3rd Party RG
RDBMS
unstructured(semi)structured
Syslog
servers
SIEM APP
Sqoop
PySpark
SIEM Tool
Data Source Ingestion Integration Delivery
Flume
UBA Tools
SOCDataScienceKPI/Reporting

Challenges
• Complexity of Architecture
• Debugging
• Data Source Dependencies
• Lack of Centralized logging
• Multiple Data Copies
• Stress on Network
• Transformations with respect to destination

Solution Framework
 Single Data entry point – avoids network traffic and
duplicate data flowing around
 Transformations according destination – reduces the
reliance on source
 Should be capable of handling different formats and
different sources
Ingest Clean/Route
Transform for
1
Transform for
2
Route to 1
Route to 2
Archive

Challenges
 Good architectural understanding of all
systems
 Good amount of coding effort
 Long development hours
 Maintenance overheads
 Maintain the sync between the systems
 Provenance

• Guaranteed delivery
• Processors that supports multiple
formats
• Ease to develop the flows and
deploy in minutes
• Open Source and rich community

The Data Gateway
Network logs
Web logs
AD Logs
Infrastructure
logs
Application
Logs
Threat Intel
3rd Party RG
RDBMS
unstructured(semi)structured
Data Source Data Gateway Delivery
SOCDataScienceKPI/Reporting
SOC

Metrics
 100+ production flows
 ~ 20 Billion events
 1000+ Transformations

Next ?
 MiNiFi
 Stateless NiFi
 Registry
 SAM
 Real-Time Model training
 CI/CD, NiFi API’s

Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi

This is an update to the Cyber Defense Matrix briefing given at the 2019 RSA Conference. Cybersecurity practitioners can use this to organize vendors, find gaps in security portfolios, understand how to organize security measurements, prioritize investments, minimize business impact, visualize attack surfaces, align other existing frameworks, and gain a fuller understanding of the entire space of cybersecurity.

As the cost and complexity of deploying and maintaining on-premises security continues to rise, many endpoint security providers have embraced the cloud as the ideal way to deliver their solutions. Yet, incorporating cloud services into legacy architectures limits their ability to fully engage the tremendous power the cloud offers. CrowdStrike Falcon recognized the value of cloud-delivery from the beginning, developing architecture built from the ground up to take full advantage of the cloud. CrowdStrike’s cloud-powered endpoint security not only ensures rapid deployment and infinite scalability, it increases your security posture by enabling real-time advanced threat protection across even the largest, distributed enterprises. In this CrowdCast, Jackie Castelli, Sr. Product Manager will discuss: •The advantages of endpoint protection purpose-built for the cloud – why it allows you to take full advantage of the cloud’s power •The common concerns organizations face when evaluating cloud-based endpoint security - can privacy and control be assured? •Real-world examples demonstrating the unique advantages offered by CrowdStrike Falcon’s innovative cloud-powered platform

Microsoft 365 Security and Compliance

David J Rosenthal

Cyberspace is the new battlefield: We’re seeing attacks on civilians and organizations from nation states. Attacks are no longer just against governments or enterprise systems directly. We’re seeing attacks against private property—the mobile devices we carry around everyday, the laptop on our desks—and public infrastructure. What started a decade-and-a-half ago as a sense that there were some teenagers in the basement hacking their way has moved far beyond that. It has morphed into sophisticated international organized crime and, worse, sophisticated nation state attacks. Personnel and resources are limited: According to an annual survey of 620 IT professional across North America and Western Europe from ESG, 51% respondents claim their organization had a problem of shortage of cybersecurity skills—up from 23% in 2014.1 The security landscape is getting more complicated and the stakes are rising, but many enterprises don’t have the resources they need to meet their security needs. Virtually anything can be corrupted: The number of connected devices in 2018 is predict to top 11 billion – not including computers and phones. As we connect virtually everything, anything can be disrupted. Everything from the cloud to the edge needs to be considered and protected.2

Make Your SOC Work Smarter, Not Harder

Splunk

The volume and complexities of today’s security incidents can tax even the largest security teams. This leaves big gaps in incident detection and response workflows that can put organisations at great risk. Your team can’t scale to manually catch and address every incident, so which ones should you focus on and which ones should you ignore? You shouldn’t be forced to make a choice. In this session, find out how Splunk’s SIEM and SOAR technologies deliver security analytics, machine learning, and automation capabilities to increase the efficiency of security teams and reduce the enterprise’s exposure to risk. Learn how to achieve big results from intelligently streamlined incident detection and response workflows—accelerating your actions, scaling your resources, and optimizing your security operations.

Proactive Threat Hunting: Game-Changing Endpoint Protection Beyond Alerting

CrowdStrike

Falcon OverWatch Experts Hunt 24/7 To Stop Incidents Before They Become Breaches Is your IT security team suffering from alert fatigue? For many organizations, chasing down every security alert can tax an already overburdened IT department, often resulting in a breach that might have been avoided. Adding to this challenge is an increase in sophisticated threats that strike so fast and frequently, traditional methods of investigation and response can’t offer adequate protection. A new webcast from CrowdStrike, “Proactive Threat Hunting: Game-Changing Endpoint Protection Above and Beyond Alerting,” discusses why so many organizations are vulnerable to unseen threats and alert fatigue, and why having an approach that is both reactive and proactive is key. You’ll also learn about Falcon OverWatch™, CrowdStrike’s proactive threat hunting service that investigates and responds to threats immediately, dramatically increasing your ability to react before a damaging breach occurs. Download the webcast slides to learn: --How constantly reacting to alerts prevents you from getting ahead of the potentially damaging threats designed to bypass standard endpoint security --Why an approach that includes proactive threat hunting, sometimes called Managed Detection and Response, is key to increasing protection against new and advanced threats --How CrowdStrike Falcon OverWatch can provide 24/7 managed threat hunting, augmenting your security efforts with a team of cyber intrusion detection analysts and investigators who proactively identify and prioritize incidents before they become damaging breaches

introduction to Azure Sentinel

Robert Crane

SOC and SIEM.pptx

SandeshUprety4

Tenable Solutions for Enterprise Cloud Security

MarketingArrowECS_CZ

SOC Architecture - Building the NextGen SOC

Priyanka Aash

Cloud Architecture - Multi Cloud, Edge, On-Premise

Araf Karsh Hamid

Dragos S4x20: How to Build an OT Security Operations Center

Dragos, Inc.

Next-Gen security operation center

Muhammad Sahputra

Talking about Next-Gen Security Operation Center for IDNIC+APJII as representative from IDSECCONF. People-Centric SOC requires lot of investment on human in terms of quantity and quality, unfortunately, (good) IT security people are getting rare these days. Organisation need to put their investments more on technology, as in Industry 4.0, machines are getting more advanced to support Human on doing continuous and repetitive task. Moving from “traditional” to next-gen SOC require proper plan, thats what this talk was about.

Zero Trust

Boaz Shunami

Modern Enterprise integration Strategies

Jesus Rodriguez

CLOUD NATIVE SECURITY

Maganathin Veeraragaloo

Building a Next-Generation Security Operations Center (SOC)

Sqrrl

So, you need to build a Security Operations Center (SOC)? What does that mean? What does the modern SOC need to do? Learn from Dr. Terry Brugger, who has been doing information security work for over 15 years, including building out a SOC for a large Federal agency and consulting for numerous large enterprises on their security operations. Watch the presentation with audio here: http://info.sqrrl.com/sqrrl-october-webinar-next-generation-soc

[Round table] zeroing in on zero trust architecture

Denise Bailey

Threat Hunting

Splunk

(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014

Amazon Web Services

The US government has built hundreds of applications that must be refactored to task advantage of modern distributed systems. This session discusses EzBake, an open-source, secure big data platform deployed on top of Amazon EC2 and using Amazon S3 and Amazon RDS. This solution has helped speed the US government to the cloud and make big data easy. Furthermore this session discusses critical architecture design decisions through the creation of the platform in order to add additional security, leverage future AWS offerings, and cut total operations and maintenance costs. Sponsored by CSC

On the Application of AI for Failure Management: Problems, Solutions and Algo...

Jorge Cardoso

Artificial Intelligence for IT Operations (AIOps) is a class of software which targets the automation of operational tasks through machine learning technologies. ML algorithms are typically used to support tasks such as anomaly detection, root-causes analysis, failure prevention, failure prediction, and system remediation. AIOps is gaining an increasing interest from the industry due to the exponential growth of IT operations and the complexity of new technology. Modern applications are assembled from hundreds of dependent microservices distributed across many cloud platforms, leading to extremely complex software systems. Studies show that cloud environments are now too complex to be managed solely by humans. This talk discusses various AIOps problems we have addressed over the years and gives a sketch of the solutions and algorithms we have implemented. Interesting problems include hypervisor anomaly detection, root-cause analysis of software service failures using application logs, multi-modal anomaly detection, root-cause analysis using distributed traces, and verification of virtual private cloud networks.

What's hot

Microsoft Defender and Azure Sentinel

David J Rosenthal

Zero trust Architecture

AddWeb Solution Pvt. Ltd.

Splunk Security Session - .conf Go Köln

Splunk

Cloud-Enabled: The Future of Endpoint Security

CrowdStrike

Microsoft 365 Security and Compliance

David J Rosenthal

Make Your SOC Work Smarter, Not Harder

Splunk

Proactive Threat Hunting: Game-Changing Endpoint Protection Beyond Alerting

CrowdStrike

introduction to Azure Sentinel

Robert Crane

SOC and SIEM.pptx

SandeshUprety4

Tenable Solutions for Enterprise Cloud Security

MarketingArrowECS_CZ

SOC Architecture - Building the NextGen SOC

Priyanka Aash

Cloud Architecture - Multi Cloud, Edge, On-Premise

Araf Karsh Hamid

Dragos S4x20: How to Build an OT Security Operations Center

Dragos, Inc.

Next-Gen security operation center

Muhammad Sahputra

Zero Trust

Boaz Shunami

Modern Enterprise integration Strategies

Jesus Rodriguez

CLOUD NATIVE SECURITY

Maganathin Veeraragaloo

Building a Next-Generation Security Operations Center (SOC)

Sqrrl

[Round table] zeroing in on zero trust architecture

Denise Bailey

Threat Hunting

Splunk

What's hot (20)

Microsoft Defender and Azure Sentinel

Zero trust Architecture

Splunk Security Session - .conf Go Köln

Cloud-Enabled: The Future of Endpoint Security

Microsoft 365 Security and Compliance

Make Your SOC Work Smarter, Not Harder

Proactive Threat Hunting: Game-Changing Endpoint Protection Beyond Alerting

introduction to Azure Sentinel

SOC and SIEM.pptx

Tenable Solutions for Enterprise Cloud Security

SOC Architecture - Building the NextGen SOC

Cloud Architecture - Multi Cloud, Edge, On-Premise

Dragos S4x20: How to Build an OT Security Operations Center

Next-Gen security operation center

Zero Trust

Modern Enterprise integration Strategies

CLOUD NATIVE SECURITY

Building a Next-Generation Security Operations Center (SOC)

[Round table] zeroing in on zero trust architecture

Threat Hunting

Similar to Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi

(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014

Amazon Web Services

On the Application of AI for Failure Management: Problems, Solutions and Algo...

Jorge Cardoso

Python + MPP Database = Large Scale AI/ML Projects in Production Faster

Paige_Roberts

ODSC East virtual presentation - The best machine learning, and advanced analytics projects are often stopped when it comes time to move into large scale production, preventing them from ever impacting the business in a meaningful way. Hundreds of hours of work may never get put to use. Python is rapidly becoming the language of choice for scientists and researchers of many types to build, test, train and score models. But when data science models need to go into production, challenges of performance and scale can be a huge roadblock. By combining a Python application with an underlying massively parallel (MPP) database, Python users can achieve a simplified path to production. An MPP database also allows you to do data preparation and data analysis at far greater speeds, accelerating development and testing as well as production performance. It also allows greater numbers of concurrent jobs to run, while also continuously loading data for IoT or other streaming use cases. Analyze data in the database where it sits, rather than first moving it to another framework, then analyzing it, then moving the results, taking multiple performance hits from both CPU and IO for every move and transformation. In this talk, you will learn about combination architectures that can get your work into production, shorten development time, and provide the performance and scale advantages of an MPP database with the convenience and power of Python. Use case examples use the open source Vertica-Python project created by Uber with contributions from Twitter, Palantir, Etsy, Vertica, Kayak and Gooddata.

Distributed Data Processing for Real-time Applications

ScyllaDB

inmation Presentation_2017

inmation Software GmbH

Industrial production is becoming increasingly interlinked with modern information and communication technology. From the foundation of intelligent digitally-networked systems, a largely self-organized production will be possible. In Industrie4.0, people, machinery, plants, logistics and products will communicate and cooperate directly. To connect these different strands, a unified, flexible, high-performance system is needed to provide company-wide, real-time, information flow. To target these issues, we developed enterprise:inmation. It securely and efficiently gathers data from manufacturing, process control and IT systems all around the globe, contextualizes it and transforms it into actionable information, which is presented to every decision-maker on any device, anytime, at any location. Software made by industrial system integration pros, in close cooperation with industry leaders. Business performance in real-time, anytime, anywhere, for all decision- makers -that is enterprise:inmation.

Modern Data Management for Federal Modernization

Denodo

Watch full webinar here: https://bit.ly/2QaVfE7 Faster, more agile data management is at the heart of government modernization. However, Traditional data delivery systems are limited in realizing a modernized and future-proof data architecture. This webinar will address how data virtualization can modernize existing systems and enable new data strategies. Join this session to learn how government agencies can use data virtualization to: - Enable governed, inter-agency data sharing - Simplify data acquisition, search and tagging - Streamline data delivery for transition to cloud, data science initiatives, and more

Real-time Visibility at Scale with Sumo Logic

Amazon Web Services

Legacy monitoring and troubleshooting tools can limit visibility and control over your infrastructure and applications. Organizations must find monitoring and troubleshooting tools that can scale with the volume, variety and velocity of data generated by today’s complex applications in order to keep pace with business demands. Our upcoming webinar will discuss how Sumo Logic helped Scripps Networks harness cloud-native machine data analytics to improve application quality and reliability on AWS. Sumo Logic allows IT operations teams to visualize and monitor workloads in real-time, identify issues and expedite root-cause analysis across the AWS environment. Join us to learn: • How to migrate from traditional on-premises data centers to AWS with confidence • How to improve the monitoring and troubleshooting of modern applications • How Scripps Networks, a leading content developer, used Sumo Logic to optimize their transition to AWS Who should attend: Developers, DevOps Director/Manager, IT Operations Director/Manager, Director of Cloud/Infrastructure, VP of Engineering

Building a Real-Time Security Application Using Log Data and Machine Learning...

Sri Ambati

Preparing for the Cybersecurity Renaissance

Cloudera, Inc.

We are in the midst of a fundamental shift in the way in which organizations protect themselves from the modern adversary. Traditional rules based cybersecurity applications of the past are not able to protect organizations in the new mobile, social, and hyper-connected world they now operate within. However, the convergence of big data technology, analytic advancements, and a variety of other factors have sparked a cybersecurity renaissance that will forever change the way in which organizations protect themselves. Join Rocky DeStefano, Cloudera's Cybersecurity subject matter expert, as he explores how modern organizations are protecting themselves from more frequent, sophisticated attacks. During this webinar you will learn about: The current challenges cybersecurity professionals are facing today How big data technologies are extending the capabilities of cybersecurity applications Cloudera customers that are future proofing their cybersecurity posture with Cloudera’s next generation data and analytics management system

Emerging IT Trends and Innovation Concepts.pptx

Roshni814224

Operating a secure big data platform in a multi-cloud environment

Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi

Similar to Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi (20)

More from DataWorks Summit

More from DataWorks Summit (20)

Recently uploaded

Recently uploaded (20)

Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi

Editor's Notes