SlideShare a Scribd company logo
1 of 1
Machine Learning, Analytics
& Data Science Conference
Dec 7-8
Redmond
15
Visualizing Real-Time Network Alerts to
Identify Command & Control (C2) Infrastructure
Abstract: We have prototyped a near real-time streaming event pipeline built on a foundation of Apache Kafka and Apache Spark. In our demo we present a continuously evolving network
graph based on Spark stream processing and security alerts. The network graph is enriched with meta-information and presented to the user for additional analysis and investigation.
At a high level, we illustrate: (1) forwarding relevant Windows events from our cloud servers to an Apache Kafka cluster, (2) consuming the real-time Kafka messages using a Spark cluster in 3-
5s streaming batch intervals, (3) generating a continuously evolving network graph using the correlated alerts and meta-information which are displayed using Gephi Streaming. Our solution is
able to correlate and display many thousands of events per second, typically taking ±45s from host event creation to display.
Firewall security events (Windows Event 5156)
are forwarded from individual hosts to a WEC
collection server. These events indicate a
network connection initiated by or sent to a
process, and provide: the name of the process,
source IP address, destination IP address, etc.
This information is later used to generate the
C2 detection graph.
Once events have been collected by the WEC
server the events are produced to a clustered
Kafka topic. Kafka allows for high-
performance aggregation and reliable
message brokering between the event
collection and event processing endpoints.
The next graph iteration is received from
Spark and compared to the last known
graph. The differences between the two
graphs are resolved: nodes added, nodes
removed, edges added, edges removed,
etc. The necessary metadata – coloring
and labels – is added to the nodes and
edges, and then the changes are sent to
the Gephi graph visualization tool.
A Spark cluster is used for event processing and
correlation. Spark consumes from the Kafka event
topic and correlates the message stream. When a
message is found with a blacklisted process, the
originating host is added to a “suspect host” list
(the blue nodes seen on the graph). Any
connection initiated from the suspect host is
highlighted on the graph. This connection
information is used to create a list of nodes and
edges that should appear on the next iteration of
the graph.
By Todd Lanning (CELA) & Aaron Davis (WDG SMART)
1 Event Collection
Firewall Security Events
Data Flow Architecture
4 Graph Delta Generated
Generate Graph Delta
Update Metadata
Transmit to Gephi
2 Event Aggregation
and Delivery
Kafka Event Stream
3 Spark Processing
Kafka Consumer
Event Correlation
Generate Nodes & Edges
5 Draw Visualization
WEF WEC

More Related Content

What's hot

Ceilometer presentation ods havana final - published
Ceilometer presentation ods havana   final - publishedCeilometer presentation ods havana   final - published
Ceilometer presentation ods havana final - publishedeNovance
 
Stabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out CeilometerStabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out CeilometerPradeep Kilambi
 
Ceilometer presentation ODS Grizzly.pdf
Ceilometer presentation ODS Grizzly.pdfCeilometer presentation ODS Grizzly.pdf
Ceilometer presentation ODS Grizzly.pdfOpenStack Foundation
 
Ceilo componentization diagrams
Ceilo componentization diagramsCeilo componentization diagrams
Ceilo componentization diagramsFabio Giannetti
 
Distributed Tracing
Distributed TracingDistributed Tracing
Distributed Tracingsoasme
 
Shared time-series-analysis-using-an-event-streaming-platform -_v2
Shared   time-series-analysis-using-an-event-streaming-platform -_v2Shared   time-series-analysis-using-an-event-streaming-platform -_v2
Shared time-series-analysis-using-an-event-streaming-platform -_v2confluent
 
LINQ to HPC: Developing Big Data Applications on Windows HPC Server
LINQ to HPC: Developing Big Data Applications on Windows HPC ServerLINQ to HPC: Developing Big Data Applications on Windows HPC Server
LINQ to HPC: Developing Big Data Applications on Windows HPC ServerSaptak Sen
 
Presentation fyp1automationreplicationinopenstack
Presentation fyp1automationreplicationinopenstackPresentation fyp1automationreplicationinopenstack
Presentation fyp1automationreplicationinopenstackathiqah
 
MQTT - REST Bridge using the Smart Object API
MQTT - REST Bridge using the Smart Object APIMQTT - REST Bridge using the Smart Object API
MQTT - REST Bridge using the Smart Object APIMichael Koster
 
Enforcing Application SLA with Congress and Monasca
Enforcing Application SLA with Congress and MonascaEnforcing Application SLA with Congress and Monasca
Enforcing Application SLA with Congress and MonascaFabio Giannetti
 
Graph The Planet 2019 - Intrusion Detection with Graphs
Graph The Planet 2019 - Intrusion Detection with GraphsGraph The Planet 2019 - Intrusion Detection with Graphs
Graph The Planet 2019 - Intrusion Detection with GraphsMatt Swann
 
From Ceilometer to Telemetry: not so alarming!
From Ceilometer to Telemetry: not so alarming!From Ceilometer to Telemetry: not so alarming!
From Ceilometer to Telemetry: not so alarming!Nicolas (Nick) Barcet
 
Big data reactive streams and OSGi - M Rulli
Big data reactive streams and OSGi - M RulliBig data reactive streams and OSGi - M Rulli
Big data reactive streams and OSGi - M Rullimfrancis
 
Apache Flink® Meets Apache Mesos® and DC/OS
Apache Flink® Meets Apache Mesos® and DC/OSApache Flink® Meets Apache Mesos® and DC/OS
Apache Flink® Meets Apache Mesos® and DC/OSTill Rohrmann
 
Time series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalTime series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalconfluent
 
OpenPOWER Webinar Series : Cloud native with openshift presentation from Indu...
OpenPOWER Webinar Series : Cloud native with openshift presentation from Indu...OpenPOWER Webinar Series : Cloud native with openshift presentation from Indu...
OpenPOWER Webinar Series : Cloud native with openshift presentation from Indu...Ganesan Narayanasamy
 
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...confluent
 

What's hot (20)

Ceilometer presentation ods havana final - published
Ceilometer presentation ods havana   final - publishedCeilometer presentation ods havana   final - published
Ceilometer presentation ods havana final - published
 
Stabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out CeilometerStabilizing the Jenga tower: Scaling out Ceilometer
Stabilizing the Jenga tower: Scaling out Ceilometer
 
Ceilometer + Heat = Alarming
Ceilometer + Heat = Alarming Ceilometer + Heat = Alarming
Ceilometer + Heat = Alarming
 
Ceilometer presentation ODS Grizzly.pdf
Ceilometer presentation ODS Grizzly.pdfCeilometer presentation ODS Grizzly.pdf
Ceilometer presentation ODS Grizzly.pdf
 
Ceilo componentization diagrams
Ceilo componentization diagramsCeilo componentization diagrams
Ceilo componentization diagrams
 
Distributed Tracing
Distributed TracingDistributed Tracing
Distributed Tracing
 
Shared time-series-analysis-using-an-event-streaming-platform -_v2
Shared   time-series-analysis-using-an-event-streaming-platform -_v2Shared   time-series-analysis-using-an-event-streaming-platform -_v2
Shared time-series-analysis-using-an-event-streaming-platform -_v2
 
LINQ to HPC: Developing Big Data Applications on Windows HPC Server
LINQ to HPC: Developing Big Data Applications on Windows HPC ServerLINQ to HPC: Developing Big Data Applications on Windows HPC Server
LINQ to HPC: Developing Big Data Applications on Windows HPC Server
 
Presentation fyp1automationreplicationinopenstack
Presentation fyp1automationreplicationinopenstackPresentation fyp1automationreplicationinopenstack
Presentation fyp1automationreplicationinopenstack
 
MQTT - REST Bridge using the Smart Object API
MQTT - REST Bridge using the Smart Object APIMQTT - REST Bridge using the Smart Object API
MQTT - REST Bridge using the Smart Object API
 
Enforcing Application SLA with Congress and Monasca
Enforcing Application SLA with Congress and MonascaEnforcing Application SLA with Congress and Monasca
Enforcing Application SLA with Congress and Monasca
 
Graph The Planet 2019 - Intrusion Detection with Graphs
Graph The Planet 2019 - Intrusion Detection with GraphsGraph The Planet 2019 - Intrusion Detection with Graphs
Graph The Planet 2019 - Intrusion Detection with Graphs
 
Distributed Tracing
Distributed TracingDistributed Tracing
Distributed Tracing
 
From Ceilometer to Telemetry: not so alarming!
From Ceilometer to Telemetry: not so alarming!From Ceilometer to Telemetry: not so alarming!
From Ceilometer to Telemetry: not so alarming!
 
Big data reactive streams and OSGi - M Rulli
Big data reactive streams and OSGi - M RulliBig data reactive streams and OSGi - M Rulli
Big data reactive streams and OSGi - M Rulli
 
Apache Flink® Meets Apache Mesos® and DC/OS
Apache Flink® Meets Apache Mesos® and DC/OSApache Flink® Meets Apache Mesos® and DC/OS
Apache Flink® Meets Apache Mesos® and DC/OS
 
OpenStack Ceilometer
OpenStack CeilometerOpenStack Ceilometer
OpenStack Ceilometer
 
Time series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_finalTime series-analysis-using-an-event-streaming-platform -_v3_final
Time series-analysis-using-an-event-streaming-platform -_v3_final
 
OpenPOWER Webinar Series : Cloud native with openshift presentation from Indu...
OpenPOWER Webinar Series : Cloud native with openshift presentation from Indu...OpenPOWER Webinar Series : Cloud native with openshift presentation from Indu...
OpenPOWER Webinar Series : Cloud native with openshift presentation from Indu...
 
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...
Extending the Stream/Table Duality into a Trinity, with Graphs (David Allen &...
 

Similar to Visualizing C2_MLADS_2015

Strata NYC 2015: What's new in Spark Streaming
Strata NYC 2015: What's new in Spark StreamingStrata NYC 2015: What's new in Spark Streaming
Strata NYC 2015: What's new in Spark StreamingDatabricks
 
Seattle spark-meetup-032317
Seattle spark-meetup-032317Seattle spark-meetup-032317
Seattle spark-meetup-032317Nan Zhu
 
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisNoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisHelena Edelson
 
Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...HostedbyConfluent
 
Stream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream SharingStream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream Sharingconfluent
 
Spark streaming state of the union
Spark streaming state of the unionSpark streaming state of the union
Spark streaming state of the unionDatabricks
 
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)Kai Wähner
 
Comparison of Open-Source Data Stream Processing Engines: Spark Streaming, Fl...
Comparison of Open-Source Data Stream Processing Engines: Spark Streaming, Fl...Comparison of Open-Source Data Stream Processing Engines: Spark Streaming, Fl...
Comparison of Open-Source Data Stream Processing Engines: Spark Streaming, Fl...Darshan Gorasiya
 
Time Series Analysis… using an Event Streaming Platform
Time Series Analysis… using an Event Streaming PlatformTime Series Analysis… using an Event Streaming Platform
Time Series Analysis… using an Event Streaming Platformconfluent
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming PlatformDr. Mirko Kämpf
 
Kafka clients and emitters
Kafka clients and emittersKafka clients and emitters
Kafka clients and emittersEdgar Domingues
 
Dynamic composition of virtual network functions in a cloud environment
Dynamic composition of virtual network functions in a cloud environmentDynamic composition of virtual network functions in a cloud environment
Dynamic composition of virtual network functions in a cloud environmentFrancesco Foresta
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQShameera Rathnayaka
 
IPT Reactive Java IoT Demo - BGOUG 2018
IPT Reactive Java IoT Demo - BGOUG 2018IPT Reactive Java IoT Demo - BGOUG 2018
IPT Reactive Java IoT Demo - BGOUG 2018Trayan Iliev
 
Building Scalable Data Pipelines - 2016 DataPalooza Seattle
Building Scalable Data Pipelines - 2016 DataPalooza SeattleBuilding Scalable Data Pipelines - 2016 DataPalooza Seattle
Building Scalable Data Pipelines - 2016 DataPalooza SeattleEvan Chan
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterPaolo Castagna
 
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...Flink Forward
 
Real time data-pipeline from inception to production
Real time data-pipeline from inception to productionReal time data-pipeline from inception to production
Real time data-pipeline from inception to productionShreya Mukhopadhyay
 

Similar to Visualizing C2_MLADS_2015 (20)

IoT Research Project
IoT Research ProjectIoT Research Project
IoT Research Project
 
Strata NYC 2015: What's new in Spark Streaming
Strata NYC 2015: What's new in Spark StreamingStrata NYC 2015: What's new in Spark Streaming
Strata NYC 2015: What's new in Spark Streaming
 
Seattle spark-meetup-032317
Seattle spark-meetup-032317Seattle spark-meetup-032317
Seattle spark-meetup-032317
 
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisNoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
 
Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...Changing landscapes in data integration - Kafka Connect for near real-time da...
Changing landscapes in data integration - Kafka Connect for near real-time da...
 
Stream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream SharingStream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream Sharing
 
Spark streaming state of the union
Spark streaming state of the unionSpark streaming state of the union
Spark streaming state of the union
 
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
 
Comparison of Open-Source Data Stream Processing Engines: Spark Streaming, Fl...
Comparison of Open-Source Data Stream Processing Engines: Spark Streaming, Fl...Comparison of Open-Source Data Stream Processing Engines: Spark Streaming, Fl...
Comparison of Open-Source Data Stream Processing Engines: Spark Streaming, Fl...
 
Time Series Analysis… using an Event Streaming Platform
Time Series Analysis… using an Event Streaming PlatformTime Series Analysis… using an Event Streaming Platform
Time Series Analysis… using an Event Streaming Platform
 
Time Series Analysis Using an Event Streaming Platform
 Time Series Analysis Using an Event Streaming Platform Time Series Analysis Using an Event Streaming Platform
Time Series Analysis Using an Event Streaming Platform
 
Kafka clients and emitters
Kafka clients and emittersKafka clients and emitters
Kafka clients and emitters
 
Introduction to ns3
Introduction to ns3Introduction to ns3
Introduction to ns3
 
Dynamic composition of virtual network functions in a cloud environment
Dynamic composition of virtual network functions in a cloud environmentDynamic composition of virtual network functions in a cloud environment
Dynamic composition of virtual network functions in a cloud environment
 
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQCluster_Performance_Apache_Kafak_vs_RabbitMQ
Cluster_Performance_Apache_Kafak_vs_RabbitMQ
 
IPT Reactive Java IoT Demo - BGOUG 2018
IPT Reactive Java IoT Demo - BGOUG 2018IPT Reactive Java IoT Demo - BGOUG 2018
IPT Reactive Java IoT Demo - BGOUG 2018
 
Building Scalable Data Pipelines - 2016 DataPalooza Seattle
Building Scalable Data Pipelines - 2016 DataPalooza SeattleBuilding Scalable Data Pipelines - 2016 DataPalooza Seattle
Building Scalable Data Pipelines - 2016 DataPalooza Seattle
 
Introduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matterIntroduction to apache kafka, confluent and why they matter
Introduction to apache kafka, confluent and why they matter
 
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
 
Real time data-pipeline from inception to production
Real time data-pipeline from inception to productionReal time data-pipeline from inception to production
Real time data-pipeline from inception to production
 

Visualizing C2_MLADS_2015

  • 1. Machine Learning, Analytics & Data Science Conference Dec 7-8 Redmond 15 Visualizing Real-Time Network Alerts to Identify Command & Control (C2) Infrastructure Abstract: We have prototyped a near real-time streaming event pipeline built on a foundation of Apache Kafka and Apache Spark. In our demo we present a continuously evolving network graph based on Spark stream processing and security alerts. The network graph is enriched with meta-information and presented to the user for additional analysis and investigation. At a high level, we illustrate: (1) forwarding relevant Windows events from our cloud servers to an Apache Kafka cluster, (2) consuming the real-time Kafka messages using a Spark cluster in 3- 5s streaming batch intervals, (3) generating a continuously evolving network graph using the correlated alerts and meta-information which are displayed using Gephi Streaming. Our solution is able to correlate and display many thousands of events per second, typically taking ±45s from host event creation to display. Firewall security events (Windows Event 5156) are forwarded from individual hosts to a WEC collection server. These events indicate a network connection initiated by or sent to a process, and provide: the name of the process, source IP address, destination IP address, etc. This information is later used to generate the C2 detection graph. Once events have been collected by the WEC server the events are produced to a clustered Kafka topic. Kafka allows for high- performance aggregation and reliable message brokering between the event collection and event processing endpoints. The next graph iteration is received from Spark and compared to the last known graph. The differences between the two graphs are resolved: nodes added, nodes removed, edges added, edges removed, etc. The necessary metadata – coloring and labels – is added to the nodes and edges, and then the changes are sent to the Gephi graph visualization tool. A Spark cluster is used for event processing and correlation. Spark consumes from the Kafka event topic and correlates the message stream. When a message is found with a blacklisted process, the originating host is added to a “suspect host” list (the blue nodes seen on the graph). Any connection initiated from the suspect host is highlighted on the graph. This connection information is used to create a list of nodes and edges that should appear on the next iteration of the graph. By Todd Lanning (CELA) & Aaron Davis (WDG SMART) 1 Event Collection Firewall Security Events Data Flow Architecture 4 Graph Delta Generated Generate Graph Delta Update Metadata Transmit to Gephi 2 Event Aggregation and Delivery Kafka Event Stream 3 Spark Processing Kafka Consumer Event Correlation Generate Nodes & Edges 5 Draw Visualization WEF WEC