SlideShare a Scribd company logo
1 of 19
Download to read offline
FEBRUARY 22, 2018, WARSAW
ENHANCING SPARK
INCREASE STREAMING CAPABILITIES OF YOU APPLICATIONS
Kamil Folkert, Tomasz Mirowski
BUSINESS RULES - ALERTS
ANALYTICAL MODEL
SOURCE LAYER
• SECURITY LOGS
MESSAGE LAYER
• TIME WINDOW
PROCESSING LAYER
DATA
STREAM
FILTERING
AGGREGATED
WINDOW
PERPETUALLY
GROWING
REFERENCE
DATA
EXTERNAL
ACCESS
SOURCE
SYSTEM
EVENT
EVENT
EVENT
SOURCE
SYSTEM
EVENT
EVENT
EVENT
JOINING
STREAMS
TEMPORARY
FILES
TIME
OUTPUT
METHODS
EVENT
EVENT
EVENT
SNAPPYDATA
SNAPPYDATA: REAL TIME + OLTP
SNAPPYDATA
SNAPPYDATA
DATA
STREAM
FILTERING
AGGREGATED
WINDOW
PERPETUALLY
GROWING
REFERENCE
DATA
EXTERNAL
ACCESS
val eventsDF = jedis.hgetAll(”eventDictionary")
.toList.toDF.withColumnRenamed("_1","key").withColumnRenamed("_2", "value")
application_name log_type source_host ldap_group status
application1 application_server server1 userGroup security_breach
… … … … …
DATA
STREAM
FILTERING
AGGREGATED
WINDOW
PERPETUALLY
GROWING
REFERENCE
DATA
EXTERNAL
ACCESS
snsc.sql("create stream table applicationStream (”time_stamp timestamp," +
" application_name string, log_type string,” +
…
) " +
" using directkafka_stream options(" +
" rowConverter 'io.streaming.RowsConverter' ," +
s" kafkaParams 'metadata.broker.list->$brokerList;auto.offset.reset->largest'," +
s" topics '$kafkaTopic')"
)
DATA
STREAM
FILTERING
AGGREGATED
WINDOW
PERPETUALLY
GROWING
REFERENCE
DATA
EXTERNAL
ACCESS
val resultStream: SchemaDStream = snsc.registerCQ(
"select application_name, log_type" +
"count(source_host) as source" +
"from applicationStream window (duration 1 seconds, slide 1 seconds)" +
"group by application_name, log_type"
)
DATA
STREAM
FILTERING
AGGREGATED
WINDOW
PERPETUALLY
GROWING
REFERENCE
DATA
EXTERNAL
ACCESS
resultStream.foreachDataFrame(df => {
val dfApplications = df.groupBy("application_name")
.agg(sum(”log_type"))
dfApplications.write.insertInto("application_name")
val resultStream: SchemaDStream = snsc.registerCQ(
"select application_name, log_type" +
"count(source_host) as source" +
"from applicationStream window (duration 1 seconds, slide 1 seconds)" +
"group by application_name, log_type"
)
DATA
STREAM
FILTERING
AGGREGATED
WINDOW
PERPETUALLY
GROWING
REFERENCE
DATA
EXTERNAL
ACCESS
snsc.sql("create table eventTable(application string, event string)
using row")
resultStream.foreachDataFrame(df => {
val result = df.collect()
...
val stmt = conn.prepareStatement("put into eventTable
(application_name, log_type) values (?,?+(nvl(
(select application_name from eventTable where
application_name= ?),0)))")
DATA
STREAM
FILTERING
AGGREGATED
WINDOW
PERPETUALLY
GROWING
REFERENCE
DATA
EXTERNAL
ACCESS
DATA
STREAM
FILTERING
AGGREGATED
WINDOW
PERPETUALLY
GROWING
REFERENCE
DATA
EXTERNAL
ACCESS
FEBRUARY 22, 2018, WARSAW
ENHANCING SPARK
INCREASE STREAMING CAPABILITIES OF YOU APPLICATIONS
Kamil Folkert, Tomasz Mirowski

More Related Content

Similar to Enhancing Spark - increase streaming capabilities of your applications - Kamil Folkert, Tomasz Mirowski, 3Soft

Overall Security Process Review CISC 6621Agend.docx
Overall Security Process Review CISC 6621Agend.docxOverall Security Process Review CISC 6621Agend.docx
Overall Security Process Review CISC 6621Agend.docx
karlhennesey
 

Similar to Enhancing Spark - increase streaming capabilities of your applications - Kamil Folkert, Tomasz Mirowski, 3Soft (20)

Collecting Endpoint Security Logs Through Big Data Technology - Dedi Dwianto
Collecting Endpoint Security Logs Through Big Data Technology - Dedi DwiantoCollecting Endpoint Security Logs Through Big Data Technology - Dedi Dwianto
Collecting Endpoint Security Logs Through Big Data Technology - Dedi Dwianto
 
EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?EDA Meets Data Engineering – What's the Big Deal?
EDA Meets Data Engineering – What's the Big Deal?
 
Overall Security Process Review CISC 6621Agend.docx
Overall Security Process Review CISC 6621Agend.docxOverall Security Process Review CISC 6621Agend.docx
Overall Security Process Review CISC 6621Agend.docx
 
McAfee - Enterprise Security Manager (ESM) - SIEM
McAfee - Enterprise Security Manager (ESM) - SIEMMcAfee - Enterprise Security Manager (ESM) - SIEM
McAfee - Enterprise Security Manager (ESM) - SIEM
 
Become a Threat Hunter by Hamza Beghal
Become a Threat Hunter by Hamza BeghalBecome a Threat Hunter by Hamza Beghal
Become a Threat Hunter by Hamza Beghal
 
Infosec 2014: Intelligence as a Service: The Future of Frontline Security
Infosec 2014: Intelligence as a Service: The Future of Frontline SecurityInfosec 2014: Intelligence as a Service: The Future of Frontline Security
Infosec 2014: Intelligence as a Service: The Future of Frontline Security
 
Big security for big data
Big security for big dataBig security for big data
Big security for big data
 
SplunkLive! Stockholm 2015 breakout - Getting started with Splunk Enterprise
SplunkLive! Stockholm 2015 breakout - Getting started with Splunk EnterpriseSplunkLive! Stockholm 2015 breakout - Getting started with Splunk Enterprise
SplunkLive! Stockholm 2015 breakout - Getting started with Splunk Enterprise
 
CryptTech 2015
CryptTech 2015CryptTech 2015
CryptTech 2015
 
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...
Strata Presentation: One Billion Objects in 2GB: Big Data Analytics on Small ...
 
Dataviz For Cyber Security
Dataviz For Cyber SecurityDataviz For Cyber Security
Dataviz For Cyber Security
 
CEP and SOA: An Open Event-Driven Architecture for Risk Management
CEP and SOA: An Open Event-Driven Architecture for Risk ManagementCEP and SOA: An Open Event-Driven Architecture for Risk Management
CEP and SOA: An Open Event-Driven Architecture for Risk Management
 
[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...
[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...
[OpenInfra Days Korea 2018] (Track 4) CloudEvents 소개 - 상호 운용 가능성을 극대화한 이벤트 데이...
 
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity IndustryData Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
 
ISSA Siem Fraud
ISSA Siem FraudISSA Siem Fraud
ISSA Siem Fraud
 
Started from the Bottom: Exploiting Data Sources to Uncover ATT&CK Behaviors
Started from the Bottom: Exploiting Data Sources to Uncover ATT&CK BehaviorsStarted from the Bottom: Exploiting Data Sources to Uncover ATT&CK Behaviors
Started from the Bottom: Exploiting Data Sources to Uncover ATT&CK Behaviors
 
SIEM - Activating Defense through Response by Ankur Vats
SIEM - Activating Defense through Response by Ankur VatsSIEM - Activating Defense through Response by Ankur Vats
SIEM - Activating Defense through Response by Ankur Vats
 
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
Apache Kafka for Cybersecurity and SIEM / SOAR ModernizationApache Kafka for Cybersecurity and SIEM / SOAR Modernization
Apache Kafka for Cybersecurity and SIEM / SOAR Modernization
 
AWS Summit Auckland - Sponsor Presentation - Splunk
AWS Summit Auckland - Sponsor Presentation - SplunkAWS Summit Auckland - Sponsor Presentation - Splunk
AWS Summit Auckland - Sponsor Presentation - Splunk
 
Presentation data security solutions certified ibm business partner for ibm...
Presentation   data security solutions certified ibm business partner for ibm...Presentation   data security solutions certified ibm business partner for ibm...
Presentation data security solutions certified ibm business partner for ibm...
 

More from Evention

Stream Analytics with SQL on Apache Flink - Fabian Hueske
Stream Analytics with SQL on Apache Flink - Fabian HueskeStream Analytics with SQL on Apache Flink - Fabian Hueske
Stream Analytics with SQL on Apache Flink - Fabian Hueske
Evention
 

More from Evention (20)

The Factorization Machines algorithm for building recommendation system - Paw...
The Factorization Machines algorithm for building recommendation system - Paw...The Factorization Machines algorithm for building recommendation system - Paw...
The Factorization Machines algorithm for building recommendation system - Paw...
 
A/B testing powered by Big data - Saurabh Goyal, Booking.com
A/B testing powered by Big data - Saurabh Goyal, Booking.comA/B testing powered by Big data - Saurabh Goyal, Booking.com
A/B testing powered by Big data - Saurabh Goyal, Booking.com
 
Near Real-Time Fraud Detection in Telecommunication Industry - Burak Işıklı, ...
Near Real-Time Fraud Detection in Telecommunication Industry - Burak Işıklı, ...Near Real-Time Fraud Detection in Telecommunication Industry - Burak Işıklı, ...
Near Real-Time Fraud Detection in Telecommunication Industry - Burak Işıklı, ...
 
Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; K...
Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; K...Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; K...
Assisting millions of active users in real-time - Alexey Brodovshuk, Kcell; K...
 
Machine learning security - Pawel Zawistowski, Warsaw University of Technolog...
Machine learning security - Pawel Zawistowski, Warsaw University of Technolog...Machine learning security - Pawel Zawistowski, Warsaw University of Technolog...
Machine learning security - Pawel Zawistowski, Warsaw University of Technolog...
 
Building a Modern Data Pipeline: Lessons Learned - Saulius Valatka, Adform
Building a Modern Data Pipeline: Lessons Learned - Saulius Valatka, AdformBuilding a Modern Data Pipeline: Lessons Learned - Saulius Valatka, Adform
Building a Modern Data Pipeline: Lessons Learned - Saulius Valatka, Adform
 
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data ArtisansApache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
 
Privacy by Design - Lars Albertsson, Mapflat
Privacy by Design - Lars Albertsson, MapflatPrivacy by Design - Lars Albertsson, Mapflat
Privacy by Design - Lars Albertsson, Mapflat
 
Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...
Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...
Elephants in the cloud or how to become cloud ready - Krzysztof Adamski, GetI...
 
Deriving Actionable Insights from High Volume Media Streams - Jörn Kottmann, ...
Deriving Actionable Insights from High Volume Media Streams - Jörn Kottmann, ...Deriving Actionable Insights from High Volume Media Streams - Jörn Kottmann, ...
Deriving Actionable Insights from High Volume Media Streams - Jörn Kottmann, ...
 
7 Days of Playing Minesweeper, or How to Shut Down Whistleblower Defense with...
7 Days of Playing Minesweeper, or How to Shut Down Whistleblower Defense with...7 Days of Playing Minesweeper, or How to Shut Down Whistleblower Defense with...
7 Days of Playing Minesweeper, or How to Shut Down Whistleblower Defense with...
 
Big Data Journey at a Big Corp - Tomasz Burzyński, Maciej Czyżowicz, Orange P...
Big Data Journey at a Big Corp - Tomasz Burzyński, Maciej Czyżowicz, Orange P...Big Data Journey at a Big Corp - Tomasz Burzyński, Maciej Czyżowicz, Orange P...
Big Data Journey at a Big Corp - Tomasz Burzyński, Maciej Czyżowicz, Orange P...
 
Stream processing with Apache Flink - Maximilian Michels Data Artisans
Stream processing with Apache Flink - Maximilian Michels Data ArtisansStream processing with Apache Flink - Maximilian Michels Data Artisans
Stream processing with Apache Flink - Maximilian Michels Data Artisans
 
Scaling Cassandra in all directions - Jimmy Mardell Spotify
Scaling Cassandra in all directions - Jimmy Mardell SpotifyScaling Cassandra in all directions - Jimmy Mardell Spotify
Scaling Cassandra in all directions - Jimmy Mardell Spotify
 
Big Data for unstructured data Dariusz Śliwa
Big Data for unstructured data Dariusz ŚliwaBig Data for unstructured data Dariusz Śliwa
Big Data for unstructured data Dariusz Śliwa
 
Elastic development. Implementing Big Data search Grzegorz Kołpuć
Elastic development. Implementing Big Data search Grzegorz KołpućElastic development. Implementing Big Data search Grzegorz Kołpuć
Elastic development. Implementing Big Data search Grzegorz Kołpuć
 
H2 o deep water making deep learning accessible to everyone -jo-fai chow
H2 o deep water   making deep learning accessible to everyone -jo-fai chowH2 o deep water   making deep learning accessible to everyone -jo-fai chow
H2 o deep water making deep learning accessible to everyone -jo-fai chow
 
That won’t fit into RAM - Michał Brzezicki
That won’t fit into RAM -  Michał  BrzezickiThat won’t fit into RAM -  Michał  Brzezicki
That won’t fit into RAM - Michał Brzezicki
 
Stream Analytics with SQL on Apache Flink - Fabian Hueske
Stream Analytics with SQL on Apache Flink - Fabian HueskeStream Analytics with SQL on Apache Flink - Fabian Hueske
Stream Analytics with SQL on Apache Flink - Fabian Hueske
 
Hopsworks Secure Streaming as-a-service with Kafka Flinkspark - Theofilos Kak...
Hopsworks Secure Streaming as-a-service with Kafka Flinkspark - Theofilos Kak...Hopsworks Secure Streaming as-a-service with Kafka Flinkspark - Theofilos Kak...
Hopsworks Secure Streaming as-a-service with Kafka Flinkspark - Theofilos Kak...
 

Recently uploaded

Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Valters Lauzums
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
pyhepag
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
cyebo
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
pyhepag
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
pyhepag
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
DilipVasan
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
pyhepag
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 

Recently uploaded (20)

2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting2024 Q2 Orange County (CA) Tableau User Group Meeting
2024 Q2 Orange County (CA) Tableau User Group Meeting
 
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
Data Analytics for Digital Marketing Lecture for Advanced Digital & Social Me...
 
How I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prisonHow I opened a fake bank account and didn't go to prison
How I opened a fake bank account and didn't go to prison
 
Machine Learning for Accident Severity Prediction
Machine Learning for Accident Severity PredictionMachine Learning for Accident Severity Prediction
Machine Learning for Accident Severity Prediction
 
一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理一比一原版西悉尼大学毕业证成绩单如何办理
一比一原版西悉尼大学毕业证成绩单如何办理
 
一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理一比一原版纽卡斯尔大学毕业证成绩单如何办理
一比一原版纽卡斯尔大学毕业证成绩单如何办理
 
Artificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdfArtificial_General_Intelligence__storm_gen_article.pdf
Artificial_General_Intelligence__storm_gen_article.pdf
 
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
一比一原版加利福尼亚大学尔湾分校毕业证成绩单如何办理
 
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
一比一原版(Monash毕业证书)莫纳什大学毕业证成绩单如何办理
 
Exploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptxExploratory Data Analysis - Dilip S.pptx
Exploratory Data Analysis - Dilip S.pptx
 
一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理一比一原版阿德莱德大学毕业证成绩单如何办理
一比一原版阿德莱德大学毕业证成绩单如何办理
 
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
Data Visualization Exploring and Explaining with Data 1st Edition by Camm sol...
 
2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call2024 Q1 Tableau User Group Leader Quarterly Call
2024 Q1 Tableau User Group Leader Quarterly Call
 
Pre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptxPre-ProductionImproveddsfjgndflghtgg.pptx
Pre-ProductionImproveddsfjgndflghtgg.pptx
 
Easy and simple project file on mp online
Easy and simple project file on mp onlineEasy and simple project file on mp online
Easy and simple project file on mp online
 
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotecAbortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
Abortion pills in Dammam Saudi Arabia// +966572737505 // buy cytotec
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
basics of data science with application areas.pdf
basics of data science with application areas.pdfbasics of data science with application areas.pdf
basics of data science with application areas.pdf
 
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflictSupply chain analytics to combat the effects of Ukraine-Russia-conflict
Supply chain analytics to combat the effects of Ukraine-Russia-conflict
 
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdfGenerative AI for Trailblazers_ Unlock the Future of AI.pdf
Generative AI for Trailblazers_ Unlock the Future of AI.pdf
 

Enhancing Spark - increase streaming capabilities of your applications - Kamil Folkert, Tomasz Mirowski, 3Soft