SlideShare a Scribd company logo
1 of 26
Download to read offline
THE POWER OF
INTELLIGENT FLOWS
REAL-TIME IOT BOTNET CLASSIFICATION WITH APACHE NIFI
Andre Fucs de Miranda - Fluenda
Andy LoPresto - Hortonworks
Agenda
- Who are the two blokes in front of you
- A brief prologue
- Logs! Logs! Logs!
- The challenge
- The solution
- Wrapping up
Who are the two blokes in front of you
Andre Fucs de Miranda



- Nearly 20 years working with
information cyber security



- Logging aficionado (i.e. security
data engineer)



- Apache NiFi PMC Member



@trixpan @trixpan
Andy LoPresto



- Financial security & device
firmware at Apple, TigerText, etc.



- PII, PCI & EPHI encryption &
cracking



- Apache NiFi PMC Member



@yolopey @alopresto
A brief prologue
The Botnet Kill Chain & the Honeypot
Reconnaissanc
e
Weaponization Delivery Exploitation Installation C & C
Actions on
Objective
Botnet

developer
Low /
Medium
interaction
honeypot

High
interaction 

--

Sandboxing
The Botnet Kill Chain & the Honeypot
Delivery Exploitation Installation
Step 1
Logon to
system
Step 2
Execute
predefined
sequence of
commands
Step 3
Try to install
some sort of
persistence
The demo environment
- A handful of EC2 instances
running:
- Cowrie - Medium interaction
SSH / Telnet honeypot
- MiNiFi

- An EC2 instance running:
- NiFi 1.3.0 (with security
enabled)
Flow Design Approach
" Don’t be prescriptive
" Treat everything as data
" Don’t be limited by prior
expectations
" Start from the end
Logs! Logs! Logs!
MiNiFi Process Group
" Tailing a log file being written by
cowrie
" Pushing to Amazon S3
○ Could stream via NiFi Site
to Site
○ MiNiFi extensibility
○ Shows multiple capabilities
○ Decoupled/no lock in
The data being ingested
- Cowrie logs include:
- Username / Password
- Commands executed (and parameters)
- Files downloaded
- Single line JSON entries
- Easy to parse
- Textbook machine readable log format
- Perfect match to NiFi processors such as:
- SplitText
- EvaluateJSONPath
Cowrie log example
Simple Cowrie log ingestion with NiFi
The challenge
The challenge
The challenge
- Logs in isolation rarely will provide the reader with a
meaningful view over what is happening

- Verbosity means sensors generate lots of “events”, but
who cares about a bot trying to `cat /proc/mounts` ?

- Bots use semi-random values to make detection more
difficult.
The solution
Logs are data too...
This looks familiar...
Locality-sensitive hashing
A type of algorithm that can be used to “group” similar items together
and may provide a similarity score between two particular items.
Areas of application:
- Genome-wide association study
- Anti-spam (e.g. TLSH, Spamsum/SSDeep)
- Near-duplicate detection
- etc
NiFi + SpamSum + TLSH = WIN!
Wrapping up
Key points
- Treat everything as data
- Be flexible on how you build your data flows.
- Apparently unrelated domains may speed up your results
- Use MiNiFi to aggregate data at the edge whenever possible
- NiFi rocks!*

* Disclaimer: We may be a bit
Future Steps
" Automate IP blocking & firewall rules (ML)
" Continuously update signature definition list with new
sigs
" Analyze epidemiology & spread vectors
" Follow evolution of malware families
" Support attribution of samples
Further reading
Mysterious Hajime botnet has pwned 300,000 IoT devices

https://www.theregister.co.uk/2017/04/27/hajime_iot_botnet/
Identifying unknown files by using fuzzy hashing

https://www.honeynet.org/node/811
Classifying Malware using Import API and Fuzzy Hashing – impfuzzy

http://blog.jpcert.or.jp/2016/05/classifying-mal-a988.html
Template and samples:

https://github.com/fluenda/dataworks_summit_iot_botnet
Thank you

More Related Content

What's hot

LLAP: Building Cloud First BI
LLAP: Building Cloud First BILLAP: Building Cloud First BI
LLAP: Building Cloud First BIDataWorks Summit
 
Real-Time Ingesting and Transforming Sensor Data and Social Data with NiFi an...
Real-Time Ingesting and Transforming Sensor Data and Social Data with NiFi an...Real-Time Ingesting and Transforming Sensor Data and Social Data with NiFi an...
Real-Time Ingesting and Transforming Sensor Data and Social Data with NiFi an...DataWorks Summit
 
Building Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFiBuilding Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFiBryan Bende
 
IoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFiIoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFiDataWorks Summit
 
Storage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesStorage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesDataWorks Summit
 
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataWorks Summit
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0DataWorks Summit
 
Devnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFiDevnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFiBryan Bende
 
NiFi Developer Guide
NiFi Developer GuideNiFi Developer Guide
NiFi Developer GuideDeon Huang
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks
 
Zero ETL analytics with LLAP in Azure HDInsight
Zero ETL analytics with LLAP in Azure HDInsightZero ETL analytics with LLAP in Azure HDInsight
Zero ETL analytics with LLAP in Azure HDInsightDataWorks Summit
 
Seattle spark-meetup-032317
Seattle spark-meetup-032317Seattle spark-meetup-032317
Seattle spark-meetup-032317Nan Zhu
 
Big Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and Parquet
Big Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and ParquetBig Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and Parquet
Big Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and ParquetDataWorks Summit
 
Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsTimothy Spann
 
Mission to NARs with Apache NiFi
Mission to NARs with Apache NiFiMission to NARs with Apache NiFi
Mission to NARs with Apache NiFiHortonworks
 

What's hot (20)

LLAP: Building Cloud First BI
LLAP: Building Cloud First BILLAP: Building Cloud First BI
LLAP: Building Cloud First BI
 
Real-Time Ingesting and Transforming Sensor Data and Social Data with NiFi an...
Real-Time Ingesting and Transforming Sensor Data and Social Data with NiFi an...Real-Time Ingesting and Transforming Sensor Data and Social Data with NiFi an...
Real-Time Ingesting and Transforming Sensor Data and Social Data with NiFi an...
 
Building Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFiBuilding Data Pipelines for Solr with Apache NiFi
Building Data Pipelines for Solr with Apache NiFi
 
IoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFiIoT with Apache MXNet and Apache NiFi and MiniFi
IoT with Apache MXNet and Apache NiFi and MiniFi
 
Streaming SQL
Streaming SQLStreaming SQL
Streaming SQL
 
From Device to Data Center to Insights
From Device to Data Center to InsightsFrom Device to Data Center to Insights
From Device to Data Center to Insights
 
Storage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesStorage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on Kubernetes
 
Dataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFiDataflow Management From Edge to Core with Apache NiFi
Dataflow Management From Edge to Core with Apache NiFi
 
Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0Meet HBase 2.0 and Phoenix 5.0
Meet HBase 2.0 and Phoenix 5.0
 
Devnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFiDevnexus 2018 - Let Your Data Flow with Apache NiFi
Devnexus 2018 - Let Your Data Flow with Apache NiFi
 
NiFi Developer Guide
NiFi Developer GuideNiFi Developer Guide
NiFi Developer Guide
 
Building a Smarter Home with Apache NiFi and Spark
Building a Smarter Home with Apache NiFi and SparkBuilding a Smarter Home with Apache NiFi and Spark
Building a Smarter Home with Apache NiFi and Spark
 
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
Hortonworks Data in Motion Webinar Series Part 7 Apache Kafka Nifi Better Tog...
 
Zero ETL analytics with LLAP in Azure HDInsight
Zero ETL analytics with LLAP in Azure HDInsightZero ETL analytics with LLAP in Azure HDInsight
Zero ETL analytics with LLAP in Azure HDInsight
 
Seattle spark-meetup-032317
Seattle spark-meetup-032317Seattle spark-meetup-032317
Seattle spark-meetup-032317
 
Apache Nifi Crash Course
Apache Nifi Crash CourseApache Nifi Crash Course
Apache Nifi Crash Course
 
Big Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and Parquet
Big Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and ParquetBig Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and Parquet
Big Data Storage - Comparing Speed and Features for Avro, JSON, ORC, and Parquet
 
Running Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration OptionsRunning Apache NiFi with Apache Spark : Integration Options
Running Apache NiFi with Apache Spark : Integration Options
 
Mission to NARs with Apache NiFi
Mission to NARs with Apache NiFiMission to NARs with Apache NiFi
Mission to NARs with Apache NiFi
 
The Avant-garde of Apache NiFi
The Avant-garde of Apache NiFiThe Avant-garde of Apache NiFi
The Avant-garde of Apache NiFi
 

Similar to Real-Time IoT Botnet Classification with Apache NiFi and Locality-Sensitive Hashing

iShelf - Inventory management project
iShelf - Inventory management projectiShelf - Inventory management project
iShelf - Inventory management projectFrancescoCassini
 
Basics of tcp ip
Basics of tcp ipBasics of tcp ip
Basics of tcp ipKumar
 
Kamailio World 2018: Having fun with new stuff
Kamailio World 2018: Having fun with new stuffKamailio World 2018: Having fun with new stuff
Kamailio World 2018: Having fun with new stuffOlle E Johansson
 
Teensy Programming for Everyone
Teensy Programming for EveryoneTeensy Programming for Everyone
Teensy Programming for EveryoneNikhil Mittal
 
Bsides Tampa Blue Team’s tool dump.
Bsides Tampa Blue Team’s tool dump.Bsides Tampa Blue Team’s tool dump.
Bsides Tampa Blue Team’s tool dump.Alexander Kot
 
Lares from LOW to PWNED
Lares from LOW to PWNEDLares from LOW to PWNED
Lares from LOW to PWNEDChris Gates
 
Detecting Web Browser Heap Corruption Attacks - Stephan Chenette, Moti Joseph...
Detecting Web Browser Heap Corruption Attacks - Stephan Chenette, Moti Joseph...Detecting Web Browser Heap Corruption Attacks - Stephan Chenette, Moti Joseph...
Detecting Web Browser Heap Corruption Attacks - Stephan Chenette, Moti Joseph...Stephan Chenette
 
Jaime Blasco - Fighting Advanced Persistent Threat (APT) with Open Source Too...
Jaime Blasco - Fighting Advanced Persistent Threat (APT) with Open Source Too...Jaime Blasco - Fighting Advanced Persistent Threat (APT) with Open Source Too...
Jaime Blasco - Fighting Advanced Persistent Threat (APT) with Open Source Too...RootedCON
 
Honeypots.ppt1800363876
Honeypots.ppt1800363876Honeypots.ppt1800363876
Honeypots.ppt1800363876Momita Sharma
 
MachinePulse at the November Open Hardware Meetup, Mumbai 2014
MachinePulse at the November Open Hardware Meetup, Mumbai 2014MachinePulse at the November Open Hardware Meetup, Mumbai 2014
MachinePulse at the November Open Hardware Meetup, Mumbai 2014MachinePulse
 
DEF CON 27 - DANIEL ROMERO and MARIO RIVAS - why you should fear your mundane...
DEF CON 27 - DANIEL ROMERO and MARIO RIVAS - why you should fear your mundane...DEF CON 27 - DANIEL ROMERO and MARIO RIVAS - why you should fear your mundane...
DEF CON 27 - DANIEL ROMERO and MARIO RIVAS - why you should fear your mundane...Felipe Prado
 
Shmoocon 2013 - OpenStack Security Brief
Shmoocon 2013 - OpenStack Security BriefShmoocon 2013 - OpenStack Security Brief
Shmoocon 2013 - OpenStack Security Briefopenfly
 
What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)Brian Brazil
 

Similar to Real-Time IoT Botnet Classification with Apache NiFi and Locality-Sensitive Hashing (20)

iShelf - Inventory management project
iShelf - Inventory management projectiShelf - Inventory management project
iShelf - Inventory management project
 
The LabRat - Physical backdoor hacks and IOT primer
The LabRat - Physical backdoor hacks and IOT primerThe LabRat - Physical backdoor hacks and IOT primer
The LabRat - Physical backdoor hacks and IOT primer
 
Data analysis with pandas
Data analysis with pandasData analysis with pandas
Data analysis with pandas
 
Data Analysis With Pandas
Data Analysis With PandasData Analysis With Pandas
Data Analysis With Pandas
 
Routing_Article
Routing_ArticleRouting_Article
Routing_Article
 
Basics of tcp ip
Basics of tcp ipBasics of tcp ip
Basics of tcp ip
 
Kamailio World 2018: Having fun with new stuff
Kamailio World 2018: Having fun with new stuffKamailio World 2018: Having fun with new stuff
Kamailio World 2018: Having fun with new stuff
 
Teensy Programming for Everyone
Teensy Programming for EveryoneTeensy Programming for Everyone
Teensy Programming for Everyone
 
OS Fingerprinting
OS FingerprintingOS Fingerprinting
OS Fingerprinting
 
Bsides Tampa Blue Team’s tool dump.
Bsides Tampa Blue Team’s tool dump.Bsides Tampa Blue Team’s tool dump.
Bsides Tampa Blue Team’s tool dump.
 
2015 moloch recipes
2015 moloch recipes2015 moloch recipes
2015 moloch recipes
 
Lares from LOW to PWNED
Lares from LOW to PWNEDLares from LOW to PWNED
Lares from LOW to PWNED
 
Detecting Web Browser Heap Corruption Attacks - Stephan Chenette, Moti Joseph...
Detecting Web Browser Heap Corruption Attacks - Stephan Chenette, Moti Joseph...Detecting Web Browser Heap Corruption Attacks - Stephan Chenette, Moti Joseph...
Detecting Web Browser Heap Corruption Attacks - Stephan Chenette, Moti Joseph...
 
Jaime Blasco - Fighting Advanced Persistent Threat (APT) with Open Source Too...
Jaime Blasco - Fighting Advanced Persistent Threat (APT) with Open Source Too...Jaime Blasco - Fighting Advanced Persistent Threat (APT) with Open Source Too...
Jaime Blasco - Fighting Advanced Persistent Threat (APT) with Open Source Too...
 
IoT 2.pptx
IoT 2.pptxIoT 2.pptx
IoT 2.pptx
 
Honeypots.ppt1800363876
Honeypots.ppt1800363876Honeypots.ppt1800363876
Honeypots.ppt1800363876
 
MachinePulse at the November Open Hardware Meetup, Mumbai 2014
MachinePulse at the November Open Hardware Meetup, Mumbai 2014MachinePulse at the November Open Hardware Meetup, Mumbai 2014
MachinePulse at the November Open Hardware Meetup, Mumbai 2014
 
DEF CON 27 - DANIEL ROMERO and MARIO RIVAS - why you should fear your mundane...
DEF CON 27 - DANIEL ROMERO and MARIO RIVAS - why you should fear your mundane...DEF CON 27 - DANIEL ROMERO and MARIO RIVAS - why you should fear your mundane...
DEF CON 27 - DANIEL ROMERO and MARIO RIVAS - why you should fear your mundane...
 
Shmoocon 2013 - OpenStack Security Brief
Shmoocon 2013 - OpenStack Security BriefShmoocon 2013 - OpenStack Security Brief
Shmoocon 2013 - OpenStack Security Brief
 
What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)What does "monitoring" mean? (FOSDEM 2017)
What does "monitoring" mean? (FOSDEM 2017)
 

More from DataWorks Summit

Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisDataWorks Summit
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiDataWorks Summit
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...DataWorks Summit
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...DataWorks Summit
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal SystemDataWorks Summit
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExampleDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixDataWorks Summit
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiDataWorks Summit
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsDataWorks Summit
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureDataWorks Summit
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EngineDataWorks Summit
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...DataWorks Summit
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudDataWorks Summit
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiDataWorks Summit
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerDataWorks Summit
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouDataWorks Summit
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkDataWorks Summit
 

More from DataWorks Summit (20)

Data Science Crash Course
Data Science Crash CourseData Science Crash Course
Data Science Crash Course
 
Floating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache RatisFloating on a RAFT: HBase Durability with Apache Ratis
Floating on a RAFT: HBase Durability with Apache Ratis
 
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFiTracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
 
HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...HBase Tales From the Trenches - Short stories about most common HBase operati...
HBase Tales From the Trenches - Short stories about most common HBase operati...
 
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
 
Managing the Dewey Decimal System
Managing the Dewey Decimal SystemManaging the Dewey Decimal System
Managing the Dewey Decimal System
 
Practical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist ExamplePractical NoSQL: Accumulo's dirlist Example
Practical NoSQL: Accumulo's dirlist Example
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and PhoenixScaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
 
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFiBuilding the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
 
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability ImprovementsSupporting Apache HBase : Troubleshooting and Supportability Improvements
Supporting Apache HBase : Troubleshooting and Supportability Improvements
 
Security Framework for Multitenant Architecture
Security Framework for Multitenant ArchitectureSecurity Framework for Multitenant Architecture
Security Framework for Multitenant Architecture
 
Presto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything EnginePresto: Optimizing Performance of SQL-on-Anything Engine
Presto: Optimizing Performance of SQL-on-Anything Engine
 
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
 
Extending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google CloudExtending Twitter's Data Platform to Google Cloud
Extending Twitter's Data Platform to Google Cloud
 
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFiEvent-Driven Messaging and Actions using Apache Flink and Apache NiFi
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
 
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache RangerSecuring Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Computer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near YouComputer Vision: Coming to a Store Near You
Computer Vision: Coming to a Store Near You
 
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache SparkBig Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
 

Recently uploaded

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Recently uploaded (20)

Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

Real-Time IoT Botnet Classification with Apache NiFi and Locality-Sensitive Hashing

  • 1. THE POWER OF INTELLIGENT FLOWS REAL-TIME IOT BOTNET CLASSIFICATION WITH APACHE NIFI Andre Fucs de Miranda - Fluenda Andy LoPresto - Hortonworks
  • 2. Agenda - Who are the two blokes in front of you - A brief prologue - Logs! Logs! Logs! - The challenge - The solution - Wrapping up
  • 3. Who are the two blokes in front of you Andre Fucs de Miranda
 
 - Nearly 20 years working with information cyber security
 
 - Logging aficionado (i.e. security data engineer)
 
 - Apache NiFi PMC Member
 
 @trixpan @trixpan Andy LoPresto
 
 - Financial security & device firmware at Apple, TigerText, etc.
 
 - PII, PCI & EPHI encryption & cracking
 
 - Apache NiFi PMC Member
 
 @yolopey @alopresto
  • 5. The Botnet Kill Chain & the Honeypot Reconnaissanc e Weaponization Delivery Exploitation Installation C & C Actions on Objective Botnet
 developer Low / Medium interaction honeypot
 High interaction 
 --
 Sandboxing
  • 6. The Botnet Kill Chain & the Honeypot Delivery Exploitation Installation Step 1 Logon to system Step 2 Execute predefined sequence of commands Step 3 Try to install some sort of persistence
  • 7. The demo environment - A handful of EC2 instances running: - Cowrie - Medium interaction SSH / Telnet honeypot - MiNiFi
 - An EC2 instance running: - NiFi 1.3.0 (with security enabled)
  • 8. Flow Design Approach " Don’t be prescriptive " Treat everything as data " Don’t be limited by prior expectations " Start from the end
  • 10. MiNiFi Process Group " Tailing a log file being written by cowrie " Pushing to Amazon S3 ○ Could stream via NiFi Site to Site ○ MiNiFi extensibility ○ Shows multiple capabilities ○ Decoupled/no lock in
  • 11. The data being ingested - Cowrie logs include: - Username / Password - Commands executed (and parameters) - Files downloaded - Single line JSON entries - Easy to parse - Textbook machine readable log format - Perfect match to NiFi processors such as: - SplitText - EvaluateJSONPath
  • 13. Simple Cowrie log ingestion with NiFi
  • 16. The challenge - Logs in isolation rarely will provide the reader with a meaningful view over what is happening
 - Verbosity means sensors generate lots of “events”, but who cares about a bot trying to `cat /proc/mounts` ?
 - Bots use semi-random values to make detection more difficult.
  • 18. Logs are data too...
  • 20. Locality-sensitive hashing A type of algorithm that can be used to “group” similar items together and may provide a similarity score between two particular items. Areas of application: - Genome-wide association study - Anti-spam (e.g. TLSH, Spamsum/SSDeep) - Near-duplicate detection - etc
  • 21. NiFi + SpamSum + TLSH = WIN!
  • 23. Key points - Treat everything as data - Be flexible on how you build your data flows. - Apparently unrelated domains may speed up your results - Use MiNiFi to aggregate data at the edge whenever possible - NiFi rocks!*
 * Disclaimer: We may be a bit
  • 24. Future Steps " Automate IP blocking & firewall rules (ML) " Continuously update signature definition list with new sigs " Analyze epidemiology & spread vectors " Follow evolution of malware families " Support attribution of samples
  • 25. Further reading Mysterious Hajime botnet has pwned 300,000 IoT devices
 https://www.theregister.co.uk/2017/04/27/hajime_iot_botnet/ Identifying unknown files by using fuzzy hashing
 https://www.honeynet.org/node/811 Classifying Malware using Import API and Fuzzy Hashing – impfuzzy
 http://blog.jpcert.or.jp/2016/05/classifying-mal-a988.html Template and samples:
 https://github.com/fluenda/dataworks_summit_iot_botnet