Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Advanced Threat Detection on Streaming Data

543 views

Published on

Advanced Threat Detection on Streaming Data
using Kafka , Storm and HBase

Published in: Software

Advanced Threat Detection on Streaming Data

  1. 1. ® © 2016 MapR Technologies 1® © 2016 MapR Technologies 1© 2016 MapR Technologies ® Advanced Threat Detection on Streaming Data Carol McDonald, Solution Architect Strata + Hadoop World March 2016
  2. 2. ® © 2016 MapR Technologies 2® © 2016 MapR Technologies 2 Meeting Advanced Threats Head On •  Solutionary: Managed Security Services Provider –  Provides Threat Intelligence as a Service
  3. 3. ® © 2016 MapR Technologies 3® © 2016 MapR Technologies 3 Real-time Detection of Advanced Threats •  Objective: –  Provide real time threat Intelligence on trillions of messages per year –  Store and process lots of unstructured security data –  Combine machine learning and predictive analytics
  4. 4. ® © 2016 MapR Technologies 4® © 2016 MapR Technologies 4 Event-based Detection of Advanced Threats Threat Alerts Store and Process Unstructured Data Anomaly Detection Real-time Threat Intelligence Predictive Analytics Machine Learning
  5. 5. ® © 2016 MapR Technologies 5® © 2016 MapR Technologies 5 Meeting Advanced Threats Head On •  Challenges: –  Expanding Data storage in RDBMS expensive $$ –  Could not process unstructured data at scale Scaling Unstructured Data Processing Challenges RDBMS Economics Unstructured Data
  6. 6. ® © 2016 MapR Technologies 6® © 2016 MapR Technologies 6 Serve DataStore DataCollect Data What Did The Solution Need to do ? Process DataData Sources ? ? ? ? Security Feeds HTTP Syslog Firewall Other
  7. 7. ® © 2016 MapR Technologies 7® © 2016 MapR Technologies 7 How to do this with High Performance at Scale? •  Parallel , Partitioned = fast , scalable
  8. 8. ® © 2016 MapR Technologies 8® © 2016 MapR Technologies 8 Data Ingest Solution: Stream Processing Architecture Topics Sources Security Feeds HTTP Syslog Firewall Other Data Ingest: •  Kafka or MapR Streams: fast distributed messaging Topics Topics Topics
  9. 9. ® © 2016 MapR Technologies 9® © 2016 MapR Technologies 9 Fast Distributed Messaging •  Topics organize events into categories •  Topics decouple producers from Consumers
  10. 10. ® © 2016 MapR Technologies 10® © 2016 MapR Technologies 10 Fast Distributed Messaging •  Topics are partitioned for fast throughput and scalability
  11. 11. ® © 2016 MapR Technologies 11® © 2016 MapR Technologies 11 How to do this with High Performance at Scale? •  Parallel , Partitioned: –  Messaging
  12. 12. ® © 2016 MapR Technologies 12® © 2016 MapR Technologies 12 Data Ingest Complex Event Processing with Storm and Esper Stream Processing Parser Bolt Kafka Spout Enrich Bolts Esper Kakfa Bolt Esper Spout Topic Alert Bolts Cross topology correlation of events •  Stream Processing: –  Storm: distributed real time computation –  Esper: Complex Event Processing Topics Topics Topics
  13. 13. ® © 2016 MapR Technologies 13® © 2016 MapR Technologies 13 Complex Event Processing with Esper •  Detect a related set or pattern of events within a time window •  Example Pattern Excess Login Failure: –  Same user, same source login failure SELECT * FROM Event(ip_src IS NOT NULL AND ec_activity=’Logon’ AND ec_outcome = ‘Failure’) .std:groupwin(ip_src).win:time (300 sec) GROUP BY ip_src HAVING COUNT(*) = 10
  14. 14. ® © 2016 MapR Technologies 14® © 2016 MapR Technologies 14 How to do this with High Performance at Scale? •  Parallel , Partitioned: –  Processing
  15. 15. ® © 2016 MapR Technologies 15® © 2016 MapR Technologies 15 Real-time Detection of Advanced Threats: Examples Data transferred from critical database servers Large traffic flows from a host to a given IP address Employee accessing database servers at unusual hours User logging in from two different countries within a short window
  16. 16. ® © 2016 MapR Technologies 16® © 2016 MapR Technologies 16 Complex Event Processing with Storm and Esper Cross-topology correlation of events
  17. 17. ® © 2016 MapR Technologies 17® © 2016 MapR Technologies 17 NoSQL Storage Solution: Stream Processing Architecture Stream Processing MapR-FS MapR-DB HDFS Bolt Index Bolt HBase Bolt •  NoSQL Storage –  HBase: fast scalable storage and caching –  Elastic Search: Indexing for real- time search analytics
  18. 18. ® © 2016 MapR Technologies 18® © 2016 MapR Technologies 18 Scalability with HBase (MapR-DB) Key colB col C val val val xxx val val Key colB col C val val val xxx val val Key colB col C val val val xxx val val Storage ModelRDBMS HBase Normalized schema à Joins for queries can cause bottleneck De-normalized schema à Data that is read together is stored together
  19. 19. ® © 2016 MapR Technologies 19® © 2016 MapR Technologies 19 MapR-DB (HBase API) is Designed to Scale Key Range xxxx xxxx Key Range xxxx xxxx Key Range xxxx xxxx Key colB col C val val val xxx val val Key colB col C val val val xxx val val Key colB col C val val val xxx val val Fast Reads and Writes by Key! Data is automatically partitioned by Key Range!
  20. 20. ® © 2016 MapR Technologies 20® © 2016 MapR Technologies 20 How to do this with High Performance at Scale? •  Parallel , Partitioned: –  Storage
  21. 21. ® © 2016 MapR Technologies 21® © 2016 MapR Technologies 21 NoSQL Storage Solution: Stream Processing Architecture MapR-FS MapR-DB •  Machine Learning –  thread modeling –  anomaly detection •  Security Analytics Serve Data
  22. 22. ® © 2016 MapR Technologies 22® © 2016 MapR Technologies 22 Data Driven Forensics Investigation •  What can the data tell us? –  What happened within a time range? –  How did the threat get in? –  What are all the activities associated with a specific IP/user? –  How much data was affected? –  Has this occurred elsewhere in the past?
  23. 23. ® © 2016 MapR Technologies 23® © 2016 MapR Technologies 23 Solution: Stream Processing Architecture
  24. 24. ® © 2016 MapR Technologies 24® © 2016 MapR Technologies 24 Key to Real Time: Event-based Data Flows Key to Scale = Parallel Partitioned: •  Messaging •  Processing •  Storage
  25. 25. ® © 2016 MapR Technologies 25® © 2016 MapR Technologies 25 Stream Processing Building a Complete Data Architecture Sources/Apps Bulk Processing Web-Scale Storage MapR-FS MapR-DB MapR Streams Event StreamingDatabase
  26. 26. ® © 2016 MapR Technologies 26® © 2016 MapR Technologies 26 Key to Real Time: ConvergenceApps High Availability Data Protection Unified Security Real Time Multi-tenancy UnifiedManagement&Monitoring Customer ExperienceData Architecture Optimization Security Investigation & Event Management Operational Intelligence Managed Services & Custom Apps Event Streaming Database Storage Converged Data Platform
  27. 27. ® © 2016 MapR Technologies 27® © 2016 MapR Technologies 27 Why Hadoop for Security Analytics? •  Cost effective for storing and analyzing large volumes of data in real-time •  Provides search & query, machine learning for activity correlation and anomaly detection •  When it comes to Hadoop, select an enterprise distribution (e.g. MapR Converged Data Platform) so you can focus on your primary objective
  28. 28. ® © 2016 MapR Technologies 28® © 2016 MapR Technologies 28 To Learn More: •  http://learn.mapr.com/
  29. 29. ® © 2016 MapR Technologies 29® © 2016 MapR Technologies 29 To Learn More: •  Download example code –  https://github.com/caroljmcdonald/mapr-streams-sparkstreaming-hbase •  Read explanation of example code –  https://www.mapr.com/blog/spark-streaming-hbase
  30. 30. ® © 2016 MapR Technologies 30® © 2016 MapR Technologies 30 Q&A @mapr https://www.mapr.com/blog/author/carol-mcdonald Engage with us! mapr-technologies

×