Detecting Threats with
Analytics and Machine
Learning (ML)
Shomiron DAS GUPTA (GCIA)
Founder, CEO
NETMONASTERY Inc.
#SACON
Agenda
■ Dissecting detection systems
■ Why do we need “analytics”
■ Learning systems
■ Anomaly / Heuristics / Dictionaries
■ Machine Learning Use Cases
■ Why ML works / fails
#SACON
Dissecting Detection Systems
■ Signature based
■ Anomaly engines
■ Analytics workbench
■ Learning systems
#SACON
Why do we need “Analytics”
The Real Need for Analytics
■ Cyber security refresh rate
■ Custom payloads from attackers
■ Servers not the target
■ Speed with volume
#SACON
Learning Systems
■ Heuristic learning
■ Anomaly engines
■ Spot / Baseline / Profilers
■ Time series analytics
■ Classifiers
■ Unassisted learning
#SACON
Heuristics
■ Virus detection, OS Rootkits
v1 v2 v3 v4
Day 14Day 6Day 1 Day 42
#SACON
Anomaly
■ DDoS Detection, Protocol Obfuscation, Malformed
Data Streams, Application Breach
Fixed Anomaly
Model Structure
Could be traffic behavior,
protocol behavior,
application behavior.
Realtime Data
#SACON
Spot / Baseline / Profilers
■ Unordered Action - new rule, new device, long dead
user, database user event
#SACON
LEARN PHASE EVAL PHASE
Build Model
Transcode model with
feature aggregation
performed on realtime
data flows
Data Data
Evaluation
Identification of
outliers based on
pre approved
model
Time Series Analytics
■ DDoS, Flow Outliers, protocol breach, zombies
#SACON
THRESHOLDING DYNAMIC THRESHOLDING
Fixed limits are set to
detect breach in activity
Moving window analysis of
time series data
Classifiers
■ SPAM, Botnets, Authentication Anomalies
#SACON
Clustering Process
- Suitable feature selection (PCA)
- Training set (static / dynamic)
- Cleaning training data
- Regression to find mean
- Operations
- Feedback and Re-tuning
Unassisted Learning
■ SPAM, DNS Detection, L2 Attacks
#SACON
Alert
Feedback
AnalystSelf Adjusting Loop
Data
Profiler
Model
When is ML working
■ Credible / Clean training data
■ Positive and timely feedback
■ Picking the right features
■ Consistent feature variation
■ Consistent data pattern
#SACON
Where does ML work
■ DNS based detection
■ DDoS / Traffic anomaly
■ SPAM Mail filters
■ Authentication
■ Application modelling
■ Threat Intelligence
#SACON
ML is failing
■ Variance challenge
■ The “stale dataset” problem
■ Mass labelling
■ Complex selection challenges
#SACON
■ Programming in R / Python
■ Data platforms - Splunk, DNIF
■ Infrastructures - Generic Hadoop, Hortonworks
https://dnif.it
Get started with 100Gb free every month forever
Getting Started with ML
#SACON
Shomiron DAS GUPTA
shomiron@netmonastery.com
+91 9820336050
Thank You!
#SACON

Threat Detection using Analytics & Machine Learning

  • 1.
    Detecting Threats with Analyticsand Machine Learning (ML) Shomiron DAS GUPTA (GCIA) Founder, CEO NETMONASTERY Inc. #SACON
  • 2.
    Agenda ■ Dissecting detectionsystems ■ Why do we need “analytics” ■ Learning systems ■ Anomaly / Heuristics / Dictionaries ■ Machine Learning Use Cases ■ Why ML works / fails #SACON
  • 3.
    Dissecting Detection Systems ■Signature based ■ Anomaly engines ■ Analytics workbench ■ Learning systems #SACON
  • 4.
    Why do weneed “Analytics”
  • 5.
    The Real Needfor Analytics ■ Cyber security refresh rate ■ Custom payloads from attackers ■ Servers not the target ■ Speed with volume #SACON
  • 6.
    Learning Systems ■ Heuristiclearning ■ Anomaly engines ■ Spot / Baseline / Profilers ■ Time series analytics ■ Classifiers ■ Unassisted learning #SACON
  • 7.
    Heuristics ■ Virus detection,OS Rootkits v1 v2 v3 v4 Day 14Day 6Day 1 Day 42 #SACON
  • 8.
    Anomaly ■ DDoS Detection,Protocol Obfuscation, Malformed Data Streams, Application Breach Fixed Anomaly Model Structure Could be traffic behavior, protocol behavior, application behavior. Realtime Data #SACON
  • 9.
    Spot / Baseline/ Profilers ■ Unordered Action - new rule, new device, long dead user, database user event #SACON LEARN PHASE EVAL PHASE Build Model Transcode model with feature aggregation performed on realtime data flows Data Data Evaluation Identification of outliers based on pre approved model
  • 10.
    Time Series Analytics ■DDoS, Flow Outliers, protocol breach, zombies #SACON THRESHOLDING DYNAMIC THRESHOLDING Fixed limits are set to detect breach in activity Moving window analysis of time series data
  • 11.
    Classifiers ■ SPAM, Botnets,Authentication Anomalies #SACON Clustering Process - Suitable feature selection (PCA) - Training set (static / dynamic) - Cleaning training data - Regression to find mean - Operations - Feedback and Re-tuning
  • 12.
    Unassisted Learning ■ SPAM,DNS Detection, L2 Attacks #SACON Alert Feedback AnalystSelf Adjusting Loop Data Profiler Model
  • 13.
    When is MLworking ■ Credible / Clean training data ■ Positive and timely feedback ■ Picking the right features ■ Consistent feature variation ■ Consistent data pattern #SACON
  • 14.
    Where does MLwork ■ DNS based detection ■ DDoS / Traffic anomaly ■ SPAM Mail filters ■ Authentication ■ Application modelling ■ Threat Intelligence #SACON
  • 15.
    ML is failing ■Variance challenge ■ The “stale dataset” problem ■ Mass labelling ■ Complex selection challenges #SACON
  • 16.
    ■ Programming inR / Python ■ Data platforms - Splunk, DNIF ■ Infrastructures - Generic Hadoop, Hortonworks https://dnif.it Get started with 100Gb free every month forever Getting Started with ML #SACON
  • 17.