The internet and different computing devices from desktop computers to smartphones have raised many security and privacy concerns, and the need to automate systems that detect attacks on these networks has emerged in order to be able to protect these networks with scale. And while traditional intrusion detection methods may be able to detect previously known attacks, the issue of dealing with new unknown attacks arises and that brings machine learning as a strong candidate to solve these challenges.
In this report, we investigate the use of machine learning in detecting network attacks, intrusion detection, by looking at work that has been done in this field. Particularly we look at the work that has been done by Pasocal et al.
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
ย
Using Machine Learning in Networks Intrusion Detection Systems
1. Using Machine Learning in
Networks Intrusion Detection
Systems
OMAR SHAYA
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 1
2. Sections
โค Introduction
โค Intrusion Detection Methodologies
โค A Machine Learning Based IDS (Intrusion Detection System)
โค Challenges of Using Machine Learning in Intrusion Detection
โค Summary
โค References
โค Appendix
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 2
4. Increasing attacks on computer networks and the need
for automated detection
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 4
โข Internet and computer systems have raised numerous security
and privacy issues
โข Explosive use of networks due to many reasons e.g. internet,
wireless networks, cloud computing
โข Thus, malicious attacks on networks have increased year over
year
โข Need to automate systems that detect these attacks
โข Based on on known attacks
โข But what about attacks that were not seen before
โข Machine learning?
INTRODUCTION
5. De๏ฌnition: intrusion & intrusion detection
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 5
INTRODUCTION
โIntrusion is an attempt to compromise CIA
(Con๏ฌdentiality, Integrity, Availability), or to bypass
the security mechanisms of a computer or networkโ
โIntrusion detection is the process of monitoring
the events occurring in a computer system or
network, and analyzing them for signs of intrusionโ
7. There are 3 main Detection Methodologies
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 7
โข Signature-based Detection (SD)
โข A signature is a string or pattern that corresponds to known attack or threat
โข SD is a process to compare patterns against captured events for recognizing
possible intrusions
โข Uses the knowledge accumulated by speci๏ฌc attacks and system vulnerabilities
โข Also known as Knowledge-based Detection or Misuse Detection
โข Anomaly-based Detection (AD)
โข Anomaly is a deviation to โnormalโ behavior
โข Pro๏ฌles of normal derived from monitoring network traf๏ฌc
โข AD compares normal pro๏ฌles with observed events to recognize attacks
โข Stateful Protocol Analysis (SPA)
โข SPA depends on vendor-developed generic pro๏ฌles to speci๏ฌc protocols
โข Protocols based on standards from international standard organizations
โข Hybrid IDS use multiple methodologies
โข SD and AD are complementary methods, former concerns with certain attacks
and the later focuses on unknown attacks
INTRUSION DETECTION METHODOLOGIES
8. There are 3 main Detection Methodologies
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 8
โข Hybrid IDS use multiple methodologies
โข E.g. SD and AD are complementary methods
โข SD concerns with certain attacks and AD focuses on unknown attacks
INTRUSION DETECTION METHODOLOGIES
Signature-based Detection
(SD)*
Anomaly-based Detection
(AD)
Stateful Protocol Analysis
(SPA)
SD is a process to compare patterns
against captured events for
recognizing possible intrusions
AD compares normal pro๏ฌles with
observed events to recognize attacks
SPA depends on vendor-developed
generic pro๏ฌles to speci๏ฌc protocols
A signature is a string or pattern that
corresponds to known attack or threat
Anomaly is a deviation to โnormalโ
behavior
The stateful in SPA indicates that IDS
could know and trace the protocol
states (e.g., pairing requests with
replies)
Uses the knowledge accumulated by
speci๏ฌc attacks and system
vulnerabilities
Pro๏ฌles of normal derived from
monitoring network traf๏ฌc
Protocols based on standards from
international standard organizations
* Also known as Knowledge-based Detection or Misuse Detection
9. Pros and cons of Intrusion Detection Methods
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 9
INTRUSION DETECTION METHODOLOGIES
Table 1: Pros and Cons of intrusion detection methodologies. Source [2]
Signature-based Detection
(SD)
Anomaly-based Detection
(AD)
Stateful Protocol Analysis
(SPA)
โข Simplest and effective method to
detect attacks
โข Detail contextual analysis
โข Effective to detect new and
unforeseen vulnerabilities
โข Less dependent on OS
โข Facilitate detections of privilege
abuse
โข Know and trace protocol states
โข Distinguish unexpected sequences
of commands
โข Ineffective with unknown attacks
and variants of known attacks
โข Little understanding to states and
protocols
โข Hard to keep signatures/patterns up
to date
โข Time consuming to maintain the
knowledge
โข Weak pro๏ฌles accuracy due to
observed events
โข Unavailable during rebuilding of
behavior pro๏ฌles
โข Dif๏ฌcult to trigger alerts in right time
โข Resource consuming to protocol
state tracing and examination
โข Unable to inspect attacks looking
like benign protocol behaviors
โข Might be incompatible to dedicated
OSs or APs
PROSCONS
10. A MACHINE LEARNING BASED IDS
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 10
IDS: Intrusion Detection System
11. Machine learning in anomaly detection
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 11
โข Anomaly-based Detection (AD)
โข Easy when it is possible to characterize what is normal in the
data using simple mathematical model, e.g. normal distribution
โข Most interesting real world systems have complex behavior that
doesnโt follow such distribution
โข Machine learning is useful to learn the characteristics of the
system from observed data
โข Feature Selection is the process of selecting a subset of relevant
features (variables, predictors) for use in model construction. Feature
selection techniques are used for three reasons:
โข Simpli๏ฌcation of models to make them easier to interpret
โข Shorter training times
โข Enhanced generalization by reducing over๏ฌtting
โข Outlier Detection: an outlier is an observation point that is distant from
other observations
A MACHINE LEARNING BASED IDS
12. Robust Feature Selection and Robust PCA for Internet
Traf๏ฌc Anomaly Detection
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 12
โข Couples feature selection algorithm with outlier detection
method
โข Uses robust statistics tools in both procedures
โข Reliable results even with outliersโ presence
โข Feature selection based on robust mutual estimator
โข MI (Mutual Information): an information-theoretic metric that
captures both linear and non-linear dependencies
โข Outlier detection on robust PCA (Principal Component Analysis)
โข Mathematical procedure used to reduce dimensionality of a
problem
A MACHINE LEARNING BASED IDS
13. Robust Feature Selection and Robust PCA for Internet
Traf๏ฌc Anomaly Detection
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 13
โข Feature selection
โข Important preprocessing step (๏ฌlter)
โข Reduce dimensionality with high-dimensional data
โข Remove irrelevant data
โข Increase learning accuracy
โข Gives signi๏ฌcant performance gains โจ
A MACHINE LEARNING BASED IDS
14. Robust Feature Selection and Robust PCA for Internet
Traf๏ฌc Anomaly Detection
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 14
A MACHINE LEARNING BASED IDS
โข Robust statistics
โข Reliable results even in the
presence of outliers
Example:
โข In normal distribution, the inner 95%
are in โcenter ยฑ 1.96 X spreadโ
โข Center: instead of mean, โจ
take the median
โข Spread: instead of SD (standard
deviation), take the MAD (median
absolute deviation)
Source [1]
15. Dataset creation for training and testing (1/2)
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 15
โข Dataset collected from mirroring traf๏ฌc passing the switch of:
โข Private laboratory network, 17 inter-connected PCs
โข 10 for users producing licit traf๏ฌc
โข 1 for server, 1 for measurements
โข 5 for attacks
โข Licit traf๏ฌc
โข File sharing (BitTorrent)
โข Video streaming (IPTV over TCP)
โข Web browsing (HTTP)
โข Attacks
โข Botnets
โข Port-scans: identify other targets vulnerable to infections
โข Snapshots: type of identity theft for stealing personal information
โข Other Botnet attacks are not used e.g. spyware, malware, denial of service, and
email spam
โข Happen uniquely on host level
โข Can be detected by e.g. anti-virus, monitoring at router/๏ฌrewalls, email scanning
A MACHINE LEARNING BASED IDS
16. Dataset creation for training and testing (2/2)
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 16
โข Customer usage pro๏ฌles
โข (a) Soft browsing (HTTP only)
โข (b) File sharing machine (BitTorrent only)
โข (c) File sharing user (BitTorrent and HTTP)
โข (d) Heavy user (HTTP, BitTorrent, and
Streaming)
โข Network scenarios
โข (B) Business user
โข 100% (a)
โข (R) Residential user
โข 30% (b), 40% (c), 30% (d)
โข Attack intensities
โข (1) 6% (5% snapshot, 1% port-scan)
โข (2) 20% (15% snapshot, 5% port-scan)
โข (3) 35% (30% snapshot, 5% port-scan)
A MACHINE LEARNING BASED IDS
Table 2. Source [1]
17. Results (1/3)
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 17
A MACHINE LEARNING BASED IDS
โข 6 types of anomaly detectors A-B
โข A: feature selection method, B Outlier
detection method
โข R (robust)
โข NR (non-robust)
โข โ (no-method)
โข Performance measures
โข Nr Ftrs: number of selected features
โข Recall: probability that an observation is
classi๏ฌed as anomaly when in fact it is an
anomaly
โข False positive rate (FPR): probability that an
observation is classi๏ฌed as an anomaly when
in fact it is a regular observation
โข Precision: probability of having an anomalous
observation given that it is classi๏ฌed as an
anomaly
Table 3. Source [1]
18. Results (2/3)
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 18
โข R-R detector achieved the best
results
โข Recall is always 1
โข B1, B2, B3, R3 performance is maximum
โข FPR and Precision are close to their optimal
โข Improvement over non-robust
version is high
โข Low recall means large percentage of
anomalies are not correctly identi๏ฌed
โข B2, B3, R3 recall improved from 0.167,
0.273, and 0.125 to 1
โข Feature selection
โข Feature selection reduces Nr Ftrs, improves
performance
โข B3 and R3: no feature selection sometimes
better than non-robust feature selection
A MACHINE LEARNING BASED IDS
Table 3. Source [1]
19. Results (3/3)
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 19
A MACHINE LEARNING BASED IDS
โข Compare R-NR (top) and R-R
(bottom)
โข Any point with score or distance
larger than a threshold (the lines) is
considered an anomaly
โข R-NR case there is confusion
around snapshots
โข thus poor recall value 0.125
โข proximity in behavior between snapshots and
some HTTP and BitTorrent fools the non-robust
outlier detector
โข All consist of small ๏ฌle uploads
Source [1]
Fig. 2.
21. CHALLENGES OF USING MACHINE
LEARNING IN INTRUSION DETECTION
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 21
22. Outliers, cost of error, semantics, and evaluation
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 22
โข Outlier detection
โข Hard to de๏ฌne normal in network traf๏ฌc as the usage varies in every
session and with new applications (diversity of network traf๏ฌc)
โข High cost of errors
โข Cost of misclassi๏ฌcation is extremely high
โข False positive: expensive analyst time
โข False negative: cause serious damage to an organization
โข Error in other applications of ML not expensive e.g. product
recommendations, OCR, spam detection
โข Semantic gap
โข Currently it is only assessment of capability to identify deviations from
normal pro๏ฌle (could be good or bad)
โข Need to interpret results from operator point of view, what does it mean?
โข Dif๏ฌculties with evaluation
โข Designing sound evaluation schemes can be more dif๏ฌcult than the
detector itself
โข Lack of public data sets for assessing anomaly detection
โข Hard to gain real data set for many reasons e.g. leak of personal data
โข Simulated data is not accurate
CHALLENGES OF USING MACHINE LEARNING IN INTRUSION DETECTION
26. Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 26
References
[1] C. Pasocal, M. Oliveira, R. Valdas, P. Filzmoser, P. Salvador and A. Pacheco. Robust Feature Selection and
Robust PCA for Internet Traffic Anomaly Detection. In Proceedings IEEE INFOCOM, pages 1755-1763, 2012
[2] H. Liao, C. Lin, Y. Lin and K. Tung. Intrusion Detection System: A Comprehensive Review. In Journal of
Network and Computer Applications, pages 16-24, 2013
[3] R. Sommer and V. Paxson. Outside the Closed World: On Using Machine Learning For Network Intrusion
Detection. In IEEE Symposium on Security and Privacy, pages 305-316, 2010
[4] Feature Selection. https://en.wikipedia.org/wiki/Feature_selection on 6 August 2015
[5] Outlier. https://en.wikipedia.org/wiki/Outlier on 6 August 2015
[6] Anomaly Detection โ Using Machine Learning to Detect Abnormalities in Time Series Data. http://
blogs.technet.com/b/machinelearning/archive/2014/11/05/anomaly-detection-using-machine-learning-to-
detect-abnormalities-in-time-series-data.aspx on 6 August 2015
REFERENCES
27. Precision and Recall
Georg-August-Universitรคt Gรถttingen โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ 27
APPENDIX
Source: Dr. Stephan Siggโs slides from Machine Learning and Pervasive Computing course SoSe 2015