2. About Me
Education:
Masters of Science, Computer Information Technology, Purdue University
Thesis Topic: Application of Hardware-based Machine learning For Intrusion Detection using Cognitive Processors
Bachelors of Science, Computer Engineering Software, IUST
Awards:
2nd Place in the Soccer Simulation 3D League competition of the Robocup GermanOpen April 2005
1st Place in the Rescue Simulation League competition of the Robocup GermanOpen April 2005
3rd Place in the Simulation 3D League of the Robocup International Competition July 2005 Japan
Industrial experiences:
4. Goal of This Presentation
Is not:
Science behind Machine Learning
Science behind Artificial neural network
Create super intelligent robot
Is:
Application of machine learning in day to day life
Think like data scientist
Extract meaningful information out of data
5. Caution
I cannot explain everything
You cannot get every detail
Try to get a big picture
Get some useful keyword
Connect it with your daily work
6. Problem
New technologies come to market, and with them new vulnerabilities add to our systems.
Nowadays lots of devices connect to the internet not only computers but also devices like TV,
refrigerator, cell phones, doors and even small sensors.
Our today’s markets are less tolerant to down time due to security issues or attacks.
Attacks likes Denial of Service can cause a big problem by make the service unavailable and
increase the down time
7. Intrusion Detection Systems
Intrusion detection systems use two approaches in order to detect the malicious traffic :
◦ signature based which rely on the previously created list of known attacks
◦ Anomaly detection
Signature-based approach can not detect Novell attack and zero-day attack.
Anomaly detection uses machine learning algorithm, however most of them are resource
intensive.
Performance and response time is crucial, fast detection is a MUST
8. Signature based IDS
Signature based intrusion detection systems
need to check the traffics with thousands or
even millions of pattern gathered from
previously executed attacks
novel attacks or previous attack with even a
minor changes are almost impossible to detect
in run time
In order to add the signature of an attack to the
base system the attack first needs to detect and
analyze and then its pattern should be created
9. Definitions
Machine learning: that we refer to this as ML, is a system that can learn from data
Embedded System: is a sort of computer system often with real-time computing constraints.
Cognitive Processor: it uses the idea of neural network to build a processing unit works like
Human Brain. As the Brain it’s consist of small unit called neuron. Neurons in this computational
model have its own memory and logic for operating on that memory.
IDS: intrusion detection system
RCE: Restricted Coulomb Energy is a Hyperspherical classifiers.
KNN: K-Nearest Neighbor is a non-parametric method for classification and regression
10. Machine Learning
A computer program is said to learn from experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T, as measured by P, improves with
experience E“ – T. Michell (1997)
17. Applications of ML - Banking & Telecom
Identify:
Prospective customers
Dissatisfied customers
Good customers
Bad payers
Obtain:
More effective advertising
Less credit risk
Fewer fraud
Decreased churn rate
18. Applications of ML - Computer &
Internet
Computer interfaces:
Troubleshooting wizards
Handwriting and speech
Brain waves
Internet
Hit ranking
Spam filtering
Text categorization
Text translation
Recommendation
19. KNN
An object is classified by a majority vote of
its neighbors, with the object being assigned
to the class most common among
its k nearest neighbors
The neighbors are taken from a set of
objects for which the class (for k-NN
classification) is known.
If k = 1, then the object is simply assigned to
the class of that single nearest neighbor.
20. Artificial Neural Networks
Artificial neural networks are built out of a densely interconnected set of simple units, where
each unit takes a number of real-valued inputs (possibly the outputs of other units) and
produces a single real-valued output (which may become the input to many other units).
The human brain is estimated to contain a densely interconnected network of approximately
10^11 neurons, each connected, on average, to 104 others.
21. RCE
The architecture of the RCE network
contains two layers: A hidden layer and an
output layer.
The hidden layer is fully interconnected to
all components of an input pattern
The output layer is sparsely connected to
the hidden layer; each hidden unit projects
its output to one and only one output unit.
22. IDS Literature Review
A signature based IDS watches for network packets then compares that traffic to a database of
known attacks, called signatures. However, there will be a time gap between the attack and the
time the system can detect that attack (Barman 2012).
In 2010, Stuxnet, a computer worm, affected nuclear facility in a country. It was designed to
harm PLC system (Falliere, 2011).
Baker and Prasanna in 2004, proposed a methodology for building an efficient IDS using FPGA.
They showed that this methodology results in 8 times faster computing time in comparison with
shift-and-compare architecture. Although they reached high throughput, the amount of false-
positive errors was increased.
In 2013 Yoon et al, suggested a Multicore-based IDS. Shared resources in processors create a lot
of problem and also add a lot complexity to development of system using those processors. They
tried to detect malicious behavior using statistical analysis.
24. Architecture (2) – CM1K
It features 1024 neurons working in parallel implementing two non-linear classifiers.
Learn and recognize patterns up to 256 bytes ( 1 Byte for each)
Classify patterns up to 32,768 categories
Choice of Restricted Coulomb Energy (RCE) or K-Nearest Neighbor (KNN) classifiers
Low cost, small footprint, low power consumption (0.5w)
Recognition time independent of the number of neurons
25. Methodology - Data Collection
A small packet sniffer has been developed.
The sniffer is based on libpcap library.
The developed packet sniffer is installed on
an embedded device which is a Raspberry PI.
The sniffer is based on libpcap library. Once
it reads the packet header, it stores it into
CSV format.
26. Methodology – Data Collection (2)
In order to have required samples a
small isolated LAN has been set up.
Normal packets like ping trace route
and other TCP stream have been
generated in this network.
Anomaly Packets were gathered by
running some network attack using
Netwox toolset.
The dataset has 10 features
27. Methodology – Data Collection (3)
Features
src_ip
dst_ip
Tos
Len
Id
off
ttl
prt
src_p
dst_p
28. Methodology - Data Normalization
There is only 1 byte available for each feature. 1 byte cannot store numbers higher than 255.
CM1K chip only accepts integer values so the values were rounded.
Collected data should be normalized to fit in this range. This was achieved by using this formula:
𝑥 𝑛𝑒𝑤 = 𝑟𝑑𝑜𝑤𝑛 +
𝑥 − 𝑥 𝑚𝑖𝑛
𝑥 𝑚𝑎𝑥 − 𝑥 𝑚𝑖𝑛
× (𝑟𝑢𝑝−𝑟𝑑𝑜𝑤𝑛)
29. Methodology - Classification and Training
Another column for class was added to dataset. For the normal data, the class is ‘1’ and for data
gathered from anomaly traffic the class is ‘2’.
10 pairs of Test/Train file were prepared. Each file contained 512 samples for normal traffic and
512 for anomaly traffic.
The data must sent form the Arduino board to the CM1K.
After the CM1K was trained The Arduino board loaded the test file into chip.
The chip sends back the distance between the test samples and the trained model starting from
shortest distance.
30. Methodology - Classification and Training
Using CM1K
The algorithm can be chosen before
training part. RCE and KNN can be
selected by changing a data register on
the Arduino board.
31. Methodology - Classification and Training
Using NSL-KDD Dataset
The KDD Cup '99 dataset was created by processing the tcpdump portions of the 1998 DARPA
Intrusion Detection System (IDS) Evaluation dataset
NSL-KDD suggested in order solving some problem of KDD’99 dataset.
NSL-KDD dataset has 41 features and provided thousands of data sample for both training and
testing.
By using the same method used before, the CM1K was trained and then tested with both KNN
and RCE algorithm.
From test and train samples 10 pairs of completely identical data were created. Each sample file
has 1024 samples.
34. Conclusion
CM1K provides parallelism with low cost and energy consumption
CM1K provides classification algorithm in hardware level
Although KNN showed more accuracy but RCE used less Neuron.
Having good data is a big challenge
This project can be used for any classification problem
𝐼2 𝐶 is not a good communication bus as it creates bottleneck
36. Time Line
0 20 40 60 80 100 120 140
Developing Packet Sniffer
Get the components
Design of the system
Installing Packet Sniffer on Raspberry PI
Soldering complete and approved by advisor
Gathering Sample from Network
Developing Classifier Code On Arduion
Training the Chip
Testing the IDS with random Data
Post testing modification
Timeline
Start Days Completed
37. References
Cheng (2006). On-Time and Scalable Intrusion Detection in Embedded Systems. Albert
Mo Kim Cheng, Real-Time Systems Laboratory Department of Computer Science University of
Houston.
Axelsson (1999). Research in intrusion-detection systems: A survey. TR 98-17, Department
of Computer Engineering, Chalmers University of Technology, G ¨ oteborg, Sweden, December 1998.
Revised August 19, 1999.
Kerschbaum (2001) Florian Kerschbaum, Eugene H. Spafford, Diego Zamboni. Using
internal sensors and embedded detectors for intrusion detection. Center for Education and Research
in Information Assurance and Security 1315 Recitation Building Purdue University.
Tavallaee (2009) Mahbod Tavallaee, Ebrahim Bagheri, Wei Lu, and Ali A. Ghorbani. A Detailed
Analysis of the KDD CUP 99 Data Set.
Hripcsak, G., & Rothschild, A. (2005). Agreement, the F-Measure, and the Reliability in Information
Retrieval. Retrieved from http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1090460/pdf/296.pdf