This document presents research on improved competitive learning neural networks for network intrusion and fraud detection. It discusses machine learning concepts like classification, clustering, artificial neural networks, and competitive learning. It then introduces an improved competitive learning network (ICLN) algorithm and a supervised version called SICLN. The paper compares the performance of ICLN, SICLN, k-means, and SOM clustering algorithms on intrusion detection datasets like KDD99 and a transaction fraud dataset, evaluating based on metrics like accuracy, precision, and recall. The SICLN was shown to achieve slightly better performance than the other methods on these tasks.
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
Attractive light wid
1. 1
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
April - May 2019
ISLAMIC AZAD UNIVERSITY OF RASHT
In The NameOf GOD
Faculty of Engineering
Improved competitive learning neural networks for network intrusion and fraud detection
Benyamin Moadab , Saba Zahedi Rad
Profesoor : Elham Khoshkerdar
2. 2
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V EA TT R A C T I V E
Table
of
Contents
1. The basic concepts (Machine learning , Clustering , classification , Artificial
Neural Networks , Competitive learning , Intrusion Detection System )
2. Introduction
3. Background
4. Algorithm
5. Experimental comparisons
6. Evaluation metrics
7. Discussions
8. Conclusion
3. 3
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
Machine
learningMachine learning (ML) is the scientific study of algorithms and statistical models that computer systems use to effectively
perform a specific task without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of
artificial intelligence.
A TT R A C T I V E
4. 4
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V EA TT R A C T I V E
Types of machine learning
5. 5
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
classification
In machine learning and statistics, classification is
the problem of identifying to which of a set of
categories (sub-populations) a new observation
belongs, on the basis of a training set of data
containing observations (or instances) whose
category membership is known.
6. 6
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
Clustering
In cluster analysis or clustering, the grouping of
a set of objects takes place in such a way that
objects in a group (called cluster) are more
similar than other clusters.
This is the main task of
exploratory data mining and is a
common method for analyzing
statistical data that is used in
many areas, including machine
learning, pattern recognition,
image analysis, data retrieval,
bioinformatics, data compression,
and computer graphics.
7. 7
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
Types of clustering
Clustering algorithms can be classified according to the cluster model. Here are some prominent examples of clustering algorithms, because there are
probably more than 100 published clustering algorithms. All models are not described for their clusters, so they can not be easily categorized.
Members
Connection clustering
(hierarchical clustering)
single linkage on Gaussian data
Centroid based clustering
Isolation of K-means data in Voronoi-cells
Distribution clustering
For the Gaussian data , em has worked well.
Density clustering
Density clustering with DBSCAN
8. 8
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
Artificial Neural
Networks - ANNArtificial Neural Networks (ANN) or, more simply, neural networks, new computing systems and computing methods for machine
learning, knowledge representation, and, finally, applying knowledge to the vast majority of output responses from complex
systems. The main idea behind these networks is to some extent inspired by the way the biological nervous system functions to
process data and information in order to learn and create knowledge. The key element of this idea is to create new structures for
the information processing system.
A TT R A C T I V E
9. 9
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
10. 10
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
Artificial Neural
Networks - ANNComplex neural network
11. 11
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
Competitive learning
Competitive learning is a form of unsupervised
learning in artificial neural networks, in which nodes
compete for the right to respond to a subset of the
input data.
A variant of Hebbian learning, competitive learning
works by increasing the specialization of each node
in the network. It is well suited to finding clusters
within data.
VIEW
12. 12
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
Host Based IDS
The task of identifying and detecting any unauthorized use of the
system is either abusive or harmful by both internal and external
users. Detecting and preventing infiltration today is considered as
one of the main mechanisms in achieving security of networks and
computer systems and are generally used beside firewalls and
complementary security.
Architecture of Intrusion Detection Systems
Different architectures of penetration detection system are:
1. Host Based Intrusion Detection System (HIDS)
2. Network Based Intrusion Detection System (NIDS)
3. Distributed Intrusion Detection System (DIDS)
Intrusion
Detection
System
Log File Monitoring File Integrity
Checker
Network Based
IDS
Types of penetration
detection systems
13. 13
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
• Fraud detections and network intrusion detections are extremely
critical to e-Commerce business.
Both the credit card fraud-detection and network intrusion
detection domains present the following challenges to data
mining:
• There are millions of transactions each day.
• The data are highly skewed.
• Data labels are not immediately available.
• It is hard to track users' behaviors.
ICLN
SICLN
Introduction
14. 14
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
Scam
Statement
Place order
Deduct money
Dispute charge
Chargeback
Fraud report procedure
15. 15
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
Background
• The techniques for fraud detection and intrusion detections fall into two categories:
“ statistical techniques “ and “ data mining techniques “.
• Data mining based network intrusion detection techniques can be categorized into
“ misuse detection “ and “ anomaly detection” .
Multilayer Perceptron (MLP)
Self Organizing Projects (SOM)
Unconscious Integration Clustering (UNC)
Hybrid model
16. 16
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
One-layer perceptron
W1*X1 + W2*X2 + θ
= 0
17. 17
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
Multilayer Perceptron
18. 18
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
Improved competitive learning network
(ICLN)
1. The limitation of SCLN
2. New update rules in ICLN
3. The ICLN algorithm
Algorithm
Supervised improved competitive learning network
(SICLN)
standard competitive learning network
(S CLN )
1. The objective function
2. The SICLN algorithm
3. The SiCLN vs. the iCLN
19. 19
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
1. The limitation of SCLN
The SCLN consists of two layers of neurons: the distance measure layer and the competitive layer.
The distance measure layer consists of m weight vectors W = {w1,w2, ...,wm}.
The distances calculated in the distance measure layer become the input of the competitive layer.
Each bit of the output vector is either 0 or 1
The update is calculated by the standard competitive learning rule:
wj(r +1) = wj(r) + z(r)(x-wj(r))
Improved competitive learning network
20. 20
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
The drawback of the SCLN
(a) Initial weight vectors (b) Clustering result
21. 21
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
2. New update rules in ICLN
The ICLN changes the SCLN's reward-only rule to reward punish rule.
The lone neuron update formula:
wj(r+1)= w,(r)-Z2(r)K (d(xj))(x-wj(r))
The effect of the ICLN update rules
22. 22
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
Supervised improved competitive learning network
1. The objective function
The SICLN uses an objective function Obj(X,W) to measure the quality of the
clustering result.
• Obj(X, W) = a x Imp(X, W)+b x Sct(X,W)
The purpose of the objective function is to minimize the impurity of the result
clusters and keep a minimum number of clusters
The impurity of the whole result is the weighted average of the
impurity of each cluster:
• Imp(X,W ) = Ei = 1 |wi | x Imp(X,Wi)
23. 23
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
w1 and w5 are labeled as "Black" because their black point members are
more than gray point members.
w2and w4are labeled as "Gray“ because gray points of their members are
more than black points.
w3 is labeled as "unknown" because all of its members are missing
label.
w6is labeled as "unknown" because it has no data member.
After the learning step, the SICLN will reconstruct a new
network based on the trained network.
In the reconstruction step, a neuron is split into two new neurons if it
contains many members belonging to other classes.
On the other hand, two neighboring neurons are merged into one if they
belong to the same class.
1
2
3
6
4
5
24. 24
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
3. The SiCLN vs. the iCLN
While ICLN has the capability to cluster data in its nature groups.
The SICLN uses labels to guide the clustering process.
The ICLN groups data into clusters by gathering closer data points into the same group.
As a supervised clustering algorithm, the SICLN minimizes the impurity of the groups and the
number of groups.
25. 25
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
Regina Aurora
Designer
“Ut wisi enim ad minim veniam
In this section, we compare the performance of
the SICLN and the ICLN with the k-means and
SOM on three data sets:
The Iris data
The KDD 1999 data
The Vesta transaction data
Experimental
comparisons
26. 26
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V EA TT R A C T I V E
Evaluation metrics
The outputs of a prediction or detection model fall into four categories:
1)true positive (TP)
2)true negative (TN)
3)false positive (FP)
4) false negative (FN)
27. 27
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E Performance comparison on the Iris data
k-Means SOM ICLN SICLN
28. 28
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
Performance of the SICLN on Iris data with missing labels
29. 29
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
Network intrusion detection: KDD-99 data
Algoritm Num Of
Clusters
Accuracy Precision Recall
K-means 10 99.57% 98.60% 99.54%
Som 10 99.62% 98.89% 99.45%
ICLN 5 99.58% 98.59% 99.59%
SICLN 9 99.66% 98.92% 99.60%
Each connection is labeled as "normal" or a particular type of the attacks:
neptune
Smurf
Ip sweep
Back DoS
30. 30
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
ROC curves of SICLN, k-means, SOM, and ICLN on KDD-99 data
SICLN
SOM
ICLN
k-means
31. 31
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E Misclassify rate on individual class
32. 32
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E Data flow of Vesta data for fraud analysis
OLAP
OLTP
33. 33
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
Discussions
34. 34
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
1
2
4
3
35. 35
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
is able toclassify highly skew data
4
is completely independent
from the initial number of
clusters
6has the capability to
identify unseen
patterns
5
has the capability to achieve high
performance even when part of
data labels are missing
3
able to deal with both
labeled and unlabeled data
2achieves low misclassification
rate in solving classification
problems;
1
We have proposed and developed two clustering algorithms:
(1)The ICLN, an unsupervised clustering algorithm improving from
the standard competitive learning neural network,
(2) The SICLN, a supervised clustering algorithm, which introduces
supervised mechanism to the ICLN.
The SICLN is a supervised clustering algorithm derived from the ICLN.
The reconstruction step enables the SICLN to become completely
independent from the number of initial clusters.
The experimental comparison demonstrates the SICLN has excellent
performance in solving classification problems using clustering
approaches.
The experimental comparison demonstrates the SICLN has
excellent performance in solving classification problems using
clustering approaches.
Conclu
sion
36. 36
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
09116997485
CALL US
Beny.modab@gmail.com
EMAIL
Islamic Azad University Of Rasht
ADDRESS
Many thanks to the students of Computer Engineering (Information Technology and Software) at Rasht University of Technology.
Prepared by : Students at Azad University of Rasht
Contact Us
37. 37
W W W . W E B S I T E . C O
M
ATT R A C T I V E 2 0 1 7 . A L L R I G H T S
A TT R A C T I V E
F O R Y O U R A T T E N T I O N
A TT R A C T I V E