SlideShare a Scribd company logo
1 of 25
UNIBA: http://www.uniba.it DIB: http://www.di.uniba.it KDDE: http://kdde.di.uniba.it
SYNTHESIS OF AN INTRUSION DETECTION
ALGORITHM BASED ON DEEP LEARNING AND
REASSIGNMENT OF TRAINING LABELS
Advisor
Prof.ssa Annalisa Appice
Co-Advisor
Dott.ssa Giuseppina Andresini
Department of Computer Science, University of Bari Aldo Moro
Student
Francesco Paolo Caforio
Via Orabona, 4 - 70125 Bari - Italy
Motivations
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels 1
o Today's computer systems are complex and prone to
vulnerabilities
o New types of attacks are designed and built to traverse
sophisticated prevention and detection mechanisms
o Hackers design new attacks whose behavior is as similar as
possible to normal network traffic
Thesis Objective
2
o Synthesis of a Intrusion Detection System
o Data segmentation
o Identifying examples on the segment boundary and changing
their labels
o Deep learning
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
CD-IDS
3
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
o One-hot-encoding: transforms attributes into a vector of
numeric attributes
o Scaling: constructs attributes with standard normal
distribution with mean 0 and standard deviation 1
o Sostituzione valori mancanti: removes undefined values
o 𝑁𝑢𝑙𝑙 → 0
o 𝐼𝑛𝑓𝑖𝑛𝑖𝑡𝑦 → 𝑀𝑎𝑥
4
CD-IDS: Pre-processing
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
o Multi-level autoencoder (2+1+2 layers)
o Encoder + Decoder
o Central layer with 10 neurons
5
CD-IDS: Dimensionality reduction
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
Data Reconstructed data
o Training a Support Vector Machine
o Estimation of the reliability (distance to support vectors) of
classifying an example into a class (normal/attack)
o Segment creation (normal vs attack)
o If 𝑐0 𝑥 > 0.50, 𝑥 ∈ 𝐶𝑎𝑡𝑡𝑎𝑐𝑐𝑜
o Otherwise 𝑥 ∈ 𝐶𝑛𝑜𝑟𝑚𝑎𝑙𝑒
6
CD-IDS: Segmentation
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
o Identification of examples less reliably assigned to class
𝐶𝑛𝑜𝑟𝑚𝑎𝑙𝑒 and change their class
7
CD-IDS: Class Reassignment
NLS-KDD – Training set
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
o Identification of examples less reliably assigned to class
𝐶𝑛𝑜𝑟𝑚𝑎𝑙𝑒 and change their class
8
CD-IDS: Class Reassignment
NLS-KDD – Training set
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
o Convolution Neural Network learns a classifier from labeled
data in the form of images (matrices)
o Each connection  3 × 10 grayscale image + class (normal/attack)
9
CD-IDS: CNN
Training example
Closest training normal example
Closest training attack example
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
10
CD-IDS: CNN
Layer Hyperparameters Output shape
input (𝑁𝑜𝑛𝑒, 3, 10, 1)
conv0 Conv 2D level
32 filters, size: (2 × 2)
Activation function: 𝑟𝑒𝑙𝑢
(𝑁𝑜𝑛𝑒, 2, 9, 32)
dropout_0 𝑑𝑟𝑜𝑝𝑜𝑢𝑡 = 0.3 (𝑁𝑜𝑛𝑒, 2, 9, 32)
conv1 Conv 2D level
16 filters, size: (2 × 4)
Activation function: 𝑟𝑒𝑙𝑢
(𝑁𝑜𝑛𝑒, 1, 6, 16)
dropout_1 𝑑𝑟𝑜𝑝𝑜𝑢𝑡 = 0.3 (𝑁𝑜𝑛𝑒, 1, 6, 16)
flatten_1 (𝑁𝑜𝑛𝑒, 96)
dense_1 𝑛_𝑐𝑙𝑎𝑠𝑠𝑒𝑠 = 2
Activation function: 𝑆𝑜𝑓𝑡𝑚𝑎𝑥
(𝑁𝑜𝑛𝑒, 2)
output
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
Empirical evaluation: Data
11
Dataset Training set
Normal
Training set
Attack
Training set Testing set
Normal
Training set
Attack
Testing set
NLS-KDDTest+ 67343 58630 𝟏𝟐𝟓𝟗𝟕𝟑 9711 12833 𝟐𝟐𝟓𝟒𝟒
NLS-KDDTest-21 67313 58630 𝟏𝟐𝟓𝟗𝟕𝟑 2152 9698 𝟏𝟏𝟖𝟓𝟎
UNSW-NB15 56000 119341 𝟏𝟕𝟓𝟑𝟒𝟏 37000 45332 𝟖𝟐𝟑𝟑𝟐
CICIDS2017 80000 20000 𝟏𝟎𝟎𝟎𝟎𝟎 80000 20000 𝟏𝟎𝟎𝟎𝟎𝟎
o Dataset
o NLS-KDD
o UNSW-NB15
o CICIDS2017
o Dataset organized in 10 folders
o Every folder contains one training set and nine various testing set
o Each file contains 100.000 examples (80.000 genuine and 20.000 attacks)
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
12
Layer - AE NLS-KDD UNSW-NB15 CICIDS2017
1 80 100 55
2 30 40 30
3 10 10 10
4 30 40 30
5 80 100 55
o Number of neurons per layer
Empirical evaluation: Autoencoder
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
o Which algorithm best separates normal examples from
attacks?
o Fuzzy C-Means (F C-M) - unsupervised
o Gaussian Mixture Model (GMM) - unsupervised
o Support Vector Machine (SVM) - supervised
o Purity Index
o Ω = 𝑤1, 𝑤2, … , 𝑤𝐾 - set of clusters
o 𝐶 = 𝑐1, 𝑐2, … , 𝑐𝐽 - set of classes
13
𝑝𝑢𝑟𝑖𝑡𝑦 Ω, 𝐶 =
1
𝑁
𝑘
max
𝑗
𝑤𝑘 ∩ 𝑐𝑗
Empirical evaluation: Segmentation
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
14
Methods Well clustered
examples
Examples
Training set
Purity index
Fuzzy C-Means 111966 125973 0.8888
Gaussian Mixture Model 110672 125973 0.8785
Support Vector Machine 123582 125973 0.9810
o Purity index as the segmentation algorithm varies
Empirical evaluation: Segmentation
NLS-KDD
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
15
o Classifier training after the re-labeling step as the number
of re-labeled examples varies
o Evaluation of accuracy on the training set to see if there is a
threshold
o Evaluation of accuracy on the testing set
o Metrics
o True Positive (TP), False Positive (FP), True Negative (TN), False Negative (FN)
o Overall Accuracy (OA)
o Precision (P)
o Recall (R)
o F-Measure (F1-Score)
o True Positive Rate (TPR)
o False Positive Rate (FPR)
Empirical evaluation: Classification
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
16
Empirical evaluation: Classification
(TP -- training set)
Dataset Configuration OA F1-Score TPR FPR
NLS-KDDTest+ 4500 0.9128 0.9271 0.1683 0.9741
NLS-KDDTest-21 8500 0.8716 0.9256 0.5999 0.9761
UNSW-NB15 500 0.9163 0.9411 0.2229 0.9817
CICIDS2017 100 0.9813 0.9523 0.0071 0.9349
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
17
Empirical evaluation: Classification
(TP -- training set)
Dataset Configuration OA F1-Score TPR FPR
NLS-KDDTest+ 4500 0.9128 0.9271 0.1683 0.9741
NLS-KDDTest-21 8500 0.8716 0.9256 0.5999 0.9761
UNSW-NB15 500 0.9163 0.9411 0.2229 0.9817
CICIDS2017 100 0.9813 0.9523 0.0071 0.9349
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
18
Empirical evaluation: Classification
(TP -- training set)
Dataset Configuration OA F1-Score TPR FPR
NLS-KDDTest+ 4500 0.9128 0.9271 0.1683 0.9741
NLS-KDDTest-21 8500 0.8716 0.9256 0.5999 0.9761
UNSW-NB15 500 0.9163 0.9411 0.2229 0.9817
CICIDS2017 100 0.9813 0.9523 0.0071 0.9349
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
19
Empirical evaluation: Classification
(TP -- training set)
o NLS-KDD
o Configuration with the highest number of TP in the training set
o UNSW-NB15 e CICIDS2017
o Configuration with the first local maximum peak in TP values in the
training set
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
20
Configurazione OA F1-Score TPR FPR
100 0,901215 0,912697 0,10658 0,907114
500 0,888396 0,900095 0,104727 0,883192
1000 0,893985 0,905913 0,109463 0,896595
1500 0,903078 0,915129 0,116569 0,917946
2000 0,90228 0,914906 0,12491 0,922855
2500 0,900816 0,913821 0,129544 0,92379
3000 0,902369 0,915427 0,131809 0,928232
3500 0,905696 0,918966 0,138812 0,939375
4000 0,901127 0,914745 0,13943 0,931816
𝟒𝟓𝟎𝟎 0,912793 0,927099 0,168263 0,974129
5000 0,912349 0,926999 0,173926 0,977636
5500 0,90849 0,92287 0,161878 0,961739
6000 0,904143 0,919877 0,178457 0,966648
6500 0,90157 0,917812 0,182885 0,96548
7000 0,90228 0,919147 0,194831 0,975766
7500 0,898066 0,916369 0,211616 0,981064
8000 0,89709 0,914762 0,199362 0,970077
8500 0,895538 0,914547 0,218721 0,982
9000 0,895271 0,914286 0,218309 0,98122
9500 0,883783 0,905843 0,246113 0,982077
10000 0,886622 0,907812 0,237669 0,980675
Configurazione OA F1-Score TPR FPR
100 0,820759 0,889005 0,433086 0,877088
500 0,795865 0,871446 0,427509 0,845432
1000 0,808523 0,880648 0,437732 0,863168
1500 0,829283 0,895252 0,450743 0,891421
2000 0,832068 0,897454 0,464684 0,897917
2500 0,830211 0,896566 0,480483 0,899154
3000 0,834768 0,899651 0,481877 0,905032
3500 0,842025 0,905032 0,508364 0,919777
4000 0,832911 0,899113 0,513476 0,909775
4500 0,86616 0,921941 0,582714 0,965766
5000 0,868692 0,923643 0,589684 0,970406
5500 0,855359 0,914845 0,568309 0,949371
6000 0,858059 0,916823 0,582714 0,955867
6500 0,857468 0,916382 0,578996 0,95432
7000 0,866835 0,922465 0,588755 0,967932
7500 0,870211 0,924785 0,601766 0,974943
8000 0,861013 0,918767 0,586896 0,960404
𝟖𝟓𝟎𝟎 0,871561 0,925596 0,599907 0,976181
9000 0,870633 0,925026 0,600372 0,97515
9500 0,870127 0,924835 0,608271 0,976284
10000 0,869114 0,92416 0,605483 0,974428
NLS-KDDTest+ NLS-KDDTest-21
Empirical evaluation: Classification
(testing set)
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
21
Configurazione OA F1-Score TPR FPR
100 0,912907 0,938727 0,230518 0,980208
500 0,91634 0,941082 0,222875 0,981666
1000 0,906559 0,933416 0,212196 0,962285
1500 0,914355 0,939302 0,211982 0,973639
2000 0,912673 0,938682 0,235214 0,982068
2500 0,907021 0,935274 0,263393 0,986987
3000 0,913523 0,93972 0,250214 0,990355
3500 0,907865 0,935232 0,240196 0,977342
4000 0,907238 0,935232 0,256357 0,984004
4500 0,907455 0,935188 0,24925 0,980987
5000 0,903993 0,933535 0,280607 0,990615
5500 0,898563 0,928939 0,262464 0,974125
6000 0,889763 0,921867 0,250286 0,955481
6500 0,885497 0,919927 0,286875 0,966382
7000 0,890368 0,924324 0,308554 0,983711
7500 0,890699 0,924224 0,298214 0,979345
8000 0,879857 0,918621 0,368286 0,996296
8500 0,885572 0,921033 0,316625 0,980451
9000 0,884294 0,920315 0,323304 0,981708
9500 0,881602 0,918665 0,333214 0,982403
10000 0,88 0,91795 0,346429 0,986249
Configurazione OA F1-Score TPR FPR
100 0,981319 0,952298 0,007089 0,934952
500 0,975459 0,937567 0,011868 0,924763
1000 0,969574 0,922991 0,016626 0,914375
1500 0,966377 0,916227 0,023496 0,925869
2000 0,963825 0,912081 0,031327 0,944437
UNSW-NB15
CICIDS2017
Empirical evaluation: Classification
(testing set)
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
22
Dataset T.L.E.
OA
Senza T.L.E.
OA
T.L.E.
F1-Score
Senza TLE
F1-Score
NLS-KDDTest+ 𝟎. 𝟗𝟏𝟐𝟖 0.8875 𝟎. 𝟗𝟐𝟕𝟏 0.8987
NLS-KDDTest-21 𝟎. 𝟖𝟕𝟏𝟔 0.7911 𝟎. 𝟗𝟐𝟓𝟔 0.8677
UNSW-NB15 𝟎. 𝟗𝟏𝟔𝟑 0.9092 𝟎. 𝟗𝟒𝟏𝟏 0.9349
CICIDS2017 𝟎. 𝟗𝟖𝟏𝟑 0.9785 𝟎. 𝟗𝟓𝟐𝟑 0.9445
o Results
Empirical evaluation: Classification
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
23
Metodo OA F1-Score
CD-IDS 𝟎. 𝟗𝟏𝟐𝟖 𝟎. 𝟗𝟐𝟕𝟏
Li et al. [12] 0.7914 0.7912
Kim et al. [10] − 0.8100
Know et al. [11] − 0.89
Naaser et al. [18] 0.85 −
Yan et al. [26] 0.793 −
Kherlenchimeg et al. [26] 0.80 −
NLS-KDDTest+
Metodo OA F1-Score
CD-IDS 𝟎. 𝟖𝟕𝟏𝟔 𝟎. 𝟗𝟐𝟓𝟔
Li et al. [12] 0.8184 0.9001
Kim et al. [10] − 0.79
Know et al. [11] − 0.62
Naaser et al. [18] 0.70 −
NLS-KDDTest-21
Metodo OA F1-Score
CD-IDS 𝟎. 𝟗𝟏𝟔𝟑 𝟎. 𝟗𝟒𝟏𝟏
Kim et al. [10] − 0.90
Yan et al. [26] 0.8825 −
UNSW-NB15
Metodo OA F1-Score
CD-IDS 𝟎. 𝟗𝟖𝟏𝟑 𝟎. 𝟗𝟓𝟐3
Kim et al. [10] − 0.89
CICIDS2017
Empirical evaluation: State of the Art
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
o The technique of segmentation + re-labeling of examples
allows you to build a more accurate intrusion detection
model
o Future Developments
o Methodological: use of Generative Adversarial Network
(GAN) in which a generator creates "synthetic" data similar
to real data and a discriminator distinguishes the
constructed data from real data
o Technological: use of Apache Spark for image
construction
24
Conclusions and future developments
Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels

More Related Content

Similar to Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels

SafeML: Safety Monitoring of Machine Learning Classifiers through Statistical...
SafeML: Safety Monitoring of Machine Learning Classifiers through Statistical...SafeML: Safety Monitoring of Machine Learning Classifiers through Statistical...
SafeML: Safety Monitoring of Machine Learning Classifiers through Statistical...Koorosh Aslansefat
 
Black-Box attacks against Neural Networks - technical project presentation
Black-Box attacks against Neural Networks - technical project presentationBlack-Box attacks against Neural Networks - technical project presentation
Black-Box attacks against Neural Networks - technical project presentationRoberto Falconi
 
Skin melanoma stage detection - CNN.pptx
Skin melanoma stage detection - CNN.pptxSkin melanoma stage detection - CNN.pptx
Skin melanoma stage detection - CNN.pptxVishalLabde
 
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSISFUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSISIrene Pochinok
 
Biometric presentation attack detection
Biometric presentation attack detectionBiometric presentation attack detection
Biometric presentation attack detectionGautam Saxena
 
A scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringA scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringAllenWu
 
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
Maxim Kazantsev
 
Complex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsComplex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsPeter Solymos
 
Bigger Data v Better Math
Bigger Data v Better MathBigger Data v Better Math
Bigger Data v Better MathBrent Schneeman
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsValery Tkachenko
 
An Artificial Immune Network for Multimodal Function Optimization on Dynamic ...
An Artificial Immune Network for Multimodal Function Optimization on Dynamic ...An Artificial Immune Network for Multimodal Function Optimization on Dynamic ...
An Artificial Immune Network for Multimodal Function Optimization on Dynamic ...Fabricio de França
 
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017MLconf
 
Intrusion Detection System for Classification of Attacks with Cross Validation
Intrusion Detection System for Classification of Attacks with Cross ValidationIntrusion Detection System for Classification of Attacks with Cross Validation
Intrusion Detection System for Classification of Attacks with Cross Validationinventionjournals
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAndrea Dal Pozzolo
 
Injection Attack detection using ML for
Injection Attack detection using ML  forInjection Attack detection using ML  for
Injection Attack detection using ML forKhazane Hassan
 
From DNA Sequence Variation to .NET Bits and Bobs
From DNA Sequence Variation to .NET Bits and BobsFrom DNA Sequence Variation to .NET Bits and Bobs
From DNA Sequence Variation to .NET Bits and BobsSource Conference
 

Similar to Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels (20)

SafeML: Safety Monitoring of Machine Learning Classifiers through Statistical...
SafeML: Safety Monitoring of Machine Learning Classifiers through Statistical...SafeML: Safety Monitoring of Machine Learning Classifiers through Statistical...
SafeML: Safety Monitoring of Machine Learning Classifiers through Statistical...
 
Black-Box attacks against Neural Networks - technical project presentation
Black-Box attacks against Neural Networks - technical project presentationBlack-Box attacks against Neural Networks - technical project presentation
Black-Box attacks against Neural Networks - technical project presentation
 
Skin melanoma stage detection - CNN.pptx
Skin melanoma stage detection - CNN.pptxSkin melanoma stage detection - CNN.pptx
Skin melanoma stage detection - CNN.pptx
 
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSISFUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
 
2012 predictive clusters
2012 predictive clusters2012 predictive clusters
2012 predictive clusters
 
Biometric presentation attack detection
Biometric presentation attack detectionBiometric presentation attack detection
Biometric presentation attack detection
 
A scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clusteringA scalable collaborative filtering framework based on co clustering
A scalable collaborative filtering framework based on co clustering
 
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS
FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

FUNCTION OF RIVAL SIMILARITY IN A COGNITIVE DATA ANALYSIS

 
Complex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutionsComplex models in ecology: challenges and solutions
Complex models in ecology: challenges and solutions
 
Layering Based Network Intrusion Detection System to Enhance Network Attacks ...
Layering Based Network Intrusion Detection System to Enhance Network Attacks ...Layering Based Network Intrusion Detection System to Enhance Network Attacks ...
Layering Based Network Intrusion Detection System to Enhance Network Attacks ...
 
Bigger Data v Better Math
Bigger Data v Better MathBigger Data v Better Math
Bigger Data v Better Math
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpoints
 
1530 track2 humphrey
1530 track2 humphrey1530 track2 humphrey
1530 track2 humphrey
 
An Artificial Immune Network for Multimodal Function Optimization on Dynamic ...
An Artificial Immune Network for Multimodal Function Optimization on Dynamic ...An Artificial Immune Network for Multimodal Function Optimization on Dynamic ...
An Artificial Immune Network for Multimodal Function Optimization on Dynamic ...
 
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
Ryan West, Machine Learning Engineer, Nexosis at MLconf ATL 2017
 
Intrusion Detection System for Classification of Attacks with Cross Validation
Intrusion Detection System for Classification of Attacks with Cross ValidationIntrusion Detection System for Classification of Attacks with Cross Validation
Intrusion Detection System for Classification of Attacks with Cross Validation
 
Adaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud DetectionAdaptive Machine Learning for Credit Card Fraud Detection
Adaptive Machine Learning for Credit Card Fraud Detection
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
Injection Attack detection using ML for
Injection Attack detection using ML  forInjection Attack detection using ML  for
Injection Attack detection using ML for
 
From DNA Sequence Variation to .NET Bits and Bobs
From DNA Sequence Variation to .NET Bits and BobsFrom DNA Sequence Variation to .NET Bits and Bobs
From DNA Sequence Variation to .NET Bits and Bobs
 

Recently uploaded

Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxGOWTHAMIM22
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Sérgio Sacani
 
Triploidy ...............................pptx
Triploidy ...............................pptxTriploidy ...............................pptx
Triploidy ...............................pptxCherry
 
GBSN - Microbiology Lab (Compound Microscope)
GBSN - Microbiology Lab (Compound Microscope)GBSN - Microbiology Lab (Compound Microscope)
GBSN - Microbiology Lab (Compound Microscope)Areesha Ahmad
 
NUMERICAL Proof Of TIme Electron Theory.
NUMERICAL Proof Of TIme Electron Theory.NUMERICAL Proof Of TIme Electron Theory.
NUMERICAL Proof Of TIme Electron Theory.syedmuneemqadri
 
Film Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfFilm Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfPharmatech-rx
 
mixotrophy in cyanobacteria: a dual nutritional strategy
mixotrophy in cyanobacteria: a dual nutritional strategymixotrophy in cyanobacteria: a dual nutritional strategy
mixotrophy in cyanobacteria: a dual nutritional strategyMansiBishnoi1
 
IISc Bangalore M.E./M.Tech. courses and fees 2024
IISc Bangalore M.E./M.Tech. courses and fees 2024IISc Bangalore M.E./M.Tech. courses and fees 2024
IISc Bangalore M.E./M.Tech. courses and fees 2024SciAstra
 
-case selection and treatment planing.pptx
-case selection and treatment planing.pptx-case selection and treatment planing.pptx
-case selection and treatment planing.pptxmohamedturki866
 
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Sérgio Sacani
 
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Sahil Suleman
 
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanPlasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanmuralinath2
 
B lymphocytes, Receptors, Maturation and Activation
B lymphocytes, Receptors, Maturation and ActivationB lymphocytes, Receptors, Maturation and Activation
B lymphocytes, Receptors, Maturation and ActivationBhanu Krishan
 
Cellular Communication and regulation of communication mechanisms to sing the...
Cellular Communication and regulation of communication mechanisms to sing the...Cellular Communication and regulation of communication mechanisms to sing the...
Cellular Communication and regulation of communication mechanisms to sing the...Nistarini College, Purulia (W.B) India
 
GBSN - Microbiology Lab (Microbiology Lab Safety Procedures)
GBSN -  Microbiology Lab (Microbiology Lab Safety Procedures)GBSN -  Microbiology Lab (Microbiology Lab Safety Procedures)
GBSN - Microbiology Lab (Microbiology Lab Safety Procedures)Areesha Ahmad
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Sérgio Sacani
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...yogeshlabana357357
 
SCHISTOSOMA HEAMATOBIUM life cycle .pdf
SCHISTOSOMA HEAMATOBIUM life cycle  .pdfSCHISTOSOMA HEAMATOBIUM life cycle  .pdf
SCHISTOSOMA HEAMATOBIUM life cycle .pdfDebdattaGhosh6
 
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...TALAPATI ARUNA CHENNA VYDYANAD
 

Recently uploaded (20)

Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
 
Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...Jet reorientation in central galaxies of clusters and groups: insights from V...
Jet reorientation in central galaxies of clusters and groups: insights from V...
 
Triploidy ...............................pptx
Triploidy ...............................pptxTriploidy ...............................pptx
Triploidy ...............................pptx
 
GBSN - Microbiology Lab (Compound Microscope)
GBSN - Microbiology Lab (Compound Microscope)GBSN - Microbiology Lab (Compound Microscope)
GBSN - Microbiology Lab (Compound Microscope)
 
NUMERICAL Proof Of TIme Electron Theory.
NUMERICAL Proof Of TIme Electron Theory.NUMERICAL Proof Of TIme Electron Theory.
NUMERICAL Proof Of TIme Electron Theory.
 
Film Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfFilm Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdf
 
mixotrophy in cyanobacteria: a dual nutritional strategy
mixotrophy in cyanobacteria: a dual nutritional strategymixotrophy in cyanobacteria: a dual nutritional strategy
mixotrophy in cyanobacteria: a dual nutritional strategy
 
IISc Bangalore M.E./M.Tech. courses and fees 2024
IISc Bangalore M.E./M.Tech. courses and fees 2024IISc Bangalore M.E./M.Tech. courses and fees 2024
IISc Bangalore M.E./M.Tech. courses and fees 2024
 
-case selection and treatment planing.pptx
-case selection and treatment planing.pptx-case selection and treatment planing.pptx
-case selection and treatment planing.pptx
 
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
Constraints on Neutrino Natal Kicks from Black-Hole Binary VFTS 243
 
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
 
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyanPlasma proteins_ Dr.Muralinath_Dr.c. kalyan
Plasma proteins_ Dr.Muralinath_Dr.c. kalyan
 
B lymphocytes, Receptors, Maturation and Activation
B lymphocytes, Receptors, Maturation and ActivationB lymphocytes, Receptors, Maturation and Activation
B lymphocytes, Receptors, Maturation and Activation
 
Cellular Communication and regulation of communication mechanisms to sing the...
Cellular Communication and regulation of communication mechanisms to sing the...Cellular Communication and regulation of communication mechanisms to sing the...
Cellular Communication and regulation of communication mechanisms to sing the...
 
GBSN - Microbiology Lab (Microbiology Lab Safety Procedures)
GBSN -  Microbiology Lab (Microbiology Lab Safety Procedures)GBSN -  Microbiology Lab (Microbiology Lab Safety Procedures)
GBSN - Microbiology Lab (Microbiology Lab Safety Procedures)
 
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
 
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
Extensive Pollution of Uranus and Neptune’s Atmospheres by Upsweep of Icy Mat...
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
 
SCHISTOSOMA HEAMATOBIUM life cycle .pdf
SCHISTOSOMA HEAMATOBIUM life cycle  .pdfSCHISTOSOMA HEAMATOBIUM life cycle  .pdf
SCHISTOSOMA HEAMATOBIUM life cycle .pdf
 
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
Virulence Analysis of Citrus canker caused by Xanthomonas axonopodis pv. citr...
 

Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels

  • 1. UNIBA: http://www.uniba.it DIB: http://www.di.uniba.it KDDE: http://kdde.di.uniba.it SYNTHESIS OF AN INTRUSION DETECTION ALGORITHM BASED ON DEEP LEARNING AND REASSIGNMENT OF TRAINING LABELS Advisor Prof.ssa Annalisa Appice Co-Advisor Dott.ssa Giuseppina Andresini Department of Computer Science, University of Bari Aldo Moro Student Francesco Paolo Caforio Via Orabona, 4 - 70125 Bari - Italy
  • 2. Motivations Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels 1 o Today's computer systems are complex and prone to vulnerabilities o New types of attacks are designed and built to traverse sophisticated prevention and detection mechanisms o Hackers design new attacks whose behavior is as similar as possible to normal network traffic
  • 3. Thesis Objective 2 o Synthesis of a Intrusion Detection System o Data segmentation o Identifying examples on the segment boundary and changing their labels o Deep learning Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 4. CD-IDS 3 Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 5. o One-hot-encoding: transforms attributes into a vector of numeric attributes o Scaling: constructs attributes with standard normal distribution with mean 0 and standard deviation 1 o Sostituzione valori mancanti: removes undefined values o 𝑁𝑢𝑙𝑙 → 0 o 𝐼𝑛𝑓𝑖𝑛𝑖𝑡𝑦 → 𝑀𝑎𝑥 4 CD-IDS: Pre-processing Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 6. o Multi-level autoencoder (2+1+2 layers) o Encoder + Decoder o Central layer with 10 neurons 5 CD-IDS: Dimensionality reduction Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels Data Reconstructed data
  • 7. o Training a Support Vector Machine o Estimation of the reliability (distance to support vectors) of classifying an example into a class (normal/attack) o Segment creation (normal vs attack) o If 𝑐0 𝑥 > 0.50, 𝑥 ∈ 𝐶𝑎𝑡𝑡𝑎𝑐𝑐𝑜 o Otherwise 𝑥 ∈ 𝐶𝑛𝑜𝑟𝑚𝑎𝑙𝑒 6 CD-IDS: Segmentation Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 8. o Identification of examples less reliably assigned to class 𝐶𝑛𝑜𝑟𝑚𝑎𝑙𝑒 and change their class 7 CD-IDS: Class Reassignment NLS-KDD – Training set Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 9. o Identification of examples less reliably assigned to class 𝐶𝑛𝑜𝑟𝑚𝑎𝑙𝑒 and change their class 8 CD-IDS: Class Reassignment NLS-KDD – Training set Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 10. o Convolution Neural Network learns a classifier from labeled data in the form of images (matrices) o Each connection  3 × 10 grayscale image + class (normal/attack) 9 CD-IDS: CNN Training example Closest training normal example Closest training attack example Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 11. 10 CD-IDS: CNN Layer Hyperparameters Output shape input (𝑁𝑜𝑛𝑒, 3, 10, 1) conv0 Conv 2D level 32 filters, size: (2 × 2) Activation function: 𝑟𝑒𝑙𝑢 (𝑁𝑜𝑛𝑒, 2, 9, 32) dropout_0 𝑑𝑟𝑜𝑝𝑜𝑢𝑡 = 0.3 (𝑁𝑜𝑛𝑒, 2, 9, 32) conv1 Conv 2D level 16 filters, size: (2 × 4) Activation function: 𝑟𝑒𝑙𝑢 (𝑁𝑜𝑛𝑒, 1, 6, 16) dropout_1 𝑑𝑟𝑜𝑝𝑜𝑢𝑡 = 0.3 (𝑁𝑜𝑛𝑒, 1, 6, 16) flatten_1 (𝑁𝑜𝑛𝑒, 96) dense_1 𝑛_𝑐𝑙𝑎𝑠𝑠𝑒𝑠 = 2 Activation function: 𝑆𝑜𝑓𝑡𝑚𝑎𝑥 (𝑁𝑜𝑛𝑒, 2) output Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 12. Empirical evaluation: Data 11 Dataset Training set Normal Training set Attack Training set Testing set Normal Training set Attack Testing set NLS-KDDTest+ 67343 58630 𝟏𝟐𝟓𝟗𝟕𝟑 9711 12833 𝟐𝟐𝟓𝟒𝟒 NLS-KDDTest-21 67313 58630 𝟏𝟐𝟓𝟗𝟕𝟑 2152 9698 𝟏𝟏𝟖𝟓𝟎 UNSW-NB15 56000 119341 𝟏𝟕𝟓𝟑𝟒𝟏 37000 45332 𝟖𝟐𝟑𝟑𝟐 CICIDS2017 80000 20000 𝟏𝟎𝟎𝟎𝟎𝟎 80000 20000 𝟏𝟎𝟎𝟎𝟎𝟎 o Dataset o NLS-KDD o UNSW-NB15 o CICIDS2017 o Dataset organized in 10 folders o Every folder contains one training set and nine various testing set o Each file contains 100.000 examples (80.000 genuine and 20.000 attacks) Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 13. 12 Layer - AE NLS-KDD UNSW-NB15 CICIDS2017 1 80 100 55 2 30 40 30 3 10 10 10 4 30 40 30 5 80 100 55 o Number of neurons per layer Empirical evaluation: Autoencoder Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 14. o Which algorithm best separates normal examples from attacks? o Fuzzy C-Means (F C-M) - unsupervised o Gaussian Mixture Model (GMM) - unsupervised o Support Vector Machine (SVM) - supervised o Purity Index o Ω = 𝑤1, 𝑤2, … , 𝑤𝐾 - set of clusters o 𝐶 = 𝑐1, 𝑐2, … , 𝑐𝐽 - set of classes 13 𝑝𝑢𝑟𝑖𝑡𝑦 Ω, 𝐶 = 1 𝑁 𝑘 max 𝑗 𝑤𝑘 ∩ 𝑐𝑗 Empirical evaluation: Segmentation Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 15. 14 Methods Well clustered examples Examples Training set Purity index Fuzzy C-Means 111966 125973 0.8888 Gaussian Mixture Model 110672 125973 0.8785 Support Vector Machine 123582 125973 0.9810 o Purity index as the segmentation algorithm varies Empirical evaluation: Segmentation NLS-KDD Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 16. 15 o Classifier training after the re-labeling step as the number of re-labeled examples varies o Evaluation of accuracy on the training set to see if there is a threshold o Evaluation of accuracy on the testing set o Metrics o True Positive (TP), False Positive (FP), True Negative (TN), False Negative (FN) o Overall Accuracy (OA) o Precision (P) o Recall (R) o F-Measure (F1-Score) o True Positive Rate (TPR) o False Positive Rate (FPR) Empirical evaluation: Classification Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 17. 16 Empirical evaluation: Classification (TP -- training set) Dataset Configuration OA F1-Score TPR FPR NLS-KDDTest+ 4500 0.9128 0.9271 0.1683 0.9741 NLS-KDDTest-21 8500 0.8716 0.9256 0.5999 0.9761 UNSW-NB15 500 0.9163 0.9411 0.2229 0.9817 CICIDS2017 100 0.9813 0.9523 0.0071 0.9349 Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 18. 17 Empirical evaluation: Classification (TP -- training set) Dataset Configuration OA F1-Score TPR FPR NLS-KDDTest+ 4500 0.9128 0.9271 0.1683 0.9741 NLS-KDDTest-21 8500 0.8716 0.9256 0.5999 0.9761 UNSW-NB15 500 0.9163 0.9411 0.2229 0.9817 CICIDS2017 100 0.9813 0.9523 0.0071 0.9349 Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 19. 18 Empirical evaluation: Classification (TP -- training set) Dataset Configuration OA F1-Score TPR FPR NLS-KDDTest+ 4500 0.9128 0.9271 0.1683 0.9741 NLS-KDDTest-21 8500 0.8716 0.9256 0.5999 0.9761 UNSW-NB15 500 0.9163 0.9411 0.2229 0.9817 CICIDS2017 100 0.9813 0.9523 0.0071 0.9349 Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 20. 19 Empirical evaluation: Classification (TP -- training set) o NLS-KDD o Configuration with the highest number of TP in the training set o UNSW-NB15 e CICIDS2017 o Configuration with the first local maximum peak in TP values in the training set Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 21. 20 Configurazione OA F1-Score TPR FPR 100 0,901215 0,912697 0,10658 0,907114 500 0,888396 0,900095 0,104727 0,883192 1000 0,893985 0,905913 0,109463 0,896595 1500 0,903078 0,915129 0,116569 0,917946 2000 0,90228 0,914906 0,12491 0,922855 2500 0,900816 0,913821 0,129544 0,92379 3000 0,902369 0,915427 0,131809 0,928232 3500 0,905696 0,918966 0,138812 0,939375 4000 0,901127 0,914745 0,13943 0,931816 𝟒𝟓𝟎𝟎 0,912793 0,927099 0,168263 0,974129 5000 0,912349 0,926999 0,173926 0,977636 5500 0,90849 0,92287 0,161878 0,961739 6000 0,904143 0,919877 0,178457 0,966648 6500 0,90157 0,917812 0,182885 0,96548 7000 0,90228 0,919147 0,194831 0,975766 7500 0,898066 0,916369 0,211616 0,981064 8000 0,89709 0,914762 0,199362 0,970077 8500 0,895538 0,914547 0,218721 0,982 9000 0,895271 0,914286 0,218309 0,98122 9500 0,883783 0,905843 0,246113 0,982077 10000 0,886622 0,907812 0,237669 0,980675 Configurazione OA F1-Score TPR FPR 100 0,820759 0,889005 0,433086 0,877088 500 0,795865 0,871446 0,427509 0,845432 1000 0,808523 0,880648 0,437732 0,863168 1500 0,829283 0,895252 0,450743 0,891421 2000 0,832068 0,897454 0,464684 0,897917 2500 0,830211 0,896566 0,480483 0,899154 3000 0,834768 0,899651 0,481877 0,905032 3500 0,842025 0,905032 0,508364 0,919777 4000 0,832911 0,899113 0,513476 0,909775 4500 0,86616 0,921941 0,582714 0,965766 5000 0,868692 0,923643 0,589684 0,970406 5500 0,855359 0,914845 0,568309 0,949371 6000 0,858059 0,916823 0,582714 0,955867 6500 0,857468 0,916382 0,578996 0,95432 7000 0,866835 0,922465 0,588755 0,967932 7500 0,870211 0,924785 0,601766 0,974943 8000 0,861013 0,918767 0,586896 0,960404 𝟖𝟓𝟎𝟎 0,871561 0,925596 0,599907 0,976181 9000 0,870633 0,925026 0,600372 0,97515 9500 0,870127 0,924835 0,608271 0,976284 10000 0,869114 0,92416 0,605483 0,974428 NLS-KDDTest+ NLS-KDDTest-21 Empirical evaluation: Classification (testing set) Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 22. 21 Configurazione OA F1-Score TPR FPR 100 0,912907 0,938727 0,230518 0,980208 500 0,91634 0,941082 0,222875 0,981666 1000 0,906559 0,933416 0,212196 0,962285 1500 0,914355 0,939302 0,211982 0,973639 2000 0,912673 0,938682 0,235214 0,982068 2500 0,907021 0,935274 0,263393 0,986987 3000 0,913523 0,93972 0,250214 0,990355 3500 0,907865 0,935232 0,240196 0,977342 4000 0,907238 0,935232 0,256357 0,984004 4500 0,907455 0,935188 0,24925 0,980987 5000 0,903993 0,933535 0,280607 0,990615 5500 0,898563 0,928939 0,262464 0,974125 6000 0,889763 0,921867 0,250286 0,955481 6500 0,885497 0,919927 0,286875 0,966382 7000 0,890368 0,924324 0,308554 0,983711 7500 0,890699 0,924224 0,298214 0,979345 8000 0,879857 0,918621 0,368286 0,996296 8500 0,885572 0,921033 0,316625 0,980451 9000 0,884294 0,920315 0,323304 0,981708 9500 0,881602 0,918665 0,333214 0,982403 10000 0,88 0,91795 0,346429 0,986249 Configurazione OA F1-Score TPR FPR 100 0,981319 0,952298 0,007089 0,934952 500 0,975459 0,937567 0,011868 0,924763 1000 0,969574 0,922991 0,016626 0,914375 1500 0,966377 0,916227 0,023496 0,925869 2000 0,963825 0,912081 0,031327 0,944437 UNSW-NB15 CICIDS2017 Empirical evaluation: Classification (testing set) Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 23. 22 Dataset T.L.E. OA Senza T.L.E. OA T.L.E. F1-Score Senza TLE F1-Score NLS-KDDTest+ 𝟎. 𝟗𝟏𝟐𝟖 0.8875 𝟎. 𝟗𝟐𝟕𝟏 0.8987 NLS-KDDTest-21 𝟎. 𝟖𝟕𝟏𝟔 0.7911 𝟎. 𝟗𝟐𝟓𝟔 0.8677 UNSW-NB15 𝟎. 𝟗𝟏𝟔𝟑 0.9092 𝟎. 𝟗𝟒𝟏𝟏 0.9349 CICIDS2017 𝟎. 𝟗𝟖𝟏𝟑 0.9785 𝟎. 𝟗𝟓𝟐𝟑 0.9445 o Results Empirical evaluation: Classification Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 24. 23 Metodo OA F1-Score CD-IDS 𝟎. 𝟗𝟏𝟐𝟖 𝟎. 𝟗𝟐𝟕𝟏 Li et al. [12] 0.7914 0.7912 Kim et al. [10] − 0.8100 Know et al. [11] − 0.89 Naaser et al. [18] 0.85 − Yan et al. [26] 0.793 − Kherlenchimeg et al. [26] 0.80 − NLS-KDDTest+ Metodo OA F1-Score CD-IDS 𝟎. 𝟖𝟕𝟏𝟔 𝟎. 𝟗𝟐𝟓𝟔 Li et al. [12] 0.8184 0.9001 Kim et al. [10] − 0.79 Know et al. [11] − 0.62 Naaser et al. [18] 0.70 − NLS-KDDTest-21 Metodo OA F1-Score CD-IDS 𝟎. 𝟗𝟏𝟔𝟑 𝟎. 𝟗𝟒𝟏𝟏 Kim et al. [10] − 0.90 Yan et al. [26] 0.8825 − UNSW-NB15 Metodo OA F1-Score CD-IDS 𝟎. 𝟗𝟖𝟏𝟑 𝟎. 𝟗𝟓𝟐3 Kim et al. [10] − 0.89 CICIDS2017 Empirical evaluation: State of the Art Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels
  • 25. o The technique of segmentation + re-labeling of examples allows you to build a more accurate intrusion detection model o Future Developments o Methodological: use of Generative Adversarial Network (GAN) in which a generator creates "synthetic" data similar to real data and a discriminator distinguishes the constructed data from real data o Technological: use of Apache Spark for image construction 24 Conclusions and future developments Synthesis of an intrusion detection algorithm based on deep learning and reassignment of training labels