2. • Abstract
• Introduction
• Background
• System Model
• Problem Statement
• Project Analysis/ Project Implementation
• Experimental Results
• Conclusion
• Future Works
• Reference
Presen tatio n
Outline
3. Abstract
O u r project objective is to detect network intrusion d u e to
an urge towards cyber safety in the day today world.
Reasons including uncertainty in finding the types of attacks
a n d increased complexity of advanced cyber attacks. O u r
motivation is to reduce the malware interruptions within
the c o m p u t e r systems. In o u r project we used different
classification supervised m a c h i n e learning algorithms like
Naive Bayes Classification, Decision Tree Classification , K-
Neighbors Classification a n d Logistic Regression a n d a m o n g
all these algorithms we f o u n d Decision Tree Classification
to yield the best accuracy of 99.60%.
4. In tro du ctio n
In the m o d e r n world, the fast-paced technological
advancements have encouraged every organization to adopt
the integration of information a n d c o m m u n i c a t i o n technology
(ICT). Hence creating an e n v i r o n m e n t where every action is
routed through that system making the organization
vulnerable if the security of the ICT system is compromised.
Therefore, this call for a multilayered detection a n d protection
s c h e m e that can handle truly novel attacks o n the system as
well as able autonomously adapt to the new data.
5. Background
We took s o m e cases we knew before h a n d that a lot of
data was n e e d e d to be collected that would be used to be
worked o n this project so s o m e of the experimen t are
listed as follows -
A. Datasets Description
B. Identifying Network Parameters
6. A. Datasets Description
T h e DARPA’s p r o g r a m for ID evaluation of 1998 was m a n a g e d a n d
prepared by Lincoln Labs of MIT. T h e m a i n objective of this is to
analyze a n d conduct research in ID. A standardized dataset was prepared,
which included various types of intrusions which imitated a military
e n v i ro n me n t a n d was m a d e publicly available.
B.Identifying Network Parameters
Hyper-tuning of parameters to figure out the o p t i m u m set of
parameters to achieve the desired result is all by itself a separate field
with plenty of future scope for research. In this paper, the learning is
kept constant at 0.01 while the other parameters where optimized.
7. System Model
T h e System of the proposed and existing algorithms is
carried out using Jupyter Notebook in Anaconda Individual
Edition 2020.11 IDE, o n an Intel Core i5 8th Generation
processor, 2.84GHz CPU, and 8 GB RAM running o n
Microsoft Windows 10 platform.
8. Problem Statement
To distinguish the activities of the network traffic that
the anomaly and normal is very difficult and to need
m u c h time consuming. Therefore, it needs a way that
can detect network intrusion to reflect the current
network traffics.
9. Project Analysis/ Project Implementation
We started with searching and knowing
various factors challenging the intrusion
detect systems. IDSes are prone to false
alarms -- or false positives. Consequently,
organizations need to fine-tune their IDS
products w h e n they first install them. This
includes properly configuring their intrusion
detection systems to recognize what normal
traffic o n their network looks like compared
to potentially malicious activity.
However, despite the inefficiencies they cause, false positives don't usually cause serious damage to the
actual network and simply lead to configuration improvements.
10.
11.
12.
13.
14. Experimental Results
We followed several steps to prepare our m o d e l that included Preprocessing of
Data ,Splitting of Data , Exploratory Data Analysis (EDA) , Feature Engineering ,
Feature Selection , Training/Fitting the m o d e l o n Train Set Hyperparameter tuning
and lastly predicting over the Test Set.
T h e m o d e l that was created was a Classification Model so the the train set was
basically applied with Naive Bayes classification , Decision Tree classification , K-
Neighbors classification and Logistic Regression. After performing, the best m o d e l
was chosen which was f o u n d out to be the Decision Tree Classification.
Lastly we predicted the o u tc o me s o n the Test Set. First, we created the c o l u m n called
‘class’ which was the d e p e n d e n t variable o r the o u t c o m e variable that h a d only two
types of values n a m e l y - ‘anomaly’ and ‘normal’. So at last we predicted the o u t c o m e
of this class c o l u m n which was o u r objective.
15. Conclusion
From the above codes and output of using different Classification
analysis w e finally got accuracy of Intrusion Detection around 9 9 . 6 0
%and w e can use this m o d e l to test o n any dataset .We used s o m e
libraries provided by Python to i m p l e m e n t this project. After the
experiments, the algorithm of Decision Tree gives us the best test
accuracy, which is 9 9 . 6 0 %.T h e reason w h y it outperforms others is
that it is not limited to the property of the dataset. SVC requires the
parameters to be appropriately set and the neural network requires a
complicated and big dataset. In the future, w e m a y try to carry out
the prediction over different datasets . In this way, the dataset
be c o me s complicated and w e can apply neural networks to m a k e
accurate predictions.
16. Future Works
1.Cloud-centric product development
2. Shift f ro m product- to service-focused models
3.Growing connectivity
4. Wireless, two-way communication
17. Guided By:-
Dr. Amiya Ranjan
Pan da
Presented By:-
Sankhanil Parai - 1806240
Subhrajyoti Payra - 1806260
Unmil Mukhopadhyay - 1806265
Subhrajit Paul -1806352
Abhijit Kumar - 1806275