The main issues of the Intrusion Detection Systems (IDS) are in the sensitivity of these systems toward the errors, the inconsistent and inequitable ways in which the evaluation processes of these systems were often performed. Most of the previous efforts concerned with improving the overall accuracy of these models via increasing the detection rate and decreasing the false alarm which is an important issue. Machine Learning (ML) algorithms can classify all or most of the records of the minor classes to one of the main classes with negligible impact on performance. The riskiness of the threats caused by the small classes and the shortcoming of the previous efforts were used to address this issue, in addition to the need for improving the performance of the IDSs were the motivations for this work. In this paper, stratified sampling method and different cost-function schemes were consolidated with Extreme Learning Machine (ELM) method with Kernels, Activation Functions to build competitive ID solutions that improved the performance of these systems and reduced the occurrence of the accuracy paradox problem. The main experiments were performed using the UNB ISCX2012 dataset. The experimental results of the UNB ISCX2012 dataset showed that ELM models with polynomial function outperform other models in overall accuracy, recall, and F-score. Also, it competed with traditional model in Normal, DoS and SSH classes.
An Empirical Comparison and Feature Reduction Performance Analysis of Intrusi...ijctcm
This paper reports on the empirical evaluation of five machine learning algorithm such as J48, BayesNet, OneR, NB and ZeroR using ten performance criteria: accuracy, precision, recall, F-Measure, incorrectly classified instances, kappa statistic, mean absolute error, root mean squared error, relative absolute error, root relative squared error. The aim of this paper is to find out which classifier is better in its performance for intrusion detection system. Machine Learning is one of the methods used in the intrusion detection system (IDS).Based on this study, it can be concluded that J48 decision tree is the most suitable associated algorithm than the other four algorithms. In this paper we compared the performance of Intrusion Detection System (IDS) Classifiers using seven feature reduction techniques.
ANALYSIS OF MACHINE LEARNING ALGORITHMS WITH FEATURE SELECTION FOR INTRUSION ...IJNSA Journal
In recent times, various machine learning classifiers are used to improve network intrusion detection. The researchers have proposed many solutions for intrusion detection in the literature. The machine learning classifiers are trained on older datasets for intrusion detection, which limits their detection accuracy. So, there is a need to train the machine learning classifiers on the latest dataset. In this paper, UNSW-NB15, the latest dataset is used to train machine learning classifiers. The selected classifiers such as K-Nearest Neighbors (KNN), Stochastic Gradient Descent (SGD), Random Forest (RF), Logistic Regression (LR), and Naïve Bayes (NB) classifiers are used for training from the taxonomy of classifiers based on lazy and eager learners. In this paper, Chi-Square, a filter-based feature selection technique, is applied to the UNSW-NB15 dataset to reduce the irrelevant and redundant features. The performance of classifiers is measured in terms of Accuracy, Mean Squared Error (MSE), Precision, Recall, F1-Score, True Positive Rate (TPR) and False Positive Rate (FPR) with or without feature selection technique and comparative analysis of these machine learning classifiers is carried out.
Secure Multiparty Computation during Privacy Preserving Data Mining: Inscruta...cscpconf
Internet today has put up a great challenge on the security for Indian Healthcare Sector. In
today’s growing environment, most of the computation is jointly computed involving inputs of
all the hospitals. Such computations use confidential data of the involved hospitals to compute
the result. Each hospital is having confidential data which they would not like to share with
other hospitals. Privacy preservation is of great concern as no hospital can be trusted in real
scenario. In this paper we have proposed an efficient protocol for computation. This paper is an
extension of our previous work in which we have defined and compared single and multi trusted
third party protocol. This paper uses multi trusted third party protocol, in which TTPs are
selected at runtime from a pool of TTPs and computation is performed by more than one TTP as
TTPs can be corrupted and correctness in computation is a major concern. In this paper we
proposed a secure protocol that uses encrypted inputs for computation to maintain privacy of
inputs and inscrutablizers to make the identity of hospitals ambiguous. Besides this, security analysis is done for the protocol.
GROUP FUZZY TOPSIS METHODOLOGY IN COMPUTER SECURITY SOFTWARE SELECTIONijfls
In today's interconnected world, the risk of malwares is a major concern for users. Antivirus software is a
device to prevent, discover, and eliminatemalwares such as, computer worm, trojan horses,computer
viruses,spyware and adware. In the competitive IT environment, due to availability of many antivirus
software and their diverse features evaluating them is an arguable and complicated issue for users which
has a significant impact on the efficiency of computers defense systems. The anti-virus selection problem
can be formulated as a multiple criteria decision making problem. This paper proposes an antivirus
evaluation model for computer users based on group fuzzy TOPSIS. We study a real world case of antivirus
software and define criteria for antivirus selection problem. Seven alternatives were selected from among
the most popular antiviruses in the market and seven criteria were determined by the experts. The study is
followed by the sensitivity analyses of the results which also gives valuable insights into the needs and
solutions for different users in different conditions.
Comparative Study on Machine Learning Algorithms for Network Intrusion Detect...ijtsrd
Network has brought convenience to the earth by permitting versatile transformation of information, however it conjointly exposes a high range of vulnerabilities. A Network Intrusion Detection System helps network directors and system to view network security violation in their organizations. Characteristic unknown and new attacks are one of the leading challenges in Intrusion Detection System researches. Deep learning that a subfield of machine learning cares with algorithms that are supported the structure and performance of brain known as artificial neural networks. The improvement in such learning algorithms would increase the probability of IDS and the detection rate of unknown attacks. Throughout, we have a tendency to suggest a deep learning approach to implement increased IDS and associate degree economical. Priya N | Ishita Popli "Comparative Study on Machine Learning Algorithms for Network Intrusion Detection System" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-1 , December 2020, URL: https://www.ijtsrd.com/papers/ijtsrd38175.pdf Paper URL : https://www.ijtsrd.com/computer-science/computer-network/38175/comparative-study-on-machine-learning-algorithms-for-network-intrusion-detection-system/priya-n
A novel ensemble modeling for intrusion detection system IJECEIAES
Vast increase in data through internet services has made computer systems more vulnerable and difficult to protect from malicious attacks. Intrusion detection systems (IDSs) must be more potent in monitoring intrusions. Therefore an effectual Intrusion Detection system architecture is built which employs a facile classification model and generates low false alarm rates and high accuracy. Noticeably, IDS endure enormous amounts of data traffic that contain redundant and irrelevant features, which affect the performance of the IDS negatively. Despite good feature selection approaches leads to a reduction of unrelated and redundant features and attain better classification accuracy in IDS. This paper proposes a novel ensemble model for IDS based on two algorithms Fuzzy Ensemble Feature selection (FEFS) and Fusion of Multiple Classifier (FMC). FEFS is a unification of five feature scores. These scores are obtained by using feature-class distance functions. Aggregation is done using fuzzy union operation. On the other hand, the FMC is the fusion of three classifiers. It works based on Ensemble decisive function. Experiments were made on KDD cup 99 data set have shown that our proposed system works superior to well-known methods such as Support Vector Machines (SVMs), K-Nearest Neighbor (KNN) and Artificial Neural Networks (ANNs). Our examinations ensured clearly the prominence of using ensemble methodology for modeling IDSs, and hence our system is robust and efficient.
IMPLEMENTATION OF RISK ANALYZER MODEL FOR UNDERTAKING THE RISK ANALYSIS OF PR...IJDKP
The model of RISK ANALYZER was implemented as Knowledge-based System for the purpose of undertaking risk analysis for proposed construction projects in a selected domain. The Fuzzy Decision Variables (FDVs) that cause differences between initial and final contract sums of building projects were identified, the likelihood of the occurrence of the risks were determined and a Knowledge-Based System that would rank the risks was constructed using JAVA programming language and Graphic User Interface. The Knowledge-Based System is composed a Knowledge Base for storing data, an Inference Engine for controlling and directing the use of knowledge for problem-solution, and a User Interface that assists the user retrieve, use and alter data in the Knowledge Base. The developed Knowledge-Based System was compiled, implemented and validated with data of previously completed projects. The client could utilize the Knowledge-Based System to undertake proposed building projects
An Empirical Comparison and Feature Reduction Performance Analysis of Intrusi...ijctcm
This paper reports on the empirical evaluation of five machine learning algorithm such as J48, BayesNet, OneR, NB and ZeroR using ten performance criteria: accuracy, precision, recall, F-Measure, incorrectly classified instances, kappa statistic, mean absolute error, root mean squared error, relative absolute error, root relative squared error. The aim of this paper is to find out which classifier is better in its performance for intrusion detection system. Machine Learning is one of the methods used in the intrusion detection system (IDS).Based on this study, it can be concluded that J48 decision tree is the most suitable associated algorithm than the other four algorithms. In this paper we compared the performance of Intrusion Detection System (IDS) Classifiers using seven feature reduction techniques.
ANALYSIS OF MACHINE LEARNING ALGORITHMS WITH FEATURE SELECTION FOR INTRUSION ...IJNSA Journal
In recent times, various machine learning classifiers are used to improve network intrusion detection. The researchers have proposed many solutions for intrusion detection in the literature. The machine learning classifiers are trained on older datasets for intrusion detection, which limits their detection accuracy. So, there is a need to train the machine learning classifiers on the latest dataset. In this paper, UNSW-NB15, the latest dataset is used to train machine learning classifiers. The selected classifiers such as K-Nearest Neighbors (KNN), Stochastic Gradient Descent (SGD), Random Forest (RF), Logistic Regression (LR), and Naïve Bayes (NB) classifiers are used for training from the taxonomy of classifiers based on lazy and eager learners. In this paper, Chi-Square, a filter-based feature selection technique, is applied to the UNSW-NB15 dataset to reduce the irrelevant and redundant features. The performance of classifiers is measured in terms of Accuracy, Mean Squared Error (MSE), Precision, Recall, F1-Score, True Positive Rate (TPR) and False Positive Rate (FPR) with or without feature selection technique and comparative analysis of these machine learning classifiers is carried out.
Secure Multiparty Computation during Privacy Preserving Data Mining: Inscruta...cscpconf
Internet today has put up a great challenge on the security for Indian Healthcare Sector. In
today’s growing environment, most of the computation is jointly computed involving inputs of
all the hospitals. Such computations use confidential data of the involved hospitals to compute
the result. Each hospital is having confidential data which they would not like to share with
other hospitals. Privacy preservation is of great concern as no hospital can be trusted in real
scenario. In this paper we have proposed an efficient protocol for computation. This paper is an
extension of our previous work in which we have defined and compared single and multi trusted
third party protocol. This paper uses multi trusted third party protocol, in which TTPs are
selected at runtime from a pool of TTPs and computation is performed by more than one TTP as
TTPs can be corrupted and correctness in computation is a major concern. In this paper we
proposed a secure protocol that uses encrypted inputs for computation to maintain privacy of
inputs and inscrutablizers to make the identity of hospitals ambiguous. Besides this, security analysis is done for the protocol.
GROUP FUZZY TOPSIS METHODOLOGY IN COMPUTER SECURITY SOFTWARE SELECTIONijfls
In today's interconnected world, the risk of malwares is a major concern for users. Antivirus software is a
device to prevent, discover, and eliminatemalwares such as, computer worm, trojan horses,computer
viruses,spyware and adware. In the competitive IT environment, due to availability of many antivirus
software and their diverse features evaluating them is an arguable and complicated issue for users which
has a significant impact on the efficiency of computers defense systems. The anti-virus selection problem
can be formulated as a multiple criteria decision making problem. This paper proposes an antivirus
evaluation model for computer users based on group fuzzy TOPSIS. We study a real world case of antivirus
software and define criteria for antivirus selection problem. Seven alternatives were selected from among
the most popular antiviruses in the market and seven criteria were determined by the experts. The study is
followed by the sensitivity analyses of the results which also gives valuable insights into the needs and
solutions for different users in different conditions.
Comparative Study on Machine Learning Algorithms for Network Intrusion Detect...ijtsrd
Network has brought convenience to the earth by permitting versatile transformation of information, however it conjointly exposes a high range of vulnerabilities. A Network Intrusion Detection System helps network directors and system to view network security violation in their organizations. Characteristic unknown and new attacks are one of the leading challenges in Intrusion Detection System researches. Deep learning that a subfield of machine learning cares with algorithms that are supported the structure and performance of brain known as artificial neural networks. The improvement in such learning algorithms would increase the probability of IDS and the detection rate of unknown attacks. Throughout, we have a tendency to suggest a deep learning approach to implement increased IDS and associate degree economical. Priya N | Ishita Popli "Comparative Study on Machine Learning Algorithms for Network Intrusion Detection System" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-1 , December 2020, URL: https://www.ijtsrd.com/papers/ijtsrd38175.pdf Paper URL : https://www.ijtsrd.com/computer-science/computer-network/38175/comparative-study-on-machine-learning-algorithms-for-network-intrusion-detection-system/priya-n
A novel ensemble modeling for intrusion detection system IJECEIAES
Vast increase in data through internet services has made computer systems more vulnerable and difficult to protect from malicious attacks. Intrusion detection systems (IDSs) must be more potent in monitoring intrusions. Therefore an effectual Intrusion Detection system architecture is built which employs a facile classification model and generates low false alarm rates and high accuracy. Noticeably, IDS endure enormous amounts of data traffic that contain redundant and irrelevant features, which affect the performance of the IDS negatively. Despite good feature selection approaches leads to a reduction of unrelated and redundant features and attain better classification accuracy in IDS. This paper proposes a novel ensemble model for IDS based on two algorithms Fuzzy Ensemble Feature selection (FEFS) and Fusion of Multiple Classifier (FMC). FEFS is a unification of five feature scores. These scores are obtained by using feature-class distance functions. Aggregation is done using fuzzy union operation. On the other hand, the FMC is the fusion of three classifiers. It works based on Ensemble decisive function. Experiments were made on KDD cup 99 data set have shown that our proposed system works superior to well-known methods such as Support Vector Machines (SVMs), K-Nearest Neighbor (KNN) and Artificial Neural Networks (ANNs). Our examinations ensured clearly the prominence of using ensemble methodology for modeling IDSs, and hence our system is robust and efficient.
IMPLEMENTATION OF RISK ANALYZER MODEL FOR UNDERTAKING THE RISK ANALYSIS OF PR...IJDKP
The model of RISK ANALYZER was implemented as Knowledge-based System for the purpose of undertaking risk analysis for proposed construction projects in a selected domain. The Fuzzy Decision Variables (FDVs) that cause differences between initial and final contract sums of building projects were identified, the likelihood of the occurrence of the risks were determined and a Knowledge-Based System that would rank the risks was constructed using JAVA programming language and Graphic User Interface. The Knowledge-Based System is composed a Knowledge Base for storing data, an Inference Engine for controlling and directing the use of knowledge for problem-solution, and a User Interface that assists the user retrieve, use and alter data in the Knowledge Base. The developed Knowledge-Based System was compiled, implemented and validated with data of previously completed projects. The client could utilize the Knowledge-Based System to undertake proposed building projects
Comparative Performance Analysis of Machine Learning Techniques for Software ...csandit
Machine learning techniques can be used to analyse data from different perspectives and enable
developers to retrieve useful information. Machine learning techniques are proven to be useful
in terms of software bug prediction. In this paper, a comparative performance analysis of
different machine learning techniques is explored for software bug prediction on public
available data sets. Results showed most of the machine learning methods performed well on
software bug datasets.
Requirement analysis, architectural design and formal verification of a multi...ijcsit
This paper presents an approach based on the analysis, design, and formal verification of a multi-agent
based university Information Management System (IMS). University IMS accesses information, creates
reports and facilitates teachers as well as students. An orchestrator agent manages the coordination
between all agents. It also manages the database connectivity for the whole system. The proposed IMS is
based on BDI agent architecture, which models the system based on belief, desire, and intentions. The
correctness properties of safety and liveness are specified by First-order predicate logic.
Benchmarks for Evaluating Anomaly Based Intrusion Detection SolutionsIJNSA Journal
Anomaly-based Intrusion Detection Systems (IDS) have gained increased popularity over time. There are many proposed anomaly-based systems using different Machine Learning (ML) algorithms and techniques, however there is no standard benchmark to compare them based on quantifiable measures. In this paper, we propose a benchmark that measures both accuracy and performance to produce objective metrics that can be used in the evaluation of each algorithm implementation. We then use this benchmark to compare accuracy as well as the performance of four different Anomaly-based IDS solutions based on various ML algorithms. The algorithms include Naive Bayes, Support Vector Machines, Neural Networks, and K-means Clustering. The benchmark evaluation is performed on the popular NSL-KDD dataset. The experimental results show the differences in accuracy and performance between these Anomaly-based IDS solutions on the dataset. The results also demonstrate how this benchmark can be used to create useful metrics for such comparisons.
A Software Measurement Using Artificial Neural Network and Support Vector Mac...ijseajournal
Today, Software measurement are based on various techniques such that neural network, Genetic
algorithm, Fuzzy Logic etc. This study involves the efficiency of applying support vector machine using
Gaussian Radial Basis kernel function to software measurement problem to increase the performance and
accuracy. Support vector machines (SVM) are innovative approach to constructing learning machines that
Minimize generalization error. There is a close relationship between SVMs and the Radial Basis Function
(RBF) classifiers. Both have found numerous applications such as in optical character recognition, object
detection, face verification, text categorization, and so on. The result demonstrated that the accuracy and
generalization performance of SVM Gaussian Radial Basis kernel function is better than RBFN. We also
examine and summarize the several superior points of the SVM compared with RBFN.
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR MLijaia
Given the impact of Machine Learning (ML) on individuals and the society, understanding how harm might
be occur throughout the ML life cycle becomes critical more than ever. By offering a framework to
determine distinct potential sources of downstream harm in ML pipeline, the paper demonstrates the
importance of choices throughout distinct phases of data collection, development, and deployment that
extend far beyond just model training. Relevant mitigation techniques are also suggested for being used
instead of merely relying on generic notions of what counts as fairness.
A Survey of Security of Multimodal Biometric SystemsIJERA Editor
A biometric system is essentially a pattern recognition system being used in adversarial environment. Since,
biometric system like any conventional security system is exposed to malicious adversaries, who can manipulate
data to make the system ineffective by compromising its integrity. Current theory and design methods of
biometric systems do not take into account the vulnerability to such adversary attacks. Therefore, evaluation of
classical design methods is an open problem to investigate whether they lead to design secure systems. In order
to make biometric systems secure it is necessary to understand and evaluate the threats and to thus develop
effective countermeasures and robust system designs, both technical and procedural, if necessary. Accordingly,
the extension of theory and design methods of biometric systems is mandatory to safeguard the security and
reliability of biometric systems in adversarial environments.
UBIQUITOUS COMPUTING AND SCRUM SOFTWARE ANALYSIS FOR COMMUNITY SOFTWAREijseajournal
Ubiquitous processing has the special property when contrasted with customary desktop figuring framework. Really in ubiquitous, more computational force in the earth by utilizing any gadget everything was accessible and open. Numerous machines give the interface to the single client by concealing every one of the gadgets from the foundation. Programming Engineering has unbelievable meaning in the area of g which was utilized and satisfactory for a wide range of ubiquitous figuring application. Because of less programming designing methodologies were distinguished, this was fundamental issue in the root to propose general level engineering. To reach on this objective, step was taken at exceptionally introductory level to seek out all the believable programming
designing difficulties in numerous ubiquitous registering applications that were at that point grew as such.
Attempt to highlight all methodologies that ever been utilized to built up the universal applications. This alysts to construct the general level engineering for the ubiquitous applications
Software Defect Prediction Using Radial Basis and Probabilistic Neural NetworksEditor IJCATR
Defects in modules of software systems is a major problem in software development. There are a variety of data mining
techniques used to predict software defects such as regression, association rules, clustering, and classification. This paper is concerned
with classification based software defect prediction. This paper investigates the effectiveness of using a radial basis function neural
network and a probabilistic neural network on prediction accuracy and defect prediction. The conclusions to be drawn from this work is
that the neural networks used in here provide an acceptable level of accuracy but a poor defect prediction ability. Probabilistic neural
networks perform consistently better with respect to the two performance measures used across all datasets. It may be advisable to use
a range of software defect prediction models to complement each other rather than relying on a single technique.
A SURVEY ON TECHNIQUES REQUIREMENTS FOR INTEGRATEING SAFETY AND SECURITY ENGI...IJCSES Journal
Nowadays, safety and security have become a requirement, integrated to each other, for information systems as a new generation of infrastructure systems distributed throughout networks. That opened the door for questions on whether these systems are safety-critical especially since they were tested in a closed, separated environment and are now deployed in an uncontrollable environment, namely the internet, where the number of threats is enormous. So it opened the door to talk about new development approach methods that take safety and security into consideration during the system development life cycle and most importantly, identifying hazard, risks and threats. We will conduct a survey exploring technical languages that were created by the scholars to combine safety and security requirement engineering and accident analysis technique languages.
COMPUTER INTRUSION DETECTION BY TWOOBJECTIVE FUZZY GENETIC ALGORITHMcscpconf
The purpose of this paper is to describe two objective fuzzy genetics-based learning algorithms
and discusses its usage to detect intrusion in a computer network. Experiments were performed
with KDD-cup data set, which have information on computer networks, during normal behavior
and intrusive behavior. The performance of final fuzzy classification system has been
investigated using intrusion detection problem as a high dimensional classification problem.
This task is formulated as optimization problem with two objectives: To minimize the number of
fuzzy rules and to maximize the classification rate. We show a two-objective genetic algorithm
for finding non-dominated solutions of the fuzzy rule selection problem
A LITERATURE SURVEY AND ANALYSIS ON SOCIAL ENGINEERING DEFENSE MECHANISMS AND...IJNSA Journal
Social engineering attacks can be severe and hard to detect. Therefore, to prevent such attacks, organizations should be aware of social engineering defense mechanisms and security policies. To that end, the authors developed a taxonomy of social engineering defense mechanisms, designed a survey to measure employee awareness of these mechanisms, proposed a model of Social Engineering InfoSec Policies (SE-IPs), and designed a survey to measure the incorporation level of these SE-IPs. After analyzing the data from the first survey, the authors found that more than half of employees are not aware of social engineering attacks. The paper also analyzed a second set of survey data, which found that on average, organizations incorporated just over fifty percent of the identified formal SE-IPs. Such worrisome results show that organizations are vulnerable to social engineering attacks, and serious steps need to be taken to elevate awareness against these emerging security threats.
Intrusion Detection System (IDS) Development Using Tree-Based Machine Learnin...IJCNCJournal
The paper proposes a two-phase classification method for detecting anomalies in network traffic, aiming to tackle the challenges of imbalance and feature selection. The study uses Information Gain to select relevant features and evaluates its performance on the CICIDS-2018 dataset with various classifiers. Results indicate that the ensemble classifier achieved the highest accuracy, precision, and recall. The proposed method addresses challenges in intrusion detection and highlights the effectiveness of ensemble classifiers in improving anomaly detection accuracy. Also, the quantity of pertinent characteristics chosen by Information Gain has a considerable impact on the F1-score and detection accuracy. Specifically, the Ensemble Learning achieved the highest accuracy of 98.36% and F1-score of 97.98% using the relevant selected features.
Intrusion Detection System(IDS) Development Using Tree-Based Machine Learning...IJCNCJournal
The paper proposes a two-phase classification method for detecting anomalies in network traffic, aiming to tackle the challenges of imbalance and feature selection. The study uses Information Gain to select relevant features and evaluates its performance on the CICIDS-2018 dataset with various classifiers. Results indicate that the ensemble classifier achieved the highest accuracy, precision, and recall. The proposed method addresses challenges in intrusion detection and highlights the effectiveness of ensemble classifiers in improving anomaly detection accuracy. Also, the quantity of pertinent characteristics chosen by Information Gain has a considerable impact on the F1-score and detection accuracy. Specifically, the Ensemble Learning achieved the highest accuracy of 98.36% and F1-score of 97.98% using the relevant selected features.
Intrusion detection system for imbalance ratio class using weighted XGBoost c...TELKOMNIKA JOURNAL
The rapid development of the internet of things (IoT) has taken an important role in daily activities. As it develops, IoT is very vulnerable to attacks and creates IoT for users. Intrusion detection system (IDS) can work efficiently and look for activity in the network. Many data sets have already been collected, however, when dealing with problems involving big data and hight data imbalances. This article proposes, using the dataset used by BotIoT to evaluate the system framework to be created, the XGBoost model to improve the detection performance of all types of attacks, to control unbalanced data using the imbalance ratio of each class weight (CW). The experimental results show that the proposed approach greatly increases the detection rate for infrequent disturbances.
Progress of Machine Learning in the Field of Intrusion Detection Systemsijcisjournal
With the growth in the use of the Internet and local area networks, malicious attacks and intrusions into
computer systems are increasing. Implementing intrusion detection systems have become extremely
important to help maintain good network security. Support vector machines (SVMs), a classic pattern
recognition tool, have been widely used in intrusion detection. They can handle very large data with high
efficiency, are easy to use, and exhibit good prediction behavior. This paper presents a new SVM model
enriched with a Gaussian kernel function based on the features of the training data for intrusion detection.
The new model is tested with the CICIDS2017 dataset. The test proves better results in terms of detection
efficiency and false alarm rate, which can give better coverage and make detection more efficient.
11421ijcPROGRESS OF MACHINE LEARNING IN THE FIELD OF INTRUSION DETECTION SYST...ijcisjournal
With the growth in the use of the Internet and local area networks, malicious attacks and intrusions into computer systems are increasing. Implementing intrusion detection systems have become extremely important to help maintain good network security. Support vector machines (SVMs), a classic pattern recognition tool, have been widely used in intrusion detection. They can handle very large data with high efficiency, are easy to use, and exhibit good prediction behavior. This paper presents a new SVM model enriched with a Gaussian kernel function based on the features of the training data for intrusion detection. The new model is tested with the CICIDS2017 dataset. The test proves better results in terms of detection efficiency and false alarm rate, which can give better coverage and make detection more efficient.
Comparative Performance Analysis of Machine Learning Techniques for Software ...csandit
Machine learning techniques can be used to analyse data from different perspectives and enable
developers to retrieve useful information. Machine learning techniques are proven to be useful
in terms of software bug prediction. In this paper, a comparative performance analysis of
different machine learning techniques is explored for software bug prediction on public
available data sets. Results showed most of the machine learning methods performed well on
software bug datasets.
Requirement analysis, architectural design and formal verification of a multi...ijcsit
This paper presents an approach based on the analysis, design, and formal verification of a multi-agent
based university Information Management System (IMS). University IMS accesses information, creates
reports and facilitates teachers as well as students. An orchestrator agent manages the coordination
between all agents. It also manages the database connectivity for the whole system. The proposed IMS is
based on BDI agent architecture, which models the system based on belief, desire, and intentions. The
correctness properties of safety and liveness are specified by First-order predicate logic.
Benchmarks for Evaluating Anomaly Based Intrusion Detection SolutionsIJNSA Journal
Anomaly-based Intrusion Detection Systems (IDS) have gained increased popularity over time. There are many proposed anomaly-based systems using different Machine Learning (ML) algorithms and techniques, however there is no standard benchmark to compare them based on quantifiable measures. In this paper, we propose a benchmark that measures both accuracy and performance to produce objective metrics that can be used in the evaluation of each algorithm implementation. We then use this benchmark to compare accuracy as well as the performance of four different Anomaly-based IDS solutions based on various ML algorithms. The algorithms include Naive Bayes, Support Vector Machines, Neural Networks, and K-means Clustering. The benchmark evaluation is performed on the popular NSL-KDD dataset. The experimental results show the differences in accuracy and performance between these Anomaly-based IDS solutions on the dataset. The results also demonstrate how this benchmark can be used to create useful metrics for such comparisons.
A Software Measurement Using Artificial Neural Network and Support Vector Mac...ijseajournal
Today, Software measurement are based on various techniques such that neural network, Genetic
algorithm, Fuzzy Logic etc. This study involves the efficiency of applying support vector machine using
Gaussian Radial Basis kernel function to software measurement problem to increase the performance and
accuracy. Support vector machines (SVM) are innovative approach to constructing learning machines that
Minimize generalization error. There is a close relationship between SVMs and the Radial Basis Function
(RBF) classifiers. Both have found numerous applications such as in optical character recognition, object
detection, face verification, text categorization, and so on. The result demonstrated that the accuracy and
generalization performance of SVM Gaussian Radial Basis kernel function is better than RBFN. We also
examine and summarize the several superior points of the SVM compared with RBFN.
MITIGATION TECHNIQUES TO OVERCOME DATA HARM IN MODEL BUILDING FOR MLijaia
Given the impact of Machine Learning (ML) on individuals and the society, understanding how harm might
be occur throughout the ML life cycle becomes critical more than ever. By offering a framework to
determine distinct potential sources of downstream harm in ML pipeline, the paper demonstrates the
importance of choices throughout distinct phases of data collection, development, and deployment that
extend far beyond just model training. Relevant mitigation techniques are also suggested for being used
instead of merely relying on generic notions of what counts as fairness.
A Survey of Security of Multimodal Biometric SystemsIJERA Editor
A biometric system is essentially a pattern recognition system being used in adversarial environment. Since,
biometric system like any conventional security system is exposed to malicious adversaries, who can manipulate
data to make the system ineffective by compromising its integrity. Current theory and design methods of
biometric systems do not take into account the vulnerability to such adversary attacks. Therefore, evaluation of
classical design methods is an open problem to investigate whether they lead to design secure systems. In order
to make biometric systems secure it is necessary to understand and evaluate the threats and to thus develop
effective countermeasures and robust system designs, both technical and procedural, if necessary. Accordingly,
the extension of theory and design methods of biometric systems is mandatory to safeguard the security and
reliability of biometric systems in adversarial environments.
UBIQUITOUS COMPUTING AND SCRUM SOFTWARE ANALYSIS FOR COMMUNITY SOFTWAREijseajournal
Ubiquitous processing has the special property when contrasted with customary desktop figuring framework. Really in ubiquitous, more computational force in the earth by utilizing any gadget everything was accessible and open. Numerous machines give the interface to the single client by concealing every one of the gadgets from the foundation. Programming Engineering has unbelievable meaning in the area of g which was utilized and satisfactory for a wide range of ubiquitous figuring application. Because of less programming designing methodologies were distinguished, this was fundamental issue in the root to propose general level engineering. To reach on this objective, step was taken at exceptionally introductory level to seek out all the believable programming
designing difficulties in numerous ubiquitous registering applications that were at that point grew as such.
Attempt to highlight all methodologies that ever been utilized to built up the universal applications. This alysts to construct the general level engineering for the ubiquitous applications
Software Defect Prediction Using Radial Basis and Probabilistic Neural NetworksEditor IJCATR
Defects in modules of software systems is a major problem in software development. There are a variety of data mining
techniques used to predict software defects such as regression, association rules, clustering, and classification. This paper is concerned
with classification based software defect prediction. This paper investigates the effectiveness of using a radial basis function neural
network and a probabilistic neural network on prediction accuracy and defect prediction. The conclusions to be drawn from this work is
that the neural networks used in here provide an acceptable level of accuracy but a poor defect prediction ability. Probabilistic neural
networks perform consistently better with respect to the two performance measures used across all datasets. It may be advisable to use
a range of software defect prediction models to complement each other rather than relying on a single technique.
A SURVEY ON TECHNIQUES REQUIREMENTS FOR INTEGRATEING SAFETY AND SECURITY ENGI...IJCSES Journal
Nowadays, safety and security have become a requirement, integrated to each other, for information systems as a new generation of infrastructure systems distributed throughout networks. That opened the door for questions on whether these systems are safety-critical especially since they were tested in a closed, separated environment and are now deployed in an uncontrollable environment, namely the internet, where the number of threats is enormous. So it opened the door to talk about new development approach methods that take safety and security into consideration during the system development life cycle and most importantly, identifying hazard, risks and threats. We will conduct a survey exploring technical languages that were created by the scholars to combine safety and security requirement engineering and accident analysis technique languages.
COMPUTER INTRUSION DETECTION BY TWOOBJECTIVE FUZZY GENETIC ALGORITHMcscpconf
The purpose of this paper is to describe two objective fuzzy genetics-based learning algorithms
and discusses its usage to detect intrusion in a computer network. Experiments were performed
with KDD-cup data set, which have information on computer networks, during normal behavior
and intrusive behavior. The performance of final fuzzy classification system has been
investigated using intrusion detection problem as a high dimensional classification problem.
This task is formulated as optimization problem with two objectives: To minimize the number of
fuzzy rules and to maximize the classification rate. We show a two-objective genetic algorithm
for finding non-dominated solutions of the fuzzy rule selection problem
A LITERATURE SURVEY AND ANALYSIS ON SOCIAL ENGINEERING DEFENSE MECHANISMS AND...IJNSA Journal
Social engineering attacks can be severe and hard to detect. Therefore, to prevent such attacks, organizations should be aware of social engineering defense mechanisms and security policies. To that end, the authors developed a taxonomy of social engineering defense mechanisms, designed a survey to measure employee awareness of these mechanisms, proposed a model of Social Engineering InfoSec Policies (SE-IPs), and designed a survey to measure the incorporation level of these SE-IPs. After analyzing the data from the first survey, the authors found that more than half of employees are not aware of social engineering attacks. The paper also analyzed a second set of survey data, which found that on average, organizations incorporated just over fifty percent of the identified formal SE-IPs. Such worrisome results show that organizations are vulnerable to social engineering attacks, and serious steps need to be taken to elevate awareness against these emerging security threats.
Intrusion Detection System (IDS) Development Using Tree-Based Machine Learnin...IJCNCJournal
The paper proposes a two-phase classification method for detecting anomalies in network traffic, aiming to tackle the challenges of imbalance and feature selection. The study uses Information Gain to select relevant features and evaluates its performance on the CICIDS-2018 dataset with various classifiers. Results indicate that the ensemble classifier achieved the highest accuracy, precision, and recall. The proposed method addresses challenges in intrusion detection and highlights the effectiveness of ensemble classifiers in improving anomaly detection accuracy. Also, the quantity of pertinent characteristics chosen by Information Gain has a considerable impact on the F1-score and detection accuracy. Specifically, the Ensemble Learning achieved the highest accuracy of 98.36% and F1-score of 97.98% using the relevant selected features.
Intrusion Detection System(IDS) Development Using Tree-Based Machine Learning...IJCNCJournal
The paper proposes a two-phase classification method for detecting anomalies in network traffic, aiming to tackle the challenges of imbalance and feature selection. The study uses Information Gain to select relevant features and evaluates its performance on the CICIDS-2018 dataset with various classifiers. Results indicate that the ensemble classifier achieved the highest accuracy, precision, and recall. The proposed method addresses challenges in intrusion detection and highlights the effectiveness of ensemble classifiers in improving anomaly detection accuracy. Also, the quantity of pertinent characteristics chosen by Information Gain has a considerable impact on the F1-score and detection accuracy. Specifically, the Ensemble Learning achieved the highest accuracy of 98.36% and F1-score of 97.98% using the relevant selected features.
Intrusion detection system for imbalance ratio class using weighted XGBoost c...TELKOMNIKA JOURNAL
The rapid development of the internet of things (IoT) has taken an important role in daily activities. As it develops, IoT is very vulnerable to attacks and creates IoT for users. Intrusion detection system (IDS) can work efficiently and look for activity in the network. Many data sets have already been collected, however, when dealing with problems involving big data and hight data imbalances. This article proposes, using the dataset used by BotIoT to evaluate the system framework to be created, the XGBoost model to improve the detection performance of all types of attacks, to control unbalanced data using the imbalance ratio of each class weight (CW). The experimental results show that the proposed approach greatly increases the detection rate for infrequent disturbances.
Progress of Machine Learning in the Field of Intrusion Detection Systemsijcisjournal
With the growth in the use of the Internet and local area networks, malicious attacks and intrusions into
computer systems are increasing. Implementing intrusion detection systems have become extremely
important to help maintain good network security. Support vector machines (SVMs), a classic pattern
recognition tool, have been widely used in intrusion detection. They can handle very large data with high
efficiency, are easy to use, and exhibit good prediction behavior. This paper presents a new SVM model
enriched with a Gaussian kernel function based on the features of the training data for intrusion detection.
The new model is tested with the CICIDS2017 dataset. The test proves better results in terms of detection
efficiency and false alarm rate, which can give better coverage and make detection more efficient.
11421ijcPROGRESS OF MACHINE LEARNING IN THE FIELD OF INTRUSION DETECTION SYST...ijcisjournal
With the growth in the use of the Internet and local area networks, malicious attacks and intrusions into computer systems are increasing. Implementing intrusion detection systems have become extremely important to help maintain good network security. Support vector machines (SVMs), a classic pattern recognition tool, have been widely used in intrusion detection. They can handle very large data with high efficiency, are easy to use, and exhibit good prediction behavior. This paper presents a new SVM model enriched with a Gaussian kernel function based on the features of the training data for intrusion detection. The new model is tested with the CICIDS2017 dataset. The test proves better results in terms of detection efficiency and false alarm rate, which can give better coverage and make detection more efficient.
Most of the network habitats retain on facing an ever increasing number of security threats. In early times,
firewalls are used as a security examines point in the network environment. Recently the use of Intrusion
Detection System (IDS) has greatly increased due to its more constructive and robust working than
firewall. An IDS refers to the process of constantly observing the incoming and outgoing traffic of a
network in order to diagnose suspicious behavior. In real scenario most of the environments are dynamic
in nature, which leads to the problem of concept drift, is perturbed with learning from data whose
statistical attribute change over time. Concept drift is impenetrable if the dataset is class-imbalanced. In
this review paper, study of IDS along with different approaches of incremental learning is carried out.
From this study, by applying voting rule to incremental learning a new approach is proposed. Further, the
comparison between existing Fuzzy rule method and proposed approach is done.
A PROPOSED MODEL FOR DIMENSIONALITY REDUCTION TO IMPROVE THE CLASSIFICATION C...IJNSA Journal
Over the past few years, intrusion protection systems have drawn a mature research area in the field of computer networks. The problem of excessive features has a significant impact on
intrusion detection performance. The use of machine learning algorithms in many previous researches has been used to identify network traffic, harmful or normal. Therefore, to obtain the accuracy, we must reduce the dimensionality of the data used. A new model design based on a combination of feature selection and machine learning algorithms is proposed in this paper. This model depends on selected genes from every feature to increase the accuracy of intrusion detection systems. We selected from features content only ones which impact in attack detection. The performance has been evaluated based on a comparison of several known algorithms. The NSL-KDD dataset is used for examining classification. The proposed model outperformed the other learning approaches with accuracy 98.8 %.
Optimizing cybersecurity incident response decisions using deep reinforcemen...IJECEIAES
The main purpose of this paper is to explore and investigate the role of deep reinforcement learning (DRL) in optimizing the post-alert incident response process in security incident and event management (SIEM) systems. Although machine learning is used at multiple levels of SIEM systems, the last mile decision process is often ignored. Few papers reported efforts regarding the use of DRL to improve the post-alert decision and incident response processes. All the reported efforts applied only shallow (traditional) machine learning approaches to solve the problem. This paper explores the possibility of solving the problem using DRL approaches. The main attraction of DRL models is their ability to make accurate decisions based on live streams of data without the need for prior training, and they proved to be very successful in other fields of applications. Using standard datasets, a number of experiments have been conducted using different DRL configurations The results showed that DRL models can provide highly accurate decisions without the need for prior training.
An effective approach for tackling network security
problems is Intrusion detection systems (IDS). These kind of
systems play a key role in network security as they can detect
different types of attacks in networks, including DoS, U2R Probe
and R2L. In addition, IDS are an increasingly key part of the
system’s defense. Various approaches to IDS are now being used,
but are unfortunately relatively ineffective. Data mining techniques
and artificial intelligence play an important role in security
services. We will present a comparative study of three wellknown
intelligent algorithms in this paper. These are Radial Basis
Functions (RBF), Multilayer Perceptrons (MLP) and Support
Vector Machine (SVM).This work’s main interest is to benchmark
the performance of these3 intelligent algorithms. This is done by
using a dataset of about 9,000 connections, randomly chosen from
KDD'99’s 10% dataset. In addition, we investigate these
algorithms’ performance in terms of their attack classification
accuracy. The Simulation results are also analyzed and the
discussion is then presented. It has been observed that SVM with a
linear kernel (Linear-SVM) gives a better performance than MLP
and RBF in terms of its detection accuracy and processing speed.
DETECTION OF ATTACKS IN WIRELESS NETWORKS USING DATA MINING TECHNIQUESIAEME Publication
With the progressive increase of network application and electronic devices (computer, mobile phones, android, etc), attack and intrusion detection is becoming a very challenging task in cybercrime detection area. in this context, most of existing approaches of attack detection rely mainly on a finite set of attacks. However, these solutions are vulnerable, that is, they fail in detecting some attacks when sources of information’s are ambiguous or imperfect. But, few approaches started investigating toward this direction. Following this trends, this paper investigates the role of machine learning approach (ANN, SVM) in detecting TCP connection traffic as normal or suspicious one. But, using ANN and SVM is an expensive technique individually. In this paper, combining two classifiers has been proposed, where artificial neural network (ANN) classifier and support vector machine (SVM) were employed. Additionally, our proposed solution allows visualizing obtained classification results. Accuracy of the proposed solution has been compared with other classifier results. Experiments have been conducted with different network connection selected from NSL-KDD DARPA dataset. Empirical results show that combining ANN and SVM techniques for attack detection is a promising direction
ATTACK DETECTION AVAILING FEATURE DISCRETION USING RANDOM FOREST CLASSIFIERCSEIJJournal
The widespread use of the Internet has an adverse effect of being vulnerable to cyber attacks. Defensive
mechanisms like firewalls and IDSs have evolved with a lot of research contributions happening in these
areas. Machine learning techniques have been successfully used in these defense mechanisms especially
IDSs. Although they are effective to some extent in identifying new patterns and variants of existing
malicious patterns, many attacks are still left as undetected. The objective is to develop an algorithm for
detecting malicious domains based on passive traffic measurements. In this paper, an anomaly-based
intrusion detection system based on an ensemble based machine learning classifier called Random Forest
with gradient boosting is deployed. NSL-KDD cup dataset is used for analysis and out of 41 features, 32
features were identified as significant using feature discretion. Our observations confirm the conjecture
that both the feature selection and stochastic based genetic operators improves the accuracy and the
effectiveness. The training time is shown to be reduced tremendously by 98.59% and accuracy improved to
98.75%.
Attack Detection Availing Feature Discretion using Random Forest ClassifierCSEIJJournal
The widespread use of the Internet has an adverse effect of being vulnerable to cyber attacks. Defensive
mechanisms like firewalls and IDSs have evolved with a lot of research contributions happening in these
areas. Machine learning techniques have been successfully used in these defense mechanisms especially
IDSs. Although they are effective to some extent in identifying new patterns and variants of existing
malicious patterns, many attacks are still left as undetected. The objective is to develop an algorithm for
detecting malicious domains based on passive traffic measurements. In this paper, an anomaly-based
intrusion detection system based on an ensemble based machine learning classifier called Random Forest
with gradient boosting is deployed. NSL-KDD cup dataset is used for analysis and out of 41 features, 32
features were identified as significant using feature discretion.
The main goal of Intrusion Detection Systems (IDSs) is
to detect intrusions. This kind of detection system represents a
significant tool in traditional computer based systems for ensuring
cyber security. IDS model can be faster and reach more accurate
detection rates, by selecting the most related features from the
input dataset. Feature selection is an important stage of any IDs to
select the optimal subset of features that enhance the process of the
training model to become faster and reduce the complexity while
preserving or enhancing the performance of the system. In this
paper, we proposed a method that based on dividing the input
dataset into different subsets according to each attack. Then we
performed a feature selection technique using information gain
filter for each subset. Then the optimal features set is generated by
combining the list of features sets that obtained for each attack.
Experimental results that conducted on NSL-KDD dataset shows
that the proposed method for feature selection with fewer features,
make an improvement to the system accuracy while decreasing the
complexity. Moreover, a comparative study is performed to the
efficiency of technique for feature selection using different
classification methods. To enhance the overall performance,
another stage is conducted using Random Forest and PART on
voting learning algorithm. The results indicate that the best
accuracy is achieved when using the product probability rule.
Three layer hybrid learning to improve intrusion detection system performanceIJECEIAES
In imbalanced network traffic, malicious cyberattacks can be hidden in a large amount of normal traffic, making it difficult for intrusion detection systems (IDS) to detect them. Therefore, anomaly-based IDS with machine learning is the solution. However, a single machine learning cannot accurately detect all types of attacks. Therefore, a hybrid model that combines long short-term memory (LSTM) and random forest (RF) in three layers is proposed. Building the hybrid model starts with Nearmiss-2 class balancing, which reduces normal samples without increasing minority samples. Then, feature selection is performed using chi-square and RF. Next, hyperparameter tuning is performed to obtain the optimal model. In the first and second layers, LSTM and RF are used for binary classification to detect normal data and attack data. While the third layer model uses RF for multiclass classification. The hybrid model verified using the CSE-CIC-IDS2018 dataset, showed better performance compared to the single algorithm. For multiclass classification, the hybrid model achieved 99.76% accuracy, 99.76% precision, 99.76% recall, and 99.75% F1-score.
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...IJNSA Journal
Building practical and efficient intrusion detection systems in computer network is important in industrial areas today and machine learning technique provides a set of effective algorithms to detect network
intrusion. To find out appropriate algorithms for building such kinds of systems, it is necessary to evaluate various types of machine learning algorithms based on specific criteria. In this paper, we propose a novel evaluation formula which incorporates 6 indexes into our comprehensive measurement, including precision, recall, root mean square error, training time, sample complexity and practicability, in order to
find algorithms which have high detection rate, low training time, need less training samples and are easy
to use like constructing, understanding and analyzing models. Detailed evaluation process is designed to
get all necessary assessment indicators and 6 kinds of machine learning algorithms are evaluated.
Experimental results illustrate that Logistic Regression shows the best overall performance.
A NOVEL EVALUATION APPROACH TO FINDING LIGHTWEIGHT MACHINE LEARNING ALGORITHM...IJNSA Journal
Building practical and efficient intrusion detection systems in computer network is important in industrial areas today and machine learning technique provides a set of effective algorithms to detect network intrusion. To find out appropriate algorithms for building such kinds of systems, it is necessary to evaluate various types of machine learning algorithms based on specific criteria. In this paper, we propose a novel evaluation formula which incorporates 6 indexes into our comprehensive measurement, including precision, recall, root mean square error, training time, sample complexity and practicability, in order to find algorithms which have high detection rate, low training time, need less training samples and are easy to use like constructing, understanding and analyzing models. Detailed evaluation process is designed to get all necessary assessment indicators and 6 kinds of machine learning algorithms are evaluated. Experimental results illustrate that Logistic Regression shows the best overall performance.
Forecasting number of vulnerabilities using long short-term neural memory net...IJECEIAES
Cyber-attacks are launched through the exploitation of some existing vulnerabilities in the software, hardware, system and/or network. Machine learning algorithms can be used to forecast the number of post release vulnerabilities. Traditional neural networks work like a black box approach; hence it is unclear how reasoning is used in utilizing past data points in inferring the subsequent data points. However, the long short-term memory network (LSTM), a variant of the recurrent neural network, is able to address this limitation by introducing a lot of loops in its network to retain and utilize past data points for future calculations. Moving on from the previous finding, we further enhance the results to predict the number of vulnerabilities by developing a time series-based sequential model using a long short-term memory neural network. Specifically, this study developed a supervised machine learning based on the non-linear sequential time series forecasting model with a long short-term memory neural network to predict the number of vulnerabilities for three vendors having the highest number of vulnerabilities published in the national vulnerability database (NVD), namely microsoft, IBM and oracle. Our proposed model outperforms the existing models with a prediction result root mean squared error (RMSE) of as low as 0.072.
Through the generalization of deep learning, the research community has addressed critical challenges in
the network security domain, like malware identification and anomaly detection. However, they have yet to
discuss deploying them on Internet of Things (IoT) devices for day-to-day operations. IoT devices are often
limited in memory and processing power, rendering the compute-intensive deep learning environment
unusable. This research proposes a way to overcome this barrier by bypassing feature engineering in the
deep learning pipeline and using raw packet data as input. We introduce a feature- engineering-less
machine learning (ML) process to perform malware detection on IoT devices. Our proposed model,”
Feature engineering-less ML (FEL-ML),” is a lighter-weight detection algorithm that expends no extra
computations on “engineered” features. It effectively accelerates the low-powered IoT edge. It is trained
on unprocessed byte-streams of packets. Aside from providing better results, it is quicker than traditional
feature-based methods. FEL-ML facilitates resource-sensitive network traffic security with the added
benefit of eliminating the significant investment by subject matter experts in feature engineering.
EFFICIENT ATTACK DETECTION IN IOT DEVICES USING FEATURE ENGINEERING-LESS MACH...ijcsit
Through the generalization of deep learning, the research community has addressed critical challenges in
the network security domain, like malware identification and anomaly detection. However, they have yet to
discuss deploying them on Internet of Things (IoT) devices for day-to-day operations. IoT devices are often
limited in memory and processing power, rendering the compute-intensive deep learning environment
unusable. This research proposes a way to overcome this barrier by bypassing feature engineering in the
deep learning pipeline and using raw packet data as input. We introduce a feature- engineering-less
machine learning (ML) process to perform malware detection on IoT devices. Our proposed model,”
Feature engineering-less ML (FEL-ML),” is a lighter-weight detection algorithm that expends no extra
computations on “engineered” features. It effectively accelerates the low-powered IoT edge. It is trained
on unprocessed byte-streams of packets. Aside from providing better results, it is quicker than traditional
feature-based methods. FEL-ML facilitates resource-sensitive network traffic security with the added
benefit of eliminating the significant investment by subject matter experts in feature engineering.
Vehicle Ad Hoc Networks (VANETs) have become a viable technology to improve traffic flow and safety on the roads. Due to its effectiveness and scalability, the Wingsuit Search-based Optimised Link State Routing Protocol (WS-OLSR) is frequently used for data distribution in VANETs. However, the selection of MultiPoint Relays (MPRs) plays a pivotal role in WS-OLSR's performance. This paper presents an improved MPR selection algorithm tailored to WS-OLSR, designed to enhance the overall routing efficiency and reduce overhead. The analysis found that the current OLSR protocol has problems such as redundancy of HELLO and TC message packets or failure to update routing information in time, so a WS-OLSR routing protocol based on improved-MPR selection algorithm was proposed. Firstly, factors such as node mobility and link changes are comprehensively considered to reflect network topology changes, and the broadcast cycle of node HELLO messages is controlled through topology changes. Secondly, a new MPR selection algorithm is proposed, considering link stability issues and nodes. Finally, evaluate its effectiveness in terms of packet delivery ratio, end-to-end delay, and control message overhead. Simulation results demonstrate the superior performance of our improved MR selection algorithm when compared to traditional approaches.
A Novel Medium Access Control Strategy for Heterogeneous Traffic in Wireless ...IJCNCJournal
So far, Wireless Body Area Networks (WBANs) have played a pivotal role in driving the development of intelligent healthcare systems with broad applicability across various domains. Each WBAN consists of one or more types of sensors that can be embedded in clothing, attached directly to the body, or even implanted beneath an individual's skin. These sensors typically serve asingle application. However, the traffic generated by each sensor may have distinct requirements. This diversity necessitates a dual approach: tailored treatment based on the specific needs of each traffic typeand the fulfillment of application requirements, such asreliability and timeliness. Never the less, the presence of energy constraints and the unreliable nature of wireless communications make QoS provisioning under such networks a non-trivial task. In this context, the current paper introduces a novel Medium AccessControl (MAC) strategy for the regular traffic applications of WBANs, designed to significantly enhance efficiency when compared to the established MAC protocols IEEE 802.15.4 and IEEE 802.15.6, with a particular focus on improving reliability, timeliness, and energy efficiency.
May_2024 Top 10 Read Articles in Computer Networks & Communications.pdfIJCNCJournal
The International Journal of Computer Networks & Communications (IJCNC) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of Computer Networks & Communications. The journal focuses on all technical and practical aspects of Computer Networks & data Communications. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on advanced networking concepts and establishing new collaborations in these areas.
A Topology Control Algorithm Taking into Account Energy and Quality of Transm...IJCNCJournal
The efficient use of energy in wireless sensor networks is critical for extending node lifetime. The network topology is one of the factors that have a significant impact on the energy usage at the nodes and the quality of transmission (QoT) in the network. We propose a topology control algorithm for software-defined wireless sensor networks (SDWSNs) in this paper. Our method is to formulate topology control algorithm as a nonlinear programming (NP) problem with the objective to optimizing two metrics, maximum communication range, and desired degree. This NP problem is solved at the SDWSN controller by employing the genetic algorithm (GA) to determine the best topology. The simulation results show that the proposed algorithm outperforms the MaxPower algorithm in terms of average node degree and energy expansion ratio.
Multi-Server user Authentication Scheme for Privacy Preservation with Fuzzy C...IJCNCJournal
The integration of artificial intelligence technology with a scalable Internet of Things (IoT) platform facilitates diverse smart communication services, allowing remote users to access services from anywhere at any time. The multi-server environment within IoT introduces a flexible security service model, enabling users to interact with any server through a single registration. To ensure secure and privacy preservation services for resources, an authentication scheme is essential. Zhao et al. recently introduced a user authentication scheme for the multi-server environment, utilizing passwords and smart cards, claiming resilience against well-known attacks. This paper conducts cryptanalysis on Zhao et al.'s scheme, focusing on denial of service and privacy attacks, revealing a lack of user-friendliness. Subsequently, we propose a new multi-server user authentication scheme for privacy preservation with fuzzy commitment over the IoT environment, addressing the shortcomings of Zhao et al.'s scheme. Formal security verification of the proposed scheme is conducted using the ProVerif simulation tool. Through both formal and informal security analyses, we demonstrate that the proposed scheme is resilient against various known attacks and those identified in Zhao et al.'s scheme.
Advanced Privacy Scheme to Improve Road Safety in Smart Transportation SystemsIJCNCJournal
In -Vehicle Ad-Hoc Network (VANET), vehicles continuously transmit and receive spatiotemporal data with neighboring vehicles, thereby establishing a comprehensive 360-degree traffic awareness system. Vehicular Network safety applications facilitate the transmission of messages between vehicles that are near each other, at regular intervals, enhancing drivers' contextual understanding of the driving environment and significantly improving traffic safety. Privacy schemes in VANETs are vital to safeguard vehicles’ identities and their associated owners or drivers. Privacy schemes prevent unauthorized parties from linking the vehicle's communications to a specific real-world identity by employing techniques such as pseudonyms, randomization, or cryptographic protocols. Nevertheless, these communications frequently contain important vehicle information that malevolent groups could use to Monitor the vehicle over a long period. The acquisition of this shared data has the potential to facilitate the reconstruction of vehicle trajectories, thereby posing a potential risk to the privacy of the driver. Addressing the critical challenge of developing effective and scalable privacy-preserving protocols for communication in vehicle networks is of the highest priority. These protocols aim to reduce the transmission of confidential data while ensuring the required level of communication. This paper aims to propose an Advanced Privacy Vehicle Scheme (APV) that periodically changes pseudonyms to protect vehicle identities and improve privacy. The APV scheme utilizes a concept called the silent period, which involves changing the pseudonym of a vehicle periodically based on the tracking of neighboring vehicles. The pseudonym is a temporary identifier that vehicles use to communicate with each other in a VANET. By changing the pseudonym regularly, the APV scheme makes it difficult for unauthorized entities to link a vehicle's communications to its real-world identity. The proposed APV is compared to the SLOW, RSP, CAPS, and CPN techniques. The data indicates that the efficiency of APV is a better improvement in privacy metrics. It is evident that the AVP offers enhanced safety for vehicles during transportation in the smart city.
April 2024 - Top 10 Read Articles in Computer Networks & CommunicationsIJCNCJournal
The International Journal of Computer Networks & Communications (IJCNC) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of Computer Networks & Communications. The journal focuses on all technical and practical aspects of Computer Networks & data Communications. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on advanced networking concepts and establishing new collaborations in these areas.
DEF: Deep Ensemble Neural Network Classifier for Android Malware DetectionIJCNCJournal
Malware is one of the threats to security of computer networks and information systems. Since malware instances are available sufficiently, there is increased interest among researchers on usage of Artificial Intelligence (AI). Of late AI-enabled methods such as machine learning (ML) and deep learning paved way for solving many real-world problems. As it is a learning-based approach, accumulated training samples help in improving thequality of training and thus leveraging malware detection accuracy. Existing deep learning methods are focusing on learning-based malware detection systems. However, there is need for improving the state of the art through ensemble approach. Towards this end, in this paper we proposed a framework known as Deep Ensemble Framework (DEF) for automatic malware detection. The framework obtains features from training samples. From given malware instance a grayscale image is generated. There is another process to extract the opcode sequences. Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) techniques are used to obtain grayscale image and opcode sequence respectively. Afterwards, a stacking ensemble is employed in order to achieve efficient malware detection and classification. Malware samples collected fromthe Internet sources and Microsoft are used for theempirical study. An algorithm known as Ensemble Learning for Automatic Malware Detection (EL-AML) is proposed to realize our framework. Another algorithm named Pre-Process is proposed to assist the EL-AML algorithm for obtaining intermediate features required by CNN and LSTM.Empirical study reveals that our framework outperforms many existing methods in terms of speed-up and accuracy.
High Performance NMF Based Intrusion Detection System for Big Data IOT TrafficIJCNCJournal
With the emergence of smart devices and the Internet of Things (IoT), millions of users connected to the network produce massive network traffic datasets. These vast datasets of network traffic, Big Data are challenging to store, deal with and analyse using a single computer. In this paper we developed parallel implementation using a High Performance Computer (HPC) for the Non-Negative Matrix Factorization technique as an engine for an Intrusion Detection System (HPC-NMF-IDS). The large IoT traffic datasets of order of millions samples are distributed evenly on all the computing cores for both storage and speedup purpose. The distribution of computing tasks involved in the Matrix Factorization takes into account the reduction of the communication cost between the computing cores. The experiments we conducted on the proposed HPC-IDS-NMF give better results than the traditional ML-based intrusion detection systems. We could train the HPC model with datasets of one million samples in only 31 seconds instead of the 40 minutes using one processor), that is a speed up of 87 times. Moreover, we have got an excellent detection accuracy rate of 98% for KDD dataset.
A Novel Medium Access Control Strategy for Heterogeneous Traffic in Wireless ...IJCNCJournal
So far, Wireless Body Area Networks (WBANs) have played a pivotal role in driving the development of intelligent healthcare systems with broad applicability across various domains. Each WBAN consists of one or more types of sensors that can be embedded in clothing, attached directly to the body, or even implanted beneath an individual's skin. These sensors typically serve asingle application. However, the traffic generated by each sensor may have distinct requirements. This diversity necessitates a dual approach: tailored treatment based on the specific needs of each traffic typeand the fulfillment of application requirements, such asreliability and timeliness. Never the less, the presence of energy constraints and the unreliable nature of wireless communications make QoS provisioning under such networks a non-trivial task. In this context, the current paper introduces a novel Medium AccessControl (MAC) strategy for the regular traffic applications of WBANs, designed to significantly enhance efficiency when compared to the established MAC protocols IEEE 802.15.4 and IEEE 802.15.6, with a particular focus on improving reliability, timeliness, and energy efficiency.
A Topology Control Algorithm Taking into Account Energy and Quality of Transm...IJCNCJournal
The efficient use of energy in wireless sensor networks is critical for extending node lifetime. The network topology is one of the factors that have a significant impact on the energy usage at the nodes and the quality of transmission (QoT) in the network. We propose a topology control algorithm for software-defined wireless sensor networks (SDWSNs) in this paper. Our method is to formulate topology control algorithm as a nonlinear programming (NP) problem with the objective to optimizing two metrics, maximum communication range, and desired degree. This NP problem is solved at the SDWSN controller by employing the genetic algorithm (GA) to determine the best topology. The simulation results show that the proposed algorithm outperforms the MaxPower algorithm in terms of average node degree and energy expansion ratio.
Multi-Server user Authentication Scheme for Privacy Preservation with Fuzzy C...IJCNCJournal
The integration of artificial intelligence technology with a scalable Internet of Things (IoT) platform facilitates diverse smart communication services, allowing remote users to access services from anywhere at any time. The multi-server environment within IoT introduces a flexible security service model, enabling users to interact with any server through a single registration. To ensure secure and privacy preservation services for resources, an authentication scheme is essential. Zhao et al. recently introduced a user authentication scheme for the multi-server environment, utilizing passwords and smart cards, claiming resilience against well-known attacks. This paper conducts cryptanalysis on Zhao et al.'s scheme, focusing on denial of service and privacy attacks, revealing a lack of user-friendliness. Subsequently, we propose a new multi-server user authentication scheme for privacy preservation with fuzzy commitment over the IoT environment, addressing the shortcomings of Zhao et al.'s scheme. Formal security verification of the proposed scheme is conducted using the ProVerif simulation tool. Through both formal and informal security analyses, we demonstrate that the proposed scheme is resilient against various known attacks and those identified in Zhao et al.'s scheme.
Advanced Privacy Scheme to Improve Road Safety in Smart Transportation SystemsIJCNCJournal
In -Vehicle Ad-Hoc Network (VANET), vehicles continuously transmit and receive spatiotemporal data with neighboring vehicles, thereby establishing a comprehensive 360-degree traffic awareness system. Vehicular Network safety applications facilitate the transmission of messages between vehicles that are near each other, at regular intervals, enhancing drivers' contextual understanding of the driving environment and significantly improving traffic safety. Privacy schemes in VANETs are vital to safeguard vehicles’ identities and their associated owners or drivers. Privacy schemes prevent unauthorized parties from linking the vehicle's communications to a specific real-world identity by employing techniques such as pseudonyms, randomization, or cryptographic protocols. Nevertheless, these communications frequently contain important vehicle information that malevolent groups could use to Monitor the vehicle over a long period. The acquisition of this shared data has the potential to facilitate the reconstruction of vehicle trajectories, thereby posing a potential risk to the privacy of the driver. Addressing the critical challenge of developing effective and scalable privacy-preserving protocols for communication in vehicle networks is of the highest priority. These protocols aim to reduce the transmission of confidential data while ensuring the required level of communication. This paper aims to propose an Advanced Privacy Vehicle Scheme (APV) that periodically changes pseudonyms to protect vehicle identities and improve privacy. The APV scheme utilizes a concept called the silent period, which involves changing the pseudonym of a vehicle periodically based on the tracking of neighboring vehicles. The pseudonym is a temporary identifier that vehicles use to communicate with each other in a VANET. By changing the pseudonym regularly, the APV scheme makes it difficult for unauthorized entities to link a vehicle's communications to its real-world identity. The proposed APV is compared to the SLOW, RSP, CAPS, and CPN techniques. The data indicates that the efficiency of APV is a better improvement in privacy metrics. It is evident that the AVP offers enhanced safety for vehicles during transportation in the smart city.
DEF: Deep Ensemble Neural Network Classifier for Android Malware DetectionIJCNCJournal
Malware is one of the threats to security of computer networks and information systems. Since malware instances are available sufficiently, there is increased interest among researchers on usage of Artificial Intelligence (AI). Of late AI-enabled methods such as machine learning (ML) and deep learning paved way for solving many real-world problems. As it is a learning-based approach, accumulated training samples help in improving thequality of training and thus leveraging malware detection accuracy. Existing deep learning methods are focusing on learning-based malware detection systems. However, there is need for improving the state of the art through ensemble approach. Towards this end, in this paper we proposed a framework known as Deep Ensemble Framework (DEF) for automatic malware detection. The framework obtains features from training samples. From given malware instance a grayscale image is generated. There is another process to extract the opcode sequences. Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) techniques are used to obtain grayscale image and opcode sequence respectively. Afterwards, a stacking ensemble is employed in order to achieve efficient malware detection and classification. Malware samples collected fromthe Internet sources and Microsoft are used for theempirical study. An algorithm known as Ensemble Learning for Automatic Malware Detection (EL-AML) is proposed to realize our framework. Another algorithm named Pre-Process is proposed to assist the EL-AML algorithm for obtaining intermediate features required by CNN and LSTM.Empirical study reveals that our framework outperforms many existing methods in terms of speed-up and accuracy.
High Performance NMF based Intrusion Detection System for Big Data IoT TrafficIJCNCJournal
With the emergence of smart devices and the Internet of Things (IoT), millions of users connected to the network produce massive network traffic datasets. These vast datasets of network traffic, Big Data are challenging to store, deal with and analyse using a single computer. In this paper we developed parallel implementation using a High Performance Computer (HPC) for the Non-Negative Matrix Factorization technique as an engine for an Intrusion Detection System (HPC-NMF-IDS). The large IoT traffic datasets of order of millions samples are distributed evenly on all the computing cores for both storage and speedup purpose. The distribution of computing tasks involved in the Matrix Factorization takes into account the reduction of the communication cost between the computing cores. The experiments we conducted on the proposed HPC-IDS-NMF give better results than the traditional ML-based intrusion detection systems. We could train the HPC model with datasets of one million samples in only 31 seconds instead of the 40 minutes using one processor), that is a speed up of 87 times. Moreover, we have got an excellent detection accuracy rate of 98% for KDD dataset.
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...IJCNCJournal
Cyber intrusion attacks increasingly target the Internet of Things (IoT) ecosystem, exploiting vulnerable devices and networks. Malicious activities must be identified early to minimize damage and mitigate threats. Using actual benign and attack traffic from the CICIoT2023 dataset, this WORK aims to evaluate and benchmark machine-learning techniques for IoT intrusion detection. There are four main phases to the system. First, the CICIoT2023 dataset is refined to remove irrelevant features and clean up missing and duplicate data. The second phase employs statistical models and artificial intelligence to discover novel features. The most significant features are then selected in the third phase based on cooperative game theory. Using the original CICIoT2023 dataset and a dataset containing only novel features, we train and evaluate a variety of machine learning classifiers. On the original dataset, Random Forest achieved the highest accuracy of 99%. Still, with novel features, Random Forest's performance dropped only slightly (96%) while other models achieved significantly lower accuracy. As a whole, the work contributes substantial contributions to tailored feature engineering, feature selection, and rigorous benchmarking of IoT intrusion detection techniques. IoT networks and devices face continuously evolving threats, making it necessary to develop robust intrusion detection systems.
Enhancing Traffic Routing Inside a Network through IoT Technology & Network C...IJCNCJournal
IoT networking uses real items as stationary or mobile nodes. Mobile nodes complicate networking. Internet of Things (IoT) networks have a lot of control overhead messages because devices are mobile. These signals are generated by the constant flow of control data as such device identity, geographical positioning, node mobility, device configuration, and others. Network clustering is a popular overhead communication management method. Many cluster-based routing methods have been developed to address system restrictions. Node clustering based on the Internet of Things (IoT) protocol, may be used to cluster all network nodes according to predefined criteria. Each cluster will have a Smart Designated Node. SDN cluster management is efficient. Many intelligent nodes remain in the network. The network design spreads these signals. This paper presents an intelligent and responsive routing approach for clustered nodes in IoT networks. An existing method builds a new sub-area clustered topology. The Nodes Clustering Based on the Internet of Things (NCIoT) method improves message transmission between any two nodes. This will facilitate the secure and reliable interchange of healthcare data between professionals and patients. NCIoT is a system that organizes nodes in the Internet of Things (IoT) by grouping them together based on their proximity. It also picks SDN routes for these nodes. This approach involves selecting one option from a range of choices and preparing for likely outcomes problem addressing limitations on activities is a primary focus during the review process. Predictive inquiry employs the process of analyzing data to forecast and anticipate future events. This document provides an explanation of compact units. The Predictive Inquiry Small Packets (PISP) improved its backup system and partnered with SDN to establish a routing information table for each intelligent node, resulting in higher routing performance. Both principal and secondary roads are available for use. The simulation findings indicate that NCIoT algorithms outperform CBR protocols. Enhancements lead to a substantial 78% boost in network performance. In addition, the end-to-end latency dropped by 12.5%. The PISP methodology produces 5.9% more inquiry packets compared to alternative approaches. The algorithms are constructed and evaluated against academic ones.
IoT Guardian: A Novel Feature Discovery and Cooperative Game Theory Empowered...IJCNCJournal
Cyber intrusion attacks increasingly target the Internet of Things (IoT) ecosystem, exploiting vulnerable devices and networks. Malicious activities must be identified early to minimize damage and mitigate threats. Using actual benign and attack traffic from the CICIoT2023 dataset, this WORK aims to evaluate and benchmark machine-learning techniques for IoT intrusion detection. There are four main phases to the system. First, the CICIoT2023 dataset is refined to remove irrelevant features and clean up missing and duplicate data. The second phase employs statistical models and artificial intelligence to discover novel features. The most significant features are then selected in the third phase based on cooperative game theory. Using the original CICIoT2023 dataset and a dataset containing only novel features, we train and evaluate a variety of machine learning classifiers. On the original dataset, Random Forest achieved the highest accuracy of 99%. Still, with novel features, Random Forest's performance dropped only slightly (96%) while other models achieved significantly lower accuracy. As a whole, the work contributes substantial contributions to tailored feature engineering, feature selection, and rigorous benchmarking of IoT intrusion detection techniques. IoT networks and devices face continuously evolving threats, making it necessary to develop robust intrusion detection systems.
** Connect, Collaborate, And Innovate: IJCNC - Where Networking Futures Take ...IJCNCJournal
The International Journal of Computer Networks & Communications (IJCNC) is a bi monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of Computer Networks & Communications. The journal focuses on all technical and practical aspects of Computer Networks & data Communications. The goal of this journal is to bring together researchers and practitioners from academia and industry to focus on advanced networking concepts and establishing new collaborations in these areas.
Enhancing Traffic Routing Inside a Network through IoT Technology & Network C...IJCNCJournal
IoT networking uses real items as stationary or mobile nodes. Mobile nodes complicate networking. Internet of Things (IoT) networks have a lot of control overhead messages because devices are mobile. These signals are generated by the constant flow of control data as such device identity, geographical positioning, node mobility, device configuration, and others. Network clustering is a popular overhead communication management method. Many cluster-based routing methods have been developed to address system restrictions. Node clustering based on the Internet of Things (IoT) protocol, may be used to cluster all network nodes according to predefined criteria. Each cluster will have a Smart Designated Node. SDN cluster management is efficient. Many intelligent nodes remain in the network. The network design spreads these signals. This paper presents an intelligent and responsive routing approach for clustered nodes in IoT networks. An existing method builds a new sub-area clustered topology. The Nodes Clustering Based on the Internet of Things (NCIoT) method improves message transmission between any two nodes. This will facilitate the secure and reliable interchange of healthcare data between professionals and patients. NCIoT is a system that organizes nodes in the Internet of Things (IoT) by grouping them together based on their proximity. It also picks SDN routes for these nodes. This approach involves selecting one option from a range of choices and preparing for likely outcomes problem addressing limitations on activities is a primary focus during the review process. Predictive inquiry employs the process of analyzing data to forecast and anticipate future events. This document provides an explanation of compact units. The Predictive Inquiry Small Packets (PISP) improved its backup system and partnered with SDN to establish a routing information table for each intelligent node, resulting in higher routing performance. Both principal and secondary roads are available for use. The simulation findings indicate that NCIoT algorithms outperform CBR protocols. Enhancements lead to a substantial 78% boost in network performance. In addition, the end-to-end latency dropped by 12.5%. The PISP methodology produces 5.9% more inquiry packets compared to alternative approaches. The algorithms are constructed and evaluated against academic ones.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
The Internet of Things (IoT) is a revolutionary concept that connects everyday objects and devices to the internet, enabling them to communicate, collect, and exchange data. Imagine a world where your refrigerator notifies you when you’re running low on groceries, or streetlights adjust their brightness based on traffic patterns – that’s the power of IoT. In essence, IoT transforms ordinary objects into smart, interconnected devices, creating a network of endless possibilities.
Here is a blog on the role of electrical and electronics engineers in IOT. Let's dig in!!!!
For more such content visit: https://nttftrg.com/
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Water billing management system project report.pdfKamal Acharya
Our project entitled “Water Billing Management System” aims is to generate Water bill with all the charges and penalty. Manual system that is employed is extremely laborious and quite inadequate. It only makes the process more difficult and hard.
The aim of our project is to develop a system that is meant to partially computerize the work performed in the Water Board like generating monthly Water bill, record of consuming unit of water, store record of the customer and previous unpaid record.
We used HTML/PHP as front end and MYSQL as back end for developing our project. HTML is primarily a visual design environment. We can create a android application by designing the form and that make up the user interface. Adding android application code to the form and the objects such as buttons and text boxes on them and adding any required support code in additional modular.
MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software. It is a stable ,reliable and the powerful solution with the advanced features and advantages which are as follows: Data Security.MySQL is free open source database that facilitates the effective management of the databases by connecting them to the software.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
Heap Sort (SS).ppt FOR ENGINEERING GRADUATES, BCA, MCA, MTECH, BSC STUDENTS
ADDRESSING IMBALANCED CLASSES PROBLEM OF INTRUSION DETECTION SYSTEM USING WEIGHTED EXTREME LEARNING MACHINE
1. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
DOI: 10.5121/ijcnc.2019.11503 39
ADDRESSING IMBALANCED CLASSES PROBLEM OF
INTRUSION DETECTION SYSTEM USING
WEIGHTED EXTREME LEARNING MACHINE
Mohammed Awad1
and Alaeddin Alabdallah2
1
Faculty of E&IT, Dept. of Computer Systems Engineering,
Arab American University, Palestine
2
Faculty of E&IT, Dept. of Computer Engineering,
An-Najah National University, Palestine
ABSTRACT
The main issues of the Intrusion Detection Systems (IDS) are in the sensitivity of these systems toward the
errors, the inconsistent and inequitable ways in which the evaluation processes of these systems were often
performed. Most of the previous efforts concerned with improving the overall accuracy of these models via
increasing the detection rate and decreasing the false alarm which is an important issue. Machine
Learning (ML) algorithms can classify all or most of the records of the minor classes to one of the main
classes with negligible impact on performance. The riskiness of the threats caused by the small classes and
the shortcoming of the previous efforts were used to address this issue, in addition to the need for
improving the performance of the IDSs were the motivations for this work. In this paper, stratified sampling
method and different cost-function schemes were consolidated with Extreme Learning Machine (ELM)
method with Kernels, Activation Functions to build competitive ID solutions that improved the performance
of these systems and reduced the occurrence of the accuracy paradox problem. The main experiments were
performed using the UNB ISCX2012 dataset. The experimental results of the UNB ISCX2012 dataset
showed that ELM models with polynomial function outperform other models in overall accuracy, recall,
and F-score. Also, it competed with traditional model in Normal, DoS and SSH classes.
KEYWORDS
Machine Learning, Weighted Extreme Learning Machine, Intrusion detection system, Accuracy, UNB
ISCX2012.
1. INTRODUCTION
With the increase of the services that offered by the computational systems on computer
networks, it’s necessary to maintain the reliability, integrity, and availability of these systems,
which makes the information security of these systems more important. A very important
problem is the increasing of attackers on these systems [1]. The operations of cyber-attacks able
to cause significant economic damage to companies and organizations, thus attacks the national
security of any country [2]. There is also a greater complexity of Intrusion attacks due to the
exponential growth of mobile devices and cloud environments. Intrusion detection (ID) in
cyberspace is a multi-disciplinary problem. One side of the problem is a cyber-security problem,
and the other side is the statistical, knowledge-based and machine learning fields that represent
the factories that produce the pool of solutions. This paper focuses on the machine learning
solutions of the ID problem.
2. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
40
The security problem becomes more complicated because of the high connectivity of the world
via the internet. Studying communication, computer network systems, protocols, and services
fields, which represent the main parameters of the internet appear wide distributions of the faults
for most computing components of the system. These faults caused the previous, current and
future attacks. In paper [3] the authors present some of these facts where TCP protocols suffer
from a list of security flaws.
As mentioned in [4], the ID solutions are classified into one of three common methodological
classes. The first class called misused or signature-based IDS, in this approach, different normal
and abnormal known rules or patterns are classified in the training phase from labeled data, and
then the generated models are used to make a prediction for the unseen data. Although these
models produce high accuracy for detecting known and some variant of unknown attacks, they
fail in detecting zero-day attacks. The second class is the anomaly-based IDS, it depends on the
closed world hypothesis [5], which supposes hat the model has the capability to capture all
normal behaviors in the training phase, and then developed models are used to measure the
deviation from the normal behavior in the testing phase to predict the unseen data as normal or
anomalies. This model can detect the zero-day attacks but with total accuracy not better than the
preceding one. The Third one is the hybrid approach which combines both previous approaches in
one model.
The network ID field has a wide set of open issues, some of them will be illustrated in the
following few paragraphs. Firstly, the scalability issue for ML algorithm or any other tool that
used to solve the ID problem. Computer networks generate a huge volume of traffic which is
increasing more and more due to the expansion of the Internet services, increasing the mobile
devices and the movement toward the internet of things (IoT) technology [6] Secondly; it is
related to labeling the records collected from the traffic correctly. This process needs extra efforts
from experts to label the traffic correctly. It increases the need to benefit from the huge size of
unlabeled records besides the correct labeled records. Third, this issue related to an anomaly
detection method, it is about the inability of the data collector to aggregate a pure set that includes
all variant of either normal traffic or abnormal traffic in case that the zero-day or newly attacks
are renewable. This is summarized with the impossibility to have the close world in our domain.
The question is if the incremental learning by the ML algorithm can address this dilemma that
based on the closed world assumption which is impractically in our domain. Fourth, it is a
multifaceted issue that this paper focused on; it is about the sensitivity of the IDS toward the
errors. Most works in this field concerned about increasing the detection rate and decreasing the
false alarm rate (FAR) in order to improve their system accuracy [7] [8]. Even the number of
misclassified records is little, in huge traffic; it represents a big problem for the clients of network
services if the normal traffic treats as an anomaly, and it makes a big headache for network
administrators to treat a huge amount of false alarms. On the other hand, the exact detection of
abnormal traffics helps the system administrator to solve the problem easily. Most studies
performed the performance of their approaches using the NSL-KDD dataset [10] [12]; which had
succeeded in improving the overall accuracy, this phenomenon called accuracy paradox [10].
The detection of the minor attacks will be a crucial issue [11] if it is related to minor attacks that
have a high level of security.
In this paper, we are interested in improving the accuracy of IDS for the new attacks and
mitigating the existence of accuracy paradox problem. So, a weighted algorithm which is Extreme
Machine Learning (ELM) with different weight schemes stratified sampling and with optimizing
for some parameters of these algorithms was consolidated to solve this problem. WELM is the
fast and simple NNs that solve the time consuming iterative process in feedforward neural
3. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
41
networks (FFNN). Furthermore, the evaluation phase was processed inconsistent and fairway; it
was taken into account the data selection reasons and the way of performing different tests. These
experiments were performed on one benchmark dataset which is UNB ISCX2012. The UNB
ISCX 2012 is a benchmark dataset that includes real-time contemporary traffic for normal and
attack behaviors. It is generated systematically so this makes it modifiable, extendable, and
reproducible dataset. It includes four types of attack scenarios which are inside network
infiltration, Hypertext Transfer Protocol (HTTP) denial of service, IRC Botnet Distributed Denial
of Service Attacks (DDoS), brute force SSH. These are some of the open issues in this field and
there are others included in literature. They prevent a lot of these efforts, especially those
developed using anomaly-based methods to deploy in operational real-world environments [5].
The awareness of the pressing needs to improve power and dynamics security tools that protect
the contemporary computing systems emphasize the great interest of researchers of both
communities to improve the IDSs.
2. RELATED WORKS
ID problem has great interest from the researchers; part of these efforts concentrated on review
the problem from a different point of view, one of the recent surveys [16] studied different
categories of anomaly ID methods, they are classifications, statistical, information theory, and
clustering. The Machine learning and data mining community suggest many tricks to solve the
deficiency of its models in predicting the small classes of ID problems. Different approaches
suggested in [17] are to solve the imbalanced classes. The oversampling and undersampling are
the common two resampling methods in literature while the cost function was added to different
ML algorithms to address its sensitivity to imbalanced classes. Different cost functions suggested
in related works and applied with different data mining and machine learning tools, such as
ANNs, SVM, clustering, DATA, naive Bayes, EA [17]. In [18], an enhanced method called
sample selected extreme learning machine (SS-ELM) is used to classify the ID of cloud servers.
Then, the selected sample is given to the fog nodes/MEC hosts for training. This design can bring
down the training time and increase detection accuracy. Several Network ID models were
proposed and tested in the last decade. These models were built based on a well-known dataset
called NSL-KDD [15]. Most of these efforts concentrated on either making normal or abnormal
record prediction or multi-class classification predictions.
On the other hand, a few efforts tried to build sub-models like in [19]. This paper develops a new
hybrid model consists of Meta Pagging, Random Tree, REP Tree, AdaBoostM1, Decision Stump,
and Naïve Bayes, that can be used to estimate the intrusion scope threshold degree based on the
network transaction data’s optimal features that were made available for training. The results
revealed had a significant effect on the minimization of the computational and time complexity
involved when determining the feature association impact scale. The authors in [20] proposed a
hybrid model for detecting different classes of DoS attacks. In this model, the Particle Swarm
Optimization algorithm used as feature selection methods, then it used SVM to build a model for
predicting the different classes of DoS attacks. These efforts and others go with the advice that
recommended narrowing the scope of the ID problem in order to reduce the FAR when building
the ML models. To compare the performance of the supervised or unsupervised ML models as ID
solutions, the authors in [21] have built a framework and made a number of experiments. They
demonstrate that the supervised learning model does better if the test data contain known or a
variant of known attacks. While both have close performance in the dataset contains unknown
attacks.
Many efforts performed to generate benchmark contemporary and real-time traffic datasets, one
of these done by ISCX. A systematic approach was used to generate modifiable, extendable, and
4. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
42
reproducible dataset [14] which is known as UNB ISCX2012. It includes real traffic related to
FTP, HTTP, IMAP, POP3, and SMTP and SSH protocols. UNB ISCX2012 dataset includes four
types of attacks in addition to the normal traffic; these attacks are inside network infiltration,
HTTP denial of service, IRC Botnet DDoS, Brut force SSH. In [22] the author used a supervised
ML method to detect DDoS depends on network Entropy estimation, Co-clustering, Information
Gain Ratio, and Extra-Trees algorithm. The unsupervised phase of the approach allows reducing
the irrelevant normal traffic data for DDoS which allows reducing false- positive rates and
increasing accuracy. Experiments performed using datasets NSL-KDD, UNB ISCX 12 and
UNSW-NB15.The authors in [23] applied a hybrid scheme that combines deep learning and
support vector machine to improve accuracy in ISCX IDS UNB dataset classes. The result
indicated the combined model outperforms SVM alone in terms of both accuracy and run-time
efficiency.
Another kind of hybrid model was introduced in the literature for our problem, but at this time, it
was combined with multiple kernels together [24]. Multiple Adaptive Reduced Kernel Extreme
Machine Learning Model (MARK-ELM) was proposed. This work proposed a framework that
used the AdaBoost method to combine each set of Reduced Kernel Multi-class ELM models in
order to increase the detection accuracy and decrease the false alarm. Twelve combined models
were performed, seven of them got greater than 99% accuracy in total, but only one of them got
greater than 30% for U2R class and it got 60.87%, which confirms the existence of accuracy
paradox problem in these experiments. Another multi-level ID model was proposed in [9]. It
passed through three phases. In the first phase, the categorical records were used to generate a set
of rules to binary normal, abnormal prediction using the well-known Classification and
Regression Trees (CART) algorithm. The second phase included building three predictive models
using SVM, Naïve Bases, and NNs in order to determine the exact attacks categories for only
three of the attack, while U2R attacks excluded because of the insufficient amounts of records,
this confirms the existence of the imbalanced class problem. In this phase, it used both the row
data features once and the features were generated using Discrete Wavelet Transformation
(DWT) methods in again, the models were built using the last set of features performed better
than the features of raw data. In the last phase, it deployed a visual analytical tool called iPCA to
perform visual and reasonable analysis of the results. This is a remarkable suggestion or solution
for the recommendation assigned in [5] about the clearance of the interpretation of the result at
the evaluation step of our problem.
The author in [25] used the UNB ISCX2012 dataset to build multiple class classification solution
for the ID problem. The SVM with Gaussian radial base function (RBF) and polynomial kernels,
MLPNN and Naïve Based algorithms are deployed to build different models. The SVM with
polynomial kernel had the best performance than others. There are two remarks related to this
work, the first, the number of records of this dataset as it is included in this paper is inconsistent
with the real number of records of the UNB ISCX dataset. They assumed that the number of
records of Botnet and DoS attacks equals 5 and 40 sequentially, while the correct number of these
classes is 37460, 3776 sequentially. Second, “All the tests were carried out on the same training
and testing dataset” which a subset was selected randomly with respect to the huge classes. The
performance of these experiments is not fair to reflect the correct performance of that algorithm
on this dataset or on any other subset else. A lot of ML algorithms were used, and many tricks
and enhancements also were deployed in order to improve the ID solutions, they could increase
the detection rate and also decrease the false alarms in total but they failed to detect the rare but
serious attacks.
5. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
43
In the research presented in [26], the authors present a framework for anomaly detection
depending on the Bayesian Optimization technique to tune the parameters of SVM-RBF, Random
Forest, k-Nearest Neighbor algorithms. The performance of the proposed model evaluated using
the ISCX 2012 dataset. The produced results are effective in accuracy, precision, low-false alarm
rate, and recall. This paper [27] depends on 2 stages; building prediction models for each type of
attacks separately and optimizing the model with the highest accuracy. Then, build a prediction
model for all attacks together using deep learning with the smallest number of features and we
optimize the model to achieve the highest accuracy. The model applied UNB ISCX 2012 dataset.
In this work, we have deployed Extreme Machine Learning (ELM) with different weight
schemes, stratified sampling and with optimizing for some parameters of these algorithms are
consolidate to improve the accuracy of IDS for the new attacks and mitigate the existence of
accuracy paradox problem.
3. METHODOLOGY
In this chapter, the proposed method is illustrated, it aims to improve the predicting accuracy of
the small and serious classes of the ID problem co-occurrence with preserve the overall accuracy.
It starts with emphasizing the datasets selection considerations. Then, it illustrates different
preprocessing steps which include data type portability, data cleaning, feature selection, and
stratified sampling. Next, it illustrates the deployed WELM model; they used to implement ID
solutions. Finally, it includes the evaluation process.
3.1 Dataset Selection Considerations
There are still shortages in the available datasets in the ID domain in spite of the great efforts that
were exerted in this field [14] [13]. These datasets are divided into two categories which are
simulated-based datasets and real-time datasets. Most the considerable public benchmarked
datasets are simulated-based datasets, they cannot reflect the nature of the contemporary traffic
and there is no possibility to modify or extend or reproduce these old datasets. The public real-
time datasets often subject to heavy anonymization in order preserve privacy. The dataset
anonymization is a process of hiding the critical data of these sets like payload content, real IP-
addresses, and others. CAIDA (2011), and LBNL are an example of public real-time datasets
which they are heavily anonymized and totally removed payload. Furthermore, most datasets
suffer from labeling problem regards the correctness or the completeness. Some of the important
datasets in the ID field are KDD-CUP99, NSL-KDD, UNB ISCX 2012 and Kyoto University
dataset. The UNB ISCX 2012 dataset includes two small classes, this is evident from figure 1it is
an important and sufficient selection at this scope, so it was suggested to perform the primary
experiments in this work. We are interested in selecting a contemporary and real-time dataset. So,
the UNB ISCX 2012 suggested performing the experiments. It is a benchmark dataset, and it is
included real-time contemporary traffic for normal and attack behaviors. It is generated
systematically so this made it modifiable, extendable, and reproducible dataset. To proof the
proposed idea, we performed experiments based on the general method in this paper using the
complete records of all attacks in addition to some randomly selected normal subsets that have the
same size of attack records, these subsets have small classes. This is evident from figure1which is
shown the distribution of the records for that subsets.
3.2 Pre-processing Phase
Data preprocessing includes many steps [28] that depend on the nature of the data. Different
preprocessing sub-steps were used; they included data-type portability, data cleaning, feature
selection, and stratified sampling. TheUNB ISCX 2012 ID Evaluation Dataset consists of 19
6. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
44
features, they listed in table 1. As a feature selection step, the cumulative and redundant records
were excluded. So, only the main 12 features were used in our experiments. The selected features
fall into two categories which are nominal and numerical features. The nominal features
converted to numeric features. Most features did not follow a balanced scale, so, they treated
using a data cleaning method. The generated tag feature refers to one of the following classes
which are Normal, inside network infiltration, HTTP denial of service, IRC Botnet DDoS and
Brut force SSH classes.
Figure.1. The records distribution of the UNB ISCX 2012dataset to build the experiments.
Table 1.The UNB ISCX 2012 features List.
Main
features
Application Name Total Source Bytes
Total Destination Bytes Total Destination Packets
Total Source Packets Direction
Source TCP Flags
Description
Destination TCP Flags
Description
Protocol Name Source Port
Destination Port Tag
Accumulative
and
redundant
features
Time Start Time End
sourcePayloadAsBase64 sourcePayloadAsUTF
destinationPayloadAsBase64 destinationPayloadAsUTF
dataroot_Id
Knowing that, this dataset was collected in seven days, only four of them included attacks
scenarios, one class per day. The record will classify by distinct the day that the attack appeared
in table 2. So, all the attacks appeared on the day of the inside network filtration scenario and they
were classified as attacks they will be classified as inside network filtration attacks and so on.
7. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
45
Table 2. The distribution of the records in the UNB ISCX 2012 dataset in the days which included attack
scenarios
The days named by the attack scenarios Attack Normal
Infiltrating the network from inside 20358 255170
HTTP Denial of Service 3776 167604
Distributed Denial of Service using an IRC Botnet 37460 534238
Brute Force SSH 5219 392376
Sum of the records 66813 1349388
The distribution of the records in the days which included attack scenarios illustrates in table 2. It
is clear that there are a sufficient number of records for each attack class, and there are a
tremendous number of normal records. This phase started with converting the nominal or
categorical data to sequential numeric values a data-type portability step. Then the imbalance
scale of the features was addressed using two common methods [26] in data cleaning phase:
Standardization: It is one of the common data transformation methods, it reproduces the data for
each feature to have zero mean and unity variance, it is represented using the following equation:
(1)
Where : is the mean of the feature j, is the standard deviation of the feature j and is the j
attribute of the records. Min-Max Scaling method: It scales all attributes into range and it
is represented using the following equation:
(2)
Where represent the value of the feature j and is j
attribute of the records. The standardization method was selected to clean the imbalance scale
of the feature.
3.3 Stratified Sampling
Stratified sampling is a statistical sampling method. It is an alternative to the known method
called random sampling. It is used to generate new subsets of data that have the same sample
fraction of their classes as in the main corpus. The following equation illustrates the sample
fraction:
(3)
8. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
46
Where is the fraction of the class in the main set and any subset, is the number of records in
an arbitrary set and is the number of records belonging to the class in the arbitrary set . It
guarantees that any generated subset will include records from all classes and the ratios of records
of all classes in these subsets as they are in the main data-set, while the class-records selected
each time randomly. It is clear that in the case where the minor classes present and the random
sampling is used, some models will be built that do not learn anything about these classes. This
was the reason for using this method.
3.4 Weighted Extreme Learning Machine
ELM is a feed-forward neural network, but it is not suffering from the time consuming and
iterative process in the feed-forward back propagation neural network. This is addressed via a
random selection of the weights and biases of the hidden layer, so it is a fast and simple method.
A set of other features related to ELM still needs to be discussed. One of them, it is able to deploy
different feature mapping and kernels. The other is the ability to build multiple class classification
solutions easily in one model, without the need to combine multiple binary classes together. The
method illustrated in [17] was used in this paper a WELM solution to address the unbalanced
classes in ID problem, a single hidden layer feedforward neural network (SLFFN) was used; it is
architecture is shown in figure 2. For any dataset ( ), where is the
feature matrix which includes records called and m features, while is a target
matrix. Like any neural network algorithms, both feature and target matrixes are numerical
matrices which are obtained from the output of the preprocessing phase. The algorithm starts with
ask user to determine the activation function and the number of the hidden neurons which is
denoted by and in sequential orders. Then the weight matrix and the bias vector
of hidden neurons are generated randomly, this saves the time for this algorithm and makes
this algorithm faster.
Figure. 2. Extreme Learning Machine Network
Different feature mapping can be used in ELM which represents the different activation functions
can be used, an example of activation function can be used are:
• Sigmoid function
(4)
• Gaussian function
9. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
47
(5)
Then the output of hidden neurons H is computed using the following equation:
(6)
The output layer consists of neurons, while is the number of classes in the problem, and the
weight matrix of the output layer donated by . Now, solving the problem means finding the
value of which maximize the marginal distance and minimize the weighted and accumulative
error, this represented the following equations:
(7)
The other form of the previous equation is:
(8)
, itis a diagonal matrix with size, is the weight of the record. , is the error of the
sample , which equal to the difference between the target value and the actual output.
Reformulate the equations using Lagrange and based on Karush Kuhn Tucker (KKT) theorem, it
is being:
(9)
Where is the Lagrange Multiplier which is a constant. In the next step, the partial derivative is
performed based on , , and .
(10)
Two forms of equation produce the , caused by solving equations, the first one has
dimension, and the second has the dimension of the inverse matrix. The first one is better
when the size of the dataset is small and it is able to reformulate in kernel form, while the other is
better for huge datasets.
(11)
(12)
10. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
48
Finally, the output for the complete network calculates using the following equation:
(13)
Due to the large sets represent the ID problems; the equation was used to solve this problem. We
are concerned with finding the best , , activation function and weight scheme that will be used
to build the ELM ID solution.
3.5 Accuracy Paradox and Cost-Function Scheme
The paradox of accuracy occurs frequently when most pattern recognition models were built
using unbalanced classes, it is easier for the ML algorithms to classify either all or most the
records of the small classes into one or more of the major classes, this happens with the negligible
effect of the total accuracy. But the problem gets worse when these minor classes are crucial in
the environment. The cost function is one of the methods were suggested to address this problem
[17], [18] it affects the learning process by giving different weights to the records that belong to
different classes. Different cost function methods were used. First scheme, the default weight
scheme where all classes have the same weight value which equals to 1. Second scheme [17], it
depends on the ratio between the numbers of records in the corpus to the number of records for
each class, the following equation used to calculate the weights for each class:
(14)
We used to represent the weight for all records which belong to , N to represent the
number of records in the corpus and to represent the number of records belong to a class the in
the corpus for all equation in this sub-section.
The third scheme [16], it is used the golden ratio multiplied with the inverse of the
number of the records belongs to each class; it is illustrated by the following equation:
(15)
Due to the convergence issues of the deployed algorithms, the first method was used with the
WSVM, while the third one was used with WELM, which performed well to address the
imbalanced class of ID problem.
3.6 General Method Procedure
The general procedure that was used in performing all experiments of WELM with on different
Kernels, activation functionon UNB ISCX2012 dataset is shown in the pseudo code(Algorithm)
below in detail illustrated the general structure of the proposed model.
Algorithm: The general procedure that was used in building the ID model
Input: Dataset UNB ISCX2012, cost-function, number of hidden layer neuron, other parameters.
11. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
49
Output: P number of models, Result Object;
Data Preprocessing: // Converting nominal fields into numerical
for a field in the dataset, do
if is_nominal (field) then
;
;
for k =1 to size(field), do
/* Find the index of unique
element that have the same value of the element of field column*/
;
/* replace the old field with the new numeric field*/
end if; // Applying the standardization method on the dataset fields
for a field in the dataset, do
for variable = 1 to size(field), do
// Partitioning the dataset into P sub-sets and // Calculate the fraction of each class i
// Generating the weight array which elements represent a weight for distinct class i
if cost-function == default then
W = ones(num_Classes)
else
for to num_Classes, do
If cost-function == second then
elseIf cost-function == third then
if Size( )
else
end if;
end if;
end if;
Main:
for i = 1 to P do
Train
Test dataset[ ]
model ==WELM then
end if;
ResultObject = compute All Metrics() /*Compute the overall accuracy and the accuracy for each
class for the training data*/
return P number of models, Result.
12. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
50
4. EXPERIMENTS AND RESULTS
The experiments were performed on the UNB ISCX2012 dataset. The WELM algorithm is used
to perform multiple class classification experiments on the dataset. Before starting in the issues of
the algorithm, it should be determined the best data normalization method which should be used
later in experiments. This selection depends on the properties of the data in the dataset, the
existence of outlier records is the crucial properties of the ID datasets. The MATLAB version of
the proposed WELM algorithm in [17] was used to perform parts of our experiments. The
performance of the WELM algorithm depends on the selection of the various parameters of this
algorithm; these parameters are the number of hidden layer neurons (L), the activation function of
the hidden layer neurons and the regularization parameter . Two activation functions were used,
they are sigmoidal and Gaussian activation function.
The activation function and weight scheme that will be used to build the ELM ID solution. All
combinations of the proposed model parameters are listed in table 3. The cross-validation method
called leave-one-out [7] was mainly used to build, evaluate and validate all models. Based on the
leave-one-out method the experiments repeated n times, for each round the data will divide to
partitions, the partition is used for testing phase while the other partitions are used
for building the model. Twofold cross-validation is more generalized than tenfold. In twofold
cross-validation, both training and testing phases are performed on distinct 50 percent of the
dataset records, while in tenfold cross-validation the training phase is performed on 90 percent of
the dataset records and the remaining 10 percent of dataset records are used to perform the testing
phase. So, two-fold cross-validation was mainly used to perform the experiments. The stratified
sampling method was used to maintain the same ratio of a number of records for any class to the
number of records for all classes in all partitions and in the complete set. The recent paper which
is referenced by [25] was used to make an evaluation with our models that were built on UNB
ISCX2012 dataset. It assumed that it used 1 percent from each attack class records randomly to
build the models, and 10 percent to test the built models. There is an inconsistency between the
number of records as [25] mentioned and the correct number of records. So, the same number of
records was used to build our primary experiments on this dataset. The number of records was
chosen in this way to make a consistent and equitable assessment. The overall accuracy and F-
score evaluation metrics were used with parameter optimization phase, while the confusion
matrix, recall, precision, F-score, FP, miss-detection, miss-classification in addition to the overall
accuracy were used to measure the performance of the proposed models. Several metrics and
definitions [24] are used in evaluating the multi-class pattern recognition models. Some of these
are Confusion matrix, True Positive (TP), False Positive (FP), True Negative (TN), False
Negative (FN), Accuracy, Precision, Recall, Detecting Rate, Misclassification, G-mean, F-
measuring and Receiver Operating Characteristics curve (ROC).
13. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
51
Table 3. Various combinations of parameters were used to build a paper model.
The optimized
Parameters,
Number of Hidden
Neurons
Weight
Kernels,
Activation
Function
Algorithm
Sigmoidal
WELM
Gaussian
It is important to clarify the concept of each term, these concepts oriented toward the ID problem,
that is included in the following paragraphs.
TP: the records are predicted to the correct type of attacks.
FP: the records are predicted as attacks while they are normal.
TN: the normal records which are classified correctly.
FN: the records predicted as normal while they are attacks.
Misclassification: the hostile records are predicted to the wrong type of attacks.
FAR: the rate of normal records which classified as attacks.
Confusion matrix: It is one of the common methods used to view the result of the pattern
recognition models; it represents a two-dimensional square matric. The fields of both dimensions
are the classes of the problem, and the values of the cells represent the distribution of the
predicted records on the target classes.
Accuracy: It is one of the main metrics used to measure the overall performance of the pattern
recognition models; it is represented by the following formula:
Precision: It is the percentage of records which are predicted to certain class correctly to all
records predicted in that class. It is calculated using the following equation:
Recall or Sensitivity: It is the percentage of the correctly predicted records of one of the attack
classes to the number of records belonging to that class in the target table.
14. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
52
Specificity: It is the percentage of normal records which are predicted as normal to the number of
records belonging to the normal class in the target table.
G-mean: It is an overall metric which measures a geometric mean of specificity for normal class
and the sensitivity for all hostile classes, it is used to measure the performance in cases where the
imbalanced classes exist. The following formula is used to measure the G-mean metric:
F-measuring or F-score: It is the harmonic mean of precision and recall and it is calculated using
the following equation:
4.1 UNB ISCX2012 Dataset Experiments
Evaluation inconsistency is one of the important issues related to ID solutions, which is one of the
points that have been taken into account in this work. The dataset was collected in seven days;
three of them had normal records only while the other four days had a distinct attack scenario for
each day as illustrated before. The number of normal and attack records in mentioned four days is
shown in table 2. A newly published in [25] used mainly to evaluate our work on this dataset.
Table 4 shows the number of records that were used in the experiments of [25]. The author used
11 percent of the more frequent classes (1 percent for training and 10 percent for testing phase)
and all records for the low frequent attacks which are Botnet and DoS. As it is shown in table 2,
the number of records for the Botnet and Dos attacks are (37460, 3776 records in sequential
order), which shows inconsistency is the number of records between the dataset and the
mentioned number of records in [25]. To overcome this problem, two sets of experiments were
performed. The primary experiments had the same numbers of records as in [25] for each class,
they showed in table 4, the number of records of UNB ISCX2012 dataset as they included in [24].
This enables us to make a consistent and fair comparison. The secondary experiments were
performed using complete attacks records, in addition to normal records that randomly
selected from the normal records of the day that included the scenarios of botnet attack, the
number of normal records selected to be equal to the number of the attacks records; this based on
the fact that most network traffic is normal. The secondary experiments were built on the general
method of this work. To make the results of the experiments based on randomly selected records
15. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
53
more representative, each experiment was repeated ten times but using different sets of records
each time then the average results were calculated.
Table 4: Number of records of UNB ISCX2012 dataset as they are included in [24]
Class Name # of Train records # of Test records
Infiltrating the network from inside 60 605
HTTP Denial of Service 4 36
Distributed DoS using an IRC Botnet 3 2
Brute Force SSH 46 463
Normal 1227 12285
Sum of the records 1340 13391
4.2 Discussion of the Results
WELM algorithm was applied to the UNB ISCX2012 dataset. In the primary experiments,
different weight schemes were employed and different parameters related to these algorithms
were optimized, all these experiments appeared that both the WELM algorithms with default
weight scheme had better performance with respect to the overall
accuracy and the F-score for all classes. Although the second weight scheme with WELM
algorithm failed to improve the overall accuracy and the F-score for all classes, they succeeded in
improving the recall for the vast majority of cases, but this was associated with precision
reduction. Table 5 shows the results of the WELM Gaussian RB and Sigmoidal Functions for all
classes of the dataset with the optimized parameters and result
Table 5: The results of the WELM Gaussian RB and Sigmoidal Functions on the UNB ISCX2012 during 2
tests
F-score
Theoverall
accuracy
optim
Parmt
. # of
Hidde
n
Neuro
ns
W
Kernels,
Activation
Function
Test#
SSH
Botnet
DoS
L2L
Normal
98.9%0.0%39.3%94.4%99.7%99.3%500
Def
t.
1Sigmoidal
Test1
98.5%0.0%35.3%93.1%99.6%99.1%5001
Gaussian
RB
98.9%1.8%15.2%63.8%98.9%96.2%2000Sigmoidal
Test2
98.9%1.2%23.5%81.1%99.1%97.6%2000
Gaussian
RB
Figure 3 shows the accuracy for the 5 classes of the UNB ISCX2012 dataset in two tests applying
WELM with Sigmoidal function and Gaussian RB function. It is obvious that the obtained
accuracy results in Test 1 for the two applied Kernels Activation Function in the 5 classes have
approximately the same result. On the other hand, as shown in figure 4 in Test 2 the obtained
16. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
54
accuracy result shows that the WELM with Gaussian RB function outperforms sigmoidal function
in some classes of the dataset.
Figure. 3. The WELM Sigmoidal and Gaussian RB result in Test 1
Figure. 4. The WELM Sigmoidal and Gaussian RB result in Test 2
The recently published work [25] which intersected with our work in concern was used to make a
consistent and equitable evaluation. The results of applying the SVM algorithm on the UNB
ISCX2012 dataset in the [25]. As shown in Table5, the polynomial kernel performs better than
Gaussian RBF kernel with SVM; the results represented the value of applying the experiments on
a random subset of data which was insufficient. The optimization step decreased the performance
of the SVM algorithm with Gaussian RBF, which did not reflect the true behavior of the
algorithm. On the other hand, the primary experiments on the UNB ISCX2012 dataset in our
paper were repeated ten times for each scenario, and then the average of those rounds was used to
evaluate the applied models.
17. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
55
The result of the WELM model outperforms the SVM model proposed in [25], when we applied
the same activation function in the accuracy of Normal, SSH and DoS classes, as shown in table
6.
Table6: The results of the WELM with Polynomial Function on the UNB ISCX2012
Dataset compared with [25]
WELM with Polynomial Function
SVM with Polynomial
Function [25]
Accuracy 99.10% Accuracy 99.11%
Precision Recall F-score Precision Recall F-score
SSH 98.61% 98.51% 98.55% 95.9% 100% 97.9%
Botnet 0.00% 0.00% 0.00% 18.2% 100% 30.8%
DoS 36.50% 37.78% 35.33% 60% 25% 35.3%
L2L 94.47% 91.98% 93.18% 95% 94.9% 95%
Normal 99.63% 99.67% 99.65% 99.6% 99.5% 99.5 %
Figure. 5. The result comparison of WELM with Polynomial Function on the UNB ISCX2012
And the model presented in [25]
Based on the foregoing, the WELM model reaps the superior results in the minor classes and
competitive results in overall accuracy and the accuracy of the major classes. It increases the
ability to detect the most hazardous attacks.
5. CONCLUSION AND FUTURE WORKS
The development of IDS in computer networks is a challenge for researchers because, with the
growth of computer networks, new attacks appear constantly. IDS is a vital security tool. The
daily increase in the number of attacks encourages the development of the IDS. In this paper, a
18. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
56
method was proposed for detecting the intrusions by using machine learning (ML) tools that
consolidated stratified sampling and different cost function schemes with Extreme Learning
Machine (ELM) method to build competitive ID solutions that improve the performance of these
systems and deal with classes in the training set that contains many more samples than others in
the same training set. The proposed method got a superior result than previous works in the
accuracy paradox issue while preserved the accuracy improvement. In this way, the performance
of ID capable of maintaining better levels of accuracy as well as improving the detection of the
most dangerous classes. The WELM algorithm is a good competitor. The experiments that
performed achieved competitive results of both overall accuracy and F-score per-class
performance scale onthe UNB ISCX2012 dataset. The accuracy in this experiment SSH, DoS,
Normal classes in outperform the SVM method. The truth associated with this problem is that
none of the open issues have been solved completely and all points still opened although we
covered some of ID the points through this effort. In future work, we will start using a set of one-
class classification methods which can be used in different manners. It is suggested to solve the
unbalanced class problem, to build novelty models and outlier detection, models. While the first
way pours into solving the imbalanced classes, the others contribute to building anomaly models
which may improve the detection of zero-day attacks.
REFERENCES
[1] Mishra, P., Varadharajan, V., Tupakula, U., & Pilli, E. S. (2018). A detailed investigation and analysis
of using machine learning techniques for intrusion detection. IEEE Communications Surveys &
Tutorials, 21(1), 686-728.
[2] Cashell, B., Jackson, W. D., Jickling, M., &Webel, B. (2004). The economic impact of cyber-attacks.
Congressional Research Service Documents, CRS RL32331 (Washington DC).
[3] Bellovin, S. M. (2004, December). A look back at" security problems in the tcp/ip protocol suite. In
20th Annual Computer Security Applications Conference (pp. 229-249). IEEE.
[4] Garcia-Teodoro, P., Diaz-Verdejo, J., Maciá-Fernández, G., & Vázquez, E. (2009). Anomaly-based
network intrusion detection: Techniques, systems and challenges. computers & security, 28(1-2), 18-
28.
[5] Ahmad, I., Basheri, M., Iqbal, M. J., & Rahim, A. (2018). Performance comparison of support vector
machine, random forest, and extreme learning machine for intrusion detection. IEEE Access, 6,
33789-33795.
[6] Moustafa, N., Turnbull, B., & Choo, K. K. R. (2018). An ensemble intrusion detection technique
based on proposed statistical flow features for protecting network traffic of internet of things. IEEE
Internet of Things Journal.
[7] Idhammad, M., Afdel, K., & Belouch, M. (2018). Semi-supervised machine learning approach for
DDoS detection. Applied Intelligence, 48(10), 3193-3208.
[8] A.-C. Enache and V. V. Patriciu, "Intrusions detection based on support vector machine optimized
with swarm intelligence," in Applied Computational Intelligence and Informatics (SACI), 2014 IEEE
9th International Symposium on, 2014.
[9] Ji, S. Y., Jeong, B. K., Choi, S., &Jeong, D. H. (2016). A multi-level intrusion detection method for
abnormal network behaviors. Journal of Network and Computer Applications, 62, 9-17.
19. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
57
[10] Alabdallah, A., & Awad, M. (2018). Using weighted Support Vector Machine to address the
imbalanced classes problem of Intrusion Detection System. KSII Transactions on Internet &
Information Systems, 12(10).
[11] Bains, J. K., Kaki, K. K., & Sharma, K. (2013). Intrusion Detection System with Multi-Layer using
Bayesian Networks. International Journal of Computer Applications, 67(5).
[12] Sharma, S., Gigras, Y., Chhikara, R., & Dhull, A. (2019). Analysis of NSL KDD Dataset Using
Classification Algorithms for Intrusion Detection System. Recent Patents on Engineering, 13(2), 142-
147
[13] M. H. Bhuyan, D. K. Bhattacharyya and J. K. Kalita, "Towards Generating Real-life Datasets for
Network Intrusion Detection.," IJ Network Security, vol. 17, pp. 683-701, 2015.
[14] A. Shiravi, H. Shiravi, M. Tavallaee,and A. A. Ghorbani, "Toward developing a systematic approach
to generate benchmark datasets for intrusion detection," Computers & Security, vol. 31, pp. 357-374,
2012.
[15] M. Tavallaee, E. Bagheri, W. Lu,and A.-A. Ghorbani, "A detailed analysis of the KDD CUP 99 data
set," in Proceedings of the Second IEEE Symposium on Computational Intelligence for Security and
Defence Applications 2009, 2009.
[16] Garcia-Teodoro, P., Diaz-Verdejo, J., Maciá-Fernández, G., & Vázquez, E. (2009). Anomaly-based
network intrusion detection: Techniques, systems and challenges. computers & security, 28(1-2), 18-
28.
[17] Buczak, A. L., &Guven, E. (2015). A survey of data mining and machine learning methods for cyber
security intrusion detection. IEEE Communications Surveys & Tutorials, 18(2), 1153-1176.
[18] An, X., Zhou, X., Lü, X., Lin, F., & Yang, L. (2018). Sample selected extreme learning machine
based intrusion detection in fog computing and MEC. Wireless Communications and Mobile
Computing, 2018.
[19] Aljawarneh, S., Aldwairi, M., &Yassein, M. B. (2018). Anomaly-based intrusion detection system
through feature selection analysis and building hybrid efficient model. Journal of Computational
Science, 25, 152-160.
[20] S. Anu and K. P. M. Kumar, "Hybrid Network Intrusion Detection for DoS Attacks," International
Journal of Advanced Research in Computer and Communication Engineering, vol. 5, no. 3, 2016.
[21] P. Laskov, P. Düssel, C. Schäfer and K. Rieck, "Learning intrusion detection: supervised or
unsupervised?," in International Conference on Image Analysis and Processing, 2005.
[22] Idhammad, M., Afdel, K., &Belouch, M. (2018). Semi-supervised machine learning approach for
DDoS detection. Applied Intelligence, 48(10), 3193-3208.
[23] Mighan, S. N., &Kahani, M. (2018, May). Deep Learning Based Latent Feature Extraction for
Intrusion Detection. In Electrical Engineering (ICEE), Iranian Conference on (pp. 1511-1516). IEEE.
[24] J. M. Fossaceca, T. A. Mazzuchi, and S. Sarkani, "MARK-ELM: Application of a novel Multiple
Kernel Learning framework for improving the robustness of Network Intrusion Detection," Expert
Systems with Applications, vol. 42, pp. 4062-4080, 2015.
[25] E. Nyakundi, "Using support vector machines in anomaly intrusion detection," University of Guelph,
Guelph, Ontario, Canada, 2015.
20. International Journal of Computer Networks & Communications (IJCNC) Vol.11, No.5, September 2019
58
[26] Injadat, M., Salo, F., Nassif, A. B., Essex, A., &Shami, A. (2018, December). Bayesian Optimization
with Machine Learning Algorithms Towards Anomaly Detection. In 2018 IEEE Global
Communications Conference (GLOBECOM) (pp. 1-6). IEEE.
[27] Al Najada, H., Mahgoub, I., & Mohammed, I. (2018, November). Cyber Intrusion Prediction and
Taxonomy System Using Deep Learning and Distributed Big Data Processing. In 2018 IEEE
Symposium Series on Computational Intelligence (SSCI) (pp. 631-638). IEEE.
[28] C. C. Aggarwal, Data mining: the textbook, Springer, 2015.
AUTHORS
Mohammed Awad received the B.S. Degree in Automation Engineering from
Palestine Polytechnic University in the year 2000, master & Ph.D. degrees in
Computer Engineering from the Granada University Spain (both are Scholarship
from Spanish Government). From 2005 to 2006, he was a contract Researcher at
Granada University in the research group Computer Engineering: Perspectives and
Applications. Since Feb. 2006, he has been an Assistant Professor in the Computer
Engineering Department, College of Engineering and Information Technology at
Arab American University. At 2010 he has been an Associate Professor in Computer
Engineering. At 2016 he has been a Full Professor in Computer Engineering. He
worked for more than 12 years at the Arab American University in academic Position, in parallel with
various Academic administrative positions (Departments Chairman and Dean Assistant, Dean of Scientific
Research and Editor-In-Chief, Journal of AAUJ). Through the research and educational experience, he has
developed a strong research record. His research interests include Artificial Intelligence, Neural Networks,
Function Approximation of Structure and Complex Systems, Clustering, Algorithms, Optimization
Algorithms, and Time series Prediction. He won a number of awards and research grants.
Alaeddin Alabdallah received the B.S. Degree in Computer Engineering from An-
Najah National University in 2006, and a Master Degree in computer science at Arab
American University in February 2018. From 2006 till now, he is Teacher and
Research Assistance at the Computer Engineering Department at An-Najah National
University. His research interests include Artificial Intelligence, computer networks,
and Information Security.