Predicting cyberattacks using machine learning has become imperative since cyberattacks have increased exponentially due to the stealthy and sophisticated nature of adversaries. To have situational awareness and achieve defence in depth, using machine learning for threat prediction has become a prerequisite for cyber threat intelligence gathering. Some approaches to mitigating malware attacks include the use of spam filters, firewalls, and IDS/IPS configurations to detect attacks. However, threat actors are deploying adversarial machine learning techniques to exploit vulnerabilities. This paper explores the viability of using machine learning methods to predict malware attacks and build a classifier to automatically detect and label an event as “Has Detection or No Detection”. The purpose is to predict the probability of malware penetration and the extent of manipulation on the network nodes for cyber threat intelligence. To demonstrate the applicability of our work, we use a decision tree (DT) algorithms to learn dataset for evaluation. The dataset was from Microsoft Malware threat prediction website Kaggle. We identify probably cyberattacks on smart grid, use attack scenarios to determine penetrations and manipulations. The results show that ML methods can be applied in smart grid cyber supply chain environment to detect cyberattacks and predict future trends.
Obfuscated computer virus detection using machine learning algorithmjournalBEEI
Nowadays, computer virus attacks are getting very advanced. New obfuscated computer virus created by computer virus writers will generate a new shape of computer virus automatically for every single iteration and download. This constantly evolving computer virus has caused significant threat to information security of computer users, organizations and even government. However, signature based detection technique which is used by the conventional anti-computer virus software in the market fails to identify it as signatures are unavailable. This research proposed an alternative approach to the traditional signature based detection method and investigated the use of machine learning technique for obfuscated computer virus detection. In this work, text strings are used and have been extracted from virus program codes as the features to generate a suitable classifier model that can correctly classify obfuscated virus files. Text string feature is used as it is informative and potentially only use small amount of memory space. Results show that unknown files can be correctly classified with 99.5% accuracy using SMO classifier model. Thus, it is believed that current computer virus defense can be strengthening through machine learning approach.
MACHINE LEARNING IN NETWORK SECURITY USING KNIME ANALYTICSIJNSA Journal
Machine learning has more and more effect on our every day’s life. This field keeps growing and expanding into new areas. Machine learning is based on the implementation of artificial intelligence that gives systems the capability to automatically learn and enhance from experiments without being explicitly programmed. Machine Learning algorithms apply mathematical equations to analyze datasets and predict values based on the dataset. In the field of cybersecurity, machine learning algorithms can be utilized to train and analyze the Intrusion Detection Systems (IDSs) on security-related datasets. In this paper, we tested different machine learning algorithms to analyze NSL-KDD dataset using KNIME analytics.
Machine learning in network security using knime analyticsIJNSA Journal
Machine learning has more and more effect on our every day’s life. This field keeps growing and expanding into new areas. Machine learning is based on the implementation of artificial intelligence that gives systems the capability to automatically learn and enhance from experiments without being explicitly
programmed. Machine Learning algorithms apply mathematical equations to analyze datasets and predict values based on the dataset. In the field of cybersecurity, machine learning algorithms can be utilized to train and analyze the Intrusion Detection Systems (IDSs) on security-related datasets. In this paper, we tested different machine learning algorithms to analyze NSL-KDD dataset using KNIME analytics.
Cyber security is said to be the most concentrated topic as it helps end user to stay away or stay secure from cyber attacks. Cyber security models are crucial.
More...
http://goo.gl/IwhtP2
Cyber security is a Major concern in the world. As a result of frequent and consistent daily cyber attack, this journal was written to enlighten viewers and readers on zero day attack prediction
Secure intrusion detection and countermeasure selection in virtual system usi...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...IJNSA Journal
Malicious software is constantly being developed and improved, so detection and classification of malwareis an ever-evolving problem. Since traditional malware detection techniques fail to detect new/unknown malware, machine learning algorithms have been used to overcome this disadvantage. We present a Convolutional Neural Network (CNN) for malware type classification based on the API (Application Program Interface) calls. This research uses a database of 7107 instances of API call streams and 8 different malware types:Adware, Backdoor, Downloader, Dropper, Spyware, Trojan, Virus,Worm. We used a 1-Dimensional CNN by mapping API calls as categorical and term frequency-inverse document frequency (TF-IDF) vectors and compared the results to other classification techniques.The proposed 1-D CNN outperformed other classification techniques with 91% overall accuracy for both categorical and TF-IDF vectors.
Obfuscated computer virus detection using machine learning algorithmjournalBEEI
Nowadays, computer virus attacks are getting very advanced. New obfuscated computer virus created by computer virus writers will generate a new shape of computer virus automatically for every single iteration and download. This constantly evolving computer virus has caused significant threat to information security of computer users, organizations and even government. However, signature based detection technique which is used by the conventional anti-computer virus software in the market fails to identify it as signatures are unavailable. This research proposed an alternative approach to the traditional signature based detection method and investigated the use of machine learning technique for obfuscated computer virus detection. In this work, text strings are used and have been extracted from virus program codes as the features to generate a suitable classifier model that can correctly classify obfuscated virus files. Text string feature is used as it is informative and potentially only use small amount of memory space. Results show that unknown files can be correctly classified with 99.5% accuracy using SMO classifier model. Thus, it is believed that current computer virus defense can be strengthening through machine learning approach.
MACHINE LEARNING IN NETWORK SECURITY USING KNIME ANALYTICSIJNSA Journal
Machine learning has more and more effect on our every day’s life. This field keeps growing and expanding into new areas. Machine learning is based on the implementation of artificial intelligence that gives systems the capability to automatically learn and enhance from experiments without being explicitly programmed. Machine Learning algorithms apply mathematical equations to analyze datasets and predict values based on the dataset. In the field of cybersecurity, machine learning algorithms can be utilized to train and analyze the Intrusion Detection Systems (IDSs) on security-related datasets. In this paper, we tested different machine learning algorithms to analyze NSL-KDD dataset using KNIME analytics.
Machine learning in network security using knime analyticsIJNSA Journal
Machine learning has more and more effect on our every day’s life. This field keeps growing and expanding into new areas. Machine learning is based on the implementation of artificial intelligence that gives systems the capability to automatically learn and enhance from experiments without being explicitly
programmed. Machine Learning algorithms apply mathematical equations to analyze datasets and predict values based on the dataset. In the field of cybersecurity, machine learning algorithms can be utilized to train and analyze the Intrusion Detection Systems (IDSs) on security-related datasets. In this paper, we tested different machine learning algorithms to analyze NSL-KDD dataset using KNIME analytics.
Cyber security is said to be the most concentrated topic as it helps end user to stay away or stay secure from cyber attacks. Cyber security models are crucial.
More...
http://goo.gl/IwhtP2
Cyber security is a Major concern in the world. As a result of frequent and consistent daily cyber attack, this journal was written to enlighten viewers and readers on zero day attack prediction
Secure intrusion detection and countermeasure selection in virtual system usi...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
COMPARISON OF MALWARE CLASSIFICATION METHODS USING CONVOLUTIONAL NEURAL NETWO...IJNSA Journal
Malicious software is constantly being developed and improved, so detection and classification of malwareis an ever-evolving problem. Since traditional malware detection techniques fail to detect new/unknown malware, machine learning algorithms have been used to overcome this disadvantage. We present a Convolutional Neural Network (CNN) for malware type classification based on the API (Application Program Interface) calls. This research uses a database of 7107 instances of API call streams and 8 different malware types:Adware, Backdoor, Downloader, Dropper, Spyware, Trojan, Virus,Worm. We used a 1-Dimensional CNN by mapping API calls as categorical and term frequency-inverse document frequency (TF-IDF) vectors and compared the results to other classification techniques.The proposed 1-D CNN outperformed other classification techniques with 91% overall accuracy for both categorical and TF-IDF vectors.
A BAYESIAN CLASSIFICATION ON ASSET VULNERABILITY FOR REAL TIME REDUCTION OF F...IJNSA Journal
IT assets connected on internetwill encounter alien protocols and few parameters of protocol process are exposed as vulnerabilities. Intrusion Detection Systems (IDS) are installed to alerton suspicious traffic or activity. IDS issuesfalse positives alerts, if any behavior construe for partial attack pattern or the IDS lacks environment knowledge. Continuous monitoring of alerts to evolve whether, an alert is false positive or not is a major concern. In this paper we present design of an external module to IDS,to identify false positive alertsbased on anomaly based adaptive learning model. The novel feature of this design is that the system updates behavior profile of assets and environment with adaptive learning process.A mixture model is used for behavior modeling from reference data. The design of the detection and learning process are based on normal behavior and of environment. The anomaly alert identification algorithm isbuiltonSparse Markov Transducers (SMT) based probability.The total process is presented using real-time data. The Experimental results are validated and presentedwith reference to lab environment.
Network Threat Characterization in Multiple Intrusion Perspectives using Data...IJNSA Journal
For effective security incidence response on the network, a reputable approach must be in place at both protected and unprotected region of the network. This is because compromise in the demilitarized zone could be precursor to threat inside the network. The improved complexity of attacks in present times and vulnerability of system are motivations for this work. Past and present approaches to intrusion detection and prevention have neglected victim and attacker properties despite the fact that for intrusion to occur, an overt act by an attacker and a manifestation, observable by the intended victim, which results from that act are required. Therefore, this paper presents a threat characterization model for attacks from the victim and the attacker perspective of intrusion using data mining technique. The data mining technique combines Frequent Temporal Sequence Association Mining and Fuzzy Logic. Apriori Association Mining algorithm was used to mine temporal rule patterns from alert sequences while Fuzzy Control System was used to rate exploits. The results of the experiment show that accurate threat characterization in multiple intrusion perspectives could be actualized using Fuzzy Association Mining. Also, the results proved that sequence of exploits could be used to rate threat and are motivated by victim properties and attacker objectives.
A Study and Comparative analysis of Conditional Random Fields for Intrusion d...IJORCS
Intrusion detection systems are an important component of defensive measures protecting computer systems and networks from abuse. Intrusion detection plays one of the key roles in computer security techniques and is one of the prime areas of research. Due to complex and dynamic nature of computer networks and hacking techniques, detecting malicious activities remains a challenging task for security experts, that is, currently available defense systems suffer from low detection capability and high number of false alarms. An intrusion detection system must reliably detect malicious activities in a network and must perform efficiently to cope with the large amount of network traffic. In this paper we study the Machine Learning and data mining techniques to solve Intrusion Detection problems within computer networks and compare the various approaches with conditional random fields and address these two issues of Accuracy and Efficiency using Conditional Random Fields and Layered Approach.
An effective approach for tackling network security
problems is Intrusion detection systems (IDS). These kind of
systems play a key role in network security as they can detect
different types of attacks in networks, including DoS, U2R Probe
and R2L. In addition, IDS are an increasingly key part of the
system’s defense. Various approaches to IDS are now being used,
but are unfortunately relatively ineffective. Data mining techniques
and artificial intelligence play an important role in security
services. We will present a comparative study of three wellknown
intelligent algorithms in this paper. These are Radial Basis
Functions (RBF), Multilayer Perceptrons (MLP) and Support
Vector Machine (SVM).This work’s main interest is to benchmark
the performance of these3 intelligent algorithms. This is done by
using a dataset of about 9,000 connections, randomly chosen from
KDD'99’s 10% dataset. In addition, we investigate these
algorithms’ performance in terms of their attack classification
accuracy. The Simulation results are also analyzed and the
discussion is then presented. It has been observed that SVM with a
linear kernel (Linear-SVM) gives a better performance than MLP
and RBF in terms of its detection accuracy and processing speed.
user centric machine learning framework for cyber security operations centerVenkat Projects
In order to ensure a company's Internet security, SIEM (Security Information and Event Management) system is in place to simplify the various preventive technologies and flag alerts for security events. Inspectors (SOC) investigate warnings to determine if this is true or not. However, the number of warnings in general is wrong with the majority and is more than the ability of SCO to handle all awareness. Because of this, malicious possibility. Attacks and compromised hosts may be wrong. Machine learning is a possible approach to improving the wrong positive rate and improving the productivity of SOC analysts. In this article, we create a user-centric engineer learning framework for the Internet Safety Functional Center in the real organizational context. We discuss regular data sources in SOC, their work flow, and how to process this data and create an effective machine learning system. This article is aimed at two groups of readers. The first group is intelligent researchers who have no knowledge of data scientists or computer safety fields but who engineer should develop machine learning systems for machine safety. The second groups of visitors are Internet security practitioners that have deep knowledge and expertise in Cyber Security, but do Machine learning experiences do not exist and I'd like to create one by themselves. At the end of the paper, we use the account as an example to demonstrate full steps from data collection, label creation, feature engineering, machine learning algorithm and sample performance evaluations using the computer built in the SOC production of Seyondike.
INTRUSION DETECTION USING FEATURE SELECTION AND MACHINE LEARNING ALGORITHM WI...ijcsit
In order to avoid illegitimate use of any intruder, intrusion detection over the network is one of the critical
issues. An intruder may enter any network or system or server by intruding malicious packets into the
system in order to steal, sniff, manipulate or corrupt any useful and secret information, this process is
referred to as intrusion whereas when packets are transmitted by intruder over the network for any purpose
of intrusion is referred to as attack. With the expanding networking technology, millions of servers
communicate with each other and this expansion is always in progress every day. Due to this fact, more
and more intruders get attention; and so to overcome this need of smart intrusion detection model is a
primary requirement.
By analyzing the feature selection methods the identification of essential features of NSL-KDD data set is
done, then by using selected features and machine learning approach and analyzing the basic features of
networks over the data set a hybrid algorithm is made. Finally a model is produced over the algorithm
containing the rules for the network features.
A hybrid misuse intrusion detection model is made to find attacks on system to improve the intrusion
detection. Based on prior features, intrusions on the system can be detected without any previous learning.
This model contains the advantage of feature selection and machine learning techniques with misuse
detection.
Review of Intrusion and Anomaly Detection Techniques IJMER
Intrusion detection is the act of detecting actions that attempt to compromise the
confidentiality, integrity or availability of a resource. With the tremendous growth of network-based
services and sensitive information on networks, network security is getting more and more importance
than ever. Intrusion poses a serious security threat in a huge network environment. The increasing use of
internet has dramatically added to the growing number of threats that inhabit within it. Intrusion
detection does not, in general, include prevention of intrusions. Now a days Network intrusion detection
systems have become a standard component in the area of security infrastructure. This review paper tries
to discusses various techniques which are already being used for intrusion detection.
Integrated Feature Extraction Approach Towards Detection of Polymorphic Malwa...CSCJournals
Some malware are sophisticated with polymorphic techniques such as self-mutation and emulation based analysis evasion. Most anti-malware techniques are overwhelmed by the polymorphic malware threats that self-mutate with different variants at every attack. This research aims to contribute to the detection of malicious codes, especially polymorphic malware by utilizing advanced static and advanced dynamic analyses for extraction of more informative key features of a malware through code analysis, memory analysis and behavioral analysis. Correlation based feature selection algorithm will be used to transform features; i.e. filtering and selecting optimal and relevant features. A machine learning technique called K-Nearest Neighbor (K-NN) will be used for classification and detection of polymorphic malware. Evaluation of results will be based on the following measurement metrics-True Positive Rate (TPR), False Positive Rate (FPR) and the overall detection accuracy of experiments.
Adversarial Attacks and Defenses in Malware Classification: A SurveyCSCJournals
As malware continues to grow more sophisticated and more plentiful - traditional signature and heuristics-based defenses no longer cut it. Instead, the industry has recently turned to using machine learning for malicious file detection. The challenge with this approach is that machine learning itself comes with vulnerabilities - and if left unattended presents a new attack surface for attackers to exploit.
In this paper we present a survey of research in the area of machine learning-based malware classifiers, the attacks they encounter, and the defensive measures available. We start by reviewing recent advances in malware classification, including the most important works using deep learning. We then discuss in detail the field of adversarial machine learning and conduct an exhaustive review of adversarial attacks and defenses in the field of malware classification.
Survey of network anomaly detection using markov chainijcseit
Recently an internet threat has been increased. Our motive is detect the intrusion in the network in concise.
The real time issue such as DoS attack in banking, companies, industries and organization have been
increased significantly IDS has been used in both server and host side. The major challenge is to effectively
predict the periods of threats and protect the server from the unauthorized user. In this study, a novel
probabilistic approach is proposed effectively to detect the network intrusions. It uses a Markov chain for
probabilistic modelling of abnormal events in network systems. The degree of abnormality of the incoming
data is performed on the basis of the network states.
Titles with Abstracts_2023-2024_Cyber Security.pdfinfo751436
Implementing a cybersecurity project can offer numerous advantages for organizations in today's digitally connected world. Here are some key benefits:
Protection against Cyber Threats: The primary goal of cybersecurity projects is to safeguard an organization's digital assets from various cyber threats such as malware, ransomware, phishing attacks, and more. This protection is crucial for maintaining the integrity, confidentiality, and availability of sensitive information.
Data Privacy Compliance: Many industries have specific regulations and compliance requirements regarding the protection of sensitive data. Implementing cybersecurity measures helps organizations adhere to these regulations, avoiding legal consequences and potential financial penalties.
Business Continuity: Cybersecurity projects often include strategies for disaster recovery and business continuity planning. In the event of a cyberattack or data breach, having a robust cybersecurity infrastructure in place can minimize downtime and ensure that critical business operations continue without significant disruption.
Risk Management: Cybersecurity projects help organizations identify, assess, and manage potential risks associated with their digital assets. This proactive approach allows businesses to make informed decisions about risk mitigation and prioritize resources effectively.
Customer Trust and Reputation: A strong cybersecurity posture can enhance customer trust and protect the reputation of an organization. Customers are more likely to engage with businesses that prioritize the security of their information, leading to increased brand loyalty.
Intellectual Property Protection: For many organizations, intellectual property (IP) is a valuable asset. Cybersecurity measures help protect intellectual property from theft or unauthorized access, ensuring that companies can maintain a competitive edge in the market.
Employee Awareness and Training: Cybersecurity projects often include employee training programs to raise awareness about cybersecurity threats and best practices. Well-informed employees are a crucial line of defense against social engineering attacks and other cyber threats.
Cost Savings: While implementing cybersecurity measures involves an initial investment, it can result in long-term cost savings. The financial impact of a data breach or cyberattack, including potential legal fees, reputation damage, and loss of business, can far exceed the cost of preventive cybersecurity measures.
Cyber Insurance Benefits: Having a robust cybersecurity infrastructure in place may make an organization more eligible for favorable terms and rates on cyber insurance policies, providing an additional layer of financial protection.
Adaptability to Emerging Threats: Cybersecurity projects are dynamic and adaptive, allowing organizations to stay ahead of evolving cyber threats.
DETECTION OF ATTACKS IN WIRELESS NETWORKS USING DATA MINING TECHNIQUESIAEME Publication
With the progressive increase of network application and electronic devices (computer, mobile phones, android, etc), attack and intrusion detection is becoming a very challenging task in cybercrime detection area. in this context, most of existing approaches of attack detection rely mainly on a finite set of attacks. However, these solutions are vulnerable, that is, they fail in detecting some attacks when sources of information’s are ambiguous or imperfect. But, few approaches started investigating toward this direction. Following this trends, this paper investigates the role of machine learning approach (ANN, SVM) in detecting TCP connection traffic as normal or suspicious one. But, using ANN and SVM is an expensive technique individually. In this paper, combining two classifiers has been proposed, where artificial neural network (ANN) classifier and support vector machine (SVM) were employed. Additionally, our proposed solution allows visualizing obtained classification results. Accuracy of the proposed solution has been compared with other classifier results. Experiments have been conducted with different network connection selected from NSL-KDD DARPA dataset. Empirical results show that combining ANN and SVM techniques for attack detection is a promising direction
Malware Risk Analysis on the Campus Network with Bayesian Belief NetworkIJNSA Journal
A security network management system is for providing clear guidelines on risk evaluation and assessment for enterprise networks. The threat and risk assessment is conducted to safeguard enterprise network services to maintain system confidentiality, integrity, and availability through effective control strategies. In this paper, based on our previous work in analyzing integrated information security management and malware propagation on the campus network through mathematical modelling, we proposed Bayesian Belief Network with inference level indicator to enable the decision maker to understand and provide appropriate mitigation decisions on the risks posed. We experimentally placed monitoring sensors on the campus network that gives the threat alert priority levels and magnitude on the vulnerable information assets. These methods will give a direction on the belief inferred due to malware prevalence on the information security assets for better understanding.
Supervised Machine Learning Algorithms for Intrusion Detection.pptxssuserf3a100
Intrusion detection systems using supervised machine learning algorithms are considered one of the most important tools used in the field of information security. These systems analyze data and detect illegal activities and intrusions into networks and systems. These systems rely on machine learning techniques to classify data as either normal activity or a hack. These systems include training and testing phases, where the algorithms are trained on a set of pre-labeled data to learn the natural pattern of the data and distinguish between normal activities and intrusions. Many supervisory machine learning algorithms are available for intrusion detection systems, such as Gaussian Naive Bayes, Decision Tree, Random Forest, Support Vector Machine, and Logistic Regression.
The problem of security and electronic breaches targeting networks is one of the biggest problems facing organizations today. To solve this problem, intrusion detection systems (IDS) and their tools can be used to detect and prevent these threats. This file provides an introductory overview of this problem
Security breaches
networks
Infiltration (IDS)
Algorithms
Machine learning
System call frequency analysis-based generative adversarial network model for...IJECEIAES
In today's digital age, mobile applications have become essential in connecting people from diverse domains. They play a crucial role in enabling communication, facilitating business transactions, and providing access to a range of services. Mobile communication is widespread due to its portability and ease of use, with an increasing number of mobile devices projected to reach 18.22 billion by the end of 2025. However, this convenience comes at a cost, as cybercriminals are constantly looking for ways to exploit security vulnerabilities in mobile applications. Among the several varieties of malicious applications, zero-day malware is particularly dangerous since it cannot be removed by antivirus software. To detect zeroday Android malware, this paper introduces a novel approach based on generative adversarial networks (GANs), which generates new frequencies of feature vectors from system calls. In the proposed approach, the generator is fed with a mixture of real samples and noise, and then trained to create new samples, while the discriminator model aims to classify these samples as either real or fake. We assess the performance of our model through different measures, including loss functions, the Frechet Inception distance, and the inception score evaluation metrics.
A BAYESIAN CLASSIFICATION ON ASSET VULNERABILITY FOR REAL TIME REDUCTION OF F...IJNSA Journal
IT assets connected on internetwill encounter alien protocols and few parameters of protocol process are exposed as vulnerabilities. Intrusion Detection Systems (IDS) are installed to alerton suspicious traffic or activity. IDS issuesfalse positives alerts, if any behavior construe for partial attack pattern or the IDS lacks environment knowledge. Continuous monitoring of alerts to evolve whether, an alert is false positive or not is a major concern. In this paper we present design of an external module to IDS,to identify false positive alertsbased on anomaly based adaptive learning model. The novel feature of this design is that the system updates behavior profile of assets and environment with adaptive learning process.A mixture model is used for behavior modeling from reference data. The design of the detection and learning process are based on normal behavior and of environment. The anomaly alert identification algorithm isbuiltonSparse Markov Transducers (SMT) based probability.The total process is presented using real-time data. The Experimental results are validated and presentedwith reference to lab environment.
Network Threat Characterization in Multiple Intrusion Perspectives using Data...IJNSA Journal
For effective security incidence response on the network, a reputable approach must be in place at both protected and unprotected region of the network. This is because compromise in the demilitarized zone could be precursor to threat inside the network. The improved complexity of attacks in present times and vulnerability of system are motivations for this work. Past and present approaches to intrusion detection and prevention have neglected victim and attacker properties despite the fact that for intrusion to occur, an overt act by an attacker and a manifestation, observable by the intended victim, which results from that act are required. Therefore, this paper presents a threat characterization model for attacks from the victim and the attacker perspective of intrusion using data mining technique. The data mining technique combines Frequent Temporal Sequence Association Mining and Fuzzy Logic. Apriori Association Mining algorithm was used to mine temporal rule patterns from alert sequences while Fuzzy Control System was used to rate exploits. The results of the experiment show that accurate threat characterization in multiple intrusion perspectives could be actualized using Fuzzy Association Mining. Also, the results proved that sequence of exploits could be used to rate threat and are motivated by victim properties and attacker objectives.
A Study and Comparative analysis of Conditional Random Fields for Intrusion d...IJORCS
Intrusion detection systems are an important component of defensive measures protecting computer systems and networks from abuse. Intrusion detection plays one of the key roles in computer security techniques and is one of the prime areas of research. Due to complex and dynamic nature of computer networks and hacking techniques, detecting malicious activities remains a challenging task for security experts, that is, currently available defense systems suffer from low detection capability and high number of false alarms. An intrusion detection system must reliably detect malicious activities in a network and must perform efficiently to cope with the large amount of network traffic. In this paper we study the Machine Learning and data mining techniques to solve Intrusion Detection problems within computer networks and compare the various approaches with conditional random fields and address these two issues of Accuracy and Efficiency using Conditional Random Fields and Layered Approach.
An effective approach for tackling network security
problems is Intrusion detection systems (IDS). These kind of
systems play a key role in network security as they can detect
different types of attacks in networks, including DoS, U2R Probe
and R2L. In addition, IDS are an increasingly key part of the
system’s defense. Various approaches to IDS are now being used,
but are unfortunately relatively ineffective. Data mining techniques
and artificial intelligence play an important role in security
services. We will present a comparative study of three wellknown
intelligent algorithms in this paper. These are Radial Basis
Functions (RBF), Multilayer Perceptrons (MLP) and Support
Vector Machine (SVM).This work’s main interest is to benchmark
the performance of these3 intelligent algorithms. This is done by
using a dataset of about 9,000 connections, randomly chosen from
KDD'99’s 10% dataset. In addition, we investigate these
algorithms’ performance in terms of their attack classification
accuracy. The Simulation results are also analyzed and the
discussion is then presented. It has been observed that SVM with a
linear kernel (Linear-SVM) gives a better performance than MLP
and RBF in terms of its detection accuracy and processing speed.
user centric machine learning framework for cyber security operations centerVenkat Projects
In order to ensure a company's Internet security, SIEM (Security Information and Event Management) system is in place to simplify the various preventive technologies and flag alerts for security events. Inspectors (SOC) investigate warnings to determine if this is true or not. However, the number of warnings in general is wrong with the majority and is more than the ability of SCO to handle all awareness. Because of this, malicious possibility. Attacks and compromised hosts may be wrong. Machine learning is a possible approach to improving the wrong positive rate and improving the productivity of SOC analysts. In this article, we create a user-centric engineer learning framework for the Internet Safety Functional Center in the real organizational context. We discuss regular data sources in SOC, their work flow, and how to process this data and create an effective machine learning system. This article is aimed at two groups of readers. The first group is intelligent researchers who have no knowledge of data scientists or computer safety fields but who engineer should develop machine learning systems for machine safety. The second groups of visitors are Internet security practitioners that have deep knowledge and expertise in Cyber Security, but do Machine learning experiences do not exist and I'd like to create one by themselves. At the end of the paper, we use the account as an example to demonstrate full steps from data collection, label creation, feature engineering, machine learning algorithm and sample performance evaluations using the computer built in the SOC production of Seyondike.
INTRUSION DETECTION USING FEATURE SELECTION AND MACHINE LEARNING ALGORITHM WI...ijcsit
In order to avoid illegitimate use of any intruder, intrusion detection over the network is one of the critical
issues. An intruder may enter any network or system or server by intruding malicious packets into the
system in order to steal, sniff, manipulate or corrupt any useful and secret information, this process is
referred to as intrusion whereas when packets are transmitted by intruder over the network for any purpose
of intrusion is referred to as attack. With the expanding networking technology, millions of servers
communicate with each other and this expansion is always in progress every day. Due to this fact, more
and more intruders get attention; and so to overcome this need of smart intrusion detection model is a
primary requirement.
By analyzing the feature selection methods the identification of essential features of NSL-KDD data set is
done, then by using selected features and machine learning approach and analyzing the basic features of
networks over the data set a hybrid algorithm is made. Finally a model is produced over the algorithm
containing the rules for the network features.
A hybrid misuse intrusion detection model is made to find attacks on system to improve the intrusion
detection. Based on prior features, intrusions on the system can be detected without any previous learning.
This model contains the advantage of feature selection and machine learning techniques with misuse
detection.
Review of Intrusion and Anomaly Detection Techniques IJMER
Intrusion detection is the act of detecting actions that attempt to compromise the
confidentiality, integrity or availability of a resource. With the tremendous growth of network-based
services and sensitive information on networks, network security is getting more and more importance
than ever. Intrusion poses a serious security threat in a huge network environment. The increasing use of
internet has dramatically added to the growing number of threats that inhabit within it. Intrusion
detection does not, in general, include prevention of intrusions. Now a days Network intrusion detection
systems have become a standard component in the area of security infrastructure. This review paper tries
to discusses various techniques which are already being used for intrusion detection.
Integrated Feature Extraction Approach Towards Detection of Polymorphic Malwa...CSCJournals
Some malware are sophisticated with polymorphic techniques such as self-mutation and emulation based analysis evasion. Most anti-malware techniques are overwhelmed by the polymorphic malware threats that self-mutate with different variants at every attack. This research aims to contribute to the detection of malicious codes, especially polymorphic malware by utilizing advanced static and advanced dynamic analyses for extraction of more informative key features of a malware through code analysis, memory analysis and behavioral analysis. Correlation based feature selection algorithm will be used to transform features; i.e. filtering and selecting optimal and relevant features. A machine learning technique called K-Nearest Neighbor (K-NN) will be used for classification and detection of polymorphic malware. Evaluation of results will be based on the following measurement metrics-True Positive Rate (TPR), False Positive Rate (FPR) and the overall detection accuracy of experiments.
Adversarial Attacks and Defenses in Malware Classification: A SurveyCSCJournals
As malware continues to grow more sophisticated and more plentiful - traditional signature and heuristics-based defenses no longer cut it. Instead, the industry has recently turned to using machine learning for malicious file detection. The challenge with this approach is that machine learning itself comes with vulnerabilities - and if left unattended presents a new attack surface for attackers to exploit.
In this paper we present a survey of research in the area of machine learning-based malware classifiers, the attacks they encounter, and the defensive measures available. We start by reviewing recent advances in malware classification, including the most important works using deep learning. We then discuss in detail the field of adversarial machine learning and conduct an exhaustive review of adversarial attacks and defenses in the field of malware classification.
Survey of network anomaly detection using markov chainijcseit
Recently an internet threat has been increased. Our motive is detect the intrusion in the network in concise.
The real time issue such as DoS attack in banking, companies, industries and organization have been
increased significantly IDS has been used in both server and host side. The major challenge is to effectively
predict the periods of threats and protect the server from the unauthorized user. In this study, a novel
probabilistic approach is proposed effectively to detect the network intrusions. It uses a Markov chain for
probabilistic modelling of abnormal events in network systems. The degree of abnormality of the incoming
data is performed on the basis of the network states.
Titles with Abstracts_2023-2024_Cyber Security.pdfinfo751436
Implementing a cybersecurity project can offer numerous advantages for organizations in today's digitally connected world. Here are some key benefits:
Protection against Cyber Threats: The primary goal of cybersecurity projects is to safeguard an organization's digital assets from various cyber threats such as malware, ransomware, phishing attacks, and more. This protection is crucial for maintaining the integrity, confidentiality, and availability of sensitive information.
Data Privacy Compliance: Many industries have specific regulations and compliance requirements regarding the protection of sensitive data. Implementing cybersecurity measures helps organizations adhere to these regulations, avoiding legal consequences and potential financial penalties.
Business Continuity: Cybersecurity projects often include strategies for disaster recovery and business continuity planning. In the event of a cyberattack or data breach, having a robust cybersecurity infrastructure in place can minimize downtime and ensure that critical business operations continue without significant disruption.
Risk Management: Cybersecurity projects help organizations identify, assess, and manage potential risks associated with their digital assets. This proactive approach allows businesses to make informed decisions about risk mitigation and prioritize resources effectively.
Customer Trust and Reputation: A strong cybersecurity posture can enhance customer trust and protect the reputation of an organization. Customers are more likely to engage with businesses that prioritize the security of their information, leading to increased brand loyalty.
Intellectual Property Protection: For many organizations, intellectual property (IP) is a valuable asset. Cybersecurity measures help protect intellectual property from theft or unauthorized access, ensuring that companies can maintain a competitive edge in the market.
Employee Awareness and Training: Cybersecurity projects often include employee training programs to raise awareness about cybersecurity threats and best practices. Well-informed employees are a crucial line of defense against social engineering attacks and other cyber threats.
Cost Savings: While implementing cybersecurity measures involves an initial investment, it can result in long-term cost savings. The financial impact of a data breach or cyberattack, including potential legal fees, reputation damage, and loss of business, can far exceed the cost of preventive cybersecurity measures.
Cyber Insurance Benefits: Having a robust cybersecurity infrastructure in place may make an organization more eligible for favorable terms and rates on cyber insurance policies, providing an additional layer of financial protection.
Adaptability to Emerging Threats: Cybersecurity projects are dynamic and adaptive, allowing organizations to stay ahead of evolving cyber threats.
DETECTION OF ATTACKS IN WIRELESS NETWORKS USING DATA MINING TECHNIQUESIAEME Publication
With the progressive increase of network application and electronic devices (computer, mobile phones, android, etc), attack and intrusion detection is becoming a very challenging task in cybercrime detection area. in this context, most of existing approaches of attack detection rely mainly on a finite set of attacks. However, these solutions are vulnerable, that is, they fail in detecting some attacks when sources of information’s are ambiguous or imperfect. But, few approaches started investigating toward this direction. Following this trends, this paper investigates the role of machine learning approach (ANN, SVM) in detecting TCP connection traffic as normal or suspicious one. But, using ANN and SVM is an expensive technique individually. In this paper, combining two classifiers has been proposed, where artificial neural network (ANN) classifier and support vector machine (SVM) were employed. Additionally, our proposed solution allows visualizing obtained classification results. Accuracy of the proposed solution has been compared with other classifier results. Experiments have been conducted with different network connection selected from NSL-KDD DARPA dataset. Empirical results show that combining ANN and SVM techniques for attack detection is a promising direction
Malware Risk Analysis on the Campus Network with Bayesian Belief NetworkIJNSA Journal
A security network management system is for providing clear guidelines on risk evaluation and assessment for enterprise networks. The threat and risk assessment is conducted to safeguard enterprise network services to maintain system confidentiality, integrity, and availability through effective control strategies. In this paper, based on our previous work in analyzing integrated information security management and malware propagation on the campus network through mathematical modelling, we proposed Bayesian Belief Network with inference level indicator to enable the decision maker to understand and provide appropriate mitigation decisions on the risks posed. We experimentally placed monitoring sensors on the campus network that gives the threat alert priority levels and magnitude on the vulnerable information assets. These methods will give a direction on the belief inferred due to malware prevalence on the information security assets for better understanding.
Supervised Machine Learning Algorithms for Intrusion Detection.pptxssuserf3a100
Intrusion detection systems using supervised machine learning algorithms are considered one of the most important tools used in the field of information security. These systems analyze data and detect illegal activities and intrusions into networks and systems. These systems rely on machine learning techniques to classify data as either normal activity or a hack. These systems include training and testing phases, where the algorithms are trained on a set of pre-labeled data to learn the natural pattern of the data and distinguish between normal activities and intrusions. Many supervisory machine learning algorithms are available for intrusion detection systems, such as Gaussian Naive Bayes, Decision Tree, Random Forest, Support Vector Machine, and Logistic Regression.
The problem of security and electronic breaches targeting networks is one of the biggest problems facing organizations today. To solve this problem, intrusion detection systems (IDS) and their tools can be used to detect and prevent these threats. This file provides an introductory overview of this problem
Security breaches
networks
Infiltration (IDS)
Algorithms
Machine learning
System call frequency analysis-based generative adversarial network model for...IJECEIAES
In today's digital age, mobile applications have become essential in connecting people from diverse domains. They play a crucial role in enabling communication, facilitating business transactions, and providing access to a range of services. Mobile communication is widespread due to its portability and ease of use, with an increasing number of mobile devices projected to reach 18.22 billion by the end of 2025. However, this convenience comes at a cost, as cybercriminals are constantly looking for ways to exploit security vulnerabilities in mobile applications. Among the several varieties of malicious applications, zero-day malware is particularly dangerous since it cannot be removed by antivirus software. To detect zeroday Android malware, this paper introduces a novel approach based on generative adversarial networks (GANs), which generates new frequencies of feature vectors from system calls. In the proposed approach, the generator is fed with a mixture of real samples and noise, and then trained to create new samples, while the discriminator model aims to classify these samples as either real or fake. We assess the performance of our model through different measures, including loss functions, the Frechet Inception distance, and the inception score evaluation metrics.
A Lightweight Method for Detecting Cyber Attacks in High-traffic Large Networ...IJCNCJournal
Protecting information systems is a difficult and long-term task. The size and traffic intensity of computer networks are diverse and no one protection solution is universal for all cases. A certain solution protects well in the campus network, but it is unlikely to protect well in the service provider's network. A key component of a cyber defence system is a network attack detector. This component needs to be designed to have a good way to scale detection capabilities with network size and traffic intensity beyond the size and intensity of a campus network. From this point of view, this paper aims to build a network attack detection method suitable for the scale of large and high-traffic networks based on machine learning models using clustering techniques and our proposed detection technique. The detection technique is different from outlier detection commonly used in clustering-based anomaly detection applications. The method was evaluated in cases using different feature extraction methods and different clustering algorithms. Experimental results on the NSL-KDD data set are positive with a detection accuracy of over 97%.
A LIGHTWEIGHT METHOD FOR DETECTING CYBER ATTACKS IN HIGH-TRAFFIC LARGE NETWOR...IJCNCJournal
Protecting information systems is a difficult and long-term task. The size and traffic intensity of computer
networks are diverse and no one protection solution is universal for all cases. A certain solution protects
well in the campus network, but it is unlikely to protect well in the service provider's network. A key
component of a cyber defence system is a network attack detector. This component needs to be designed to
have a good way to scale detection capabilities with network size and traffic intensity beyond the size and
intensity of a campus network. From this point of view, this paper aims to build a network attack detection
method suitable for the scale of large and high-traffic networks based on machine learning models using
clustering techniques and our proposed detection technique. The detection technique is different from
outlier detection commonly used in clustering-based anomaly detection applications. The method was
evaluated in cases using different feature extraction methods and different clustering algorithms.
Experimental results on the NSL-KDD data set are positive with a detection accuracy of over 97%.
HYBRID ARCHITECTURE FOR DISTRIBUTED INTRUSION DETECTION SYSTEM IN WIRELESS NE...IJNSA Journal
In order to the rapid growth of the network application, new kinds of network attacks are emerging
endlessly. So it is critical to protect the networks from attackers and the Intrusion detection
technology becomes popular. Therefore, it is necessary that this security concern must be articulate
right from the beginning of the network design and deployment. The intrusion detection technology is the
process of identifying network activity that can lead to a compromise of security policy. Lot of work has
been done in detection of intruders. But the solutions are not satisfactory. In this paper, we propose a
novel Distributed Intrusion Detection System using Multi Agent In order to decrease false alarms and
manage misuse and anomaly detects
HYBRID ARCHITECTURE FOR DISTRIBUTED INTRUSION DETECTION SYSTEM IN WIRELESS NE...IJNSA Journal
In order to the rapid growth of the network application, new kinds of network attacks are emerging endlessly. So it is critical to protect the networks from attackers and the Intrusion detection technology becomes popular. Therefore, it is necessary that this security concern must be articulate right from the beginning of the network design and deployment. The intrusion detection technology is the process of identifying network activity that can lead to a compromise of security policy. Lot of work has been done in detection of intruders. But the solutions are not satisfactory. In this paper, we propose a novel Distributed Intrusion Detection System using Multi Agent In order to decrease false alarms and manage misuse and anomaly detects.
Hyperparameters optimization XGBoost for network intrusion detection using CS...IAESIJAI
With the introduction of high-speed internet access, the demand for security and dependable networks has grown. In recent years, network attacks have gotten more complex and intense, making security a vital component of organizational information systems. Network intrusion detection systems (NIDS) have become an essential detection technology to protect data integrity and system availability against such attacks. NIDS is one of the most well-known areas of machine learning software in the security field, with machine learning algorithms constantly being developed to improve performance. This research focuses on detecting abnormalities in societal infiltration using the hyperparameters optimization XGBoost (HO-XGB) algorithm with the Communications Security Establishment-The Canadian Institute for Cybersecurity-Intrusion Detection System2018 (CSE-CICIDS2018) dataset to get the best potential results. When compared to typical machine learning methods published in the literature, HO-XGB outperforms them. The study shows that XGBoost outperforms other detection algorithms. We refined the HO-XGB model's hyperparameters, which included learning_rate, subsample, max_leaves, max_depth, gamma, colsample_bytree, min_child_weight, n_estimators, max_depth, and reg_alpha. The experimental findings reveal that HO-XGB1 outperforms multiple parameter settings for intrusion detection, effectively optimizing XGBoost's hyperparameters.
Obfuscated computer virus detection using machine learning algorithmjournalBEEI
Nowadays, computer virus attacks are getting very advanced. New obfuscated computer virus created by computer virus writers will generate a new shape of computer virus automatically for every single iteration and download. This constantly evolving computer virus has caused significant threat to information security of computer users, organizations and even government. However, signature based detection technique which is used by the conventional anti-computer virus software in the market fails to identify it as signatures are unavailable. This research proposed an alternative approach to the traditional signature based detection method and investigated the use of machine learning technique for obfuscated computer virus detection. In this work, text strings are used and have been extracted from virus program codes as the features to generate a suitable classifier model that can correctly classify obfuscated virus files. Text string feature is used as it is informative and potentially only use small amount of memory space. Results show that unknown files can be correctly classified with 99.5% accuracy using SMO classifier model. Thus, it is believed that current computer virus defense can be strengthening through machine learning approach.
Evaluation of network intrusion detection using markov chainIJCI JOURNAL
Day today life internet threat has been increased significantly. There is a need to develop model in order to
maintain security of system. The most effective techniques are Intrusion Detection System (IDS).The
purpose of intrusion system through the security devices detect and deal with it. In this paper, a
mathematical approach is used effectively to predict and detect intrusion in the network. Here we discuss
about two algorithms ‘K-Means + Apriori’, a method which classify normal and abnormal activities in
computer network. In K-Means process, it partitions the training set into K-clusters using Euclidean
distance and introduce an outlier factor, then it build Apriori Algorithm to prune the data by removing
infrequent data in the database. Based on defined state the degree of incoming data is evaluated through
the experiment using sample DARPA2000 dataset, and achieves high detection performance in level of
attack in stages.
DDOS ATTACK DETECTION ON INTERNET OF THINGS USING UNSUPERVISED ALGORITHMSijfls
The increase in the deployment of IoT networks has improved productivity of humans and organisations.
However, IoT networks are increasingly becoming platforms for launching DDoS attacks due to inherent
weaker security and resource-constrained nature of IoT devices. This paper focusses on detecting DDoS
attack in IoT networks by classifying incoming network packets on the transport layer as either
“Suspicious” or “Benign” using unsupervised machine learning algorithms. In this work, two deep
learning algorithms and two clustering algorithms were independently trained for mitigating DDoS
attacks. We lay emphasis on exploitation based DDOS attacks which include TCP SYN-Flood attacks and
UDP-Lag attacks. We use Mirai, BASHLITE and CICDDoS2019 dataset in training the algorithms during
the experimentation phase. The accuracy score and normalized-mutual-information score are used to
quantify the classification performance of the four algorithms. Our results show that the autoencoder
performed overall best with the highest accuracy across all the datasets.
DDoS Attack Detection on Internet o Things using Unsupervised Algorithmsijfls
The increase in the deployment of IoT networks has improved productivity of humans and organisations. However, IoT networks are increasingly becoming platforms for launching DDoS attacks due to inherent weaker security and resource-constrained nature of IoT devices. This paper focusses on detecting DDoS attack in IoT networks by classifying incoming network packets on the transport layer as either “Suspicious” or “Benign” using unsupervised machine learning algorithms. In this work, two deep learning algorithms and two clustering algorithms were independently trained for mitigating DDoS attacks. We lay emphasis on exploitation based DDOS attacks which include TCP SYN-Flood attacks and UDP-Lag attacks. We use Mirai, BASHLITE and CICDDoS2019 dataset in training the algorithms during the experimentation phase. The accuracy score and normalized-mutual-information score are used to quantify the classification performance of the four algorithms. Our results show that the autoencoder performed overall best with the highest accuracy across all the datasets.
FORTIFICATION OF HYBRID INTRUSION DETECTION SYSTEM USING VARIANTS OF NEURAL ...IJNSA Journal
Intrusion Detection Systems (IDS) form a key part of system defence, where it identifies abnormal
activities happening in a computer system. In recent years different soft computing based techniques have
been proposed for the development of IDS. On the other hand, intrusion detection is not yet a perfect
technology. This has provided an opportunity for data mining to make quite a lot of important
contributions in the field of intrusion detection. In this paper we have proposed a new hybrid technique
by utilizing data mining techniques such as fuzzy C means clustering, Fuzzy neural network / Neurofuzzy and radial basis function(RBF) SVM for fortification of the intrusion detection system. The
proposed technique has five major steps in which, first step is to perform the relevance analysis, and then
input data is clustered using Fuzzy C-means clustering. After that, neuro-fuzzy is trained, such that each
of the data point is trained with the corresponding neuro-fuzzy classifier associated with the cluster.
Subsequently, a vector for SVM classification is formed and in the last step, classification using RBF-
SVM is performed to detect intrusion has happened or not. Data set used is the KDD cup 1999 dataset
and we have used precision, recall, F-measure and accuracy as the evaluation metrics parameters. Our
technique could achieve better accuracy for all types of intrusions. The results of proposed technique are
compared with the other existing techniques. These comparisons proved the effectiveness of our
technique.
Machine learning-based intrusion detection system for detecting web attacksIAESIJAI
The increasing use of smart devices results in a huge amount of data, which raises concerns about personal data, including health data and financial data. This data circulates on the network and can encounter network traffic at any time. This traffic can either be normal traffic or an intrusion created by hackers with the aim of injecting abnormal traffic into the network. Firewalls and traditional intrusion detection systems detect attacks based on signature patterns. However, this is not sufficient to detect advanced or unknown attacks. To detect different types of unknown attacks, the use of intelligent techniques is essential. In this paper, we analyse some machine learning techniques proposed in recent years. In this study, several classifications were made to detect anomalous behaviour in network traffic. The models were built and evaluated based on the Canadian Institute for Cybersecurity-intrusion detection systems dataset released in 2017 (CIC-IDS-2017), which includes both current and historical attacks. The experiments were conducted using decision tree, random forest, logistic regression, gaussian naïve bayes, adaptive boosting, and their ensemble approach. The models were evaluated using various evaluation metrics such as accuracy, precision, recall, F1-score, false positive rate, receiver operating characteristic curve, and calibration curve.
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETIJNSA Journal
In network security framework, intrusion detection is one of a benchmark part and is a fundamental way to protect PC from many threads. The huge issue in intrusion detection is presented as a huge number of false alerts; this issue motivates several experts to discover the solution for minifying false alerts according to data mining that is a consideration as analysis procedure utilized in a large data e.g. KDD CUP 99. This paper presented various data mining classification for handling false alerts in intrusion detection as reviewed. According to the result of testing many procedure of data mining on KDD CUP 99 that is no individual procedure can reveal all attack class, with high accuracy and without false alerts. The best accuracy in Multilayer Perceptron is 92%; however, the best Training Time in Rule based model is 4 seconds . It is concluded that ,various procedures should be utilized to handle several of network attacks.
CLASSIFICATION PROCEDURES FOR INTRUSION DETECTION BASED ON KDD CUP 99 DATA SETIJNSA Journal
In network security framework, intrusion detection is one of a benchmark part and is a fundamental way to protect PC from many threads. The huge issue in intrusion detection is presented as a huge number of false alerts; this issue motivates several experts to discover the solution for minifying false alerts according to data mining that is a consideration as analysis procedure utilized in a large data e.g. KDD CUP 99. This paper presented various data mining classification for handling false alerts in intrusion detection as reviewed. According to the result of testing many procedure of data mining on KDD CUP 99 that is no individual procedure can reveal all attack class, with high accuracy and without false alerts. The best accuracy in Multilayer Perceptron is 92%; however, the best Training Time in Rule based model is 4 seconds . It is concluded that ,various procedures should be utilized to handle several of network attacks.
Similar to Classification of Malware Attacks Using Machine Learning In Decision Tree (20)
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
Assure Contact Center Experiences for Your Customers With ThousandEyes
Classification of Malware Attacks Using Machine Learning In Decision Tree
1. Abel Yeboah-Ofori
International Journal of Security (IJS), Volume (11) : Issue (2) : 2020 10
Classification of Malware Attacks Using Machine Learning
In Decision Tree
Abel Yeboah-Ofori u0118547@uel.ac.uk
School of Architecture, Computing & Engineering yeboah007@hotmail.com
University of East London.
London, E16 2GA, UK
Abstract
Predicting cyberattacks using machine learning has become imperative since cyberattacks have
increased exponentially due to the stealthy and sophisticated nature of adversaries. To have
situational awareness and achieve defence in depth, using machine learning for threat prediction
has become a prerequisite for cyber threat intelligence gathering. Some approaches to mitigating
malware attacks include the use of spam filters, firewalls, and IDS/IPS configurations to detect
attacks. However, threat actors are deploying adversarial machine learning techniques to exploit
vulnerabilities. This paper explores the viability of using machine learning methods to predict
malware attacks and build a classifier to automatically detect and label an event as “Has
Detection or No Detection”. The purpose is to predict the probability of malware penetration and
the extent of manipulation on the network nodes for cyber threat intelligence. To demonstrate the
applicability of our work, we use a decision tree (DT) algorithms to learn dataset for evaluation.
The dataset was from Microsoft Malware threat prediction website Kaggle. We identify probably
cyberattacks on smart grid, use attack scenarios to determine penetrations and manipulations.
The results show that ML methods can be applied in smart grid cyber supply chain environment
to detect cyberattacks and predict future trends.
Keywords: Cyberattack, Malware, Machine Learning, Smart Grid, Decision Tree.
1. INTRODUCTION
The unpredictable nature of cyberattacks and the cascading effects of cybercrimes on the
business system have made it difficult for organizations to predict endpoint attacks. ML assist in
recognizing attack patterns using datasets of previous attacks to predict future attacks trends and
responses [1]. Endpoints are the third-party vendor systems, workstations, servers, handheld
mobile devices and AMI devices. Malware attacks have intensified by the distributed nature of the smart
grid in supply chain systems. Adversaries are using cyberattacks such as cross site scripting, cross
site request forgeries, session hijacking and remote access trojan attacks to commit cybercrimes
such as modification of software, manipulating of online services, manipulations electronic
products, diverting e-products and other security misconfigurations. Ford and Siraj 2015,
highlighted different issues in the applications of machine learning in cybersecurity by detecting
phishing, network intrusion, testing security properties of protocols and smart energy
consumptions profiling [2].
Machine learning techniques are applied in a cybersecurity environment to predict network
intrusions detections, malicious codes detections, amount of suspicious transaction, electric
power fraud anomaly detection, substation location frauds, and spam filtering for spear phishing
attacks, as well as determine the probabilities of attacks. We could use ML to detect anomalies
in HTTPs requests such as XXE, XSS, SSRF attacks in communication networks, authentication
bypass in password setting and SQL Injection in a database system. Soska and Christin 2014,
applied ML techniques to automatically detect vulnerable websites before the turn malicious [3].
Canali et al. 2014 applied ML techniques to detect the effectiveness of risk prediction based on
browsing behaviours [4]. Hinks et al. 2015 use ML techniques on various classification algorithms
2. Abel Yeboah-Ofori
International Journal of Security (IJS), Volume (11) : Issue (2) : 2020 11
to learn dataset to detect power system disturbance and cyberattack discrimination [1]. Mohasseb
et al. 2019 applied ML techniques to analyze a dataset from various organizations to improve
classification accuracies [5]. These works are important and contribute to detecting and predicting
cyberattacks using machine learning in the cybersecurity domain. However, there is a limited
focus on smart grid vulnerability from supply chain perspective, and specifically on threats relating
to inbound and outbound chain contexts that need adequate detection to improve smart grid
security control and decision makings.
In this paper, we use ML techniques to learn datasets and build a classifier to automatically
detect and label an event as Has Detection or No Detection. The rationale for choosing the DT
algorithm is that DT represents the major supervised schemes for ML in network security. We use
a dataset from Microsoft malware prediction [6] for our work. To demonstrate the effectiveness of
our approach, we adopt the decision tree algorithm to evaluate our data sets based on the attack
classifications.
The main contribution of this paper is threefold. Firstly, we identify probably cyberattacks on the
smart grid and the vulnerable sports that could be exploited through penetration and
manipulations base on the telemetry dataset. Secondly, we use attack scenarios to determine the
penetration and the manipulations for the threat predictions on the endpoint nodes. Finally, we
use ML techniques to learn the dataset and use the DT algorithm to predict whether the endpoint
nodes can classify if the nodes can detection cyberattack or not using Has Detection or No
Detection. The results show that ML algorithms in Decision Trees methods could be applied in
smart grid supply chain predictive analytics to detect cyberattacks and predict future trends.
The rest of the paper is structured as follows: Section 2 presents an overview of related works in
the machine learning in smart grid supply security domain and the existing classification
algorithms. Section 3 considers our approach to evaluating the ML techniques to learn dataset
and the classification algorithms for smart grid supply chain, CPS smart grid infrastructure and
the vulnerable spots and probable attacks scenarios. Further, it discusses the data
representation, feature descriptions and extractions as well as the classification algorithm.
Section 4 presents the implementation of the machine learning simulation process, performance
evaluation on the classifier and determines the average accuracy of the model and predict the
probability of penetrations on the endpoint nodes. Section 5 presents the results and analysis of
the DT that predicts the cyberattack initiated and the cybercrimes committed or not. Further, we
provide discussions of the several observations identified in the study. Finally, section 6 presents
a conclusion of the study, comparisons of existing works, limitations and future works.
2. RELATED WORKS
This section reviews related works and the state of the art of cybersecurity in machine learning
predictions, decision tree classifications and how they are related to malware attacks on CPS
environment. That includes identification of previous classification approaches, leveraging the
classifications of malware with a specific data set and prediction task used. Sharmar et al. 2012
proposed an ML technique for detecting worm variants of known worms in real-time systems [7].
Tsai et al. 2009, proposed a review of the intrusion detection system by using ML techniques and
various classifiers on the intrusion detection domain [8]. Wang et al 2014. Performs an empirical
study of adversarial attacks against ML models in the context of detecting malicious
crowdsourcing systems [9]. Bilge et al. 2017, proposed a risk teller system that predicts cyber
incidents by analyzing malicious files and infection records according to the endpoint protection
software installed to determine machines that are at risk [10]. Canali et al. 2014 performed a
correlation analysis on the effectiveness of risk prediction based on user browsing behaviour by
leveraging ML techniques to provide a model that can be used to estimate the risk class of a
given user [4]. Barros 2015 posits that decision threats and induction methods in general, arose
in machine learning to avoid acquisition bottleneck for expert systems [11]. Villano, 2018,
proposed a method of classification of internet logs using ML techniques by correlation and
normalization process and evaluated the DT algorithm that could predict an attack or not [12].
Soska & Christin 2014, proposed a complementary approach to automatically detect vulnerable
3. Abel Yeboah-Ofori
International Journal of Security (IJS), Volume (11) : Issue (2) : 2020 12
websites before they turn malicious by design, implement and evaluate a novel classification
system which predicts whether a given website could be compromised in future [3]. Hinks et al.
2014, proposed an ML technique for power system disturbance and cyberattack discrimination by
evaluating various ML methods for an optimal algorithm that is accurate in its classifiers to predict
disturbance discriminators and implications [1]. Yavanoglu et al. 2017, proposed a review of
cybersecurity datasets for ML algorithms by analyzing network traffic and detecting abnormalities
used for experiments and evaluation methods considered as baseline classifiers for comparisons
[13].
2.1 Decision Trees
Decision Tree is used as a method in ML for classification and regression in large and complex
data in order to discover patterns. DT is built as a tree structure to classify instances of an attack
by plotting each malware attack attributes from the top and down to its root. Each branch of the
dataset is broken down into subsets to represent a choice of possible values for the attributes of
output, and each leaf represents a decision. DT is used in supervised ML for classification and
regression [11]. DT inference process starts at its root and proceeds to the leave. DT processes
include splitting, pruning and tree selection [14]. Splitting includes partitioning the data into
subsets, pruning includes the process of reducing the tree by turning some branch nodes into leaf
nodes, and tree selection involves finding the smallest tree that fits the data. Each attribute is
assigned a node, and in the leaf are the probable outcome or state. DT uses inductive inference
as a method to arrive at a conclusion based on the independent input and using the dependent
values as attributes. There are several approaches to DT algorithms such as J34, C4.5, C5.0 cart
and others. We used C5 method to identify which attribute was the root of the tree [15]. Figure 1
shows an example of a DT.
Malware Detection
Unknown
Has Detection No Detection
Has Detection
Has Detection
No Detection
No Detection
Root
Branch
Internal
Node
Leaf
Nodes
FIGURE 1: Decision Tree.
2.2 Decision Tree Selection Criteria
Decision Tree uses various algorithms for inferences to arrive at a conclusion, therefore, it is
required to have a selection criterion with certain characteristics that we can use to determine the
challenges. The characteristics include: Challenges originating from the classification of the data
sets could be numerous. For instance, identifying attacks that are initiated through Intelligence
Electronic Devices on smart grid systems or classifying staff salaries based on qualifications,
skills and experience.
2.3 Rational for Chosen Machine Learning and Decision Tree
Several algorithms have been used in ML. such as Naive Bayes, SVM, Random Forest and
Logistic Regression. However, our rationale for chosen DT algorithms in ML is that it provides
4. Abel Yeboah-Ofori
International Journal of Security (IJS), Volume (11) : Issue (2) : 2020 13
discrete outputs as the factors are provided by attributed value pairs for strategic management
decision makings. E.g. Results: Pass or Fail. Cyberattack: Internal or external. Temperature: hot
or cold. Outcome: Positive or Negative.
DT algorithm can identify attributes pairs that were not considered initially in the
classification such as the source of attack but could work without those attributes to
minimize inferred errors.
DT algorithm can handle datasets that have errors in the attribute values and resolve
classification errors in the training and test phase. Such as false positives (FP) output
when network traffic is a normal or false negative (FN) when network traffic is under
cyberattack. The discrete probability outputs provide results that predict a ‘True or False’,
‘Yes or No’ and ‘A or B) outcomes.
3. APPROACH
This section considers our approach to evaluating the ML techniques to learn dataset and the
classification algorithms for the smart grid supply chain. We discuss the smart grid infrastructures
and the vulnerable spots, attack scenarios and the ML approach. The rationale for the ML
approach using DT to predict an attack is to determine the causal relationships amongst the
cyberattacks on a smart grid supply chain system and attempt to predict the malware using
probability distribution methods. Then based on the classification analysis, we evaluate the
predictive method with appropriate metrics to verify the organizational goal and security goal as
we seek to determine whether a specific cyber threat phenomenon is likely to appear in a similar
event. There are some algorithms for building decision trees such as ID3 and C4.5 formula and
others [15]. We discuss the ML methods used, as well as the approaches used for the malware
prediction.
3.1 CPS Smart Grid Infrastructure
The CPS smart grid infrastructure in figure 1, integrates application and network systems using
Intelligence Electronic devices (IEDs). Refer IEC 61850 [16] The application system uses the
IEDs, Sensors, Actuators and other communication devices for power generation, distribution,
and transmission. The Supervisory Control and Data Acquisition (SCADA) and Programmable
Logic Controls (PLC) establishes communication protocols with the Remote Telemetric Units
(RTUs) for monitoring and gathering real-time data across various substation. The network
system provides interconnectivity between substations, automation systems and field devices
such as AMI and Home Energy Management Systems (HEMS) software [17] [19].
SCADA Server
Command Center
Workstation
SCADA Network
Communication Network
Switchboard
CSC Vendors
Systems
Snort
FirewallIED
Threat Actor 1
Router
Sub Station
Server
Third Party
Vendors
Syslog Open PDC
Firewall
workstation
WAN
Threat Actor 2
Threat Actor 3
ISP
Cyber Physical
Generation
Transmission Distribution
FIGURE 2: Smart Grid System Infrastructure and Vulnerable Spots [19].
5. Abel Yeboah-Ofori
International Journal of Security (IJS), Volume (11) : Issue (2) : 2020 14
The adversary could cause cyberattacks (penetration) and cybercrimes (manipulation) on the
CPS. Cyberattacks such as remote access trojan, spear phishing, cross site scripting or session
hijacking on the intelligent devices and communications networks to penetrate firewalls, IDS/IPS
or the IEDs. After penetrating the system, the adversary could commit cybercrimes by
manipulating the system to cause resonance attacks, DDoS attacks, IP theft, ID theft, intellectual
property theft as well as take command and control to monitor and control the core business
processes and operations. We include these attack scenarios in the analysis to determine the
validity of the penetration and manipulations in real-time.
3.2 Attack Scenarios
We identify various attack scenarios for the study that will assist in the feature selection process
as follows.
Network Attack: An XSS or session hijacking attack on the CSC network may provide
access to alter the smart metering system, change configurations using distance
protection scheme to bypass controls in order to manipulate the software in the meter,
prevent the system from recording accurate purchases or billings.
Spyware Attack: The attacker could insert spyware or deploy a ransomware attack
remotely to shut the systems down when the antivirus is outdated, and the software is
unpatched, and consequently affect the prepaid card settings change the configurations
using distance protection scheme so an attacker can manipulate and prevent accurate
readings from valid purchases.
Ransomware Attack: The attacker could use reconnaissance and social engineering
tactics to gather intelligence and subsequently initiate a spear phishing attack on targeted
users to shut the system down until a ransom is paid.
Software Manipulation Attack: Most organizations fail to change the hard-coded
password after buying software off the shelf. The attacker could deploy session hijacking
techniques to exploit this vulnerability using advanced persistent threats and command &
control techniques to manipulate the system and consequently cause cybercrimes such
as intellectually property theft, ID theft and industrial espionage.
DDoS or Data Injection: Attacker deploys DDoS attack that could consequently cause
voltage surges by inserting a rootkit into the OS server to cause resonance attack on the
smart grid components for the power system to oscillate.
Island Hopping attack: On the CSC systems, vendors are more susceptible to
cyberattacks, and the perpetrators are using RAT and Island-hopping attacks to gain
access to the major organizations on the supply chain.
Malware: The attacker could insert malware or spyware in the software that is bought off
the shelf that gives the developers access to the system whenever users are prompted to
update their software. That may cause software errors and subsequently lead to
application system manipulations.
3.3 Threat Prediction Scenarios
The threat prediction attempts to investigate two kinds of scenarios that will determine the
classification result. The scenarios use’s ML techniques to determine the cyberattack initiated and
the cybercrime committed based on the scenarios and the cyberattacks.
Scenario 1: what is the probability of the penetration on the endpoint nodes?
Scenario 2: What is the extent of manipulation on the various endpoint node?
3.4 Analytical Approach
To determine the viability of using ML techniques to learn dataset for penetrations and
manipulations on the CPS, we used the DT algorithm and open source data from Microsoft
Malware Predictions endpoint protection solutions website [3]. DT provides an efficient and
nonparametric method that can be applied to a classification or regression task. Further, we used
supervised learning to train and test the dataset as it provides an accurate prediction of system
6. Abel Yeboah-Ofori
International Journal of Security (IJS), Volume (11) : Issue (2) : 2020 15
performance. Using DT hierarchical data structures for supervised learning provides input space
that is split into local regions in order to predict the dependent variable for decision makings [11].
3.5 Data Representation
The data represented Microsoft Windows Machine’s probability of getting infected by various
families of malware, based on different properties of that machine. The telemetry data containing
threat report were collected by Microsoft Windows Defender from various MS windows operating
systems [6]. The properties and machine infections were generated by combining various user
activities on different organization and vendors. The dataset we used contains 4000 entries. DT
algorithm used determines attributes that return the highest information gain that satisfies the four
uncertainty axions in a confusion matrix and provides the degree of disorganization in the
dataset. Further, an Entropy formula was used to determine the information gained and the
degree of uncertainty by separating the positive and negative rates as follows:
Entropy (E) = - a log1 a – b log2 b (1)
3.6 Feature Extraction
The feature extraction process involves removing irrelevant columns names or duplicates in the
dataset to have unique values when training the data. Columns with a higher number of
duplicates are removed to the correct data. The command is to count the 62 variables and
remove the irrelevant variables. The output prints 62-8 = 54. (8 columns removed). However, the
4000 datasets were maintained. The classifier is set to model the features based on the
importance as well as the F Score. The F-Score was used as the harmonic mean to determine
the combinations of the precision and recall for plotting the model.
3.7 Classification Algorithm
The classification phase involves using the ML algorithm to test the dataset for prediction. In this
phase, the DT model was used to split the data for prediction to determine if each endpoint node
can detect infections or not. We considered C4.5 or C5 algorithms [15]. The training data is used
to build the DT model, and the test data is used to determine the dimensionalities of the dataset.
The rationale for choosing the DT algorithm is that DT represents the major supervised schemes
for ML in network security. We train the ML algorithm using the training sets. Then compare the
performance of the algorithm over the datasets.
4. IMPLEMENTATION
This section discusses the implementation of the machine learning simulation process. The
purpose of the study is to use the DT algorithm to predict cyberattack and indicate it as Has
Detection or No Detection. The dataset and the machine malware infections were gathered by
Microsoft Defender endpoint protection [6]. The dataset corresponds to a machine identifier that
provides results as to whether the Microsoft endpoints can predict if it can detect malware attacks
on the nodes. As discussed in section 2, the DT algorithm learns from data sets to approximate
an ‘if then else’ decision rules and generate branches for the tree nodes and decision nodes. We
follow the process below to build the DT classifier for our prediction.
4.1 Description of Data
The dataset is about a malware attack on Microsoft Endpoint system and such systems can be a
critical part of the smart grid CSC systems overall business continuity [6]. The dataset was
designed to meet certain business constraints in relation to privacy and time periods in which
machine was used. CSC integrates various organizational systems for the business process and
information dissemination in the CPS environment. The data set containing these properties and
the machine infections were generated by combining threat reports collected by Microsoft
Endpoint Protection Solution, Windows Defender. Each row in the dataset corresponds to a
machine unique identified by a Machine Identifier. Further, the dataset was created to meet
certain business constraints, both regarding privacy and when the machine was running. Hence,
the dataset is relevant for our work as it was gathered from global machines that used Microsoft
7. Abel Yeboah-Ofori
International Journal of Security (IJS), Volume (11) : Issue (2) : 2020 16
Windows Defender. The rationale for using the dataset for our work is that the dataset does not
represent Microsoft customers machine only as it has been sampled to include a much larger
proportion of malware machines. Thus, we used the dataset for our work to determine whether
the has detection or no detection on various network nodes for threat predictions. Below are
some of the features from the metadata that are relevant for our work [6].
MachineIdentifier - Individual machine ID
GeoNameIdentifier - ID for the geographic region a machine is located in
DefaultBrowsersIdentifier - ID for the machine's default browser
OrganizationIdentifier - ID for the organization the machine belongs in.
is protected - This is a calculated field derived from the Spynet Report's AV Products
field.
Processor - This is the process architecture of the installed operating system
HasTpm - True if the machine has tpm
over - Version of the current operating system
OsBuild - Build of the current operating system
Census_DeviceFamily - AKA DeviceClass. Indicates the type of device that an edition of
the OS is intended for desktop and mobile
Firewall - This attribute is true (1) for Windows 8.1 and above if windows firewall is
enabled, as reported by the service.
4.2 Data Preparation
The dataset represents Microsoft malware prediction events collected from various families of
malware infections based on different properties of attacks. Windows Defender tool was used to
generate the threat reports of the malware infections from various Microsoft endpoint protection
solutions. [6]. The dataset derived 4000 entries with 64 columns, and each row represents
different metadata entry. Each row in the dataset corresponds to a machine uniquely identified by
a Machine Identifier. We used supervised learning to derive the dataset that represented the
instance of each table and attribute. The rationale is to predict an outcome for future events. The
variables in the datasets are for each instance to determine whether a malware attack is Has
Detection or No Detection.
4.3 Feature Selection
The features for the dataset are split into the partition of the subsets of attacks as indicated in
table 1. The attack features indicate the categories of attacks grouped. The splitting of the attack
categories builds the classifications model for the three structure and breaks it down to represent
the attack features. Further, we pruned the dataset to reduce the size of the tree by turning some
branches nodes into leaf nodes. For instance, we categorized the attacks based on the threat
descriptions in the table for us to fit the training data for the classifier and finds the tree that
produces the lowest cross validation.
Attack
Category
Attack Features Threat Descriptions for Probable Cause of Attack
1 XSS/Session
Hijacking
Default Browser vulnerabilities and injecting code in the URL
or website
2-5 Spyware/Ransomware Outdated Antivirus/Patches that are not updated regularly
6-7 Spear Phishing Use Reconnaissance to identify vulnerable spots and attach
email with a virus
8-9 Session Hijacking Exploit Unchanged Hard-Coded password in software
bought off the shelf
10-14 Rootkit/DDoS Attack on BIOS or attach a virus to a USB key to cascade
when booting.
15-20 RAT/Island Hopping Attacks from Vendor systems to gain access to the
organizational system
21-28 Ransomware/Malware Exploiting outdated OS versions and encryptions especially
TLS/SSL
8. Abel Yeboah-Ofori
International Journal of Security (IJS), Volume (11) : Issue (2) : 2020 17
29-35 Malware/Spyware Packet injection and Resonance attacks
36-38 DDoS Exploit IP Address Systems and Packet injections
TABLE 1: Attack Category and Feature Descriptions.
4.4 Performance Evaluation on the Classifier
The performance evaluation on the classifier determines the average accuracy of the models
when we run the integer values in the cell. The performance of the model will be determined on
the following values: True Positive (TP), True Negative (TN), False Positive (FP) and False
Negative (FN) rates. Further, the FPR and FN will be determined based on the elements.
(2)
5. RESULTS
In this section, we present the analysis of the investigation of threat prediction to the two
scenarios for the classification results. We discuss adversarial ML briefly and how adversaries
use ML techniques to exploit vulnerabilities. As discussed in section 3.3, the scenarios use the
DT algorithm to predict the cyberattack initiated and the cybercrime committed.
5.1 Determine the Accuracy of the Threats
For us to predict the probability of an attack, we need to determine the known and the unknown
attacks. As listed in Table 1, Known attacks include Malware, Spyware, Ransomware and, RAT,
Cross-Site Scripting, Session Hijacking, Cross Site Request Forgery. These are the known
attacks that could be identified. However, unknown attacks are cyber crimes committed after the
attacks. Here, after gain access using the known attacks, the attacker, using APT and C&C to
commit cybercrimes such as manipulation during development, manipulation during development,
altering and changing delivery channels. The extent of these cybercrimes manipulations and the
cascading impact are unknown and unquantifiable.
Scenario 1: Predict the probability of penetration on the endpoint nodes?
Determining the accuracy process involves evaluating the threats, and its impacts on the various
network nodes for understanding and to provide cyber threat intelligence of the causes and
effects cyberattacks on the organizational goal, the business process, financial impact. Table 2
presents the performances accuracies of the DT classifier of each cyberattack on various
endpoints of the network. Using the confusion matrix, we determine the harmonic mean between
the Precision (P), Recall (R) and F-Score (F). From the table, XXS/Section Hijacking, spear
phishing, RAT/Island Hopping attacks predicted a higher probability of the penetration on the
endpoint nodes with a percentage score of 82%, 75% and 75% respectively. However, the results
revealed the XSS and Session Hijacking are the most like penetration method to deploy base on
the predictions.
SCENARIO DT PREDICTIONS
ACCURACY 83% 100%
CYBERATTACKS P R F RESULTS
XSS/Session Jacking 0.89 0.41 0.75 82%
Spyware/Ransomware 0.89 0.58 0.85 87%
Spear Phishing 0.81 0.37 0.71 75%
Session Hijacking 0.71 0.39 0.64 65%
Rootkit/DDoS 0.66 0.37 0.68 55%
RAT/Island Hopping 0.67 0.30 0.74 68%
Ransomware/Malware 0.89 0.55 0.71 85%
Malware/Spyware 0.87 0.58 0.78 84%
DDoS 0.78 0.36 0.65 66%
TABLE 2: Predicting the Probability of Penetration.
9. Abel Yeboah-Ofori
International Journal of Security (IJS), Volume (11) : Issue (2) : 2020 18
Scenario 2: Predict the extent of manipulation on the various endpoint node?
Predicting the extent of manipulations on the various endpoint nodes after penetrations are very
challenging due to the invincibility, uncertainty and fuzzy nature of cybercrimes. Further,
determining the extent of cyberattack propagation and manipulations in an integrated network
environment posse a major challenger in the cybersecurity discipline. From Table 2: the results
indicate that:
Ransomware, Malware and Spyware predicted a higher probability of manipulations on
the endpoint nodes with a percentage score of 87%, 85% and 84% respectively after
determining the Precision and F-Score with a low Recall rate.
That indicates that the extent of manipulation in a given event could be high with an
average accuracy of 85%.
The manipulations could result in cyberattacks such as Industrial Espionage, Intellectual
property theft, Advanced Persistent Threat and Command & Controls.
6. DISCUSSION
Predicting cyberattacks in real-time is challenging due to factors such as type of OS being used,
system refresh rates, time zones, running updates and data in transition. Attacks such as
Ransomware or malware may impact on the system based on the OS being used, the origin of
the attack and due to the time zone. Threat actors could use adversarial machine learning
techniques to exploit vulnerabilities in ML threat predictions
6.1 Adversarial Machine Learning
Adversarial machine learning is a technique used by the adversary to inject malicious input data
in the dataset during the training and testing phase to manipulate the classification algorithms for
the model. The technique can be used in supervised learning algorithms for cybersecurity
datasets to exploit vulnerabilities and compromise performance results of malware detections,
spam filters and IDS/IPS intrusions when predicting cyberattack trends and predicting the
probability of fraudulent activities. The adversary could cause an increase in the false-positive
rates by inserting malicious samples in the test phase to generate wrong classifications rates of
the sample data. The adversarial machine learning technique could be used to manipulate
training data to violate security policy, gain knowledge of threat intelligence, adversary
capabilities and level of manipulations.
6.2 Determining Processor Count for Vulnerable Operating System
The classification of the malware attack is built based on the type of operating system that is
being used by the organization. The OS determines the nature of antivirus that can be installed
and could be exploited on each system and if it can detect malware attacks or not. An outdated
antivirus within a third-party system could easily be a point of failure if a malware attack is
initiated from there leading to power loss, power surge, system error or power fluctuations issues.
Addressing downtime and uptime in the event of failure is critical for all the organizations that are
integrated on the supply chain. For instance, a redundant array of independent disk (RAID) uses
multiple hard drives in unique groupings and storage capacity mechanisms to produce a storage
solution that provides improved throughput, resistance, and resilience. These drivers rely on
antivirus updates and patches as trusted sources to prevent any compromises. Figure 3 explains
the ML processor count of the systems and how it determines the speed at which the malware
attack could occur as well as the extent of propagation in the event of an attack. The X-axis
determines the process count for the vulnerable OS. The Y-axis determines the number of OS
that are affected. From Figure 3, we realized that the processor count of 3000 was able to affect
4.0 systems. Indicating that the region identifier may have fewer systems with a higher probability
of penetration and manipulation. Thus, exploiting outdated OS versions and encryptions
especially TLS/SSL raises antivirus protection issues and application interoperability on the
various network nodes. Users can be lured into a false sense of security by the threat actor
across deferent platforms to update the antivirus that could lead to heuristic detections or false
positives.
10. Abel Yeboah-Ofori
International Journal of Security (IJS), Volume (11) : Issue (2) : 2020 19
FIGURE 3: Processor Count for Vulnerable Operating System.
6.3 Resolution Rate of Probable Ransomware Attack
The figure displays the detection rate of the training dataset using the feature description of smart
screen usage in the CSC environment. The implication of the of our results is that attackers have
used malware attacks to penetrate the smart screen and inset spyware in the system that turns
the camera on in the smart screen monitor. With that, the attacker can see everything the victim
will do, take command and control, leading to cybercrimes attacks such as Intellectual property
and industrial espionage. Refer to the FLocker mobile ransomware attack on smart screens
(Duan. 2016). Further, the attacker could use social engineering tactics such as spear phishing
to cause ransomware attacks to deface the monitors and cascade to other smart screen systems
on the CSC.
Ransomware attacks could affect CSC system platforms that use multiple smart screens monitors
by infecting a single screen and may propagate to others on the monitors with the same network
nodes and lock the screens during run time. Section 4.1 describes the dataset and how each row
in the dataset corresponds to a machine unique identified by a Machine Identifier gathered from
global machines that use Microsoft Windows Defender. The FLocker ransomware infects smart
screens and avoids detections as the code is always being rewriting to improve its routine
variants and meet changing trends. When launched, the malware identifies the country ID, the
machine ID and activates depending on the motives and intents of the adversary. Figure 4
identifies the vertical resolution rates of the various systems and how the infections propagate
through the systems during run time. The Y-axis indicates the extent of vertical infections and the
X-axis indicates the resolution rates of the infected systems. Malware or ransomware that is
embedded directly into the requested web page in the attack could propagate to other systems.
FIGURE 4: Resolution Rate of Probable Ransomware Attack.
11. Abel Yeboah-Ofori
International Journal of Security (IJS), Volume (11) : Issue (2) : 2020 20
6.4 Decision Tree Predictions
The DT in Figure 5 depicts the results of the classifier that predict a Windows machine probability
of getting infected by attacks listed in table1, based on various properties of that machine. The
properties used to generate the DT are, SmartScreen, CountryIdentifier, AVProducts,
OSInstallTypeNname, TotalPhysivalRam, OsBuildLab, OSWUAutoOptionsName. The Smart
screen represents workstations. The Country identifier represents the country the Windows
Operating System is located. According to a report by Controller and Audit General on the
investigations of ‘WannaCry’ Ransomware Attack in 2017, the attack initially infected the NHS
system the UK and then propagated to other countries across the world and infected various
system [18]. The report indicated that the OS antivirus product was outdated hence the attack.
The objective of the paper is to use ML techniques on a dataset to predict whether the system
can detect an attack and label that as Has Detection or No Detection. From, the sample dataset
of 4000, we predict the probability of ransomware attack infection based on the type of OS
Installed that could lead to the vulnerability of the ransomware infecting the smart screen as well
as the country the OS is installed and the version.
6.5 Gini Index Based Decision Tree
The Gini index based decision tree was the calculation for Smart Screen Malware (M) Infection
Trend.
If the dataset (D) contains examples from n class, then the Gini index, gini(D) is defined
as:
(3)
Where pi is the probability of an object being classified to a particular class that infected.
If a dataset (D) is split on the root (R) into two sets subsets D1 and D2 the gini index (D) is
expressed as:
(4)
The Reduction in Impurity for the split in the dataset was calculated as:
Gini(R) = gini(D) – giniR (D) (5)
From the DT algorithm, we calculate the information gained after the malware (M) infection trend
test is applied on the smart screen for the classification. A weighted sum of Gini Indices was
calculated using the DT and generated the Has Detection and No Detection tree.
Figure 6 depicts the DT indicating the results of the gini index used to measure the probability of
infections of a ransomware attack that may be wrongly classified. The DT root indicates a smart
screen rate of <= 6.5 with a split Gini of 0.5 indicating an equal distribution of the dataset. The
root of the three has an initial dataset of 4000 as the sample size. The DT algorithm split the
value into two sets: [1973, 2027]. From the analysis, 1973 were identified as has detection, hence
are not vulnerable to the attacks. However, 2027 were found to have no detection hence
vulnerable to malware or ransomware attacks. The branch with has detection is indicated as
(True) and the other with no detection is indicated as (False). identified as has identified as a
country identifier with the class Has detection, identified the values of 2531 and 1648 from a
sample size of 3531. A sample size of 2955 has antivirus product installed. However, the total
physical RAM has no detection rate of 1458. The DT split the sample size further till the values
were at the threshold. Figure 5 depicts the gini index calculated and information gained after the
DT test is applied.
12. Abel Yeboah-Ofori
International Journal of Security (IJS), Volume (11) : Issue (2) : 2020 21
FIGURE 5: Decision Tree Predictions.
The results from Scenarios 1 and 2 provides cyber threat intelligence as to what could happen in
the event of a cyberattack without the classifications of the detections rates in Figure 5.
Scenario 1 predicted a higher probability of the penetration’s attacks on the endpoint nodes
after determining the harmonic mean between the Precision, Recall and F-Score with a
percentage score of:
82% for XXS/Section Hijacking
75% for spear phishing
75% for RAT/Island Hopping attacks
Scenario 2, determining the extent of cyberattack propagation and manipulations in an integrated
smart grid network environment. The results show that cyberattacks such as Ransomware,
Malware and Spyware predicted a higher probability of attack propagation and manipulations to
other systems.
The results indicate the extent of manipulation to other integrated network systems could
be high with an average accuracy of 85% in a given event.
The extend of manipulations indicates the relevance of the classification of the
cyberattack. The threat intelligence indicates that it could result in cyberattacks such as
Industrial Espionage, Intellectual property theft, Advanced Persistent Threat and
Command & Controls.
6.6 Comparing Our Results with Existing Work
A significant amount of literature exists in machine learning techniques and classification
algorithms to learn dataset for performance accuracies in the cybersecurity domain that have
considered threat predictions. Comparing our results to existing works, we considered works that
used ML methods from cyberattack penetration and cybercrimes manipulations perspective to
detect attacks. Hinks et al. 2014, considered an ML technique for power system disturbance and
13. Abel Yeboah-Ofori
International Journal of Security (IJS), Volume (11) : Issue (2) : 2020 22
cyberattack discrimination by evaluating various ML methods for accurate classification to predict
disturbance discriminators and implications [1]. Sharmar et al. 2012 use ML to detect worm
variants of known worms in real-time [7]. Bilge et al. 2017 applied ML techniques to predict a risk
teller system for cyber incidents by analyzing malicious files [10]. Canali et al. 2014 used the ML
method to perform a correlation analysis of the effectiveness of risk prediction based on user
browsing behaviour [4]. Villano, 2018, applied ML method for correlation and normalization
process and evaluated the DT algorithm that could predict an attack or not [12]. Soska & Christin
2014 used the ML approach to automatically detect vulnerable websites [10]. Mohasseb et al.
2019 applied ML approach for predictive analytics using SVM and Naïve Bayes algorithms for
evaluation accuracies [5].
Further, various DT algorithms, models and techniques have been implemented using a various
dataset for building intrusion detections, anomaly detection and threat predictions. Pournouri et al
(2017) proposed a cyber attack analysis using decision tree techniques to learn an open source
intelligence dataset for prediction and for improving cyber situational awareness [21]. Patel and
Prajapati (2018) proposed a study and analysis of decision tree based classification using ID3,
C4.5 and CART algorithms to learn a dataset to determine the best performance accuracy [22].
Moon et al (2017) proposed an intrusion detection system based on a decision tree using
analysis of attack behaviour information to detect the possibility of intrusion for preventing APT
attacks [23]. Sarker et al (2020), presented a machine learning intrusion detection system based
security model called “IntruDTree” that evaluated various algorithms on a dataset by ranking the
security features according to their importance then build a generalized tree for detecting
intrusions [24]. Das & Morris (2018) presented a survey of machine learning and data mining
methods for cybersecurity applications and analytics for intrusion detection and traffic
classifications in emails by evaluating the various classifications algorithms on a dataset for
performance accuracies [25]. Balogun & Jimoh (2015) proposed a hybrid of DT and KNN
algorithms to detect anomaly intrusions [26]. Malik et al (2018) used a hybrid of DT pruning and
BPSO algorithms for network intrusion detection [27]. Rai et al (2016) proposed C4.5 DT
algorithm to construct a model for intrusion detection [28]. Yeboah-Ofori & Boachie (2019)
present a malware attack predictive analytics using various ML Classification algorithms in a
majority voting for performance accuracies [29]. Ingre et al (2017) proposed a DT algorithm that
classifies an IDS dataset as normal or attack after the learning and testing the dataset [30]. Relan
and Patil (2015) used a variant of C4.5 DT algorithm to implement an IDS by considering discrete
values for classifications [31].
However, none of the works explored the viability of using machine learning methods to predict
malware attacks and build a classifier to automatically detect and label an event as Has Detection
or No Detection on smart grid supply chain domain to predict the probability of penetration and
the extent of manipulation on the network system nodes for cyber threat intelligence and
situational awareness.
7. CONCLUSION
Our work focused on using ML to learn dataset and used the DT algorithm to determine whether
the classifier can predict an attack and label the attack as Has Detection or No Detection. In this
paper, we have used a malware prediction dataset from a well-known source learn the dataset.
We have used the DT algorithm to model the infections. Although, other algorithms can perform
the same task that the DT could handle datasets that have errors in the attribute values and
resolve classification errors in the training and test phase. Based on our result, the precision was
83% accurate and concluded that supervised learning model performed better in our predictions.
Description of objects may include attributes based on measurement or subjective judgement,
both of which might give rise to errors in the values of the attributes. Some of the objects in the
training set may even have been misclassified. Take, for instance, a malware attack classification
rule from a collection of cyberattacks events. An attribute might test for the presence of
propagation of attack that might give a positive or negative reading at some point. However,
questions remaining to be addressed as to what performance evaluation methods could provide
14. Abel Yeboah-Ofori
International Journal of Security (IJS), Volume (11) : Issue (2) : 2020 23
the best performance indicators for threat predictions and cyber threat intelligence gatherings that
could provide security control mechanisms. There are limitations in our work, such as comparing
other classification algorithms for predictive analytics due to the invincibility nature of cyberattacks
and the cascading impacts on other system nodes.
Future Works
Future research will focus on using ML techniques on various classification algorithms to learn
the dataset for anomaly detection and to predict cyberattacks trends. The approach will assist to
determine the best performance metrics, for cyber threat intelligence and predict future trends.
8. REFERENCES
[1] C. R. B. Hink, J. M. Beaver, M. A.. Bukner, T. Morris, U. Adhikari S. Pan. “Machine Learning
for Power System Disturbance and Cyber-attack Discrimination” 7
th
International Symposium
on Resilient Control Systems. IEEE Xplore. 10.1109/ISRCS.2014.6900095. (2014).
[2] V. Ford. A. Siraj. “Application of Machine Learning in Cyber Security”. Conference Paper.
Computer Science Department. Tennessee Tech University. (2014).
[3] K. Soska, N. Christin. “Automatically Detecting Vulnerable Websites Before They Turn
Malicious. In Proceeding of the 23
rd
UNENIX Security Symposium. Carnegie Mellon
University. ISBN 978-1-931971-15-7 (2014).
[4] D. Canali, L. Bilge, D. Balzarotti. “On the Effectiveness of Risk Prediction Based on User
Browsing Behaviour”. ACM 978-1-4503-2800-5/14/06.
http://dx.doi.org/10.1145/2590296.2590347. (2014). [Accessed 20/04/2020].
[5] A. Mohasseb, B. Aziz, J. Jung, and J. Lee, “Predicting Cyber Security Incidents Using
Machine Learning Algorithms: A case study of Korean SMEs”. University of Portsmouth
Research Portal. (2019).
[6] Microsoft Malware Prediction. Research Prediction. (2019).
(https://www.kaggle.com/c/microsoft-malware-prediction/data). [Accessed 26/01/2020].
[7] O. Sharma, M. Girolami J. Sventek, “Detecting Worm Variants using Machine Learning”.
DOI: 10.1145/1364654.1364657 (2007).
[8] C. Tsai, Y. Hsu, C. Lin, W. Lin. “Intrusion detection by machine learning: A review Expert
Systems with Applications”. 36.10, pp. 11994-12000, (2009).
[9] G. Wang. T. Wang. H. Zheng, B. Y. Zhao. “Man vs. Machine: Practical Adversarial Detection
of Malicious Crowdsourcing Workers”. In Proceedings of the 23rd USENIX Security
Symposium San Diego, CA, pp. 239–254, (2014).
[10] L. Bilge, Y. Han, M. D. Amoco, Risk Teller: Predicting the Risk of Cyber Incidents. ACM
ISBN 978-1-4503-4946-8/17/10. https://doi.org/10.1145/3133956.3134022 CCS (2017).
[Accessed 14/12/2019].
[11] R. C. Barros, A. c. P. L. F. De Carvalho. A. A. Freitas, “Automatic Design of Decision-Tree
Induction Algorithms”, Springer. Briefs in Computer Science, DOI 10.1007/978-3-319-14231-
9_2. (2015).
[12] E. G. V. Villano. “Classification of Logs Using Machine Learning”. Norwegian University of
Science and Technology. (2018).
[13] O. Yavanoglu. M. Aydos. “A Review of Cyber Security Dataset for Machine Learning
Algorithms”. International Conference on Big Data, IEEE Xplore. DOI:
10.1109//BigData.2007.8258167. (2018).
15. Abel Yeboah-Ofori
International Journal of Security (IJS), Volume (11) : Issue (2) : 2020 24
[14] A. Boschetti. L. Massaron. “Python Data Science Essentials”. 2
nd
Edition. UK. ISBN 978-1-
78646-213-8. (2016).
[15] J. R. Quinlan. “C4.5: Programs for Machine Learning”. 16, 2333-240 Department of
Computer, John Hopkins University, Baltimore. MD21218. (1994).
[16] W. Wang, Z. Lu, “Cyber Security in Smart Grid: Survey and Challenges”. Elsevier. (2013).
[17] A. Yeboah-Ofori, S. Islam. “Cyber Security Threat Modeling for Supply Chain Organizational
Environments”. Future Internet, 11, 63, doi: 10.3390/611030063, (2019).
[18] Controller and Audit General: Investigation. “Wannacry Cyber-attack and The NHS”.
Department of Health. National Audit Office. UK (2017).
[19] A. Yeboah-Ofori. Islam, S. Brimicombe A: Detecting Cyber Supply Chain Attacks on Cyber
Physical Systems Using Bayesian Belief Network. International Conference on Cyber
Security and Internet of Things. (2019). DOI 10.1109/ICSIoT47925.2019.00014.
[20] Duan, E. (2016). FLocker Mobile Ransomware Crosses to Smart TV. Trend Micro. Security
Intelligence Blog. https://blog.trendmicro.com/trendlabs-security-intelligence/flocker-
ransomware-crosses-smart-tv/ [Accessed 10/03/2020].
[21] S. Pournouri, B. Akhgar, P. S. Bayerl. “Cyber Attacks Analysis Using Decision Tree
Techniques for Improving Cyber Situational Awareness” International Conference on Global
Security, Safety and Sustainability. Springer. Vol.360. 2017. DOI: 10.1007/978-3-319-51064-
4_14.
[22] H. Patel, P. Prajapati. “Study and Analysis of Decision Tree Based Classification Algorithms”
International Journal of Computer Science and Engineering. 2018. DOI:
10.26438/ijcse/v6i10.7478.
[23] D. Moon, H. Im, I. Kim, J. H. Park. “DTB-IDS: An Intrusion Detection System Based on
Decision Tree Using Behavior Analysis for Preventing APT Attacks” Springer, The Journal of
Supercomputing 73 2881-2895. 2017. DOI: https://doi.org/10.1007/s11227-015-1604-8.
[24] I. H. Sarker, Y. B. Abushark, F. Alsolami, A. I. Khan. “IntruDTree: A Machine Learning Based
Cyber Security Intrusion Detection Systems” MDPI. Symmetry 12, 754,
doi:10.3390/sym12050754.
[25] R. Das, T. Morris. “Machine Learning in Cyber Security”. IEEE Xplore. International
Conference on Computer, Electronic and Communication Engineering. 2018.
DOI: 10.1109/ICCECE.2017.8526232.
[26] A. O. Balogun, R. G. Jimoh. “Anomaly Intrusion Detection Using in Hybrid of Decision Tree
And K-Nearest Neighbor”. Journal of Advances in Scientific Research & Application. 2015.
[27] A.J. Malik, F. A. Khan. “A Hybrid Technique Using Binary Particle Swarm Optimization and
Decision Tree Pruning for Network Intrusion Detection”. Cluster Computing. 21, 667–680.
2018. doi.org/10.1007/s10586-017-0971-8.
[28] K. Rai. M. S. Devi, A. Guleria. “Decision Tree Based Algorithm for Intrusion Detection”.
International Journal Advanced Networked Applications. Vol 7, Issue 04. Pages: 2828. 2016.
[29] A. Yeboah-Ofori, C. Boachie. “Malware Attack Predictive Analytics in a Cyber Supply Chain
Context Using Machine Learning” IEEE Explore. CSIoT pp. 66-77 2019, doi:
10.1109/ICSIoT47925.2019.00019.
16. Abel Yeboah-Ofori
International Journal of Security (IJS), Volume (11) : Issue (2) : 2020 25
[30] B. Ingre, A. Yadav, A. K. Soni “Decision Tree Based Intrusion Detection System for NSL-
KDD Dataset”. International Conference on Information and Communication Technology for
Intelligent Systems. 25–26, pp. 207–218. 2017.
[31] N. G. Relan. D. R. Patil. “Implementation of Network Intrusion Detection System Using
Variant of Decision Tree Algorithm”. IEEE Xplore. International Conference on Nascent
Technologies in the Engineering Field. pp. 1–5. 2015. DOI: 10.1109/ICNTE.2015.7029925.