Malicious software is abundant in a world of innumerable computer users, who are constantly faced withthese threats from various sources like the internet, local networks and portable drives. Malware is potentially low to high risk and can cause systems to function incorrectly, steal data and even crash. Malware may be executable or system library files in the form of viruses, worms, Trojans, all aimed at breaching the security of the system and compromising user privacy. Typically, anti-virus software is based on a signature definition system which keeps updating from the internet and thus keeping track of known viruses. While this may be sufficient for home-users, a security risk from a new virus could threaten an entire enterprise network. This paper proposes a new and more sophisticated antivirus engine that can not only scan files, but also build knowledge and detect files as potential viruses. This is done by extracting system API calls made by various normal and harmful executable, and using machine learning algorithms to classify and hence, rank files on a scale of security risk. While such a system is processor heavy, it is very effective when used centrally to protect an enterprise network which maybe more prone to such threats.
Antivirus software uses various methods to detect and remove malicious computer programs like viruses, worms, and trojans. Signature-based detection searches files for known malicious patterns, but cannot find new threats. Heuristics use generic signatures and behavior analysis to identify variants of known malware. As viruses evolve over time through mutations and refinements, antivirus software must be frequently updated to recognize new strains.
A SURVEY ON MALWARE DETECTION AND ANALYSIS TOOLSIJNSA Journal
This document summarizes a survey paper on malware detection and analysis tools. It provides an overview of different types of malware like viruses, worms, Trojans, rootkits, spyware and keyloggers. It describes techniques for malware analysis, including static analysis which examines code without execution, and dynamic analysis which analyzes behavior during execution. It also lists some limitations of static analysis and the need for dynamic analysis. Finally, it discusses various tools available for malware detection, analysis, reverse engineering and debugging.
IRJET- Zombie - Venomous File: Analysis using Legitimate Signature for Securi...IRJET Journal
The document discusses a proposed method for detecting viruses and malware that evade existing antivirus software. It uses a combination of analyzing files with VirusTotal's database of known threats and applying natural language processing techniques like suffix trees and TF-IDF to identify malicious patterns in files. An evaluation shows the proposed method can detect viruses that existing antivirus and VirusTotal miss, achieving a 97% accuracy rate in testing.
An analysis of how antivirus methodologies are utilized in protecting compute...UltraUploader
This document discusses different methodologies used by antivirus software to detect and protect against malicious code. It describes three main categories of antivirus scanning: signature detection through file scanning, heuristics scanning, and general decryption scanning. Signature detection involves comparing files to known virus signatures in a database. Heuristics scanning evaluates patterns of behavior to detect abnormal application activity. General decryption scanning is used to detect encrypted or polymorphic viruses.
Basic survey on malware analysis, tools and techniquesijcsa
The term malware stands for malicious software. It is a program installed on a system without the
knowledge of owner of the system. It is basically installed by the third party with the intention to steal some
private data from the system or simply just to play pranks. This in turn threatens the computer’s security,
wherein computer are used by one’s in day-to-day life as to deal with various necessities like education,
communication, hospitals, banking, entertainment etc. Different traditional techniques are used to detect
and defend these malwares like Antivirus Scanner (AVS), firewalls, etc. But today malware writers are one
step forward towards then Malware detectors. Day-by-day they write new malwares, which become a great
challenge for malware detectors. This paper focuses on basis study of malwares and various detection
techniques which can be used to detect malwares.
Optimised malware detection in digital forensicsIJNSA Journal
On the Internet, malware is one of the most serious threats to system security. Most complex issues and
problems on any systems are caused by malware and spam. Networks and systems can be accessed and
compromised by malware known as botnets, which compromise other systems through a coordinated
attack. Such malware uses anti-forensic techniques to avoid detection and investigation. To prevent systems
from the malicious activity of this malware, a new framework is required that aims to develop an optimised
technique for malware detection. Hence, this paper demonstrates new approaches to perform malware
analysis in forensic investigations and discusses how such a framework may be developed.
Detection and prevention of keylogger spyware attacksIAEME Publication
This document summarizes a proposed method for detecting and preventing keylogger spyware attacks. Keylogger spyware poses a serious threat by recording keyboard keystrokes to steal sensitive information like passwords and account numbers. The proposed method uses a detection and prevention system to identify keyloggers and remove them from infected systems. It aims to protect systems from this type of malware in a network. The document provides an overview of different types of malware like adware, spyware, and keyloggers, and describes how keylogger spyware works by logging keystrokes and transmitting the stolen data to malicious users.
Antivirus software uses various methods to detect and remove malicious computer programs like viruses, worms, and trojans. Signature-based detection searches files for known malicious patterns, but cannot find new threats. Heuristics use generic signatures and behavior analysis to identify variants of known malware. As viruses evolve over time through mutations and refinements, antivirus software must be frequently updated to recognize new strains.
A SURVEY ON MALWARE DETECTION AND ANALYSIS TOOLSIJNSA Journal
This document summarizes a survey paper on malware detection and analysis tools. It provides an overview of different types of malware like viruses, worms, Trojans, rootkits, spyware and keyloggers. It describes techniques for malware analysis, including static analysis which examines code without execution, and dynamic analysis which analyzes behavior during execution. It also lists some limitations of static analysis and the need for dynamic analysis. Finally, it discusses various tools available for malware detection, analysis, reverse engineering and debugging.
IRJET- Zombie - Venomous File: Analysis using Legitimate Signature for Securi...IRJET Journal
The document discusses a proposed method for detecting viruses and malware that evade existing antivirus software. It uses a combination of analyzing files with VirusTotal's database of known threats and applying natural language processing techniques like suffix trees and TF-IDF to identify malicious patterns in files. An evaluation shows the proposed method can detect viruses that existing antivirus and VirusTotal miss, achieving a 97% accuracy rate in testing.
An analysis of how antivirus methodologies are utilized in protecting compute...UltraUploader
This document discusses different methodologies used by antivirus software to detect and protect against malicious code. It describes three main categories of antivirus scanning: signature detection through file scanning, heuristics scanning, and general decryption scanning. Signature detection involves comparing files to known virus signatures in a database. Heuristics scanning evaluates patterns of behavior to detect abnormal application activity. General decryption scanning is used to detect encrypted or polymorphic viruses.
Basic survey on malware analysis, tools and techniquesijcsa
The term malware stands for malicious software. It is a program installed on a system without the
knowledge of owner of the system. It is basically installed by the third party with the intention to steal some
private data from the system or simply just to play pranks. This in turn threatens the computer’s security,
wherein computer are used by one’s in day-to-day life as to deal with various necessities like education,
communication, hospitals, banking, entertainment etc. Different traditional techniques are used to detect
and defend these malwares like Antivirus Scanner (AVS), firewalls, etc. But today malware writers are one
step forward towards then Malware detectors. Day-by-day they write new malwares, which become a great
challenge for malware detectors. This paper focuses on basis study of malwares and various detection
techniques which can be used to detect malwares.
Optimised malware detection in digital forensicsIJNSA Journal
On the Internet, malware is one of the most serious threats to system security. Most complex issues and
problems on any systems are caused by malware and spam. Networks and systems can be accessed and
compromised by malware known as botnets, which compromise other systems through a coordinated
attack. Such malware uses anti-forensic techniques to avoid detection and investigation. To prevent systems
from the malicious activity of this malware, a new framework is required that aims to develop an optimised
technique for malware detection. Hence, this paper demonstrates new approaches to perform malware
analysis in forensic investigations and discusses how such a framework may be developed.
Detection and prevention of keylogger spyware attacksIAEME Publication
This document summarizes a proposed method for detecting and preventing keylogger spyware attacks. Keylogger spyware poses a serious threat by recording keyboard keystrokes to steal sensitive information like passwords and account numbers. The proposed method uses a detection and prevention system to identify keyloggers and remove them from infected systems. It aims to protect systems from this type of malware in a network. The document provides an overview of different types of malware like adware, spyware, and keyloggers, and describes how keylogger spyware works by logging keystrokes and transmitting the stolen data to malicious users.
EXTERNAL - Whitepaper - How 3 Cyber ThreatsTransform Incident Response 081516Yasser Mohammed
This document discusses how three cyber threats - targeted attacks, system exploits, and data theft - are transforming incident response. It provides three case studies:
1) Operation Aurora targeted Google and other companies through a multi-stage attack using custom malware. Cyberforensics tools could have helped identify compromised systems and collect evidence.
2) The Zeus botnet exploits systems by infecting them and forwarding login credentials. Regular scans using cyberforensics tools can establish a baseline and detect any anomalies to address risks.
3) Data loss or theft of regulated/sensitive data from laptops or compromised websites can result in lost revenue and reputation damage. Cyberforensics tools can help find and wipe such data from unauthorized
A trust system based on multi level virus detectionUltraUploader
This document summarizes a research paper that proposes a new multi-level virus detection system (MDS). The MDS uses three levels of protection: 1) A smart memory monitor that detects virus behavior in real-time, 2) A file checker that analyzes batch files for virus-like code, and 3) An integrity checker that stores file signatures to detect modifications where viruses typically infect. The system was tested and able to detect virus activity through monitoring, file analysis, and integrity checking at different levels simultaneously. The paper concludes the MDS approach provides improved virus detection over single-method systems.
Malware Risk Analysis on the Campus Network with Bayesian Belief NetworkIJNSA Journal
This document discusses using a Bayesian Belief Network (BBN) to analyze malware risk on a university campus network. It begins by introducing the campus network monitoring tools and SIR epidemiological model used to model malware propagation. It then provides background on BBN principles, including defining nodes, conditional probabilities, and using the network to compute joint probabilities. The document proposes applying a BBN to assess malware prevalence risk by relating threat, vulnerability, and cost impact on network assets. It aims to provide understandable risk assessments to inform decision making.
Integrated Feature Extraction Approach Towards Detection of Polymorphic Malwa...CSCJournals
Some malware are sophisticated with polymorphic techniques such as self-mutation and emulation based analysis evasion. Most anti-malware techniques are overwhelmed by the polymorphic malware threats that self-mutate with different variants at every attack. This research aims to contribute to the detection of malicious codes, especially polymorphic malware by utilizing advanced static and advanced dynamic analyses for extraction of more informative key features of a malware through code analysis, memory analysis and behavioral analysis. Correlation based feature selection algorithm will be used to transform features; i.e. filtering and selecting optimal and relevant features. A machine learning technique called K-Nearest Neighbor (K-NN) will be used for classification and detection of polymorphic malware. Evaluation of results will be based on the following measurement metrics-True Positive Rate (TPR), False Positive Rate (FPR) and the overall detection accuracy of experiments.
This document provides an audit program to evaluate the effectiveness of Norton Antivirus 2005 software running on Windows XP. It begins with researching the software's results on third-party antivirus testing sites. The audit program then consists of 7 checklist items to test configurations like automatic definition updates, scanning of internet downloads, emails and attachments, all file types, and compressed files. Conducting this audit would verify Norton 2005 is properly configured and able to detect current viruses and malware.
Self Evolving Antivirus Based on Neuro-Fuzzy Inference SystemIJRES Journal
With today’s world filled with information and data, it is very important for one to know which information or data is harmless and which is harmful. Right from cellular phones to big MNCs and Server companies require a security system that is as competent and adaptive as its ever-updating and evolving viruses or malware. The paper talks about the development and implementation of a new idea Adaptive anti-virus based on Anfis logic. An adaptive anti-virus system that will catch up to the speed at which the viruses update and evolve.
The document defines various terms related to computer security and viruses. It provides definitions for terms like 3G, adware, anti-virus databases, anti-virus engines, anti-virus updates, application programming interfaces, archive files, attack signatures, backdoor Trojans, bandwidth, batch files, behavioral analysis, binary code, and browser hijackers. The document serves as a glossary of security-related technical terms.
Adware is a software that may be installed on the client machine for displaying advertisements for the
user of that machine with or without consideration of user. Adware can cause unrecoverable threat to the security
and privacy of computer users as there is an increase in number of malicious adware’s. The paper presents an
adware detection approach based on the application of data mining on disassembled code. This is an approach for
an accurate adware detection algorithm with adware data set and machine learning techniques. In this paper, we
disassemble binary files, generate instruction sequences and past his data through different data mining as well as
machine learning algorithms for feature extraction and feature reduction for detection of malicious adware.Then
system accurately detect both novel and known adware instances even though the binary difference between
adware and legitimate software is usually small.
Keywords — Data Mining; Adware Detection; Binary Classification; Static Analysis; Disassembly;
Instruction Sequences
Classification of Malware Attacks Using Machine Learning In Decision TreeCSCJournals
Predicting cyberattacks using machine learning has become imperative since cyberattacks have increased exponentially due to the stealthy and sophisticated nature of adversaries. To have situational awareness and achieve defence in depth, using machine learning for threat prediction has become a prerequisite for cyber threat intelligence gathering. Some approaches to mitigating malware attacks include the use of spam filters, firewalls, and IDS/IPS configurations to detect attacks. However, threat actors are deploying adversarial machine learning techniques to exploit vulnerabilities. This paper explores the viability of using machine learning methods to predict malware attacks and build a classifier to automatically detect and label an event as “Has Detection or No Detection”. The purpose is to predict the probability of malware penetration and the extent of manipulation on the network nodes for cyber threat intelligence. To demonstrate the applicability of our work, we use a decision tree (DT) algorithms to learn dataset for evaluation. The dataset was from Microsoft Malware threat prediction website Kaggle. We identify probably cyberattacks on smart grid, use attack scenarios to determine penetrations and manipulations. The results show that ML methods can be applied in smart grid cyber supply chain environment to detect cyberattacks and predict future trends.
Malware analysis on android using supervised machine learning techniquesMd. Shohel Rana
In recent years, a widespread research is conducted with the growth of malware resulted in the domain of malware analysis and detection in Android devices. Android, a mobile-based operating system currently having more than one billion active users with a high market impact that have inspired the expansion of malware by cyber criminals. Android implements a different architecture and security controls to solve the problems caused by malware, such as unique user ID (UID) for each application, system permissions, and its distribution platform Google Play. There are numerous ways to violate that fortification, and how the complexity of creating a new solution is enlarged while cybercriminals progress their skills to develop malware. A community including developer and researcher has been evolving substitutes aimed at refining the level of safety where numerous machine learning algorithms already been proposed or applied to classify or cluster malware including analysis techniques, frameworks, sandboxes, and systems security. One of the most promising techniques is the implementation of artificial intelligence solutions for malware analysis. In this paper, we evaluate numerous supervised machine learning algorithms by implementing a static analysis framework to make predictions for detecting malware on Android.
Malicious activities (malcodes) are self replicating
malware and a major security threat in a network environment.
Timely detection and system alert flags are very essential to
prevent rapid malcodes spreading in the network. The difficulty
in detecting malcodes is that they evolve over time. Despite the fact
that signature-based tools, are generally used to secure systems,
signature-based malcode detectors neglect to recognize muddled
and beforehand concealed malcode executables. Automatic signature
generation systems has likewise been use to address the issue
of malcodes, yet there are many works required for good detection.
Base on the behavior way of malcodes, a behavior approach is
required for such detection. Specifically, we require a dynamic
investigation and behavior Rule Base system that distinguishes
malcodes without erroneously block legitimate traffic or increase
false alarms. This paper proposed and discussed the approach
using Machine learning and Indicators of Compromise (IOC) to
analyze intrusion in a network, to identify the cause of the attack
and to provide future detection. This paper proposed the use of
behaviour malware analysis framework to analyze intrusion data,
apply clustering algorithm on the analyzed data and generate IOC
from the clustered data for IOCRule, which will be implemented
into Snort Intrusion Detection System (IDS) for malicious code
detection.
This document describes a proposed soft-computing system for identifying computer viruses using genetic algorithms, fuzzy logic, and neural networks. It discusses computer viruses and their types/characteristics. Seven key parameters for virus identification are identified: decline in system speed, numerous errors, persistent logoffs, disabled security software, abnormal internet behavior, obvious desktop changes, and continuous hard drive noise. The system would use fuzzy logic to set boundaries for the vague virus identification parameters. Genetic algorithms would optimize the fuzzy sets. Neural networks would provide self-learning abilities. Together this soft-computing approach aims to precisely and optimally recognize computer viruses.
This document proposes an email worm vaccine architecture that uses behavior-based anomaly detection to intercept incoming emails and scan attachments in virtual machines to detect malicious software. The system includes a virtual machine cluster to open attachments safely, a host-based intrusion detection system to monitor for dangerous behaviors, and an email-aware mail transfer agent to classify messages and communicate with the detection system. The implementation demonstrates detecting malware using parallel virtual machines while maintaining a low false positive rate.
This document discusses various types of program and system threats including Trojan horses, trapdoors, buffer overflows, worms, viruses, and denial of service attacks. A Trojan horse masquerades as legitimate software to gain unauthorized access. Trapdoors are secret vulnerabilities built into programs by designers. Buffer overflows occur when more data is input than a program expects, potentially allowing code execution. Worms self-replicate to spread while viruses require host files or human action. Examples like the Morris worm and Love Bug virus are provided. Protection involves antivirus software and safe computing practices. The key differences between worms and viruses are also outlined.
This document summarizes a proposed network attack alerting system that aims to reduce redundant alerts from intrusion detection systems (IDS). The system uses both network-based and host-based IDS to detect attacks launched using the Backtrack penetration testing tool on a virtual network environment. Well-known open source IDS tools from the Security Onion distribution are used to generate alerts. The system builds a database of alerts and defines rules to eliminate duplicate alerts for the same attack based on attributes like source/destination IP and port. It also establishes a severity classification scheme using threshold values of alerts and time to help administrators prioritize responses.
NETWORK INTRUSION DETECTION AND COUNTERMEASURE SELECTION IN VIRTUAL NETWORK (...ijsptm
Intrusion in a network or a system is a problem today as the trend of successful network attacks continue to
rise. Intruders can explore vulnerabilities of a network system to gain access in order to deploy some virus
or malware such as Denial of Service (DOS) attack. In this work, a frequency-based Intrusion Detection
System (IDS) is proposed to detect DOS attack. The frequency data is extracted from the time-series data
created by the traffic flow using Discrete Fourier Transform (DFT). An algorithm is developed for
anomaly-based intrusion detection with fewer false alarms which further detect known and unknown attack
signature in a network. The frequency of the traffic data of the virus or malware would be inconsistent with
the frequency of the legitimate traffic data. A Centralized Traffic Analyzer Intrusion Detection System
called CTA-IDS is introduced to further detect inside attackers in a network. The strategy is effective in
detecting abnormal content in the traffic data during information passing from one node to another and
also detects known attack signature and unknown attack. This approach is tested by running the artificial
network intrusion data in simulated networks using the Network Simulator2 (NS2) software.
Secure intrusion detection and countermeasure selection in virtual system usi...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
This document discusses mobile malware and how to protect against it. It begins by defining malware and listing common types. It then provides statistics on the distribution of mobile operating systems and malware detections. The document outlines sources of mobile malware infections and discusses why mobile devices contain sensitive information. It recommends implementing mobile device management to centrally manage devices and deploy security policies. Examples of recent mobile threats are also described. The document concludes by recommending security best practices like using antivirus software, updating devices, and educating users.
Abstract: The exponential growth of the internet and new technology lead today's world in a hectic situation both positive as well as the negative module. Cybercriminals gamble in the dark net using numerous techniques. This leads to cybercrime. Cyber threats like Malware attempt to infiltrate the computer or mobile device offline or internet, chat(online), and anyone can be a potential target. Malware is also known as malicious software is often used by cybercriminals to achieve their goal by tracking internet activity, capturing sensitive information, or blocking computer access. Reverse engineering is one of the best ways to prevent and is a powerful tool to keep the fight against cyber attacks. Most people in the cyber world see it as a black hat—It is said as being used to steal data and intellectual property. But when it is in the hands of cybersecurity experts, reverse engineering dons the white hat of the hero. Looking at the program from the outside in –often by a third party that had no hand in writing the code. It allows those who practice it to understand how a given program or system works when no source code is available. Reverse engineering accomplishing several tasks related to cybersecurity: finding system vulnerabilities, researching malware &analyzing the complexity of restoring core software algorithms that can further protect against theft. It is hard to hack certain software.
Keywords: Malware, threat, vulnerablity, detection, reverse engineering, analysis.
Title: Malware analysis and detection using reverse Engineering
Author: B.Rashmitha, J. Alwina Beauty Angelin, E.R. Ramesh
International Journal of Computer Science and Information Technology Research
ISSN 2348-1196 (print), ISSN 2348-120X (online)
Vol. 10, Issue 2, Month: April 2022 - June 2022
Page: (1-4)
Published Date: 01-April-2022
Research Publish Journals
Available at: www.researchpublish.com
You can Direct download full research paper at given below link:
https://www.researchpublish.com/papers/malware-analysis-and-detection-using-reverse-engineering
Academia Link: https://www.academia.edu/76069664/Malware_analysis_and_detection_using_reverse_Engineering_Available_at_www_researchpublish_com_journal_name_International_Journal_of_Computer_Science_and_Information_Technology_Research
This document discusses using data mining techniques to detect spyware. It begins by defining spyware and artificial intelligence. It then discusses three AI approaches that have been applied to spyware detection: heuristic technology, neural network technology, and data mining techniques. It focuses on using breadth-first search (BFS) within a data mining approach. The document finds that data mining techniques achieve an overall accuracy of 90.5% in detecting spyware, performing better than traditional signature-based or heuristic-based methods.
Utilization Data Mining to Detect Spyware IOSR Journals
This document discusses using data mining techniques to detect spyware. It begins by defining spyware and artificial intelligence. It then discusses three AI approaches that have been applied to spyware detection: heuristic technology, neural network technology, and data mining techniques. It focuses on using breadth-first search (BFS) within a data mining approach. The document finds that data mining techniques perform better than traditional signature-based or heuristic-based detection methods, achieving an overall accuracy of 90.5% at detecting spyware using BFS algorithms.
Malware is a worldwide pandemic. It is designed to damage computer systems without
the knowledge of the owner using the system. Software‟s from reputable vendors also contain
malicious code that affects the system or leaks information‟s to remote servers. Malware‟s includes
computer viruses, spyware, dishonest ad-ware, rootkits, Trojans, dialers etc. Malware detectors are
the primary tools in defense against malware. The quality of such a detector is determined by the
techniques it uses. It is therefore imperative that we study malware detection techniques and
understand their strengths and limitations. This survey examines different types of Malware and
malware detection methods.
EXTERNAL - Whitepaper - How 3 Cyber ThreatsTransform Incident Response 081516Yasser Mohammed
This document discusses how three cyber threats - targeted attacks, system exploits, and data theft - are transforming incident response. It provides three case studies:
1) Operation Aurora targeted Google and other companies through a multi-stage attack using custom malware. Cyberforensics tools could have helped identify compromised systems and collect evidence.
2) The Zeus botnet exploits systems by infecting them and forwarding login credentials. Regular scans using cyberforensics tools can establish a baseline and detect any anomalies to address risks.
3) Data loss or theft of regulated/sensitive data from laptops or compromised websites can result in lost revenue and reputation damage. Cyberforensics tools can help find and wipe such data from unauthorized
A trust system based on multi level virus detectionUltraUploader
This document summarizes a research paper that proposes a new multi-level virus detection system (MDS). The MDS uses three levels of protection: 1) A smart memory monitor that detects virus behavior in real-time, 2) A file checker that analyzes batch files for virus-like code, and 3) An integrity checker that stores file signatures to detect modifications where viruses typically infect. The system was tested and able to detect virus activity through monitoring, file analysis, and integrity checking at different levels simultaneously. The paper concludes the MDS approach provides improved virus detection over single-method systems.
Malware Risk Analysis on the Campus Network with Bayesian Belief NetworkIJNSA Journal
This document discusses using a Bayesian Belief Network (BBN) to analyze malware risk on a university campus network. It begins by introducing the campus network monitoring tools and SIR epidemiological model used to model malware propagation. It then provides background on BBN principles, including defining nodes, conditional probabilities, and using the network to compute joint probabilities. The document proposes applying a BBN to assess malware prevalence risk by relating threat, vulnerability, and cost impact on network assets. It aims to provide understandable risk assessments to inform decision making.
Integrated Feature Extraction Approach Towards Detection of Polymorphic Malwa...CSCJournals
Some malware are sophisticated with polymorphic techniques such as self-mutation and emulation based analysis evasion. Most anti-malware techniques are overwhelmed by the polymorphic malware threats that self-mutate with different variants at every attack. This research aims to contribute to the detection of malicious codes, especially polymorphic malware by utilizing advanced static and advanced dynamic analyses for extraction of more informative key features of a malware through code analysis, memory analysis and behavioral analysis. Correlation based feature selection algorithm will be used to transform features; i.e. filtering and selecting optimal and relevant features. A machine learning technique called K-Nearest Neighbor (K-NN) will be used for classification and detection of polymorphic malware. Evaluation of results will be based on the following measurement metrics-True Positive Rate (TPR), False Positive Rate (FPR) and the overall detection accuracy of experiments.
This document provides an audit program to evaluate the effectiveness of Norton Antivirus 2005 software running on Windows XP. It begins with researching the software's results on third-party antivirus testing sites. The audit program then consists of 7 checklist items to test configurations like automatic definition updates, scanning of internet downloads, emails and attachments, all file types, and compressed files. Conducting this audit would verify Norton 2005 is properly configured and able to detect current viruses and malware.
Self Evolving Antivirus Based on Neuro-Fuzzy Inference SystemIJRES Journal
With today’s world filled with information and data, it is very important for one to know which information or data is harmless and which is harmful. Right from cellular phones to big MNCs and Server companies require a security system that is as competent and adaptive as its ever-updating and evolving viruses or malware. The paper talks about the development and implementation of a new idea Adaptive anti-virus based on Anfis logic. An adaptive anti-virus system that will catch up to the speed at which the viruses update and evolve.
The document defines various terms related to computer security and viruses. It provides definitions for terms like 3G, adware, anti-virus databases, anti-virus engines, anti-virus updates, application programming interfaces, archive files, attack signatures, backdoor Trojans, bandwidth, batch files, behavioral analysis, binary code, and browser hijackers. The document serves as a glossary of security-related technical terms.
Adware is a software that may be installed on the client machine for displaying advertisements for the
user of that machine with or without consideration of user. Adware can cause unrecoverable threat to the security
and privacy of computer users as there is an increase in number of malicious adware’s. The paper presents an
adware detection approach based on the application of data mining on disassembled code. This is an approach for
an accurate adware detection algorithm with adware data set and machine learning techniques. In this paper, we
disassemble binary files, generate instruction sequences and past his data through different data mining as well as
machine learning algorithms for feature extraction and feature reduction for detection of malicious adware.Then
system accurately detect both novel and known adware instances even though the binary difference between
adware and legitimate software is usually small.
Keywords — Data Mining; Adware Detection; Binary Classification; Static Analysis; Disassembly;
Instruction Sequences
Classification of Malware Attacks Using Machine Learning In Decision TreeCSCJournals
Predicting cyberattacks using machine learning has become imperative since cyberattacks have increased exponentially due to the stealthy and sophisticated nature of adversaries. To have situational awareness and achieve defence in depth, using machine learning for threat prediction has become a prerequisite for cyber threat intelligence gathering. Some approaches to mitigating malware attacks include the use of spam filters, firewalls, and IDS/IPS configurations to detect attacks. However, threat actors are deploying adversarial machine learning techniques to exploit vulnerabilities. This paper explores the viability of using machine learning methods to predict malware attacks and build a classifier to automatically detect and label an event as “Has Detection or No Detection”. The purpose is to predict the probability of malware penetration and the extent of manipulation on the network nodes for cyber threat intelligence. To demonstrate the applicability of our work, we use a decision tree (DT) algorithms to learn dataset for evaluation. The dataset was from Microsoft Malware threat prediction website Kaggle. We identify probably cyberattacks on smart grid, use attack scenarios to determine penetrations and manipulations. The results show that ML methods can be applied in smart grid cyber supply chain environment to detect cyberattacks and predict future trends.
Malware analysis on android using supervised machine learning techniquesMd. Shohel Rana
In recent years, a widespread research is conducted with the growth of malware resulted in the domain of malware analysis and detection in Android devices. Android, a mobile-based operating system currently having more than one billion active users with a high market impact that have inspired the expansion of malware by cyber criminals. Android implements a different architecture and security controls to solve the problems caused by malware, such as unique user ID (UID) for each application, system permissions, and its distribution platform Google Play. There are numerous ways to violate that fortification, and how the complexity of creating a new solution is enlarged while cybercriminals progress their skills to develop malware. A community including developer and researcher has been evolving substitutes aimed at refining the level of safety where numerous machine learning algorithms already been proposed or applied to classify or cluster malware including analysis techniques, frameworks, sandboxes, and systems security. One of the most promising techniques is the implementation of artificial intelligence solutions for malware analysis. In this paper, we evaluate numerous supervised machine learning algorithms by implementing a static analysis framework to make predictions for detecting malware on Android.
Malicious activities (malcodes) are self replicating
malware and a major security threat in a network environment.
Timely detection and system alert flags are very essential to
prevent rapid malcodes spreading in the network. The difficulty
in detecting malcodes is that they evolve over time. Despite the fact
that signature-based tools, are generally used to secure systems,
signature-based malcode detectors neglect to recognize muddled
and beforehand concealed malcode executables. Automatic signature
generation systems has likewise been use to address the issue
of malcodes, yet there are many works required for good detection.
Base on the behavior way of malcodes, a behavior approach is
required for such detection. Specifically, we require a dynamic
investigation and behavior Rule Base system that distinguishes
malcodes without erroneously block legitimate traffic or increase
false alarms. This paper proposed and discussed the approach
using Machine learning and Indicators of Compromise (IOC) to
analyze intrusion in a network, to identify the cause of the attack
and to provide future detection. This paper proposed the use of
behaviour malware analysis framework to analyze intrusion data,
apply clustering algorithm on the analyzed data and generate IOC
from the clustered data for IOCRule, which will be implemented
into Snort Intrusion Detection System (IDS) for malicious code
detection.
This document describes a proposed soft-computing system for identifying computer viruses using genetic algorithms, fuzzy logic, and neural networks. It discusses computer viruses and their types/characteristics. Seven key parameters for virus identification are identified: decline in system speed, numerous errors, persistent logoffs, disabled security software, abnormal internet behavior, obvious desktop changes, and continuous hard drive noise. The system would use fuzzy logic to set boundaries for the vague virus identification parameters. Genetic algorithms would optimize the fuzzy sets. Neural networks would provide self-learning abilities. Together this soft-computing approach aims to precisely and optimally recognize computer viruses.
This document proposes an email worm vaccine architecture that uses behavior-based anomaly detection to intercept incoming emails and scan attachments in virtual machines to detect malicious software. The system includes a virtual machine cluster to open attachments safely, a host-based intrusion detection system to monitor for dangerous behaviors, and an email-aware mail transfer agent to classify messages and communicate with the detection system. The implementation demonstrates detecting malware using parallel virtual machines while maintaining a low false positive rate.
This document discusses various types of program and system threats including Trojan horses, trapdoors, buffer overflows, worms, viruses, and denial of service attacks. A Trojan horse masquerades as legitimate software to gain unauthorized access. Trapdoors are secret vulnerabilities built into programs by designers. Buffer overflows occur when more data is input than a program expects, potentially allowing code execution. Worms self-replicate to spread while viruses require host files or human action. Examples like the Morris worm and Love Bug virus are provided. Protection involves antivirus software and safe computing practices. The key differences between worms and viruses are also outlined.
This document summarizes a proposed network attack alerting system that aims to reduce redundant alerts from intrusion detection systems (IDS). The system uses both network-based and host-based IDS to detect attacks launched using the Backtrack penetration testing tool on a virtual network environment. Well-known open source IDS tools from the Security Onion distribution are used to generate alerts. The system builds a database of alerts and defines rules to eliminate duplicate alerts for the same attack based on attributes like source/destination IP and port. It also establishes a severity classification scheme using threshold values of alerts and time to help administrators prioritize responses.
NETWORK INTRUSION DETECTION AND COUNTERMEASURE SELECTION IN VIRTUAL NETWORK (...ijsptm
Intrusion in a network or a system is a problem today as the trend of successful network attacks continue to
rise. Intruders can explore vulnerabilities of a network system to gain access in order to deploy some virus
or malware such as Denial of Service (DOS) attack. In this work, a frequency-based Intrusion Detection
System (IDS) is proposed to detect DOS attack. The frequency data is extracted from the time-series data
created by the traffic flow using Discrete Fourier Transform (DFT). An algorithm is developed for
anomaly-based intrusion detection with fewer false alarms which further detect known and unknown attack
signature in a network. The frequency of the traffic data of the virus or malware would be inconsistent with
the frequency of the legitimate traffic data. A Centralized Traffic Analyzer Intrusion Detection System
called CTA-IDS is introduced to further detect inside attackers in a network. The strategy is effective in
detecting abnormal content in the traffic data during information passing from one node to another and
also detects known attack signature and unknown attack. This approach is tested by running the artificial
network intrusion data in simulated networks using the Network Simulator2 (NS2) software.
Secure intrusion detection and countermeasure selection in virtual system usi...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
This document discusses mobile malware and how to protect against it. It begins by defining malware and listing common types. It then provides statistics on the distribution of mobile operating systems and malware detections. The document outlines sources of mobile malware infections and discusses why mobile devices contain sensitive information. It recommends implementing mobile device management to centrally manage devices and deploy security policies. Examples of recent mobile threats are also described. The document concludes by recommending security best practices like using antivirus software, updating devices, and educating users.
Abstract: The exponential growth of the internet and new technology lead today's world in a hectic situation both positive as well as the negative module. Cybercriminals gamble in the dark net using numerous techniques. This leads to cybercrime. Cyber threats like Malware attempt to infiltrate the computer or mobile device offline or internet, chat(online), and anyone can be a potential target. Malware is also known as malicious software is often used by cybercriminals to achieve their goal by tracking internet activity, capturing sensitive information, or blocking computer access. Reverse engineering is one of the best ways to prevent and is a powerful tool to keep the fight against cyber attacks. Most people in the cyber world see it as a black hat—It is said as being used to steal data and intellectual property. But when it is in the hands of cybersecurity experts, reverse engineering dons the white hat of the hero. Looking at the program from the outside in –often by a third party that had no hand in writing the code. It allows those who practice it to understand how a given program or system works when no source code is available. Reverse engineering accomplishing several tasks related to cybersecurity: finding system vulnerabilities, researching malware &analyzing the complexity of restoring core software algorithms that can further protect against theft. It is hard to hack certain software.
Keywords: Malware, threat, vulnerablity, detection, reverse engineering, analysis.
Title: Malware analysis and detection using reverse Engineering
Author: B.Rashmitha, J. Alwina Beauty Angelin, E.R. Ramesh
International Journal of Computer Science and Information Technology Research
ISSN 2348-1196 (print), ISSN 2348-120X (online)
Vol. 10, Issue 2, Month: April 2022 - June 2022
Page: (1-4)
Published Date: 01-April-2022
Research Publish Journals
Available at: www.researchpublish.com
You can Direct download full research paper at given below link:
https://www.researchpublish.com/papers/malware-analysis-and-detection-using-reverse-engineering
Academia Link: https://www.academia.edu/76069664/Malware_analysis_and_detection_using_reverse_Engineering_Available_at_www_researchpublish_com_journal_name_International_Journal_of_Computer_Science_and_Information_Technology_Research
This document discusses using data mining techniques to detect spyware. It begins by defining spyware and artificial intelligence. It then discusses three AI approaches that have been applied to spyware detection: heuristic technology, neural network technology, and data mining techniques. It focuses on using breadth-first search (BFS) within a data mining approach. The document finds that data mining techniques achieve an overall accuracy of 90.5% in detecting spyware, performing better than traditional signature-based or heuristic-based methods.
Utilization Data Mining to Detect Spyware IOSR Journals
This document discusses using data mining techniques to detect spyware. It begins by defining spyware and artificial intelligence. It then discusses three AI approaches that have been applied to spyware detection: heuristic technology, neural network technology, and data mining techniques. It focuses on using breadth-first search (BFS) within a data mining approach. The document finds that data mining techniques perform better than traditional signature-based or heuristic-based detection methods, achieving an overall accuracy of 90.5% at detecting spyware using BFS algorithms.
Malware is a worldwide pandemic. It is designed to damage computer systems without
the knowledge of the owner using the system. Software‟s from reputable vendors also contain
malicious code that affects the system or leaks information‟s to remote servers. Malware‟s includes
computer viruses, spyware, dishonest ad-ware, rootkits, Trojans, dialers etc. Malware detectors are
the primary tools in defense against malware. The quality of such a detector is determined by the
techniques it uses. It is therefore imperative that we study malware detection techniques and
understand their strengths and limitations. This survey examines different types of Malware and
malware detection methods.
Viruses & Malware: Effects On Enterprise NetworksDiane M. Metcalf
The document discusses viruses and malware, focusing on three key areas: detection, disinfection, and related costs for enterprise networks. It describes popular methods of malware infection like exploits, social engineering, rogue infections, peer-to-peer file sharing, emails, and USB devices. It also discusses different types of malware like metamorphic and polymorphic malware, and how they avoid detection through techniques like obfuscation. Current detection methods include signature-based analysis, file emulation, and file analysis, as well as emerging approaches like traffic analysis and vulnerability scanning. Disinfection includes removing malware through specific tools, real-time scanners, and cloud-based technologies. The document outlines how to quantify direct and indirect costs of
Understanding the term hacking as any unconventional way of interacting with some system it is easy to conclude that there are enormous number of people who hacked or tried to hack someone or something. The article, as result of author research, analyses hacking from different points of view, including hacker's point of view as well as the defender's point of view. Here are discussed questions like: Who are the hackers? Why do people hack? Law aspects of hacking, as well as some economic issues connected with hacking. At the end, some questions about victim protection are discussed together with the weakness that hackers can use for their own protection. The aim of the article is to make readers familiar with the possible risks of hacker's attacks on the mobile phones and on possible attacks in the announced food of the internet of things (next IoT) devices
Classification of Malware based on Data Mining Approachijsrd.com
This document discusses a system called the Intelligent Malware Detection System (IMDS) that uses data mining techniques to classify malware. The IMDS uses a PE parser to extract API execution sequences from Windows portable executable files. It then applies an OOA mining algorithm called OOA_Fast_FP-Growth to generate association rules from the API sequences to classify files as malware or benign. Experimental results showed the IMDS outperformed other classification techniques and anti-virus software in detecting malware.
CS266 Software Reverse Engineering (SRE)
Identifying, Monitoring, and Reporting Malware
Teodoro (Ted) Cipresso, teodoro.cipresso@sjsu.edu
Department of Computer Science
San José State University
Spring 2015
A STATIC MALWARE DETECTION SYSTEM USING DATA MINING METHODSijaia
This document presents a static malware detection system using data mining techniques. The system extracts raw features from Windows Portable Executable (PE) files including PE header information, DLLs, and API functions. It then selects important features using Information Gain and reduces dimensions using Principal Component Analysis. Three classifiers (SVM, J48, Naive Bayes) are trained on the transformed feature vectors to classify files as malicious or benign. When evaluated on a dataset of over 247,000 files, the system achieved a detection rate of 99.6%.
Advanced Threats in the Enterprise: Finding an Evil in the HaystackEMC
This white paper describes the current advanced threat landscape, shortcomings of anti-virus, and how RSA ECAT fills the gap and helps organizations detect advanced malware.
Obfuscated computer virus detection using machine learning algorithmjournalBEEI
Nowadays, computer virus attacks are getting very advanced. New obfuscated computer virus created by computer virus writers will generate a new shape of computer virus automatically for every single iteration and download. This constantly evolving computer virus has caused significant threat to information security of computer users, organizations and even government. However, signature based detection technique which is used by the conventional anti-computer virus software in the market fails to identify it as signatures are unavailable. This research proposed an alternative approach to the traditional signature based detection method and investigated the use of machine learning technique for obfuscated computer virus detection. In this work, text strings are used and have been extracted from virus program codes as the features to generate a suitable classifier model that can correctly classify obfuscated virus files. Text string feature is used as it is informative and potentially only use small amount of memory space. Results show that unknown files can be correctly classified with 99.5% accuracy using SMO classifier model. Thus, it is believed that current computer virus defense can be strengthening through machine learning approach.
Obfuscated computer virus detection using machine learning algorithmjournalBEEI
Nowadays, computer virus attacks are getting very advanced. New obfuscated computer virus created by computer virus writers will generate a new shape of computer virus automatically for every single iteration and download. This constantly evolving computer virus has caused significant threat to information security of computer users, organizations and even government. However, signature based detection technique which is used by the conventional anti-computer virus software in the market fails to identify it as signatures are unavailable. This research proposed an alternative approach to the traditional signature based detection method and investigated the use of machine learning technique for obfuscated computer virus detection. In this work, text strings are used and have been extracted from virus program codes as the features to generate a suitable classifier model that can correctly classify obfuscated virus files. Text string feature is used as it is informative and potentially only use small amount of memory space. Results show that unknown files can be correctly classified with 99.5% accuracy using SMO classifier model. Thus, it is believed that current computer virus defense can be strengthening through machine learning approach.
Optimised Malware Detection in Digital Forensics IJNSA Journal
This summarizes a research paper that proposes developing a new framework to optimize malware detection in digital forensics investigations. The paper discusses challenges with existing detection methods, such as signature-based approaches requiring extensive manual analysis. Through a market research survey of forensics professionals, the paper finds weaknesses in current skills, tools, and accuracy rates. Most respondents agreed a new customized detection tool is needed that employs both dynamic and static analysis methods. The proposed framework aims to address these issues to more effectively detect and analyze malware.
This document summarizes a research paper that proposes an "anti-drive" device with an in-built antivirus to protect USB drives from viruses. The anti-drive would scan for viruses when connected to a computer and delete any infected files. This would prevent data on the USB drive from getting corrupted even when connected to an infected computer system. The paper outlines the objectives, literature review on existing antivirus systems, methodology using batch programming and virus signatures, implementation details, and results analysis. Future work could include increasing storage space, using more efficient antivirus chips, and modifying the circuit to make the drive cheaper and more reliable.
This document presents a novel malware clustering system based on kernel data structures. It introduces a data-centric malware defense architecture (DMDA) that models and detects malware behavior based on properties of kernel data objects targeted during attacks. The architecture consists of an external monitor that observes a target OS kernel to map dynamic kernel objects and identify memory access patterns specific to malware attacks in order to generate malware signatures and detect and cluster malware. It aims to complement traditional code-centric malware detection approaches by focusing on the manipulation of kernel data.
Novel Malware Clustering System Based on Kernel Data Structureiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Invesitigation of Malware and Forensic Tools on Internet IJECEIAES
Malware is an application that is harmful to your forensic information. Basically, malware analyses is the process of analysing the behaviours of malicious code and then create signatures to detect and defend against it.Malware, such as Trojan horse, Worms and Spyware severely threatens the forensic security. This research observed that although malware and its variants may vary a lot from content signatures, they share some behaviour features at a higher level which are more precise in revealing the real intent of malware. This paper investigates the various techniques of malware behaviour extraction and analysis. In addition, we discuss the implications of malware analysis tools for malware detection based on various techniques.
Looking to understand how hackers and other attackers use cyber technology to attack your network and your executives? This slide set provides an overview and details the anatomy of a cyber attack, and the strategies you can use to manage and mitigate risk.
The document describes a proposed integrated honeypot system that aims to detect zero-day attacks, SSH attacks, and keylogger-spyware attacks. The system uses honeypots deployed in virtual machines to log attack behaviors. A separate detection framework then analyzes the honeypot logs to generate new signatures for intrusion detection and prevention systems like Snort. The integrated honeypot includes features for logging details of the targeted attacks. The system is meant to help update defenses against new attack patterns.
Similar to Malware Detection Module using Machine Learning Algorithms to Assist in Centralized Security in Enterprise Networks (20)
Comparative analysis between traditional aquaponics and reconstructed aquapon...bijceesjournal
The aquaponic system of planting is a method that does not require soil usage. It is a method that only needs water, fish, lava rocks (a substitute for soil), and plants. Aquaponic systems are sustainable and environmentally friendly. Its use not only helps to plant in small spaces but also helps reduce artificial chemical use and minimizes excess water use, as aquaponics consumes 90% less water than soil-based gardening. The study applied a descriptive and experimental design to assess and compare conventional and reconstructed aquaponic methods for reproducing tomatoes. The researchers created an observation checklist to determine the significant factors of the study. The study aims to determine the significant difference between traditional aquaponics and reconstructed aquaponics systems propagating tomatoes in terms of height, weight, girth, and number of fruits. The reconstructed aquaponics system’s higher growth yield results in a much more nourished crop than the traditional aquaponics system. It is superior in its number of fruits, height, weight, and girth measurement. Moreover, the reconstructed aquaponics system is proven to eliminate all the hindrances present in the traditional aquaponics system, which are overcrowding of fish, algae growth, pest problems, contaminated water, and dead fish.
The CBC machine is a common diagnostic tool used by doctors to measure a patient's red blood cell count, white blood cell count and platelet count. The machine uses a small sample of the patient's blood, which is then placed into special tubes and analyzed. The results of the analysis are then displayed on a screen for the doctor to review. The CBC machine is an important tool for diagnosing various conditions, such as anemia, infection and leukemia. It can also help to monitor a patient's response to treatment.
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
Discover the latest insights on Data Driven Maintenance with our comprehensive webinar presentation. Learn about traditional maintenance challenges, the right approach to utilizing data, and the benefits of adopting a Data Driven Maintenance strategy. Explore real-world examples, industry best practices, and innovative solutions like FMECA and the D3M model. This presentation, led by expert Jules Oudmans, is essential for asset owners looking to optimize their maintenance processes and leverage digital technologies for improved efficiency and performance. Download now to stay ahead in the evolving maintenance landscape.
Software Engineering and Project Management - Introduction, Modeling Concepts...Prakhyath Rai
Introduction, Modeling Concepts and Class Modeling: What is Object orientation? What is OO development? OO Themes; Evidence for usefulness of OO development; OO modeling history. Modeling
as Design technique: Modeling, abstraction, The Three models. Class Modeling: Object and Class Concept, Link and associations concepts, Generalization and Inheritance, A sample class model, Navigation of class models, and UML diagrams
Building the Analysis Models: Requirement Analysis, Analysis Model Approaches, Data modeling Concepts, Object Oriented Analysis, Scenario-Based Modeling, Flow-Oriented Modeling, class Based Modeling, Creating a Behavioral Model.
Software Engineering and Project Management - Introduction, Modeling Concepts...
Malware Detection Module using Machine Learning Algorithms to Assist in Centralized Security in Enterprise Networks
1. International Journal of Network Security & Its Applications (IJNSA), Vol.4, No.1, January 2012
DOI : 10.5121/ijnsa.2012.4106 61
Malware Detection Module using Machine
Learning Algorithms to Assist in Centralized
Security in Enterprise Networks
Priyank Singhal
Student, Computer Engineering
Sardar Patel Institute of Technology
University of Mumbai
Mumbai, India
Nataasha Raul
Research Scholar
Sardar Patel Institute of Technology
University of Mumbai
Mumbai, India
Abstract
Malicious software is abundant in a world of innumerable computer users, who are constantly faced with
these threats from various sources like the internet, local networks and portable drives. Malware is
potentially low to high risk and can cause systems to function incorrectly, steal data and even crash.
Malware may be executable or system library files in the form of viruses, worms, Trojans, all aimed at
breaching the security of the system and compromising user privacy. Typically, anti-virus software is
based on a signature definition system which keeps updating from the internet and thus keeping track of
known viruses. While this may be sufficient for home-users, a security risk from a new virus could
threaten an entire enterprise network.
This paper proposes a new and more sophisticated antivirus engine that can not only scan files, but also
build knowledge and detect files as potential viruses. This is done by extracting system API calls made by
various normal and harmful executable, and using machine learning algorithms to classify and hence,
rank files on a scale of security risk. While such a system is processor heavy, it is very effective when
used centrally to protect an enterprise network which maybe more prone to such threats.
Keywords: Malware detection, virus, data mining, Information gain, random forest, machine
learning, classification, enterprise, network, security.
1. Introduction
Malware, short for malicious software, consists of programming (code, scripts, active content,
and other software) designed to disrupt or deny operation, gather information that leads to loss
of privacy or exploitation, gain unauthorized access to system resources, and other abusive
behaviour [1]. It is a general term used to define a variety of forms of hostile, intrusive, or
annoying software or program code. Software is considered to be malware based on the
perceived intent of the creator rather than any particular features. Malware includes computer
viruses, worms, Trojan horses, spyware, dishonest adware, crime-ware, most rootkits, and other
malicious and unwanted software or program [2].
In 2008, Symantec published a report that "the release rate of malicious code and other
unwanted programs may be exceeding that of legitimate software applications.” According to
F-Secure, "As much malware produced in 2007 as in the previous 20 years altogether.” [3].
While these may mean nothing to the average home user, these statistics are alarming keeping
2. International Journal of Network Security & Its Applications (IJNSA), Vol.4, No.1, January 2012
62
in mind the financial implications of such threats to enterprises in case such threats penetrate
and compromise the large volumes of data stored and transacted upon.
Since the rise of widespread Internet access, malicious software has been designed for a profit,
for examples forced advertising. For instance, since 2003, the majority of widespread viruses
and worms have been designed to take control of users' computers for black-market
exploitation. Another category of malware, spyware, - programs designed to monitor users' web
browsing and steal private information. Spyware programs do not spread like viruses, instead
are installed by exploiting security holes or are packaged with user-installed software, such as
peer-to-peer applications [4] [5].
Clearly, there is a very urgent need to find, not just a suitable method to detect infected files,
but too build a smart engine that can detect new viruses by studying the structure of system
calls made by malware.
2. Current Antivirus Software
Antivirus software is used to prevent, detect, and remove malware, including but not limited to
computer viruses, computer worm, Trojan horses, spyware and adware. A variety of strategies
are typically employed by the anti-virus engines. Signature-based detection involves searching
for known patterns of data within executable code. However, it is possible for a computer to be
infected with new virus for which no signatures exist [6]. To counter such “zero-day” threats,
heuristics can be used, that identify new viruses or variants of existing viruses by looking for
known malicious code. Some antivirus can also make predictions by executing files in a
sandbox and analysing results.
Often, antivirus software can impair a computer's performance. Any incorrect decision may
lead to a security breach, since it runs at the highly trusted kernel level of the operating system.
If the antivirus software employs heuristic detection, success depends on achieving the right
balance between false positives and false negatives. Today, malware may no longer be
executable files. Powerful macros in Microsoft Word could also present a security risk.
Traditionally, antivirus software heavily relied upon signatures to identify malware. However,
because of newer kinds of malware, signature-based approaches are no longer effective [7].
Although standard antivirus can effectively contain virus outbreaks, for large enterprises, any
breach could be potentially fatal. Virus makes are employing "oligomorphic", "polymorphic"
and, "metamorphic" viruses, which encrypt parts of themselves or modify themselves as a
method of disguise, so as to not match virus signatures in the dictionary [8].
Studies in 2007 showed that the effectiveness of antivirus software had decreased drastically,
particularly against unknown or zero day attacks. Detection rates have dropped from 40-50% in
2006 to 20-30% in 2007. The problem is magnified by the changing intent of virus makers.
Independent testing on all the major virus scanners consistently shows that none provide 100%
virus detection. The best ones provided as high as 99.6% detection, while the lowest provided
only 81.8% in tests conducted in February 2010 [25]. All virus scanners produce false positive
results as well, identifying benign files as malware.
3. Our Approach
As we have seen, current antivirus engine techniques are not optimum in detecting viruses in
real time. They may be useful in controlling viruses once they infect systems, which is again,
fateful for enterprises [9] [10]. This research is thus aimed at a central solution that works at the
firewall level of the enterprise network. The complete system diagram is shown in Figure 1 and
our process diagram is shown in Figure 2.
3. International Journal of Network Security & Its Applications (IJNSA), Vol.4, No.1, January 2012
63
Figure 1: Network Diagram of the entire system
Figure 2: Process Diagram of our System
Portable Executable (PE)
This format is a file format for executables, object code and DLLs, used in 32-bit and 64-bit
versions of Windows operating systems [23]. The term "portable" refers to the format's
versatility in numerous environments of operating system software architecture. The PE format
is a data structure that encapsulates the information necessary for the Windows OS loader to
manage the wrapped executable code. This includes dynamic library references for linking, API
export and import tables, resource management data and thread-local storage data. On NT
operating systems, the PE format is used for EXE, DLL, SYS (device driver), and other file
types.
4. International Journal of Network Security & Its Applications (IJNSA), Vol.4, No.1, January 2012
64
A PE file consists of a number of headers and sections that tell the dynamic linker how to map
the file into memory. An executable image consists of several different regions, each of which
requires different memory protection. The Import address table (IAT), which is used as a
lookup table when the application is calling a function in a different module. Because a
compiled program cannot know the memory location of the libraries it depends upon, an
indirect jump is required whenever an API call is made. As the dynamic linker loads modules
and joins them together, it writes actual addresses into the IAT slots, so that they point to the
memory locations of the corresponding library functions.
In our research, we extracted the PE Header from numerous infected and normal executables
and using the IAT, extracted various API calls and stored them into a data mine [11] [12]. We
then derived Information Gain (IG) for each function.
Algorithm for Information Gain:
The entropy of a variable X is defined as:
Where in H(P), the P(X) is as follows:
PEofnumberTotal
APIcertainasxwithPEofNumber
XP i
i
⋅⋅⋅
⋅⋅⋅⋅⋅⋅⋅
=)(
And the entropy of X after observing values of another variable Y is defined as:
The amount by which the entropy of X decreases reflects additional information about X
provided by Y is called information gain, given by:
IG(X | Y) = H(X) - H(X | Y)
Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the
design and development of algorithms that allow computers to evolve behaviours based on
empirical data, such as from sensor data or databases [14]. A learner can take advantage of data
to capture characteristics of interest of their unknown underlying probability distribution. Data
can be seen as examples that illustrate relations between observed variables. A major focus of
machine learning research is to automatically learn to recognize complex patterns and make
intelligent decisions based on data [15].
Further, we apply the Random Forest Algorithm (RFA) [16]. This is a machine learning
classification algorithm to construct the classifier to detect malware. A Random Forest is a
classifier that is comprised of a collection of decision tree predictors. Each individual tree is
trained on a partial, independently sampled, set of instances selected from the complete training
set. The predicted output class of a classified instance is the most frequent class output of the
individual trees [17] [18].
4. Obtained Results
To determine whether our method can provide successful results, we extracted data from over
5000 executables. These have been a combination of normal and infected files [19] [22] [24].
5. International Journal of Network Security & Its Applications (IJNSA), Vol.4, No.1, January 2012
65
The first step was to create a hash map of all the executables and functions (Figure 2). After
that, the information gain algorithm is used to choose only the top 80% of the functions (Figure
3), which are most likely to be present in harmful files [20]. The Information Gain is further
corrected by using this formula:
This formula helps in correcting the error by adding or subtracting the average value from the
information gain value calculated. This is similar to the error correction using a standard
deviation.
The results of the same are shown below:
Figure 3: Hash Map of EXEs and API Functions
After running the information gain algorithm, these are the top functions:
Figure 4: Information Gain values of API Functions
6. International Journal of Network Security & Its Applications (IJNSA), Vol.4, No.1, January 2012
66
Using this data, we run the Random Forest Algorithm, yielding the following functions:
Total Instances 4500
Correctly Classified Instances 4470 99.5556 %
Incorrectly Classified Instances 30 0.4444 %
Table 1: Experiment Results
Algorithm TP FP DR ACY
Decision Tree 0.9 0.1 90 % 90 %
Naive Bayes 0.95 0.05 95 % 95%
Random Forest 0.97 0.03 97 % 97%
Our Proposed Method 0.996 0.003 99% 98%
5. Conclusion
In this research, we have proposed a malware detection module based on advanced data mining
and machine learning. While such a method may not be suitable for home users, being very
processor heavy, this can be implemented at enterprise gateway level to act as a central
antivirus engine to supplement antiviruses present on end user computers. This will not only
easily detect known viruses, but act as a knowledge that will detect newer forms of harmful
files. While a costly model requiring costly infrastructure, it can help in protecting invaluable
enterprise data from security threat, and prevent immense financial damage.
References
[1] http://www.us-cert.gov/control_systems/pdf/undirected_attack0905.pdf
[2] "Defining Malware: FAQ". http://technet.microsoft.com. Retrieved 2009-09-10.
[3] F-Secure Corporation (December 4, 2007). "F-Secure Reports Amount of Malware Grew by 100%
during 2007". Press release. Retrieved 2007-12-11.
[4] History of Viruses. http://csrc.nist.gov/publications/nistir/threats/subsubsection3_3_1_1.html
[5] Landesman, Mary (2009). "What is a Virus Signature?” Retrieved 2009-06-18.
[6] Christodorescu,M., Jha, S., 2003. Static analysis of executables to detect malicious patterns. In:
Proceedings of the 12th USENIX Security Symposium. Washington .pp. 105-120.
[7] Filiol, E.,2005. Computer Viruses: from Theory to Applications. New York, Springer, ISBN 10: 2-
287-23939-1.
[8] Filiol, E., Jacob, G., Liard, M.L., 2007: Evaluation methodology and theoretical model for antiviral
behavioral detection strategies. J. Comput. 3, pp 27–37.
[9] H. Witten and E. Frank. 2005. Data mining: Practical machine learning tools with Java
implementations. Morgan Kaufmann, ISBN-10: 0120884070.
[10] J. Kolter and M. Maloof, 2004. Learning to detect malicious executables in the wild. In
Proceedings of KDD'04, pp 470-478.
[11] J. Wang, P. Deng, Y. Fan, L. Jaw, and Y. Liu, 2003.Virus detection using data mining techniques.
In Proceedings of IEEE International Conference on Data Mining.
7. International Journal of Network Security & Its Applications (IJNSA), Vol.4, No.1, January 2012
67
[12] Kephart, J., Arnold, W., 1994. Automatic extraction of computer virus signatures. In: Proceedings
of 4th Virus Bulletin International Conference, pp. 178–184.
[13] L. Adleman, 1990. An abstract theory of computer viruses (invited talk). CRYPTO ’88:
Proceedings on Advances in Cryptology, New York, USA. Springer, pp: 354–374.
[14] Lee, T., Mody, J., 2006.Behavioral classification. In: Proceedings of European Institute for
Computer Antivirus Research (EICAR) Conference.
[15] Lo, R., Levitt, K., Olsson, R., 1995: Mcf: A malicious code filter. Comput. Secur. 14, pp.541–
566.
[16] M. Schultz, E. Eskin, and E. Zadok, 2001.Data mining methods for detection of new malicious
executables. In Security and Privacy Proceedings IEEE Symposium, pp 38-49.
[17] McGraw, G., Morrisett, G.,2002 : Attacking malicious code, report to the infosec research
council. IEEE Software. pp. 33–41.
[18] P. Szor, 2005.The Art of Computer Virus Research and Defense. New Jersey, Addison Wesley for
Symantec Press. ISBN-10: 0321304543.
[19] Rabek, J., Khazan, R., Lewandowski, S., Cunningham, R., 2003. Detection of injected,
dynamically generated, and obfuscated malicious code. In: Proceedings of the 2003 ACM Workshop
on Rapid Malcode, pp. 76–82.
[20] S. Hashemi,Y. Yang, D. Zabihzadeh, and M. Kangavari, 2008.Detecting intrusion transactions in
databases using data item dependencies and anomaly analysis. Expert Systems, 25,5,pp 460–473. DOI:
10.1111/j.1468-0394.2008.00467.x
[21] Sung, A., Xu, J., Chavez, P., Mukkamala, S., 2004.Static analyzer of vicious executables (save).
In: Proceedings of the 20th Annual Computer Security Applications Conference. IEEE Computer
Society Press,ISBN 0-7695-2252-1,pp.326-334.
[22] Virus dataset, Available from: http://virussign.com/
[23] Y. Ye, D. Wang, T. Li, and D, Ye. 2008. An intelligent pe-malware detection system based on
association mining. In Journal in Computer Virology, 4, 4, pp 323–334. DOI 10.1007/s11416-008-
0082-4.
[24] Zakorzhevsky, 2011. Monthly Malware Statistics. Available from:
http://www.securelist.com/en/analysis/204792182/Monthly_Malware_Statistics_June_2011.
[25] Dan Goodin (December 21, 2007). "Anti-virus protection gets worse". Channel Register.
Retrieved 2011-02-24.