This document discusses obfuscation and mutations in malware. It begins by defining obfuscation as deliberately creating code that is difficult for humans to understand. It then describes various obfuscation techniques used in malware including dead code insertion, register reassignment, and subroutine reordering. The document also discusses different types of malware like viruses, worms, and trojans. It classifies malware into first and second generation and describes techniques used in second generation malware like encryption, oligomorphism, polymorphism, and metamorphism. The document concludes by explaining various malware detection methods such as signature-based, behavior-based, heuristic, and hybrid approaches.
Pune Airport ( Call Girls ) Pune 6297143586 Hot Model With Sexy Bhabi Ready...
Obfuscation And Detection Of Malware Variants
1. IIITMK
Click to edit Master subtitle style
Indian Institute of Information Technology And Management-Kerala
Obfuscation And Mutations In Malware
Student Name: KADARI SHIVRAJ
Course : M.Sc. Cyber Security- IIIrd Semester
Roll No: 20
2. IIITMK
• Obfuscation:-
• obfuscation is the deliberate act of creating obfuscated code, i.e. source or
machine code that is difficult for humans to understand.
OBFUSCATION TECHNIQUES:-
i. Dead-code insertion is a simple technique that adds some ineffective instructions
to a program to change its appearance, but keep its behavior . An example of such
instructions is nop(NO OPERATION).
ii. Register reassignment is another simple technique that switches registers from
generation to generation while keeping the program code and its behavior same .
iii. Subroutine reordering obfuscates an original code by changing the order of its
subroutines in a random way . This technique can generate n! different variants, where
n is the number of subroutines. For example, Win32/Ghost had ten subroutines,
leading to 10! = 3628800 different generations.
3. IIITMK
iv. Instruction substitution evolves an original code by replacing some instructions
with other equivalent ones . For example, xor can be replaced with sub and
mov can be replaced with push/pop .
v. Code transposition reorders the sequence of the instructions of an original code
without having any impact on its behavior . There are two methods to achieve
this technique.
a)The first method randomly shuffles the instructions, and then recovers the
original execution order by inserting the unconditional branches or jumps.
b)The second method creates new generations by choosing and reordering
the independent instructions that have no impact on one another.
vi. In code integration, introduced by the Win95/Zmist malware (called Zmist),
a malware knits itself to the code of its target program. In order to apply this
technique, Zmist firstly decompile its target program into manageable objects,
seamlessly adds itself between them, and reassembles the integrated code into a
new generation.
4. IIITMK
Malware: A malware is a program with a malicious intent that has the potential to
harm, without the user consent, the machine on which it executes or the network
over which it communicates.
●The term payload refers to the action that a malicious program is designed to
perform on the infected machine.
Malwares are basically classified as first generation and second generation:-
●In first generation , structure of the malwares does not change. But in second
generation, the internal structure of malwares change in every variant while the
actions are maintained same.
●On the basis of how variances are created in malware, second generation
malwares are further classified as Encrypted, Oligomorphic,Polymorphic and
Metamorphic Malwares.
5. IIITMK
A)First Generation Malwares:-
●Virus: A virus is a self-replicating program that attaches itself to host programs
and propagates when an infected program executes i.e. it requires a host to
propagate.
●Worm: A malicious program that uses a network to send copies of itself to other
systems is usually called a computer worm.
●Trojan horse: As viruses, Trojan horse hide their malicious intent inside host
programs that may look useful, or at least harmless, to an unsuspecting user.
●Back-door: A back-door is a computer program designed to bypass local
security policies in order to allow external entities to have remote control over a
machine or a network.
●Spyware: The term spyware usually refers to malicious programs designed to
monitor users’ actions in order to collect private information and send them to an
external entity over the Internet.
6. IIITMK
2)Oligomorphic Malware: The short comings of the encrypted malware led to the
development of different concealment techniques. In Oligomorphic malwares
decryptors are mutated from one variant to other.
●The simple method to create Oligomorphic malwares is to provide a set of different
decryptors rather than one.
●For its detection, signature based techniques can be applied by making the signature of all
the decryptors.
B) Second Generation Malwares:-
1)Encrypted Malwares: Encryption was the first concealment techniques used for creating
the 2nd generation malwares .
• It consists of two parts; the encrypted body and a decryption code .
• Usually the body is XORed with a key to make it difficult to detect.
• For each infection, encrypted malware makes the body unique by using different key to
hide the signature.
7. IIITMK
3)Polymorphic Malwares: In Polymorphic malwares, millions of decryptors can be
generated by changing instructions in the next variant of the malware to avoid signature
based detection.
●Polymorphic malwares are created by using the obfuscation techniques (dead-code
insertion, register reassignment, subroutine reordering, instruction substitution, code
transposition/integration etc.)
4)Metamorphic Malwares: Metamorphic malwares are body-polymorphic.
i.e. Instead of generating new decryptor, a new instance (body) is created
without changing its actions.
●e.g. Phalcon/Skism Mass
8. IIITMK
MALWARE DETECTION METHODS
1) Signature based methods :-
●Signature is a unique feature for each file, something like a fingerprint of an
executable.
● It is based on methods that use the patterns extracted from various malwares
to identify them and are more efficient and faster than any other methods.
2) Behaviour based methods :-
●In this method, programs with the same behaviour are collected. This single
behaviour signature is used to identify various samples of malware.
●It consists of :
a) Data Collector: This component collects dynamic /static information about the
executable.
b) Interpreter: This component converts raw information collected by data
collection module into intermediate representations.
c) Matcher: It is used to compare this representation with the behaviour signatures.
9. IIITMK
3)HEURISTIC METHODS:-
●Heuristic malware detection methods use data mining and machine learning
techniques to learn the behaviour of an executable file.
●For example, as the first attempt, Naïve Bayes and Multi Naïve Bayes were
employed by Schultz et al. to classify malware and benign files.
●Naive Bayes is a simple but surprisingly powerful algorithm for predictive
modelling.
10. IIITMK
a)API/System calls :-
●Almost all programs use application programming interface (API) calls to send
their requests to the Operating System .
●API call sequences is one of the most attractive way that reflects the behaviour
of a piece of code like malware.
b)OpCode:-
●An OpCode is the subdivision of a machine language instruction that identifies
the operation to be executed.
●The most significant research on OpCodes has been done by Bilar . He showed
the ability of single OpCodes to use as a feature in malware detection.
11. IIITMK
d)Control flow graph:-
●CFG is a directed graph, where each node represents a statement of the program
and each edge represents control flow between the statements (i.e. what happens
after what).
●Zhao proposed a detection method based on features of the control flow graph for
PE files. At first, he created CFG for each executable file. Then, he used features
which extracted from CFG as the train data.
●These features are information about nodes, edges and subgraphs.
c)N-Grams:-
●N-Grams are all substrings of a larger string with a length of N .
●For example, the string “VIRUS”, can be segmented into 3-grams:
“VIR”, “IRU”, “RUS” .
●Tesauro et al. were the first who try to use N-Grams as a feature for malware
detection domain. They used N-Grams to detect Boot Sector Viruses using
Artificial Neural Networks (ANN).
12. IIITMK
e)Hybrid Features:-
●It is combination of two features.e.g. CFG and API.
●Eskandari et al. used the simple CFG and API calls to detect metamorphic malware.
●CFG was used to understand semantic of malware.
(CFG Contd….)
●After feature selection, some data mining algorithm have been used for
classification based on these features such as Decision Tree , Bagging and
Random Forest.
13. IIITMK
References:-
1) Issa Traore,Shahid Alam and Ibrahim Sogukpinar, "Current Trends and the Future of
Metamorphic Malware Detection".
2) Ilsun You and Kangbin Yim, "Malware Obfuscation Techniques: A Brief Survey".
3) Marco Gaudesi,Andrea Marcelli,Ernesto Sanchez,Giovanni Squillero and Alberto
Tonda ,"Malware Obfuscation through Evolutionary Packers".
4) Sachin Jain, "Malware Obfuscator for Malicious Executables".
5) Wei Wang, "Virus Obfuscation".
6) Zahra Bazrafshan, Hashem Hashemi, Seyed Mehdi Hazrati Fard and Ali Hamzeh ,"A
Survey on Heuristic Malware Detection Techniques".
7) Mila Dalla Preda, "Code Obfuscation and Malware Detection by Abstract
Interpretation".