The document discusses using supervised machine learning for malware detection. It aims to classify files as malicious or not malicious by analyzing their PE headers with algorithms like ExtraTreeClassifier and RandomForestClassifier. The process involves extracting features from a dataset of PE files, using the classifiers to optimize and partition the data, then training the random forest model to classify files. Machine learning can effectively analyze malware and help build better antivirus solutions to detect threats in real-time.
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Supervised Learning in Malware Detection
1. SUPERVISED LEARNING IN
CYBERSECURITY
Ramkrushna M.
Assistant Professor
International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
2. Contents
Introduction Motivation Objectives Flow Process ApplicationsImplementation
International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
3. Introduction
Cybersecurity:
Cyber Security, it is also called as
information security. It works on three
primary principles integrity,
confidentiality, and availability (ICA) of
information. It contains set ethical tools,
risk management techniques and best
practices created to protect networks,
devices, programs, and data from
unauthorized access.
Malware:
Malware, it is kind of software created to
harm to a computer, server, client, or
computer network. Examples of
Malware's are computer viruses, worms,
Trojan horses, ransomware, spyware.
4. Motivation
• The technology is moving towards its peak it's
important to protect the information/data from
intruders(Black hat).
• Data is the primary key for any infrastructure so it is
necessary to safeguard the data from theft or any kind
of tampering.
• Here's where cybersecurity comes into the picture to
protect us from any kind of malicious activity.
International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
5. Objectives
Malware analysis with the help of ML
and PE header files.
To classify the whether the file is
malicious or not.
To build malware detection application
with the help of ExtraTreeClassifier ,
RandomForestClassifier and PE header
files.
International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
6. Flow Process
MALWARE
DETECTION USING
PE HEADERS
Start
Dataset(PE files) ExtraTreeClassifier
RandomForest Output
International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
7. Implementation Steps :
• In order to implement machine learning in malware
analysis we have taken the help of tools , pe header files ,
and help of machine learning algorithms such as extra
tree classifier and random forest.
• Here have chosen ExtraTreeClassisfier and Random forest
over other ml algo such as gradient boost .
• The main reason advantage of random forest classifier
over gradient boost is that the random forest create the
multitude of decisssion tree. More the number of tree
better the classification and hence we get a better result.
• Input files----->PE Header
• ML Algorithm->ExtraTreesClassifier / Random Forest
• Find Accurace Ratio
• Classify the Malware
International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
8. Portable Executable (PE) File Format
• The Portable Executable (PE) format is a
file format for executables, object code,
and DLLs, used in 32-bit and 64-bit
versions of Windows operating systems.
• The PE file format was defined to
provide the best way for the Windows
Operating System to execute code and
also to store the essential data which is
needed to run a program. Portable
Executable File Format is derived from
the Microsoft Common Object File
Format (COFF).
International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
9. Step by Step Classification :
Dataset
(PE files):
PE is file format
for dll,
executables,
object codes for
windows OS.
ExtraTreeClassifier:
Used for optimizing
dataset i.e
Splitting/partitioning
the legitimate and
Non legitimate
dataset
RandomForestClassifier:
Method for classification
by constructing
multitude of decision
tree at training tree
International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
10. Applications:
MACHINE LEARNING CAN
PLAY A GREAT ROLE IN
CYBERSECURITY FIELD .HERE
ARE THE SOME APPLICATION
OF MACHINE LEARNING IN
MALWARE .
ML APPLIED IN MALWARE
ANALYSIS HELP US
TO ANALYSIS/CLASSIFIE THE
DIFFERENT TYPE OF
MALWARE.
ML ALONG WITH NEURAL
NETWORK CAN HELP US TO
IDENTIFIY AND CLASSIFIE THR
MALWARE DURING REAL
TIME. SO AS THE
PRECAUTION CAN BE TAKEN
BEFORE IT IS SPREAD/CAUSE
HARM.
ML APPLIED IN DETECTING
THE MALWARE WILL HELP US
TO BUILD BETTER
ANTIVIRUSE THAT WILL GIVE
BETTER PROTECTION THE IT
INFRASTRUCTURE AND ALSO
PEOPLE.
International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
11. THANK-YOU
Ramkrushna M.
Assistant Professor
International Institute of Information Technology, I²IT, P-14, Rajiv Gandhi Infotech Park, Hinjawadi Phase 1, Pune - 411 057
Phone - +91 20 22933441/2/3 | Website - www.isquareit.edu.in | Email - info@isquareit.edu.in
http://www.isquareit.edu.in/