SlideShare a Scribd company logo
1 of 29
Download to read offline
i
Project Report On
A Machine Learning Framework for Domain
Generation Algorithm (DGA)-Based
Malware Detection
Submitted in partial fulfillment of the requirements of the degree of
Fourth Year of Engineering in Information Technology
Submitted by
Harsh Dobariya 714
Akshay Kalapgar 731
Mohit Kamble 732
Siddhesh Parab 750
Guided by
Prof. Abhay E. Patil
DEPARTMENT OF INFORMATION TECHNOLOGY
UNIVERSITY OF MUMBAI
2020 - 2021
ii
DEPARTMENT OF INFORMATION TECHNOLOGY
CERTIFICATE
Date:
This is to certify that, the project work embodied in this report entitled, “A Machine
Learning Framework for Domain Generation Algorithm (DGA)-Based Malware Detection”
submitted by “Harsh Dobariya bearing Roll No. 714”, “Akshay Kalapgar bearing Roll No.
731”, “Mohit Kamble bearing Roll No. 732”, “Siddhesh Parab bearing Roll No. 750” for the
award of Fourth Year Of Engineering (B.E.) degree, is a work carried out by them under my
guidance and supervision within the institute. The work described in this project report is
carried out by the concerned students and has not been submitted for the award of any other
degree of the University of Mumbai.
Further, it is to certify that the students were regular during the academic year2020-21
and have worked under the guidance of concerned faculty until the submission of this project
work at MCT’s Rajiv Gandhi Institute of Technology, Mumbai.
Prof. Abhay E. Patil
Project Guide
Dr. Sunil B. Wankhade Dr. Sanjay U. Bokade
Head of Department Principal
iii
CERTIFICATE OF APPROVAL
This project report is entitled
A Machine Learning Framework for Domain Generation Algorithm
(DGA)-Based Malware Detection
Submitted by:
Harsh Dobariya 714
Akshay Kalapgar 731
Mohit Kamble 732
Siddhesh Parab 750
In partial fulfillment of the requirements of the degree of Fourth Year in
Bachelor of Engineering in Information Technology is approved.
Internal Examiner
External Examiner
SEAL OF
INSTITUTE
iv
Declaration
I declare that this written submission represents my ideas in my own words and where
others' ideas or words have been included, I have adequately cited and referenced the original
sources. I also declare that I have adhered to all principles of academic honesty and integrity and
have not misrepresented or fabricated or falsified any idea/data/fact/source in my submission. I
understand that any violation of the above will be cause for disciplinary action by the Institute
and can also evoke penal action from the sources which have thus not been properly cited or from
whom proper permission has not been taken when needed.
ROLL NO. NAME SIGNATURE
714
731
732
750
HARSH DOBARIYA
AKSHAY KALAPGAR
MOHIT KAMBLE
SIDDHESH PARAB
Date:
Place:
v
Acknowledgement
With all reverence, we take the opportunity to express our deep sense of gratitude and
wholehearted indebtedness to our respected guide, Prof. A.E. Patil, Department of Information
Technology, Rajiv Gandhi Institute of Technology, Mumbai. From the day of conception of
this project his active involvement and motivating guidance on day-to- day basis has
made it possible for us to complete this challenging work in time.
We would like to express a deep sense of gratitude to our respected Head of the
Department, Dr. Sunil B. Wankhade who went all the way out to help us in all genuine cases
during the course of doing this project. We wish to express our sincere thanks to Dr. Sanjay
Bokade, Principal, Rajiv Gandhi Institute of Technology, Mumbai and would to like to
acknowledge specifically for giving guidance, encouragement and inspiration throughout the
academics.
We would like to thank all the staff of Information Technology Department who
continuously supported and motivated during our work. Also, we would like to thank our
colleagues for their continuous support and motivation during the project work. Finally, we
would like to express our gratitude to our family for their eternal belief in us. We would not be
where we are today without their support and encouragement.
HARSH DOBARIYA
AKSHAY KALAPGAR
MOHIT KAMBLE
SIDDHESH PARAB
Date:
Place:
vi
Abstract
Attackers usually use a Command and Control (C2) server to manipulate the communication. In
order to perform an attack, threat actors often employ a Domain Generation Algorithm (DGA),
which can allow malware to communicate with C2 by generating a variety of network locations.
Traditional malware control methods, such as blacklisting, are insufficient to handle DGA threats.
In this paper, we propose a machine learning framework for identifying and detecting DGA
domains to alleviate the threat. We collect real-time threat data from the real-life traffic over a one-
year period. We also propose a deep learning model to classify a large number of DGA domains.
The proposed machine learning framework consists of a two level model and a prediction model. In
the two-level model, we first classify the DGA domains apart from normal domains and then use
the clustering method to identify the algorithms that generate those DGA domains. In the prediction
model, a time-series model is constructed to predict incoming domain features based on the Hidden
Markov Model (HMM). Furthermore, we build a Deep Neural Network (DNN) model to enhance
the proposed machine learning framework by handling the huge dataset we gradually collected. Our
extensive experimental results demonstrate the accuracy of the proposed framework and the DNN
model. To be precise, we achieve an accuracy of 95.89% for the classification in the framework and
97.79% in the DNN model, 92.45% for the second-level clustering, and 95.21% for the HMM
prediction in the framework.
Keywords: Antigen, Blood Samples, GPU, Histogram, LBP (local binary pattern), Nearest
Neighbour Classifier, Image Processing, Pattern Matching.
vii
Table of Contents
Chapters Title of the Chapters Pages
Chapter 1 Introduction 1
Chapter 2 Aim and Objectives 2
Chapter 3 Literature Survey 3
Chapter 4 Existing System 4
Chapter 5 Problem Statement 5
Chapter 6 Proposed System 6
Chapter 7 Methodology 7
Chapter 8 Details of Hardware and Software 9
Chapter 9 Implementation 10
Chapter 10 Advantages & Disadvantages 20
Chapter 11 Scope 21
Chapter 12 References 22
Table of Figures
Fig
No.
List of
Figures
Pages
1. Diagrammatical Representation of working of the system 6
2. Flowchart of the System 10
3. Use Case Diagram of the System 11
4. Activity Diagram of the System 12
5. DFD Diagram of the System 13
6. Class Diagram of the System 15
7. Software Testing Lifecycle 18
1
Chapter 1.
Introduction
Malware attackers attempt to infiltrate layers of protection and defensive solutions, resulting
in threats on a computer network and its assets. Anti-malware software have been widely used in
enterprises for a long time since they can provide some level of security on computer networks and
systems to detect and mitigate malware attacks. However, many anti-malware solutions typically
utilize static string matching approaches, hashing schemes, or network communication white
listing. These solutions are too simple to resolve sophisticate malware attacks, which can hide
communication channels to bypass most detection schemes by purposely integrating evasive
techniques. The issue has posed a serious threat to the security of an enterprise and it is also a grand
challenge that needs to be addressed.
In this paper, we first propose a machine learning framework to classify and detect DGA
malware and develop a DNN model to classify the large datasets of DGA domains that we
gradually collected. We then experimentally evaluate the proposed framework through a
comparison of various machine learning approaches and a deep learning model. Specifically, our
machine learning framework consists of the following four main components: A dynamic blacklist
consists of a pattern filter. A two level machine learning model: the first-level classification and the
second-level clustering. To identify DGA domains, we first use various classification models to
classify DGA domains and normal domains. Then, we apply the clustering method to group
domains sequenced by the DGA. A time series prediction model: we propose a Hidden Markov
Model (HMM) to predict incoming DGA domain features in order to better identify the DGA
domains. The general goal of our machine learning framework is to determine which algorithm is
employed so that our proposed framework can prevent future communications from the C2.
Furthermore, we have gradually collected the data for over one year and have obtained a
large amount of datasets from real traffic. To analyze these data, we also propose a deep learning
approach for large dataset classification. We first build a DNN model and then compare it with our
machine learning models. The comparison results provide us a useful guideline for our future study
in DGA detection and prediction. In our future research, we will also apply deep learning in
clustering and prediction that are out of the scope of this paper.
2
Chapter 2.
Aim & Objectives
Aim:-
To solve the problem of detecting DGA sequences using machine learning techniques derived from
observations in a network.
Objectives:-
The objectives of the systems development and event management are:
1. In DBSCAN algorithm, we use the features described above to calculate the domain
distance and to group the domains that are generated by the same DGA together
according to their domain feature difference.
2. Distinguish the model from training and prediction stages.
3. The nodes in each layer are fully connected to the nodes in the next will not miss any
local minima, but it will take a long time to converge.
3
Chapter 3.
Literature Survey
âť– Literature Survey in a tabular format for better understanding.
Title Authors Advantages Disadvantages Result
A Machine Learning
Framework for
Domain Generation
Algorithm-Based
Malware Detection
Yi Li, Kaiqi Xiong,
Tommy Chin, Chengbin
Hu.
Institute of Electrical
and Electronics
Engineers, 2019.
In the second-level
clustering we apply the
DBSCAN algorithm.
Only the DGA
domains obtained from
the first-level
classification will be
used for clustering.
Research problem is to
accurately identify and
cluster domains that
originate from known
DGA-based techniques
where we target to
develop a security
approach that
autonomously mitigates
network
communications to
unknown threats in a
sequence.
Domain Generation Algorithm
(DGA) is used.
Learning and
Classification of
Malware Behavior
Konrad Rieck,
Thorsten Holz,
Carsten Willems,
Patrick DĂĽssel,
Pavel Laskov.
Kluwer Academic
Publishers, Dordrecht
(2002).
Results show that 70%
of malware instances
not identified by an
anti-virus software can
be correctly classified
by our approach.
Proposed machine
learning framework
aims to solve the
problem of detecting
DGA sequences using
machine learning
techniques derived from
observations in a
network.
• Normalize
Compression Distance.
• Benign Executable
An SDN based
framework for
guaranteeing security
and performance in
information-centric
cloud networks
U. Ghosh, P. Chatterjee,
D. Tosh, S. Shetty, K.
Xiong, and C. Kamhou.
11th IEEE International
Conference on Cloud
Computing (IEEE
Cloud), 2017.
The framework has the
ability to dynamically
compute the routing
path to guarantee
security and
performance of the
network.
Queries not matching
the knowledge are
stored in a backlog of
the software.
SDN-based framework and
Information centric services
A two-hashing table
multiple string
pattern matching
algorithm
C. Khancome, V.
Boonjing, and P.
Chanvarasuth.
Tenth International Con-
ference on Information
Technology: New
Generations (ITNG).
IEEE, 2013.
The attempting times
were less than of the
traditional algorithms
especially in the case
of a very long
minimum pattern
length.
The lengthy processing
time when directly
extended to the multiple
string patterns matching.
Executes multiple string pattern
matching algorithm.
4
Chapter 4.
Existing System
Threat models: Multiple conditions for a DGA to function in a network environment where
filtering results in a firewall that protects the communication and an empty cell in an Internet
domain those results in NXDOMAIN error.
Each HMM date record represents a series of domain observations. First sequences of
domain name are processed by a feature extractor and each of these feature vectors is used as a
training record.
Then, similar sequences are clustered as a group of DGA domain names with certain
outcomes. After the training process, if a sequence does not have an HMM sequence representation
(or it is not presented in the training data but the test data), the HMM model then generates the
future predicted results. Otherwise, we will use an existing HMM sequence presentation.
Disadvantages of Existing System:
1. Firewall protects the communication and an empty cell in an internet domain that results in
no domain error.
2. Queries not matching the knowledge are stored in a backlog of the software.
5
Chapter 5.
Problem Statement
The malware that communicates with an appropriate domain correctly, a threat actor must
register each respective domain name in the sequence to maintain the C2 or risk the loss of a node
in the distribution.
Our research problem is to accurately identify and cluster domains that originate from
known DGA-based techniques where we target to develop a security approach that autonomously
mitigates network communications to unknown threats in a sequence.
6
Chapter 6.
Proposed System
In our proposed system, Domains extracted from DGAs. Machine learning framework that
encompasses multiple feature extraction techniques and the models to classify the DGA domains
from normal domains, cluster the DGA domains, and predict a DGA domain.
A deep learning model to handle large datasets multiple on- line sources from simple
Google searching provide example codes for a DGA construction.
Online threat intelligence feeds give an approach to examining current and live threats in real-
world environment.
Using real-time active malicious domains derived from DGAs on the public Internet
measures the accuracy of the proposed approach.
The structure of the data is presented in a CSV format of domain names, originating
malware, and DGA membership with the daily file size of approximate 110MB.
We propose a machine learning framework that consists of three important steps, as shown
in Figure below.
We first have the DNS queries with the payload as the input.
Fig. 6.1: Diagrammatical Representation of working of the system
Advantages of Proposed System:
1. Domain Generation Algorithm (DGA), which allows malware to generate numerous domain
names until it finds its corresponding C&C server.
2. It is highly resilient to detection systems and reverse engineering, while allowing the C&C
server to have several redundant domain names.
7
Chapter 7.
Methodology
This project we will develop using python and web technology.
1. Filtering packet data:-
To filter packet data we are using pyshark which captures network packets.
We will store this packet information in pcap format
By reading packet we will filter the data and obtain domain name.
Packet flow also obtained from this.
If domain name extracted in this found in blacklist we will stop further steps.
2. Feature extraction work:-
With the python coding we will calculate the following feature
• Length- length of domain name.
• Meaningful Word Ratio,:- dictionary will be maintained of meaningful word and output will
be taken by dividing with length of domain name
• Percentage of Numerical Characters,:- numeric character involved in domain name system.
• Pronounce ability Score—frequency of text in domain calculated.
• Percentage of the Length of the Longest Meaningful String (LMS):- dividing the meaningful
word with the length of domain.
• Levenshtein Edit Distance:- It measures the minimum number of single-character edits
between a current domain and its previous domain in a stream of DNS queries received by
the server. The Levenshtein distance is calculated based on a domain and its predecessor
3. Machine learning classification:-
Following algorithms will be applied on feature obtained above.
• Decision tree:-
It calculates entropy and information gain and output generated but has problem of over
fitting. We will generate module with the selected feature.
• ANN:- it’s a Artificial neural network
Here we give input layer, hidden layer and output layer. Then with the feature we calculate
output.
• SVM:- support vector machine
8
It’s a good binary classifier .we will train with feature and model will be generated.
We are using sk learn python library
• Multiple Logistic regression:-, the logistic model (or logit model) is used to model the
probability of a certain class or event existing such as pass/fail, win/lose, alive/dead or
healthy/sick.
• Naive Bayes: - it calculates probability of occurring certain class. Model will be generated
using pickle and stored
• Random forest: - Random forest avoids over fitting problem and model will be generated,
stored into pickle.
All this machine learning model will generated and its
i. precision
ii. recall
iii. f1 score
iv. Accuracy will be calculated.
• Clustering:-
Dbscan used for outlier’s detection
Outliers are specific entries in dataset that are different than other point and don’t play vital
role in classification.
In statistics, an outlier is an observation point that is distant from other observations.
In this domain name will be clustered based on
i. Cryptolockereg. nxgbdtnvrfker.ru
ii. TOVAReg.:- gppwkpxyremp.net
iii. Dyreeg:- q2aa41a5b31294e5e6f28d1adcf48a54b.tk
iv. normalDomaineg:- easypdfcombine.com
4. Time series prediction
We use every domain cluster to train a separate HMM model. Each HMM data record
represents a series of domain observations. First, a sequence of domain names are processed by
a feature extractor and each of these feature vectors is used as a training record. Then, similar
sequences are clustered as a group of DGA domain names with certain outcomes.
9
Chapter 8.
Details of Hardware and Software
Hardware Requirements:
1. Processor: Intel Core i3 or more.
2. RAM: 4GB or more.
3. Hard disk: 250 GB or more.
Software Requirements:
1. Operating System : Windows10, 7, 8.
2. Python.
3. Anaconda.
4. Spyder, Jupyter notebook, Flask.
5. MYSQL.
10
Chapter 9.
Implementation
Flow chart of the system:
Fig. 9.1 - Flowchart of the System
START
Get domain names using packet filtering
Known
domain?
Extract domain features
Apply machine learning classification algorithm
Machine learning model will generated and its
precision, recall, f1 score and accuracy
Detect DGA or normal
Final output
STOP
Yes
No
11
Use case of the system:
Fig. 9.2 - Use case Diagram of the System
Register
login
getdomainnames
featureextraction()
User
applyMLalgos()
storeinblacklist
Train data
Test data
System
classification
Generate model
timeseriesprediction
12
Activity Diagram of the system:
Fig. 9.3 - Activity Diagram of the System
13
Data-flow diagram (DFD):
DFD level 0:
DFD level 1:
Machine learning for
malware detectionUSER SERVER
Request Request
Response Response
DATABASE
Get domain
names using
packet filtering
Feature
Extraction
Apply machine
learning
algorithms
Detect DGA or
normal
Trained models
14
DFD level 2:
Get domain
names using
packet filtering
Feature
Extraction
Apply machine
learning
algorithms
Check high
accuracy of
algo and use it
SVM
trained
model
Using pyshark
capture network
packet
Extracting length,
word ratio, LMS,
percentage numeric
character etc
ANN
Trained
model
Decision tree
Trained model
Multiple
Logistic
regressions
model
NaĂŻve
bayes
model
Random forest
model
Check high
accuracy of
algo and use it
Detect DGA or
normal
15
Class Diagram:
User
-userName (String)
-password (String)
-name (String)
-age (int)
-gender (String)
-mobile (number)
-dob (Date)
Register()
Login()
getdomainnames()
featureextraction()
Generatemodel()
applyMLalgos()
detectDGA()
System
-filteredpackets
Getdynmicdomainname()
Storeinblacklist()
Classification()
Timeseriesprediction()
Trainmodel()
Testmodel()
Classify()
Decision tree
DetectDGA()
Accuracy()
Precision
Recall()
CNN
DetectDGA()
Accuracy()
Precision
Recall()
SVM
DetectDGA()
Accuracy()
Precision
Recall()
NaĂŻve bayes
DetectDGA()
Accuracy()
Precision
Recall()
Random forest
DetectDGA()
Accuracy()
Precision
Recall()
16
TESTING
Software testing is an investigation conducted to provide stakeholders with information
about the quality of the product or service under test. Software testing can also provide an
objective, independent view of the software to allow the business to appreciate and
understand the risks of software implementation. Test techniques include the process of
executing a program or application with intent of finding software bugs (errors or other
defects).
Software testing involves the execution of a software component or system component to
evaluate one or more properties of interest. In general, these properties indicate the extent
to which the component to evaluate one or more properties of interest. In general, these
properties indicate the extent to which the component or system under test:
âť– Meets the requirements that guided its design and development,
âť– Responds correctly to all kinds of inputs,
âť– Performs its function within an Acceptable time,
âť– Is sufficiently usable,
âť– Can be installed and run in its intended environments, and
❖ Achieves the general result its stakeholder’s desire.
As the number of possible tests for even simple software components is practically infinite, all
software testing uses some strategy to select tests that are feasible for the available time and
resources. Asa result, software testing typically (but not exclusive) attempts to execute a program
or application with the intent of finding software bugs (errors or other defects). The job of testing
is an iterative process as when one bug is fixed; it can illuminate other, deeper bugs, or can even
create new ones. Software testing can provide objective, independent information about the
quality of software and risk of its failure to user and/or sponsors. Software testing can be
conducted as soon as executable software (even if partially complete) exists. The overall
approach to software development often determines when and how the testing is conducted. For
example, in a phased process, most testing occurs after the system requirements havebeen
17
Defined and then implemented in testable programs. In contrasts, under an Agile approach,
requirements, programming, and testing are often done concurrently.
âť– LEVELS OF TESTING:
In order to uncover the errors present in different phases we have the concept of levels of testing.
The basic level of testing are:-
âś“ Unit testing.
âś“ Integration testing.
âś“ Regression testing.
âś“ System testing.
âś“ Validation testing.
SOFTWARE TESTING LIFE CYCLE :
Fig 9.7 - Software Testing Lifecycle
18
UNIT TESTING
In computer programming, unit testing is a software testing method by which individual unitsof
source code, sets of one or more computer program modules together with associated control
data, usage procedures, and operating procedures, are tested to determine whether they are fit for
use. Intuitively, one can view a unit as the smallest testable part of an application. In procedural
programming, a unit could be an entire module, but it is more commonly an individual function
or procedure. In object- oriented programming, a unit is often an entire interface, such as a class,
but could be an individual method, unit tests a short code fragments created by programmers or
occasionally by white box testers during the development process. It forms the basis for
component testing.
INTEGRATION TESTING
Integration testing is any type of software testing that seeks to verify the interfaces between
components against software design. Software components may be integrated in an iterative way
or all together (“big bang”). Normally the former is considered a better practice since it allows
interface issues to be located more quickly and fixed. Integration testing works to expose defects
in the interfaces and interaction between integrated components (modules).
REGRESSION TESTING
Regression testing focuses on finding defects after a major code change has occurred.
Specifically, it seeks to uncover software regressions, as degraded or lost features, including old
bugs that have come back. Such regressions occur whenever software functionality that was
previously working correctly, stops working was intended. Typically, regressions occur as an
unintended consequence of program changes, when the newly developed part of the software
collides with the previously existing code. Common methods of regression testing include
rerunning previous sets of test cases and checking whether previously fixed faults have re-
emerged.
19
SYSTEM TESTING
System testing of software or hardware is testing conducted on a complete, integrated system to
evaluate the system’s compliance with its specified requirements. System testing falls within the
scope of black box testing, and as such, should require no knowledge of the inner design of the
code or logic.
VALIDATION TESTING
Validation Testing ensures that the product actually meets the client’s need. It can also be defines
as to demonstrate that the product fulfills its intended use when deployed on appropriate
environment
20
Chapter 10.
Advantages & Disadvantages
ADVANTAGES:
1. Domain Generation Algorithm (DGA), which allows malware to generate numerous domain
names until it finds its corresponding C&C server.
2. It is highly resilient to detection systems and reverse engineering, while allowing the C&C
server to have several redundant domain names.
3. ML algorithm helps to improve accuracy of result.
DISADVANTAGES:
1. Proposed machine learning framework aims to solve the problem of detecting DGA
sequences using machine learning techniques derived from observations in a network.
2. Queries not matching the knowledge are stored in a backlog of the software.
3. The lengthy processing time when directly extended to the multiple string patterns
matching.
21
Chapter 11
Scope
• The most common method to detect malicious URLs deployed by many antivirus groups is
the blacklist method.
• Blacklists are essentially a database of URLs that have been confirmed to be malicious in
the past. Scope of this project is useful for it helps to prevent malicious activity in cyber
world.
Future Modification:
• In future it is intended to improve the system performance on the based on dataset.
• Also use new techniques to get accurate result.
22
Chapter 12
References
1. Yi Li, Kaiqi Xiong, Tommy Chin, Chengbin Hu, “A Machine Learning Framework for
Domain Generation Algorithm-Based Malware Detection” IEEE Access (Volume: 7), 31
January 2019
2. T. Chin, K. Xiong and M. Rahouti, "SDN-based kernel modular countermeasure for
intrusion detection", Proc. 13rd EAI Int. Conf. Secur. Privacy Commun. Netw., pp. 270-
290, 2019.
3. Suthathira Vanitha N., Professor, Department of EEE, Knowledge Institute of Technology,
Tamil Nadu, India, A novel approach in identification of blood group using laser
technology, International Journal of Research in Engineering and Technology (2014).
Available: esatjournals.net/ijret/2014v03/i23/IJRET20140323005.pdf
4. Jose Fernandes, Sara Pimenta, Student Member, IEEE, Filomena O. Soares, Senior
Member, IEEE and Graca Minas, Senior Member, IEEE, (2012), A Complete Blood
Typing Device for Automatic Agglutination Detection Based on Absorption
Spectrophotometry, IEEE Transactions On Instrumentation And Measurement.
5. Nazia Fathima S.M (2013) Classification of blood type by microscopic color images,
International Journal of Machine Learning and Computing. Vol. 3, No. 4.
Available: www.ijmlc.org/papers/342-L472.pdf

More Related Content

What's hot

Computer Graphics 471 Project Report Final
Computer Graphics 471 Project Report FinalComputer Graphics 471 Project Report Final
Computer Graphics 471 Project Report FinalAli Ahmed
 
Python Summer Internship
Python Summer InternshipPython Summer Internship
Python Summer InternshipAtul Kumar
 
Emotion recognition
Emotion recognitionEmotion recognition
Emotion recognitionMadhusudhan G
 
Emotion detection using cnn.pptx
Emotion detection using cnn.pptxEmotion detection using cnn.pptx
Emotion detection using cnn.pptxRADO7900
 
Deep Learning A-Z™: Artificial Neural Networks (ANN) - Module 1
Deep Learning A-Z™: Artificial Neural Networks (ANN) - Module 1Deep Learning A-Z™: Artificial Neural Networks (ANN) - Module 1
Deep Learning A-Z™: Artificial Neural Networks (ANN) - Module 1Kirill Eremenko
 
ENEL Electricity Topology Network on Neo4j Graph DB
ENEL Electricity Topology Network on Neo4j Graph DBENEL Electricity Topology Network on Neo4j Graph DB
ENEL Electricity Topology Network on Neo4j Graph DBNeo4j
 
SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEWSENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEWJournal For Research
 
Gender and Age Detection using OpenCV.pptx
Gender and Age Detection using OpenCV.pptxGender and Age Detection using OpenCV.pptx
Gender and Age Detection using OpenCV.pptxSakshiVishwakarma12
 
A real life case study of using Agile and PRINCE2 together - AgilePM
A real life case study of using Agile and PRINCE2 together - AgilePMA real life case study of using Agile and PRINCE2 together - AgilePM
A real life case study of using Agile and PRINCE2 together - AgilePMTraining Bytesize
 
Software Project Management | An Overview of the Software Project Management
Software Project Management | An Overview of the Software Project ManagementSoftware Project Management | An Overview of the Software Project Management
Software Project Management | An Overview of the Software Project ManagementAhsan Rahim
 
Image Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine LearningImage Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine Learningijtsrd
 
Android messaging application proposal
Android messaging application proposalAndroid messaging application proposal
Android messaging application proposalincisive jovial
 
News portal
News portalNews portal
News portalArman Ahmed
 
AUTOMATED FOOTBALL MANAGEMENT SYSTEM
AUTOMATED FOOTBALL MANAGEMENT SYSTEMAUTOMATED FOOTBALL MANAGEMENT SYSTEM
AUTOMATED FOOTBALL MANAGEMENT SYSTEMAbhishek Kumar
 
Project Progress Tracking - Presentation
Project Progress Tracking - Presentation Project Progress Tracking - Presentation
Project Progress Tracking - Presentation Pallav Shah
 
Speech emotion recognition from audio converted
Speech emotion recognition from audio convertedSpeech emotion recognition from audio converted
Speech emotion recognition from audio converteddataalcott
 
Project Planning | Project Plan In Excel With Gantt Chart | Project Managemen...
Project Planning | Project Plan In Excel With Gantt Chart | Project Managemen...Project Planning | Project Plan In Excel With Gantt Chart | Project Managemen...
Project Planning | Project Plan In Excel With Gantt Chart | Project Managemen...Simplilearn
 
Internship Presentation 1 Web Developer
Internship Presentation 1 Web DeveloperInternship Presentation 1 Web Developer
Internship Presentation 1 Web DeveloperHemant Sarthak
 

What's hot (20)

Computer Graphics 471 Project Report Final
Computer Graphics 471 Project Report FinalComputer Graphics 471 Project Report Final
Computer Graphics 471 Project Report Final
 
Python Summer Internship
Python Summer InternshipPython Summer Internship
Python Summer Internship
 
Emotion recognition
Emotion recognitionEmotion recognition
Emotion recognition
 
Emotion detection using cnn.pptx
Emotion detection using cnn.pptxEmotion detection using cnn.pptx
Emotion detection using cnn.pptx
 
Deep Learning A-Z™: Artificial Neural Networks (ANN) - Module 1
Deep Learning A-Z™: Artificial Neural Networks (ANN) - Module 1Deep Learning A-Z™: Artificial Neural Networks (ANN) - Module 1
Deep Learning A-Z™: Artificial Neural Networks (ANN) - Module 1
 
ENEL Electricity Topology Network on Neo4j Graph DB
ENEL Electricity Topology Network on Neo4j Graph DBENEL Electricity Topology Network on Neo4j Graph DB
ENEL Electricity Topology Network on Neo4j Graph DB
 
SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEWSENTIMENT ANALYSIS-AN OBJECTIVE VIEW
SENTIMENT ANALYSIS-AN OBJECTIVE VIEW
 
Gender and Age Detection using OpenCV.pptx
Gender and Age Detection using OpenCV.pptxGender and Age Detection using OpenCV.pptx
Gender and Age Detection using OpenCV.pptx
 
A real life case study of using Agile and PRINCE2 together - AgilePM
A real life case study of using Agile and PRINCE2 together - AgilePMA real life case study of using Agile and PRINCE2 together - AgilePM
A real life case study of using Agile and PRINCE2 together - AgilePM
 
Software Project Management | An Overview of the Software Project Management
Software Project Management | An Overview of the Software Project ManagementSoftware Project Management | An Overview of the Software Project Management
Software Project Management | An Overview of the Software Project Management
 
Image Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine LearningImage Captioning Generator using Deep Machine Learning
Image Captioning Generator using Deep Machine Learning
 
Android messaging application proposal
Android messaging application proposalAndroid messaging application proposal
Android messaging application proposal
 
News portal
News portalNews portal
News portal
 
Industrial Training report on java
Industrial  Training report on javaIndustrial  Training report on java
Industrial Training report on java
 
DALLE-2.pptx
DALLE-2.pptxDALLE-2.pptx
DALLE-2.pptx
 
AUTOMATED FOOTBALL MANAGEMENT SYSTEM
AUTOMATED FOOTBALL MANAGEMENT SYSTEMAUTOMATED FOOTBALL MANAGEMENT SYSTEM
AUTOMATED FOOTBALL MANAGEMENT SYSTEM
 
Project Progress Tracking - Presentation
Project Progress Tracking - Presentation Project Progress Tracking - Presentation
Project Progress Tracking - Presentation
 
Speech emotion recognition from audio converted
Speech emotion recognition from audio convertedSpeech emotion recognition from audio converted
Speech emotion recognition from audio converted
 
Project Planning | Project Plan In Excel With Gantt Chart | Project Managemen...
Project Planning | Project Plan In Excel With Gantt Chart | Project Managemen...Project Planning | Project Plan In Excel With Gantt Chart | Project Managemen...
Project Planning | Project Plan In Excel With Gantt Chart | Project Managemen...
 
Internship Presentation 1 Web Developer
Internship Presentation 1 Web DeveloperInternship Presentation 1 Web Developer
Internship Presentation 1 Web Developer
 

Similar to Dga final year project report Akshay Kalapgar

Literature Review on DDOS Attacks Detection Using SVM algorithm.
Literature Review on DDOS Attacks Detection Using SVM algorithm.Literature Review on DDOS Attacks Detection Using SVM algorithm.
Literature Review on DDOS Attacks Detection Using SVM algorithm.IRJET Journal
 
Exploring and comparing various machine and deep learning technique algorithm...
Exploring and comparing various machine and deep learning technique algorithm...Exploring and comparing various machine and deep learning technique algorithm...
Exploring and comparing various machine and deep learning technique algorithm...CSITiaesprime
 
DDoS Attack Detection and Botnet Prevention using Machine Learning
DDoS Attack Detection and Botnet Prevention using Machine LearningDDoS Attack Detection and Botnet Prevention using Machine Learning
DDoS Attack Detection and Botnet Prevention using Machine LearningIRJET Journal
 
Rajshree1.pdf
Rajshree1.pdfRajshree1.pdf
Rajshree1.pdfssuser2bf502
 
ICMCSI 2023 PPT 1074.pptx
ICMCSI 2023 PPT 1074.pptxICMCSI 2023 PPT 1074.pptx
ICMCSI 2023 PPT 1074.pptxajagbesundayadeola
 
DDOS DETECTION IN SOFTWARE-DEFINED NETWORK (SDN) USING MACHINE LEARNING
DDOS DETECTION IN SOFTWARE-DEFINED NETWORK (SDN) USING MACHINE LEARNINGDDOS DETECTION IN SOFTWARE-DEFINED NETWORK (SDN) USING MACHINE LEARNING
DDOS DETECTION IN SOFTWARE-DEFINED NETWORK (SDN) USING MACHINE LEARNINGIJCI JOURNAL
 
Virtual Contact Discovery using Facial Recognition
Virtual Contact Discovery using Facial RecognitionVirtual Contact Discovery using Facial Recognition
Virtual Contact Discovery using Facial RecognitionIRJET Journal
 
Image Forgery / Tampering Detection Using Deep Learning and Cloud
Image Forgery / Tampering Detection Using Deep Learning and CloudImage Forgery / Tampering Detection Using Deep Learning and Cloud
Image Forgery / Tampering Detection Using Deep Learning and CloudIRJET Journal
 
Paper-1 PPT.pptx
Paper-1 PPT.pptxPaper-1 PPT.pptx
Paper-1 PPT.pptxSangeeta Mehta
 
RTL-DL: A HYBRID DEEP LEARNING FRAMEWORK FOR DDOS ATTACK DETECTION IN A BIG D...
RTL-DL: A HYBRID DEEP LEARNING FRAMEWORK FOR DDOS ATTACK DETECTION IN A BIG D...RTL-DL: A HYBRID DEEP LEARNING FRAMEWORK FOR DDOS ATTACK DETECTION IN A BIG D...
RTL-DL: A HYBRID DEEP LEARNING FRAMEWORK FOR DDOS ATTACK DETECTION IN A BIG D...IJCNCJournal
 
RTL-DL: A Hybrid Deep Learning Framework for DDoS Attack Detection in a Big D...
RTL-DL: A Hybrid Deep Learning Framework for DDoS Attack Detection in a Big D...RTL-DL: A Hybrid Deep Learning Framework for DDoS Attack Detection in a Big D...
RTL-DL: A Hybrid Deep Learning Framework for DDoS Attack Detection in a Big D...IJCNCJournal
 
IEEE 2014 C# Projects
IEEE 2014 C# ProjectsIEEE 2014 C# Projects
IEEE 2014 C# ProjectsVijay Karan
 
IEEE 2014 C# Projects
IEEE 2014 C# ProjectsIEEE 2014 C# Projects
IEEE 2014 C# ProjectsVijay Karan
 
Automated_attendance_system_project.pptx
Automated_attendance_system_project.pptxAutomated_attendance_system_project.pptx
Automated_attendance_system_project.pptxNaveensai51
 
IRJET- Deep Learning Based Card-Less Atm Using Fingerprint And Face Recogniti...
IRJET- Deep Learning Based Card-Less Atm Using Fingerprint And Face Recogniti...IRJET- Deep Learning Based Card-Less Atm Using Fingerprint And Face Recogniti...
IRJET- Deep Learning Based Card-Less Atm Using Fingerprint And Face Recogniti...IRJET Journal
 
IRJET- Effective Technique Used for Malware Detection using Machine Learning
IRJET-  	  Effective Technique Used for Malware Detection using Machine LearningIRJET-  	  Effective Technique Used for Malware Detection using Machine Learning
IRJET- Effective Technique Used for Malware Detection using Machine LearningIRJET Journal
 
A new proactive feature selection model based on the enhanced optimization a...
A new proactive feature selection model based on the enhanced  optimization a...A new proactive feature selection model based on the enhanced  optimization a...
A new proactive feature selection model based on the enhanced optimization a...IJECEIAES
 
Image and text Encryption using RSA algorithm in java
Image and text Encryption using RSA algorithm in java  Image and text Encryption using RSA algorithm in java
Image and text Encryption using RSA algorithm in java PiyushPatil73
 
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATIONMAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATIONIJNSA Journal
 

Similar to Dga final year project report Akshay Kalapgar (20)

Literature Review on DDOS Attacks Detection Using SVM algorithm.
Literature Review on DDOS Attacks Detection Using SVM algorithm.Literature Review on DDOS Attacks Detection Using SVM algorithm.
Literature Review on DDOS Attacks Detection Using SVM algorithm.
 
Exploring and comparing various machine and deep learning technique algorithm...
Exploring and comparing various machine and deep learning technique algorithm...Exploring and comparing various machine and deep learning technique algorithm...
Exploring and comparing various machine and deep learning technique algorithm...
 
DDoS Attack Detection and Botnet Prevention using Machine Learning
DDoS Attack Detection and Botnet Prevention using Machine LearningDDoS Attack Detection and Botnet Prevention using Machine Learning
DDoS Attack Detection and Botnet Prevention using Machine Learning
 
Rajshree1.pdf
Rajshree1.pdfRajshree1.pdf
Rajshree1.pdf
 
ICMCSI 2023 PPT 1074.pptx
ICMCSI 2023 PPT 1074.pptxICMCSI 2023 PPT 1074.pptx
ICMCSI 2023 PPT 1074.pptx
 
DDOS DETECTION IN SOFTWARE-DEFINED NETWORK (SDN) USING MACHINE LEARNING
DDOS DETECTION IN SOFTWARE-DEFINED NETWORK (SDN) USING MACHINE LEARNINGDDOS DETECTION IN SOFTWARE-DEFINED NETWORK (SDN) USING MACHINE LEARNING
DDOS DETECTION IN SOFTWARE-DEFINED NETWORK (SDN) USING MACHINE LEARNING
 
Virtual Contact Discovery using Facial Recognition
Virtual Contact Discovery using Facial RecognitionVirtual Contact Discovery using Facial Recognition
Virtual Contact Discovery using Facial Recognition
 
Image Forgery / Tampering Detection Using Deep Learning and Cloud
Image Forgery / Tampering Detection Using Deep Learning and CloudImage Forgery / Tampering Detection Using Deep Learning and Cloud
Image Forgery / Tampering Detection Using Deep Learning and Cloud
 
Paper-1 PPT.pptx
Paper-1 PPT.pptxPaper-1 PPT.pptx
Paper-1 PPT.pptx
 
RTL-DL: A HYBRID DEEP LEARNING FRAMEWORK FOR DDOS ATTACK DETECTION IN A BIG D...
RTL-DL: A HYBRID DEEP LEARNING FRAMEWORK FOR DDOS ATTACK DETECTION IN A BIG D...RTL-DL: A HYBRID DEEP LEARNING FRAMEWORK FOR DDOS ATTACK DETECTION IN A BIG D...
RTL-DL: A HYBRID DEEP LEARNING FRAMEWORK FOR DDOS ATTACK DETECTION IN A BIG D...
 
RTL-DL: A Hybrid Deep Learning Framework for DDoS Attack Detection in a Big D...
RTL-DL: A Hybrid Deep Learning Framework for DDoS Attack Detection in a Big D...RTL-DL: A Hybrid Deep Learning Framework for DDoS Attack Detection in a Big D...
RTL-DL: A Hybrid Deep Learning Framework for DDoS Attack Detection in a Big D...
 
IEEE 2014 C# Projects
IEEE 2014 C# ProjectsIEEE 2014 C# Projects
IEEE 2014 C# Projects
 
IEEE 2014 C# Projects
IEEE 2014 C# ProjectsIEEE 2014 C# Projects
IEEE 2014 C# Projects
 
Automated_attendance_system_project.pptx
Automated_attendance_system_project.pptxAutomated_attendance_system_project.pptx
Automated_attendance_system_project.pptx
 
IRJET- Deep Learning Based Card-Less Atm Using Fingerprint And Face Recogniti...
IRJET- Deep Learning Based Card-Less Atm Using Fingerprint And Face Recogniti...IRJET- Deep Learning Based Card-Less Atm Using Fingerprint And Face Recogniti...
IRJET- Deep Learning Based Card-Less Atm Using Fingerprint And Face Recogniti...
 
IRJET- Effective Technique Used for Malware Detection using Machine Learning
IRJET-  	  Effective Technique Used for Malware Detection using Machine LearningIRJET-  	  Effective Technique Used for Malware Detection using Machine Learning
IRJET- Effective Technique Used for Malware Detection using Machine Learning
 
A new proactive feature selection model based on the enhanced optimization a...
A new proactive feature selection model based on the enhanced  optimization a...A new proactive feature selection model based on the enhanced  optimization a...
A new proactive feature selection model based on the enhanced optimization a...
 
Raja's resume
Raja's resumeRaja's resume
Raja's resume
 
Image and text Encryption using RSA algorithm in java
Image and text Encryption using RSA algorithm in java  Image and text Encryption using RSA algorithm in java
Image and text Encryption using RSA algorithm in java
 
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATIONMAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
MAPREDUCE IMPLEMENTATION FOR MALICIOUS WEBSITES CLASSIFICATION
 

Recently uploaded

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonJericReyAuditor
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentInMediaRes1
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 

Recently uploaded (20)

SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lesson
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Alper Gobel In Media Res Media Component
Alper Gobel In Media Res Media ComponentAlper Gobel In Media Res Media Component
Alper Gobel In Media Res Media Component
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 

Dga final year project report Akshay Kalapgar

  • 1. i Project Report On A Machine Learning Framework for Domain Generation Algorithm (DGA)-Based Malware Detection Submitted in partial fulfillment of the requirements of the degree of Fourth Year of Engineering in Information Technology Submitted by Harsh Dobariya 714 Akshay Kalapgar 731 Mohit Kamble 732 Siddhesh Parab 750 Guided by Prof. Abhay E. Patil DEPARTMENT OF INFORMATION TECHNOLOGY UNIVERSITY OF MUMBAI 2020 - 2021
  • 2. ii DEPARTMENT OF INFORMATION TECHNOLOGY CERTIFICATE Date: This is to certify that, the project work embodied in this report entitled, “A Machine Learning Framework for Domain Generation Algorithm (DGA)-Based Malware Detection” submitted by “Harsh Dobariya bearing Roll No. 714”, “Akshay Kalapgar bearing Roll No. 731”, “Mohit Kamble bearing Roll No. 732”, “Siddhesh Parab bearing Roll No. 750” for the award of Fourth Year Of Engineering (B.E.) degree, is a work carried out by them under my guidance and supervision within the institute. The work described in this project report is carried out by the concerned students and has not been submitted for the award of any other degree of the University of Mumbai. Further, it is to certify that the students were regular during the academic year2020-21 and have worked under the guidance of concerned faculty until the submission of this project work at MCT’s Rajiv Gandhi Institute of Technology, Mumbai. Prof. Abhay E. Patil Project Guide Dr. Sunil B. Wankhade Dr. Sanjay U. Bokade Head of Department Principal
  • 3. iii CERTIFICATE OF APPROVAL This project report is entitled A Machine Learning Framework for Domain Generation Algorithm (DGA)-Based Malware Detection Submitted by: Harsh Dobariya 714 Akshay Kalapgar 731 Mohit Kamble 732 Siddhesh Parab 750 In partial fulfillment of the requirements of the degree of Fourth Year in Bachelor of Engineering in Information Technology is approved. Internal Examiner External Examiner SEAL OF INSTITUTE
  • 4. iv Declaration I declare that this written submission represents my ideas in my own words and where others' ideas or words have been included, I have adequately cited and referenced the original sources. I also declare that I have adhered to all principles of academic honesty and integrity and have not misrepresented or fabricated or falsified any idea/data/fact/source in my submission. I understand that any violation of the above will be cause for disciplinary action by the Institute and can also evoke penal action from the sources which have thus not been properly cited or from whom proper permission has not been taken when needed. ROLL NO. NAME SIGNATURE 714 731 732 750 HARSH DOBARIYA AKSHAY KALAPGAR MOHIT KAMBLE SIDDHESH PARAB Date: Place:
  • 5. v Acknowledgement With all reverence, we take the opportunity to express our deep sense of gratitude and wholehearted indebtedness to our respected guide, Prof. A.E. Patil, Department of Information Technology, Rajiv Gandhi Institute of Technology, Mumbai. From the day of conception of this project his active involvement and motivating guidance on day-to- day basis has made it possible for us to complete this challenging work in time. We would like to express a deep sense of gratitude to our respected Head of the Department, Dr. Sunil B. Wankhade who went all the way out to help us in all genuine cases during the course of doing this project. We wish to express our sincere thanks to Dr. Sanjay Bokade, Principal, Rajiv Gandhi Institute of Technology, Mumbai and would to like to acknowledge specifically for giving guidance, encouragement and inspiration throughout the academics. We would like to thank all the staff of Information Technology Department who continuously supported and motivated during our work. Also, we would like to thank our colleagues for their continuous support and motivation during the project work. Finally, we would like to express our gratitude to our family for their eternal belief in us. We would not be where we are today without their support and encouragement. HARSH DOBARIYA AKSHAY KALAPGAR MOHIT KAMBLE SIDDHESH PARAB Date: Place:
  • 6. vi Abstract Attackers usually use a Command and Control (C2) server to manipulate the communication. In order to perform an attack, threat actors often employ a Domain Generation Algorithm (DGA), which can allow malware to communicate with C2 by generating a variety of network locations. Traditional malware control methods, such as blacklisting, are insufficient to handle DGA threats. In this paper, we propose a machine learning framework for identifying and detecting DGA domains to alleviate the threat. We collect real-time threat data from the real-life traffic over a one- year period. We also propose a deep learning model to classify a large number of DGA domains. The proposed machine learning framework consists of a two level model and a prediction model. In the two-level model, we first classify the DGA domains apart from normal domains and then use the clustering method to identify the algorithms that generate those DGA domains. In the prediction model, a time-series model is constructed to predict incoming domain features based on the Hidden Markov Model (HMM). Furthermore, we build a Deep Neural Network (DNN) model to enhance the proposed machine learning framework by handling the huge dataset we gradually collected. Our extensive experimental results demonstrate the accuracy of the proposed framework and the DNN model. To be precise, we achieve an accuracy of 95.89% for the classification in the framework and 97.79% in the DNN model, 92.45% for the second-level clustering, and 95.21% for the HMM prediction in the framework. Keywords: Antigen, Blood Samples, GPU, Histogram, LBP (local binary pattern), Nearest Neighbour Classifier, Image Processing, Pattern Matching.
  • 7. vii Table of Contents Chapters Title of the Chapters Pages Chapter 1 Introduction 1 Chapter 2 Aim and Objectives 2 Chapter 3 Literature Survey 3 Chapter 4 Existing System 4 Chapter 5 Problem Statement 5 Chapter 6 Proposed System 6 Chapter 7 Methodology 7 Chapter 8 Details of Hardware and Software 9 Chapter 9 Implementation 10 Chapter 10 Advantages & Disadvantages 20 Chapter 11 Scope 21 Chapter 12 References 22 Table of Figures Fig No. List of Figures Pages 1. Diagrammatical Representation of working of the system 6 2. Flowchart of the System 10 3. Use Case Diagram of the System 11 4. Activity Diagram of the System 12 5. DFD Diagram of the System 13 6. Class Diagram of the System 15 7. Software Testing Lifecycle 18
  • 8. 1 Chapter 1. Introduction Malware attackers attempt to infiltrate layers of protection and defensive solutions, resulting in threats on a computer network and its assets. Anti-malware software have been widely used in enterprises for a long time since they can provide some level of security on computer networks and systems to detect and mitigate malware attacks. However, many anti-malware solutions typically utilize static string matching approaches, hashing schemes, or network communication white listing. These solutions are too simple to resolve sophisticate malware attacks, which can hide communication channels to bypass most detection schemes by purposely integrating evasive techniques. The issue has posed a serious threat to the security of an enterprise and it is also a grand challenge that needs to be addressed. In this paper, we first propose a machine learning framework to classify and detect DGA malware and develop a DNN model to classify the large datasets of DGA domains that we gradually collected. We then experimentally evaluate the proposed framework through a comparison of various machine learning approaches and a deep learning model. Specifically, our machine learning framework consists of the following four main components: A dynamic blacklist consists of a pattern filter. A two level machine learning model: the first-level classification and the second-level clustering. To identify DGA domains, we first use various classification models to classify DGA domains and normal domains. Then, we apply the clustering method to group domains sequenced by the DGA. A time series prediction model: we propose a Hidden Markov Model (HMM) to predict incoming DGA domain features in order to better identify the DGA domains. The general goal of our machine learning framework is to determine which algorithm is employed so that our proposed framework can prevent future communications from the C2. Furthermore, we have gradually collected the data for over one year and have obtained a large amount of datasets from real traffic. To analyze these data, we also propose a deep learning approach for large dataset classification. We first build a DNN model and then compare it with our machine learning models. The comparison results provide us a useful guideline for our future study in DGA detection and prediction. In our future research, we will also apply deep learning in clustering and prediction that are out of the scope of this paper.
  • 9. 2 Chapter 2. Aim & Objectives Aim:- To solve the problem of detecting DGA sequences using machine learning techniques derived from observations in a network. Objectives:- The objectives of the systems development and event management are: 1. In DBSCAN algorithm, we use the features described above to calculate the domain distance and to group the domains that are generated by the same DGA together according to their domain feature difference. 2. Distinguish the model from training and prediction stages. 3. The nodes in each layer are fully connected to the nodes in the next will not miss any local minima, but it will take a long time to converge.
  • 10. 3 Chapter 3. Literature Survey âť– Literature Survey in a tabular format for better understanding. Title Authors Advantages Disadvantages Result A Machine Learning Framework for Domain Generation Algorithm-Based Malware Detection Yi Li, Kaiqi Xiong, Tommy Chin, Chengbin Hu. Institute of Electrical and Electronics Engineers, 2019. In the second-level clustering we apply the DBSCAN algorithm. Only the DGA domains obtained from the first-level classification will be used for clustering. Research problem is to accurately identify and cluster domains that originate from known DGA-based techniques where we target to develop a security approach that autonomously mitigates network communications to unknown threats in a sequence. Domain Generation Algorithm (DGA) is used. Learning and Classification of Malware Behavior Konrad Rieck, Thorsten Holz, Carsten Willems, Patrick DĂĽssel, Pavel Laskov. Kluwer Academic Publishers, Dordrecht (2002). Results show that 70% of malware instances not identified by an anti-virus software can be correctly classified by our approach. Proposed machine learning framework aims to solve the problem of detecting DGA sequences using machine learning techniques derived from observations in a network. • Normalize Compression Distance. • Benign Executable An SDN based framework for guaranteeing security and performance in information-centric cloud networks U. Ghosh, P. Chatterjee, D. Tosh, S. Shetty, K. Xiong, and C. Kamhou. 11th IEEE International Conference on Cloud Computing (IEEE Cloud), 2017. The framework has the ability to dynamically compute the routing path to guarantee security and performance of the network. Queries not matching the knowledge are stored in a backlog of the software. SDN-based framework and Information centric services A two-hashing table multiple string pattern matching algorithm C. Khancome, V. Boonjing, and P. Chanvarasuth. Tenth International Con- ference on Information Technology: New Generations (ITNG). IEEE, 2013. The attempting times were less than of the traditional algorithms especially in the case of a very long minimum pattern length. The lengthy processing time when directly extended to the multiple string patterns matching. Executes multiple string pattern matching algorithm.
  • 11. 4 Chapter 4. Existing System Threat models: Multiple conditions for a DGA to function in a network environment where filtering results in a firewall that protects the communication and an empty cell in an Internet domain those results in NXDOMAIN error. Each HMM date record represents a series of domain observations. First sequences of domain name are processed by a feature extractor and each of these feature vectors is used as a training record. Then, similar sequences are clustered as a group of DGA domain names with certain outcomes. After the training process, if a sequence does not have an HMM sequence representation (or it is not presented in the training data but the test data), the HMM model then generates the future predicted results. Otherwise, we will use an existing HMM sequence presentation. Disadvantages of Existing System: 1. Firewall protects the communication and an empty cell in an internet domain that results in no domain error. 2. Queries not matching the knowledge are stored in a backlog of the software.
  • 12. 5 Chapter 5. Problem Statement The malware that communicates with an appropriate domain correctly, a threat actor must register each respective domain name in the sequence to maintain the C2 or risk the loss of a node in the distribution. Our research problem is to accurately identify and cluster domains that originate from known DGA-based techniques where we target to develop a security approach that autonomously mitigates network communications to unknown threats in a sequence.
  • 13. 6 Chapter 6. Proposed System In our proposed system, Domains extracted from DGAs. Machine learning framework that encompasses multiple feature extraction techniques and the models to classify the DGA domains from normal domains, cluster the DGA domains, and predict a DGA domain. A deep learning model to handle large datasets multiple on- line sources from simple Google searching provide example codes for a DGA construction. Online threat intelligence feeds give an approach to examining current and live threats in real- world environment. Using real-time active malicious domains derived from DGAs on the public Internet measures the accuracy of the proposed approach. The structure of the data is presented in a CSV format of domain names, originating malware, and DGA membership with the daily file size of approximate 110MB. We propose a machine learning framework that consists of three important steps, as shown in Figure below. We first have the DNS queries with the payload as the input. Fig. 6.1: Diagrammatical Representation of working of the system Advantages of Proposed System: 1. Domain Generation Algorithm (DGA), which allows malware to generate numerous domain names until it finds its corresponding C&C server. 2. It is highly resilient to detection systems and reverse engineering, while allowing the C&C server to have several redundant domain names.
  • 14. 7 Chapter 7. Methodology This project we will develop using python and web technology. 1. Filtering packet data:- To filter packet data we are using pyshark which captures network packets. We will store this packet information in pcap format By reading packet we will filter the data and obtain domain name. Packet flow also obtained from this. If domain name extracted in this found in blacklist we will stop further steps. 2. Feature extraction work:- With the python coding we will calculate the following feature • Length- length of domain name. • Meaningful Word Ratio,:- dictionary will be maintained of meaningful word and output will be taken by dividing with length of domain name • Percentage of Numerical Characters,:- numeric character involved in domain name system. • Pronounce ability Score—frequency of text in domain calculated. • Percentage of the Length of the Longest Meaningful String (LMS):- dividing the meaningful word with the length of domain. • Levenshtein Edit Distance:- It measures the minimum number of single-character edits between a current domain and its previous domain in a stream of DNS queries received by the server. The Levenshtein distance is calculated based on a domain and its predecessor 3. Machine learning classification:- Following algorithms will be applied on feature obtained above. • Decision tree:- It calculates entropy and information gain and output generated but has problem of over fitting. We will generate module with the selected feature. • ANN:- it’s a Artificial neural network Here we give input layer, hidden layer and output layer. Then with the feature we calculate output. • SVM:- support vector machine
  • 15. 8 It’s a good binary classifier .we will train with feature and model will be generated. We are using sk learn python library • Multiple Logistic regression:-, the logistic model (or logit model) is used to model the probability of a certain class or event existing such as pass/fail, win/lose, alive/dead or healthy/sick. • Naive Bayes: - it calculates probability of occurring certain class. Model will be generated using pickle and stored • Random forest: - Random forest avoids over fitting problem and model will be generated, stored into pickle. All this machine learning model will generated and its i. precision ii. recall iii. f1 score iv. Accuracy will be calculated. • Clustering:- Dbscan used for outlier’s detection Outliers are specific entries in dataset that are different than other point and don’t play vital role in classification. In statistics, an outlier is an observation point that is distant from other observations. In this domain name will be clustered based on i. Cryptolockereg. nxgbdtnvrfker.ru ii. TOVAReg.:- gppwkpxyremp.net iii. Dyreeg:- q2aa41a5b31294e5e6f28d1adcf48a54b.tk iv. normalDomaineg:- easypdfcombine.com 4. Time series prediction We use every domain cluster to train a separate HMM model. Each HMM data record represents a series of domain observations. First, a sequence of domain names are processed by a feature extractor and each of these feature vectors is used as a training record. Then, similar sequences are clustered as a group of DGA domain names with certain outcomes.
  • 16. 9 Chapter 8. Details of Hardware and Software Hardware Requirements: 1. Processor: Intel Core i3 or more. 2. RAM: 4GB or more. 3. Hard disk: 250 GB or more. Software Requirements: 1. Operating System : Windows10, 7, 8. 2. Python. 3. Anaconda. 4. Spyder, Jupyter notebook, Flask. 5. MYSQL.
  • 17. 10 Chapter 9. Implementation Flow chart of the system: Fig. 9.1 - Flowchart of the System START Get domain names using packet filtering Known domain? Extract domain features Apply machine learning classification algorithm Machine learning model will generated and its precision, recall, f1 score and accuracy Detect DGA or normal Final output STOP Yes No
  • 18. 11 Use case of the system: Fig. 9.2 - Use case Diagram of the System Register login getdomainnames featureextraction() User applyMLalgos() storeinblacklist Train data Test data System classification Generate model timeseriesprediction
  • 19. 12 Activity Diagram of the system: Fig. 9.3 - Activity Diagram of the System
  • 20. 13 Data-flow diagram (DFD): DFD level 0: DFD level 1: Machine learning for malware detectionUSER SERVER Request Request Response Response DATABASE Get domain names using packet filtering Feature Extraction Apply machine learning algorithms Detect DGA or normal Trained models
  • 21. 14 DFD level 2: Get domain names using packet filtering Feature Extraction Apply machine learning algorithms Check high accuracy of algo and use it SVM trained model Using pyshark capture network packet Extracting length, word ratio, LMS, percentage numeric character etc ANN Trained model Decision tree Trained model Multiple Logistic regressions model NaĂŻve bayes model Random forest model Check high accuracy of algo and use it Detect DGA or normal
  • 22. 15 Class Diagram: User -userName (String) -password (String) -name (String) -age (int) -gender (String) -mobile (number) -dob (Date) Register() Login() getdomainnames() featureextraction() Generatemodel() applyMLalgos() detectDGA() System -filteredpackets Getdynmicdomainname() Storeinblacklist() Classification() Timeseriesprediction() Trainmodel() Testmodel() Classify() Decision tree DetectDGA() Accuracy() Precision Recall() CNN DetectDGA() Accuracy() Precision Recall() SVM DetectDGA() Accuracy() Precision Recall() NaĂŻve bayes DetectDGA() Accuracy() Precision Recall() Random forest DetectDGA() Accuracy() Precision Recall()
  • 23. 16 TESTING Software testing is an investigation conducted to provide stakeholders with information about the quality of the product or service under test. Software testing can also provide an objective, independent view of the software to allow the business to appreciate and understand the risks of software implementation. Test techniques include the process of executing a program or application with intent of finding software bugs (errors or other defects). Software testing involves the execution of a software component or system component to evaluate one or more properties of interest. In general, these properties indicate the extent to which the component to evaluate one or more properties of interest. In general, these properties indicate the extent to which the component or system under test: âť– Meets the requirements that guided its design and development, âť– Responds correctly to all kinds of inputs, âť– Performs its function within an Acceptable time, âť– Is sufficiently usable, âť– Can be installed and run in its intended environments, and âť– Achieves the general result its stakeholder’s desire. As the number of possible tests for even simple software components is practically infinite, all software testing uses some strategy to select tests that are feasible for the available time and resources. Asa result, software testing typically (but not exclusive) attempts to execute a program or application with the intent of finding software bugs (errors or other defects). The job of testing is an iterative process as when one bug is fixed; it can illuminate other, deeper bugs, or can even create new ones. Software testing can provide objective, independent information about the quality of software and risk of its failure to user and/or sponsors. Software testing can be conducted as soon as executable software (even if partially complete) exists. The overall approach to software development often determines when and how the testing is conducted. For example, in a phased process, most testing occurs after the system requirements havebeen
  • 24. 17 Defined and then implemented in testable programs. In contrasts, under an Agile approach, requirements, programming, and testing are often done concurrently. âť– LEVELS OF TESTING: In order to uncover the errors present in different phases we have the concept of levels of testing. The basic level of testing are:- âś“ Unit testing. âś“ Integration testing. âś“ Regression testing. âś“ System testing. âś“ Validation testing. SOFTWARE TESTING LIFE CYCLE : Fig 9.7 - Software Testing Lifecycle
  • 25. 18 UNIT TESTING In computer programming, unit testing is a software testing method by which individual unitsof source code, sets of one or more computer program modules together with associated control data, usage procedures, and operating procedures, are tested to determine whether they are fit for use. Intuitively, one can view a unit as the smallest testable part of an application. In procedural programming, a unit could be an entire module, but it is more commonly an individual function or procedure. In object- oriented programming, a unit is often an entire interface, such as a class, but could be an individual method, unit tests a short code fragments created by programmers or occasionally by white box testers during the development process. It forms the basis for component testing. INTEGRATION TESTING Integration testing is any type of software testing that seeks to verify the interfaces between components against software design. Software components may be integrated in an iterative way or all together (“big bang”). Normally the former is considered a better practice since it allows interface issues to be located more quickly and fixed. Integration testing works to expose defects in the interfaces and interaction between integrated components (modules). REGRESSION TESTING Regression testing focuses on finding defects after a major code change has occurred. Specifically, it seeks to uncover software regressions, as degraded or lost features, including old bugs that have come back. Such regressions occur whenever software functionality that was previously working correctly, stops working was intended. Typically, regressions occur as an unintended consequence of program changes, when the newly developed part of the software collides with the previously existing code. Common methods of regression testing include rerunning previous sets of test cases and checking whether previously fixed faults have re- emerged.
  • 26. 19 SYSTEM TESTING System testing of software or hardware is testing conducted on a complete, integrated system to evaluate the system’s compliance with its specified requirements. System testing falls within the scope of black box testing, and as such, should require no knowledge of the inner design of the code or logic. VALIDATION TESTING Validation Testing ensures that the product actually meets the client’s need. It can also be defines as to demonstrate that the product fulfills its intended use when deployed on appropriate environment
  • 27. 20 Chapter 10. Advantages & Disadvantages ADVANTAGES: 1. Domain Generation Algorithm (DGA), which allows malware to generate numerous domain names until it finds its corresponding C&C server. 2. It is highly resilient to detection systems and reverse engineering, while allowing the C&C server to have several redundant domain names. 3. ML algorithm helps to improve accuracy of result. DISADVANTAGES: 1. Proposed machine learning framework aims to solve the problem of detecting DGA sequences using machine learning techniques derived from observations in a network. 2. Queries not matching the knowledge are stored in a backlog of the software. 3. The lengthy processing time when directly extended to the multiple string patterns matching.
  • 28. 21 Chapter 11 Scope • The most common method to detect malicious URLs deployed by many antivirus groups is the blacklist method. • Blacklists are essentially a database of URLs that have been confirmed to be malicious in the past. Scope of this project is useful for it helps to prevent malicious activity in cyber world. Future Modification: • In future it is intended to improve the system performance on the based on dataset. • Also use new techniques to get accurate result.
  • 29. 22 Chapter 12 References 1. Yi Li, Kaiqi Xiong, Tommy Chin, Chengbin Hu, “A Machine Learning Framework for Domain Generation Algorithm-Based Malware Detection” IEEE Access (Volume: 7), 31 January 2019 2. T. Chin, K. Xiong and M. Rahouti, "SDN-based kernel modular countermeasure for intrusion detection", Proc. 13rd EAI Int. Conf. Secur. Privacy Commun. Netw., pp. 270- 290, 2019. 3. Suthathira Vanitha N., Professor, Department of EEE, Knowledge Institute of Technology, Tamil Nadu, India, A novel approach in identification of blood group using laser technology, International Journal of Research in Engineering and Technology (2014). Available: esatjournals.net/ijret/2014v03/i23/IJRET20140323005.pdf 4. Jose Fernandes, Sara Pimenta, Student Member, IEEE, Filomena O. Soares, Senior Member, IEEE and Graca Minas, Senior Member, IEEE, (2012), A Complete Blood Typing Device for Automatic Agglutination Detection Based on Absorption Spectrophotometry, IEEE Transactions On Instrumentation And Measurement. 5. Nazia Fathima S.M (2013) Classification of blood type by microscopic color images, International Journal of Machine Learning and Computing. Vol. 3, No. 4. Available: www.ijmlc.org/papers/342-L472.pdf