International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1410
DDOS DETECTION SYSTEM USING C4.5 DECISION TREE ALGORITHM
Santosh Kumar Pydipalli1, Srikanth Kasthuri1, Jinu S1
1 Jr.Telecom Officer, Bharath Sanchar Nigam Limited, Bangalore
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract - In today’s digital world, cyber attacks became
much severe and intelligent,causingenormousdamage.With
the evolution of IoT, securing the digital infrastructure is
much essential toavoidanysystemcollapse.Duetoenhanced
bandwidth availability and with advent of new technologies,
hackers becamemorecapabletothrownewchallenges.DDoS
(Distributed Denial of Service) attack is much recurrent
nowadays, that inflicts serious damage and affects the
network /application performance. As hackers are finding
new ways to attack the systems, zero day vulnerabilities can
cost dearly. By the time, patchappliedforsecurityexploit,the
systems may get compromised by attackers .So it’s essential
to develop intelligence to detect and mitigate complex
security attacks. In the present paper, we proposed a model
that can combine the strengths of both signaturebasedDDoS
detection and anomaly based DDoS detection. We designeda
machine learning model that can learn from attack patterns
and alert the security analyst. In this model, DDoS attack
sample is extracted from Intrusion Detection Evaluation
Dataset (CICIDS2017) and divided the sample into training
and test data. We used Weka 3.8.2, a java based data mining
and machine learning tool for building the model. Pre-
processing is done with Weka supervised attribute filter and
classification of training set is performed using J48 / C4.5
decision tree algorithm, with 10-fold cross-validation. The
accuracy of Machine learning model is validated using test
dataset. This model, coupled with signature detection
techniques, can perform automatic and effective
differentiation between benign traffic and DDoS flooding.
Key Words: DDoS, Weka, C 4.5, Decision Tree.
1. INTRODUCTION
There are numerous attacks in present day world on the
network and server resources. Distributed Denial of Service
(DDoS) attacks are attacks targeting the availability of
network and Server resources there by causing service
obstruction to the legitimate users and performance
degradation of resources. These attacks can easily be
launched on machines present within the network or on the
machines present in the different network. Based on the
attacker and victim present in a network they can be
classified into 2 types.
Inbound /Outbound attacks: These are the attacks which
originate in a network are targeting a particular user/users
present in a completely different network.
Cross bound attacks: These are theattackswhichoriginate
in a network and are targeting a user/users present in the
same network.
Distributed DoS attacks (DDoS):
Denial of service attacks (DOS) will prevent the legitimate
user from accessing resources. Basically these DoS attacks
are classified mainly into 2 types Network level attacks and
Application level attacks. Network Level attackswill stop the
genuine user/users from accessing network resources and
Application level attacks will disable the users from
accessing the services by consuming the server resources.
These attacks are wide spread in present day digital world.
In DDoS attacks instead of using a single machine the
attacker uses multiple systems to flood the bandwidth or
resources of a targeted system, generally one or more web
servers.
During a DDoS attack, Hosts or Bots comingfromdistributed
sources overwhelm the target withillegitimatetrafficsothat
the available resources (of Server or any Network) cannot
respond to genuine clients. These DDoS attacks mainly
classified as Volume Based attacks, Protocol attacks,
application layer attacks, computational attacks,
Vulnerability attacks etc.
1.1 Volumetric DDoS attacks:
Volumetric DDoS attacks are designed to saturate and
overwhelm network resources, circuits etc by brute force.
According to M/s. Arbor Networks, 65% of DDoS attacksare
volumetric in nature.
Common attacks:DNSAmplification,NTPAmplification, UDP
Flood.
1.1.1 DNS Amplification:
A DNS amplification attack is a most advanced type of DDoS
attack. In this type of attack, attacker will send large
amounts of incoming data to a server by exploiting the
vulnerabilities of DNS server there by makingtheserverand
its resources inaccessible.
In this attack a large number of DNS request are send by
spoofing the IP address to one or more DNS servers.
Depending on the configuration these DNS servers will start
responding to those IP addresses from which the request is
originated. This will ultimately load the victim’s servers.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1411
1.1.2 NTP Amplification:
These attacks work in similar to DNS attacks. In this the
attacker will target the NTP servers and exploit the
vulnerabilities in them to generate huge amounts of traffic.
NTP is one of the network protocols, this is used by
connected machines in the internet for synchronization of
their clocks.
In most basic type of NTP amplification attack, the attacker
will send “get monlist” request repeatedly to a NTP server
while spoofing the IP address of machine. The NTP server
will respond to this request by sendingthelisttothespoofed
IP address. This response is much larger in size than the
request, generating huge amounts of traffic towards the
target machine and ultimately leading to degradation of
services to genuine requests.
1.1.3 UDP Flood:
It’s a type of denial of service attack inwhichhugenumber of
UDP packets are send with the aimof shatteringthatdevice’s
ability to respond and process there by causing denial of
service to genuine traffic.
1.2 Protocol attacks:
These type of attacks that causes a target to be inaccessible
by exploiting vulnerabilities in the Layer 3 and Layer 4
protocol stack. Common types of Protocol attacks are SYN
flood and Ping of Death.
1.2.1 SYN Flood:
This is a type of denial of service attack. In this attacker will
send SYN (Synchronization) packets to every port on the
server using fake IP addresses. The server will respond with
SYN/ACK packets from each port. This will consume the
resources of server and making itunresponsivetolegitimate
traffic.
1.2.2 Ping of Death:
It’s a type of denial of service attack in which an attacker
sends an ICMP packet larger than the maximum allowable
size, this can crash the machine, or freeze the services
offered by the machine. It’s used to makethedeviceunstable
by intentionally sending large sized packets to the target
device over an IPV4 network.
1.3 Application Layer DDoS attacks:
These attacks that exploit vulnerability in Layer 7 Protocol
stack. These are the most sophisticated ones and very
difficult to identify and mitigate.
Common types of Application layer attacks are HTTP Flood
attacks and attack on DNS services.
1.3.1 HTTP Flood:
HTTP flood attack is a volumetric DDoS attack basically
destined to shatter a target server with HTTP requests. This
will consume the available server resources making it
unresponsive/unreachable to actual genuine requests.
This is a layer7 DDoS attack in which the attacker uses
typical HTTP GET/POSTmethodstofetchinformationfroma
server or application. Unlike other DDoS attacks these
attacks does not uses malformed packetsorIPspoofing.This
makes it difficult to detect these attacks by advanced
detection systems. This is because this traffic characteristic
resembles genuine traffic pattern.
1.3.2 DNS Flood:
The attacker targets multiple Domain Name System (DNS)
servers in a particular area, thereby hampering the
resolution of resource records of that area orsub-areas.DNS
flood attacks normally uses the high bandwidthconnectivity
of bots to make the DNS servers respond to a large number
of DNS requests. In a multi level DNS flood attack, theSource
IP address will be a spoofed one, so that the large number of
DNS replies will be sent to that spoofed IP address, which is
the intended target.
1.4 Conventional Intrusion Detection Techniques:
1.4.1 Signature Based Detection Technique:
This detection is known as misuse technique. This will
compare the "known patterns” of detrimental activity with
the already present signatures stored in the database. This
technique is very accurateandit’sonlywork withtheKnown
attacks. This technique is very fast in comparison to other
detection techniques because all it has to do is to look up for
list of known signatures of attacks and if it finds a match it
will report this. A commonly used tool in signaturedetection
is SNORT tool. On the negative side if someone starts a new
attack there will not be any protection because it will not
find a match with the signatures already present.Inthiscase
similar to Anti-virus, once a new attack is recorded the
database needs to be updated.
1.4.2. Anomaly Based Detection Technique:
Unlike, Signature based detection, Anomaly based detection
doesn’t have a information regarding well known
attacks. This technique make use of the current event
pattern & determine the anomalies inthatpattern. Shannon-
Wiener’s index theory analyzes random data patterns and
point outs uncertainty in the current pattern. Entropy is the
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1412
measure of abnormality and randomness. If the samples
taken to analyze are from a single group, then entropy is
observed to be lesser in comparison with patterns taken
from multiple groups. Headers present in the sampled data
are analyzed to determine theIPandportsbeforecomputing
their entropy. The Detection systems aremadeinsucha way
that it detects DDOS attack, in case the pattern under
observation exceeds the entropy by a fixed threshold.
2. RELATED WORKS
Machine learning techniques are frequently used for
anomaly detection. They have received considerable
attention among the intrusion detection researchers to
address the weaknesses of knowledge based detection
techniques. Another experiment developed on three
intrusion detection models based onMulti-LayerPerceptron
(MLP), C 4.5, and SVM classifiers showed that C 4.5 is the
best method in terms of detection accuracy and minimum
training time [2]; it achieved the accuracy rate of (99.05%).
For this reason, we choose the C4.5 algorithm to detect the
DDoS attacks in our proposed model. We compared the
output accuracy with Naivebayes algorithm.
The model proposed using C 4.5 decision tree algorithm
can provide encouraging results, when combined with
signature based DDOS detection systems.
C 4.5 Algorithm:
C 4.5 is extension of ID3 algorithm. It tries to find
simple and small decision trees. C 4.5 builds decision trees
from a set of training data on basis of information entropy.
Entropy ) = -
Iterating over all possible values of ), the
conditional probability:
It uses the fact that each attribute of the data can be
used to make a decision that splits the data into smaller
subsets. C4.5 examines the normalized information gain
ratio that results from choosing an attribute for splitting the
data.
The attributewiththehighestinformationgainratio
is the one used to make the decision.[4]
Given a learning set S and a non class attribute X,
the Gain Ratio is defined as:
Figure1: Combined model for DDOS Detection
2.1. Input DDoS attack Data
The input DDoS sample is taken from Intrusion
Detection Evaluation Dataset (CICIDS2017)[1]. It contains
benign and the most up-to-date common attacks, which
resembles the true real-world data. The data is captured
over a period of five days with diversified attacks. Low Orbit
Ion Canon (LOIC) tool is used for DDoS attack generation.
The incidents are more than 2 million, making the dataset
comprehensive.
As this dataset is huge, we choose random subset from
the available data. The experiment is repeated with random
input, so that the results are not biased.
2.2. Pre-processing and supervised filtering
The input dataset contains a total of 78 attributes. Pre-
processing is done in Weka with supervised attribute filter.
This will ensure greater accuracy and less processing time
for building the model.
Key Chosen attributes of dataset:
 Maximum Packet length in Forward Direction
 Total Length of packet in Forward direction
 Total Length of packet in Backward direction
 Average packet size
 Destination port
As a standard practice, training and test datasets are
divided in 70:30 ratio. This dataset is taken randomly
from original data and experiment is repeated to
validate the consistency.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1413
2.3. Classification Using Model
We have used C4.5 algorithm for building this
model, as it proved to be more efficient in detection of
DDoS attacks. C4.5 algorithm is used to generate
a decision tree developed by Ross Quinlan.
J48 is an open source Java implementation of the
C4.5 algorithm in Weka. 10-fold Cross validation is done
and the trained model is validated with test dataset.
Figure 2. Decision Tree using C 4.5 Algorithm
Figure3. Weka Output for C4.5 model training dataset
Figure4. Weka Output for C4.5 model test dataset
Confusion Matrix:
A confusion matrix is the summary of prediction
results on a classification problem. The numberofcorrect
and incorrect predictions are summarized with count
values and broken down by each class.
Confusion matrix is shown in Table 1 which is the
basis for checking the accuracy and credibility of the
proposed model.
Tabel1 .Confusion matrix for C 4.5 Algorithm Model
Type of Dataset DDOS Packets BENIGN Packets
Training
Dataset
17064 11
9 12915
Test Dataset 7386 3
6 5604
To confirm the accuracy, we conducted the same
experiment with NaiveBayes classification algorithm. A
comparison chart is made as follows:
Tabel2. Comparison Chart between NavieBayes
and C 4.5
Parameter C 4.5 NaiveBayes
Accuracy 99.93% 91.64%
Kappa Statistic 0.9988 0.8255
Mean Absolute
Error
0.009 0.081
Root Mean
Square Error
0.0228 0.2813
Time taken to
build model
0.4 sec 0.06 sec
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072
© 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1414
Area under ROC, C.4.5 Model
Area under ROC, Naviebayes Model
3. CONCLUSION AND FUTURE SCOPE
C 4.5 DecisionTreealgorithmprovided99.93%accuracy
in differentiating DDoS attack and benign traffic. Even
though, the build time is more compared to Naviebayes, the
accuracy level is improved significantly with C4.5 decision
tree algorithm. The attribute filtration plays a major role for
high accuracy and increased build speed. Combination of
signature plus anomaly based model can increase the
reliability in DDoS detection. This model may be used for
identification of bot participating in DDoS attack so that
those IP addresses could be blocked.
Finally, as a future work, we are planning to develop a
prototype system based on deep learning so that, better
detection capabilities can be inherited by processing huge
real-time streaming data.
REFERENCES
[1]. Iman Sharafaldin, Arash Habibi Lashkari and Ali A.
Ghorbani., “Toward Generating a New Intrusion Detection
Dataset and Intrusion Traffic Characterization”, 4th
International Conference on Information Systems Security
and Privacy (ICISSP), Purtogal, January 2018
[2] Ismanto, Heru & Wardoyo, Retantyo., Comparison of
running time between C4.5 and k-nearest neighbor (k-NN)
algorithm on deciding mainstay area clustering.
International Journal of Advances in Intelligent Informatics.
2. 1. 10.26555/ijain.v2i1.49.
[3] Marwane Zekri, Said El Kafhali, Noureddine Aboutabit
and Youssef Saadi., “DDoS Attack Detection using Machine
Learning Techniques in Cloud Computing Environments”,
3rd International Conference of Cloud Computing
Technologies and Applications (CloudTech), October 2017
[4]. Wei Wang, 2, Sylvain Gombault, Thomas Guyet.,
“Towards fast detecting intrusions: using key attributes of
network traffic”, Third International Conference onInternet
Monitoring and Protection, pp.86-91, 2008
[5]. Agrawal, S. and Rajput, R. S., “Denial of Services Attack
Detection using Random Forest Classifier with Information
Gain”, International Journal ofEngineeringDevelopment and
Research, September 2017

IRJET- DDOS Detection System using C4.5 Decision Tree Algorithm

  • 1.
    International Research Journalof Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1410 DDOS DETECTION SYSTEM USING C4.5 DECISION TREE ALGORITHM Santosh Kumar Pydipalli1, Srikanth Kasthuri1, Jinu S1 1 Jr.Telecom Officer, Bharath Sanchar Nigam Limited, Bangalore ---------------------------------------------------------------------***---------------------------------------------------------------------- Abstract - In today’s digital world, cyber attacks became much severe and intelligent,causingenormousdamage.With the evolution of IoT, securing the digital infrastructure is much essential toavoidanysystemcollapse.Duetoenhanced bandwidth availability and with advent of new technologies, hackers becamemorecapabletothrownewchallenges.DDoS (Distributed Denial of Service) attack is much recurrent nowadays, that inflicts serious damage and affects the network /application performance. As hackers are finding new ways to attack the systems, zero day vulnerabilities can cost dearly. By the time, patchappliedforsecurityexploit,the systems may get compromised by attackers .So it’s essential to develop intelligence to detect and mitigate complex security attacks. In the present paper, we proposed a model that can combine the strengths of both signaturebasedDDoS detection and anomaly based DDoS detection. We designeda machine learning model that can learn from attack patterns and alert the security analyst. In this model, DDoS attack sample is extracted from Intrusion Detection Evaluation Dataset (CICIDS2017) and divided the sample into training and test data. We used Weka 3.8.2, a java based data mining and machine learning tool for building the model. Pre- processing is done with Weka supervised attribute filter and classification of training set is performed using J48 / C4.5 decision tree algorithm, with 10-fold cross-validation. The accuracy of Machine learning model is validated using test dataset. This model, coupled with signature detection techniques, can perform automatic and effective differentiation between benign traffic and DDoS flooding. Key Words: DDoS, Weka, C 4.5, Decision Tree. 1. INTRODUCTION There are numerous attacks in present day world on the network and server resources. Distributed Denial of Service (DDoS) attacks are attacks targeting the availability of network and Server resources there by causing service obstruction to the legitimate users and performance degradation of resources. These attacks can easily be launched on machines present within the network or on the machines present in the different network. Based on the attacker and victim present in a network they can be classified into 2 types. Inbound /Outbound attacks: These are the attacks which originate in a network are targeting a particular user/users present in a completely different network. Cross bound attacks: These are theattackswhichoriginate in a network and are targeting a user/users present in the same network. Distributed DoS attacks (DDoS): Denial of service attacks (DOS) will prevent the legitimate user from accessing resources. Basically these DoS attacks are classified mainly into 2 types Network level attacks and Application level attacks. Network Level attackswill stop the genuine user/users from accessing network resources and Application level attacks will disable the users from accessing the services by consuming the server resources. These attacks are wide spread in present day digital world. In DDoS attacks instead of using a single machine the attacker uses multiple systems to flood the bandwidth or resources of a targeted system, generally one or more web servers. During a DDoS attack, Hosts or Bots comingfromdistributed sources overwhelm the target withillegitimatetrafficsothat the available resources (of Server or any Network) cannot respond to genuine clients. These DDoS attacks mainly classified as Volume Based attacks, Protocol attacks, application layer attacks, computational attacks, Vulnerability attacks etc. 1.1 Volumetric DDoS attacks: Volumetric DDoS attacks are designed to saturate and overwhelm network resources, circuits etc by brute force. According to M/s. Arbor Networks, 65% of DDoS attacksare volumetric in nature. Common attacks:DNSAmplification,NTPAmplification, UDP Flood. 1.1.1 DNS Amplification: A DNS amplification attack is a most advanced type of DDoS attack. In this type of attack, attacker will send large amounts of incoming data to a server by exploiting the vulnerabilities of DNS server there by makingtheserverand its resources inaccessible. In this attack a large number of DNS request are send by spoofing the IP address to one or more DNS servers. Depending on the configuration these DNS servers will start responding to those IP addresses from which the request is originated. This will ultimately load the victim’s servers.
  • 2.
    International Research Journalof Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1411 1.1.2 NTP Amplification: These attacks work in similar to DNS attacks. In this the attacker will target the NTP servers and exploit the vulnerabilities in them to generate huge amounts of traffic. NTP is one of the network protocols, this is used by connected machines in the internet for synchronization of their clocks. In most basic type of NTP amplification attack, the attacker will send “get monlist” request repeatedly to a NTP server while spoofing the IP address of machine. The NTP server will respond to this request by sendingthelisttothespoofed IP address. This response is much larger in size than the request, generating huge amounts of traffic towards the target machine and ultimately leading to degradation of services to genuine requests. 1.1.3 UDP Flood: It’s a type of denial of service attack inwhichhugenumber of UDP packets are send with the aimof shatteringthatdevice’s ability to respond and process there by causing denial of service to genuine traffic. 1.2 Protocol attacks: These type of attacks that causes a target to be inaccessible by exploiting vulnerabilities in the Layer 3 and Layer 4 protocol stack. Common types of Protocol attacks are SYN flood and Ping of Death. 1.2.1 SYN Flood: This is a type of denial of service attack. In this attacker will send SYN (Synchronization) packets to every port on the server using fake IP addresses. The server will respond with SYN/ACK packets from each port. This will consume the resources of server and making itunresponsivetolegitimate traffic. 1.2.2 Ping of Death: It’s a type of denial of service attack in which an attacker sends an ICMP packet larger than the maximum allowable size, this can crash the machine, or freeze the services offered by the machine. It’s used to makethedeviceunstable by intentionally sending large sized packets to the target device over an IPV4 network. 1.3 Application Layer DDoS attacks: These attacks that exploit vulnerability in Layer 7 Protocol stack. These are the most sophisticated ones and very difficult to identify and mitigate. Common types of Application layer attacks are HTTP Flood attacks and attack on DNS services. 1.3.1 HTTP Flood: HTTP flood attack is a volumetric DDoS attack basically destined to shatter a target server with HTTP requests. This will consume the available server resources making it unresponsive/unreachable to actual genuine requests. This is a layer7 DDoS attack in which the attacker uses typical HTTP GET/POSTmethodstofetchinformationfroma server or application. Unlike other DDoS attacks these attacks does not uses malformed packetsorIPspoofing.This makes it difficult to detect these attacks by advanced detection systems. This is because this traffic characteristic resembles genuine traffic pattern. 1.3.2 DNS Flood: The attacker targets multiple Domain Name System (DNS) servers in a particular area, thereby hampering the resolution of resource records of that area orsub-areas.DNS flood attacks normally uses the high bandwidthconnectivity of bots to make the DNS servers respond to a large number of DNS requests. In a multi level DNS flood attack, theSource IP address will be a spoofed one, so that the large number of DNS replies will be sent to that spoofed IP address, which is the intended target. 1.4 Conventional Intrusion Detection Techniques: 1.4.1 Signature Based Detection Technique: This detection is known as misuse technique. This will compare the "known patterns” of detrimental activity with the already present signatures stored in the database. This technique is very accurateandit’sonlywork withtheKnown attacks. This technique is very fast in comparison to other detection techniques because all it has to do is to look up for list of known signatures of attacks and if it finds a match it will report this. A commonly used tool in signaturedetection is SNORT tool. On the negative side if someone starts a new attack there will not be any protection because it will not find a match with the signatures already present.Inthiscase similar to Anti-virus, once a new attack is recorded the database needs to be updated. 1.4.2. Anomaly Based Detection Technique: Unlike, Signature based detection, Anomaly based detection doesn’t have a information regarding well known attacks. This technique make use of the current event pattern & determine the anomalies inthatpattern. Shannon- Wiener’s index theory analyzes random data patterns and point outs uncertainty in the current pattern. Entropy is the
  • 3.
    International Research Journalof Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1412 measure of abnormality and randomness. If the samples taken to analyze are from a single group, then entropy is observed to be lesser in comparison with patterns taken from multiple groups. Headers present in the sampled data are analyzed to determine theIPandportsbeforecomputing their entropy. The Detection systems aremadeinsucha way that it detects DDOS attack, in case the pattern under observation exceeds the entropy by a fixed threshold. 2. RELATED WORKS Machine learning techniques are frequently used for anomaly detection. They have received considerable attention among the intrusion detection researchers to address the weaknesses of knowledge based detection techniques. Another experiment developed on three intrusion detection models based onMulti-LayerPerceptron (MLP), C 4.5, and SVM classifiers showed that C 4.5 is the best method in terms of detection accuracy and minimum training time [2]; it achieved the accuracy rate of (99.05%). For this reason, we choose the C4.5 algorithm to detect the DDoS attacks in our proposed model. We compared the output accuracy with Naivebayes algorithm. The model proposed using C 4.5 decision tree algorithm can provide encouraging results, when combined with signature based DDOS detection systems. C 4.5 Algorithm: C 4.5 is extension of ID3 algorithm. It tries to find simple and small decision trees. C 4.5 builds decision trees from a set of training data on basis of information entropy. Entropy ) = - Iterating over all possible values of ), the conditional probability: It uses the fact that each attribute of the data can be used to make a decision that splits the data into smaller subsets. C4.5 examines the normalized information gain ratio that results from choosing an attribute for splitting the data. The attributewiththehighestinformationgainratio is the one used to make the decision.[4] Given a learning set S and a non class attribute X, the Gain Ratio is defined as: Figure1: Combined model for DDOS Detection 2.1. Input DDoS attack Data The input DDoS sample is taken from Intrusion Detection Evaluation Dataset (CICIDS2017)[1]. It contains benign and the most up-to-date common attacks, which resembles the true real-world data. The data is captured over a period of five days with diversified attacks. Low Orbit Ion Canon (LOIC) tool is used for DDoS attack generation. The incidents are more than 2 million, making the dataset comprehensive. As this dataset is huge, we choose random subset from the available data. The experiment is repeated with random input, so that the results are not biased. 2.2. Pre-processing and supervised filtering The input dataset contains a total of 78 attributes. Pre- processing is done in Weka with supervised attribute filter. This will ensure greater accuracy and less processing time for building the model. Key Chosen attributes of dataset:  Maximum Packet length in Forward Direction  Total Length of packet in Forward direction  Total Length of packet in Backward direction  Average packet size  Destination port As a standard practice, training and test datasets are divided in 70:30 ratio. This dataset is taken randomly from original data and experiment is repeated to validate the consistency.
  • 4.
    International Research Journalof Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1413 2.3. Classification Using Model We have used C4.5 algorithm for building this model, as it proved to be more efficient in detection of DDoS attacks. C4.5 algorithm is used to generate a decision tree developed by Ross Quinlan. J48 is an open source Java implementation of the C4.5 algorithm in Weka. 10-fold Cross validation is done and the trained model is validated with test dataset. Figure 2. Decision Tree using C 4.5 Algorithm Figure3. Weka Output for C4.5 model training dataset Figure4. Weka Output for C4.5 model test dataset Confusion Matrix: A confusion matrix is the summary of prediction results on a classification problem. The numberofcorrect and incorrect predictions are summarized with count values and broken down by each class. Confusion matrix is shown in Table 1 which is the basis for checking the accuracy and credibility of the proposed model. Tabel1 .Confusion matrix for C 4.5 Algorithm Model Type of Dataset DDOS Packets BENIGN Packets Training Dataset 17064 11 9 12915 Test Dataset 7386 3 6 5604 To confirm the accuracy, we conducted the same experiment with NaiveBayes classification algorithm. A comparison chart is made as follows: Tabel2. Comparison Chart between NavieBayes and C 4.5 Parameter C 4.5 NaiveBayes Accuracy 99.93% 91.64% Kappa Statistic 0.9988 0.8255 Mean Absolute Error 0.009 0.081 Root Mean Square Error 0.0228 0.2813 Time taken to build model 0.4 sec 0.06 sec
  • 5.
    International Research Journalof Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 05 Issue: 12 | Dec 2018 www.irjet.net p-ISSN: 2395-0072 © 2018, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 1414 Area under ROC, C.4.5 Model Area under ROC, Naviebayes Model 3. CONCLUSION AND FUTURE SCOPE C 4.5 DecisionTreealgorithmprovided99.93%accuracy in differentiating DDoS attack and benign traffic. Even though, the build time is more compared to Naviebayes, the accuracy level is improved significantly with C4.5 decision tree algorithm. The attribute filtration plays a major role for high accuracy and increased build speed. Combination of signature plus anomaly based model can increase the reliability in DDoS detection. This model may be used for identification of bot participating in DDoS attack so that those IP addresses could be blocked. Finally, as a future work, we are planning to develop a prototype system based on deep learning so that, better detection capabilities can be inherited by processing huge real-time streaming data. REFERENCES [1]. Iman Sharafaldin, Arash Habibi Lashkari and Ali A. Ghorbani., “Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization”, 4th International Conference on Information Systems Security and Privacy (ICISSP), Purtogal, January 2018 [2] Ismanto, Heru & Wardoyo, Retantyo., Comparison of running time between C4.5 and k-nearest neighbor (k-NN) algorithm on deciding mainstay area clustering. International Journal of Advances in Intelligent Informatics. 2. 1. 10.26555/ijain.v2i1.49. [3] Marwane Zekri, Said El Kafhali, Noureddine Aboutabit and Youssef Saadi., “DDoS Attack Detection using Machine Learning Techniques in Cloud Computing Environments”, 3rd International Conference of Cloud Computing Technologies and Applications (CloudTech), October 2017 [4]. Wei Wang, 2, Sylvain Gombault, Thomas Guyet., “Towards fast detecting intrusions: using key attributes of network traffic”, Third International Conference onInternet Monitoring and Protection, pp.86-91, 2008 [5]. Agrawal, S. and Rajput, R. S., “Denial of Services Attack Detection using Random Forest Classifier with Information Gain”, International Journal ofEngineeringDevelopment and Research, September 2017