SlideShare a Scribd company logo
1 of 4
Download to read offline
A Network Behavior Analysis Method to Detect
Reverse Remote Access Trojan
HongyuZhu
State Grid Hunan Electric Power
Corporation Research Institute,
State Grid Corporation ofChina
Changsha , China
602983840@qq.com
ZhexiangWu
State Grid Zhejiang Yongkang
Electric Power Corporation,
State Grid Corporation ofChina
Yongkang, China
wuzx@zj.sgcc.com.cn
Jianwei Tian
State Grid Hunan Electric Power
Corporation Research Institute,
State Grid Corporation ofChina
Chimgsha , China
tianjw@lm.sgcc.com.cn
Zheng Tian
State Grid Hunan Electric Power
Corporation Research Institute,
State Grid Corporation ofChina
Changsha , China
tianz@lm.sgcc.com.cn
Hong Qiao
State Grid Hunan Electric Power
Corporation Research Institute,
State Grid Corporation ofChina
Changsha , China
qiaoh@lm.sgcc.com.cn
Xi Li
State Grid Hunan Electric Power
Corporation Research Institute,
State Grid Corporation ofChina
Changsha , China
lix@lm.sgcc.com.cn
Shengshen Chen
State Grid Hunan Electric Power
Corporation Research Institute,
State Grid Corporation ofChina
Changsha , China
chenss@lm.sgcc.com.cn
Abstract-Remote Access Trojan (RAT) reverse connections
are secret and malicious, which are established to steal private
data or be operated under hacker's command. To detect reverse
RAT effectively, a network behavior-based method is introduced
in this paper. We first conclude a typical network communication
pattern. Then four uncorrelated network behavior features are
extracted from every TCP session as the detection model input.
Six supervised classification algorithms are applied on real
network traffic data set to distinguish RAT and legitimate
sessions. Besides detection accuracy, AVC is also used because
the amount of RAT sessions is much less than normal sessions
and AVC is suitable to evaluate the performance of such
imbalanced problem. Detection accuracies of all test algorithms
are higher than 0.92. AVC of Random Forest, SVM and Logistic
Regression are higher than 0.94, which shows their ability to
handle imbalanced data set. Compared to related work, the
proposed method is effective on connection encrypted RAT
detection, and can distinguish RAT sessions from similar normal
sessions, like P2P or cloud application sessions.
Keywords-Network Security, Trojan Detection, Network
behavior, Machine learning
I. INTRODUCTION
Remote Access Trojan (RAT) has become a destructive
threat to almost every enterprise, due to its ability to steal
confidential data and execute malicious commands under the
control of hackers [1]. RAT consists of two separate parts and
these two parts communicate with each other through Internet
The client part is secretly installed on compromised computers
through fishing E-mail or USB driver, and remotely receives
hacker's command. The server part is on the hacker's side and
sends command to the compromised computers. The different
direction of RAT sessions classifies RAT into two categories:
the forward RAT and reverse RAT. In a normal forward RAT
connection, a client connects to a server through the server's
978-1-5386-6565-7118/$31.00 ©2018 IEEE 1007
open port, but in the case of a reverse RAT connection, the
client opens the port that the server connects to.
Port filtering policy has been widely used on firewall and
switches to block forward RAT connections [2-3]. However,
it's impossible to distinguish reverse RAT traffic bashed on
port filtering, because it behaves just like normal outbound
connections, like visiting a website. Deep Packet Inspection
(DPI) is another type of detection method, which compares the
payload of network packets with Trojan feature library [2][4].
But DPI can't deal with encrypted RAT traffics, and DPI has
high computational complexity. Host-based Trojan detection
needs to install a program on every target computers, and some
RAT clients have the ability to hide themselves or keep
inactivated when they discover detection software is scanning
them, which makes the host-based RAT detection easy to fail
[5].
In this paper, we developed a RAT detection method based
on network behavior analysis. This method is efficient and
works well on encrypted RAT traffic. The remainder of the
paper is organized as follows: Some related works of RAT
detection are discussed in Section II. Section III describes
typical RAT communication procedure and characteristics. In
Section IV, the model of our detection method and its
performance test results is introduced. Section V gives
conclusion of the whole paper.
II. RELATED WORK
Since Trojan is one of the most common attacking tool
used by hackers, the rule set of Intrusion Detection System
(IDS) and Intrusion Protection System (IPS) has included large
amount of rules to detect Trojan [6]. Although this is a simple
way to detect Trojan, the rapidly growing scale of Trojan rule
library reduces the IDSIIPS's time performance, and it
becomes invalid when encounter encrypted Trojan traffic. The
network-behavior-based RAT detection method has been
studied by several network security researchers. Li [7] extracts
6 network behavior features and uses cluster algorithms to
detect Trojan from network traffics. But normal P2P service is
found to have great influence on the performance of this
detection method. Dan [8] mainly focuses on the first few
packets of a TCP session and applies classification algorithms,
However, the chosen network behavior features need to be
optimized, since only 2 of the 7 extracted by Dan are
uncorrelated. Besides, the packet number in early stage is too
few to provide sufficient features in real network environment.
In [5], 927 cross-layer network features are analysed by 3
different classification algorithms to detect rnalware sessions.
This solid work is targeted at detecting dozens types of
malware, making it too complicated for RAT detection
III. OVERVIEW OF RAT COMMUNICATION
A typical communication mechanism of reverse RAT is
described in this part A complete RAT session starts with a
successful TCP handshake and ends with TCP FIN or RST
process. After TCP connection has been established, RAT will
come through a period called early stage, where interval time
between two adjacent packets is less than the threshold t
second(s) [8]. Early stage is when the server part and client part
of RAT exchanges some basic information, After that, some
small packets carrying hacker's command would be sent to the
client side. As a response the client would send several packets
back, and these packets are relatively larger than the command
packets because they may carry confidential data. During the
idle time when hacker gives no instruction, keep-alive
heartbeat packets may occur between client and server to tell
each other they are still connected.
Payload lengt h
.'
RAT se r ver s ide (Hacker)
However, not every RAT session acts exactly the same as
the described typical communication pattern. Few RATs may
not send heartbeat packets or have more inbound bytes than
outbound bytes. Some normal application may act like RAT in
some way. For example, P2P and cloud sessions also prefer to
use PSH flag, and remote desktop service may send heartbeat
packets. Thus, none of those features alone can distinguish
RAT session from legitimate session, and find a way to
integrate those features is the key to solve this RAT detection
problem.
IV REVERSE RAT DEJECTION MErnOD
Our RAT detection method is described in this section. We
first pick out four network behavior attributes that best
represent the differences between RAT and normal sessions.
Then we apply different machine learning models to learn the
mapping between the four attributes and the fact that whether a
session is RAT or not. A labeled data set consists of real RAT
and normal network traffic is used to train and test our models.
A. Feature Selection and Extraction
We extract four features from every complete TCP session:
• out-in-bytes-ratio: the ratio between average outbound
byte and inbound byte. It's a positive continuous value.
• PSH-flag-ralio: the ratio between number of packets
with PSH flag and session packet number. It's a
positive continuous value.
• early-stage-pocket-number: the packet number of
session's early stage with threshold t set to 1 second.
It's a positive integer.
• heartbeat-flag: the flag of whether a session has
heartbeat packets. It's a Boolean with value of aor 1.
To obtain the above four features, the basic information of
every TCP packet we need to collect is listed in Table I.
To get the early-stage-packet-numher, we calculate the
time interval between every two adjacent packets until the time
interval is greater than the threshold t (1 second here). RAT's
heartbeat starts with a fixed length of packet sent by client and
a fixed answering packet sent by server, and this pattern would
repeat for several times. In this paper we set the repeat time to
be 3. The pseudo code of heartbeat packets detection algorithin
is shown in Fig. 2. The overall network behavior feature
extraction process is shown in Fig. 3.
Packet info Meaning
Srcjp Source IP address
Src---'port Source TCP port
Dstip Destination IF address
Dstport Destination TCP port
Timestamp Packet timestamp
flags TCP flags including SYNIACK!FIN/RST/PSH
Payload_len Payload length in bytes
i t ime
TIT SY!'! : Early Data Heartbeat
handshake i Stage i translliss ion i packets
---~----------+--------------------."..j:,-------------------","I
, . iS3 RAT cl i ent si de Glormal us er)
Fig. 1. Typical RAT session communication process
There are some network behavior differences between of
legitimate and RAT sessions. Legitimate sessions tend to
transport data as soon and as much as possible once TCP
handshake finished, so most legitimate sessions don't have any
early stage. Most RAT sessions transport more outbound data
than inbound to send out confidential information, while
normal sessions often behave in the opposite way. The PSH
flag in the TCP header informs the receiving host that the data
should be pushed up to the receiving application immediately.
The rate of packets containing PSH flag of RAT sessions is
very likely to be higher than that of legitimate sessions,
because hackers always hope their packets to have higher
priority. Moreover normal sessions usually don't have
heartbeat packets.
1008
TABLE!. PACKET INFORMATION TO BEEXTRACTED
tlJUJ
(I)
STATISTICAL FEATURES OF TIIE COLLECTED DATA SETTABLE II.
Based on previous studies [2][4-5] related to RAT detection,
six different machine leam-ing algorithms are chosen for this
classification problem including kNN (k Nearest Neighbor),
Naive Bayes, Logistic Regression, SVM (Support Vector
Machine), Ran-dom Forest and decision tree. All algorithins
are implemented in python scikit-learn library.
We use IO-Fold cross validation to split the data for
training and testing in all tests, and we use accuracy and Area
under the Curve (AUC) metrics to evaluate the performance of
our algorithms,
• Accuracy: Accuracy is a basic score in classifier
evaluation, which is calculated by equation (2):
c. Machine Learning Model Training and Evaluation
So far, Reverse RAT detection problem has become a
binary classification problem, taking the above four features
{XI,Xl,x3,x4} as input and yE {O,I} (I for RAT and afor normal
session) as output (I).
~
Reverse RAT
Legitimate sessions
Feature sessions
Average early stage packet 475 845
number
Average outbound TCP 78.74 146.8
navload leneth (byte)
Average inbound TCP 94.24 866.32
navload leneth (byte)
Percentage of sessions 24.3% 0.91%
havine heartbeat
Calculate out-in-
bytes-
ratio=(outbou nd_b
yte/inbound_byte)
Calculate the
packet interval time
Calculate PSH-flag-
"'-_--->J rati0=(PSH_pac ket_
number/session_pa
cket_number)
this the la
packet ofthe
session?
Fig. 2. Heartbeat packets detection Algorithm
it' ,:" " IIH" " " I l""'hl. I,'",!I" "' " cq",' I) w,,' t" 'low ,,, ,•.,•.•.<'1 , InJ~,I', an "qUill) th" n
Jlw r ll>.dFI".'I ' I
~ n ,t u l'I' If',,,,''II.,,,I FI,,'I
~ ~ nd if
III else
11 /,/,," +- !<I..: + I
11 ".,,,/1 +- " .,.11 + (','l.T tldl~ _ ,)
1:1 "'Kl if
II " ml whit"
1,. r "tur" flr-m 'I 'k at} I'..~'
m e n d (unetiu ll
PSH_packet_number Set
=PSH_packet_numb ' - - - - - - ---1early_stage_e nd_flag=
er+PSH_flag_value True
A 1llorith,n 1 H",,1I,,'>« D"1<, 't i' ,L Mo d,, :,'
I np ut : ~~,:kct h:·,.Li." ,
Out put : H:'artl ",'lt ~"l"..
l Ii.mctio ll H Efl.Tf<EAT U~ I1<(,T lD.,  1'", 'htln!()Li,,' )
2 IrLr · - 0
I /{ ell rIl H',, 'YIIlJl '!- 0
I w lul" ! d~ < = 1 (~J (jrlf"hd " J ()L." I ) _ t; d o
if /H.,.;""tlnl",',i,o/[,,1.c) /0 f'lc kd/ IlfoL i81[ib : t 11)",c 3 1",i ~, oj " ,J' Iv".r',1. ",1.0,,"') 1",,-.,d .,
Set current packet to be
the next packet inthe list
Fig. 3. Network behavior feature extraction process
B. Data set
To conduct supervised machine learning algorithms, we
collect 370 real RAT traffics from open source community',
including RAT type of ghOst, Remcos, Nanocore, Adwind,
NetSupport Manager from year 2016 to 2018. Around 30% of
the collected RAT traffics are encrypted. As for normal
sessions, we collect 2190 sessions' traffic information from our
company network, covering application type of E-mail, QQ,
web browsing, P2P and cloud service.
The statistical features in Table II are calculated from the
collected data set, from which we can tell some differences
between reverse RAT and legitimate sessions. The average
early stage packet number of legitimate sessions is much
greater than that of RAT sessions. The gap between outbound
and inbound TCP payload length is much bigger of legitimate
sessions than that of RAT sessions. And the possibility of the
existence of heartbeat packets in RAT sessions is higher than
that in normal ones.
Accuracy Correctly Classified Sample Number (2)
Total Sample Number
• AUC: The RAT detection task is obviously imbalanced,
since the number of RAT sessions is much less than
normal sessions in practice. Ave is known to be a
reliable measure for imbalanced data set [4]. Receiver
Operating Characteristic (ROC) curve is created by
plotting the true positive rate (TPR) against the false
positive rate (FPR) at various threshold settings in
binary classification. AUC is the area under ROC curve.
As seen in Figure 4, all six algorithms achieve accuracy
higher than 0.92 and AUC higher than 0.87, which verifies that
machine learning is a feasible solution for reverse RAT
detection. Random Forest with 10 trees, SVM with linear
kernel and Logistics Regression have an average AUC of 0.954,
which is higher than 4NN, Gaussian Naive Bayes and decision
tree. And it indicates that the first 3 algorithms are more
capable of handling imbalanced data set. Meanwhile, decision
tree algorithin may not perfectly suit for imbalanced task since
it gets the lowest AUC of 0.87. Random Forest gets an
accuracy of 0.957 and AUC of 0.979, making it the optimal
solution for reverse RAT detection among all six algorithms.
www.malware-traffic-analysis.net www.contagiodump.blogspot.corn
www.capture.blogspot.com.
1009
A ccurac)' a nd AU C Va lue
,~..",.~,...._======
_ io.ecc _ l O·...UC
Fig. 1. Accuracy and AUC value of test algorithms
V. C ONCLUSIONS
In this paper, we introduce a reverse RAT detection method
based on network behavior features and machine learning
algorithms. Instead of inspect the payload of network traffic,
our approach uses only 4 features extracted from TCP headers,
making it efficient to detect RAT in real time. The proposed
method is mainly based on the fact that reverse RAT sessions
are more possible to have short early stages, heart beat packets,
PSH flags and send out more data than normal sessions.
Machine learning is able to solve this binary classification
problem according to our test on real data. Random Forest
performs the best with an accuracy of 0.957 and AUC of 0.979,
1010
and the performances of SVM and Logistic Regression
algorithms follows. Thus, our approach can detect unencrypted
and encrypted RAT sessions accurately and efficiently.
REFERENCES
[1] Michael, A , Sean, M., Christopher, C., & Aaron, 1. (2016) Hacking
Exposed Malware & Rootkits: Security Secrets and Solutions, 2nd edn,
McGraw-Hill Education, New York
[2] Li, w., Liu, H., & Zhang, X (2016) A network data security analysis
method based on DPI technology. 2016 7th IEEE International
Conference on Software Engineering and Service Science (ICSESS), 97-
976.
[3] Zhu, H., Tian, Z., & Xue, H. Practice of Automatic Monitoring Tool for
Boundary Port of Electric Power Information Network Hunan Electric
Power, 2017, (37):49-52.
[4] Reham, T., Nada, M., & Ayman, M. (2017) A survey on deep packet
inspection. Proceedings of ICCES 2017 12th International Conference
on Computer Engineering and Systems, 188-197.
[5] Dmitri, B., Bracha, S., Lior, R., & Ariel, B. (2015) Unknown Malware
Detection Using Network Traffic Classification. 2015 IEEE Conference
on Communications and NetworkSecurity, 134-142.
[6] Elias, R. & Xenofontas, D. (2014) IDS Alert Correlation in the Wild
With EDGe. IEEE Journal on Selected Areas in Communications, 1933-
1946.
[7] Shicong, 1., Xiaochun, Y., Yongzheng, Z., Yi, P., & Tao, Y. (2012). A
Novel Approach of Detecting Trojan Based on Network Behavior
Analysis. 2012 IEEE 14th International Conference on Communication
Technology, 513 - 518.
[8] Dan, J., & Kazumasa, O. (2015) An Approach to Detect Remote Access
Trojan in the Early Stage of Communication. 2015 IEEE 29th
International Conference on Advanced Information Networking and
Applications, 706-713.

More Related Content

What's hot

ECET 465 help Making Decisions/Snaptutorial
ECET 465 help Making Decisions/SnaptutorialECET 465 help Making Decisions/Snaptutorial
ECET 465 help Making Decisions/Snaptutorialpinck2329
 
Arun prjct dox
Arun prjct doxArun prjct dox
Arun prjct doxBaig Mirza
 
CREST CCT Exam Prep Notes
CREST CCT Exam Prep NotesCREST CCT Exam Prep Notes
CREST CCT Exam Prep NotesNathanAn
 
IRJET- Assessment of Network Protocol Packet Analysis in IPV4 and IPV6 on Loc...
IRJET- Assessment of Network Protocol Packet Analysis in IPV4 and IPV6 on Loc...IRJET- Assessment of Network Protocol Packet Analysis in IPV4 and IPV6 on Loc...
IRJET- Assessment of Network Protocol Packet Analysis in IPV4 and IPV6 on Loc...IRJET Journal
 
A precise termination condition of the probabilistic packet marking algorithm...
A precise termination condition of the probabilistic packet marking algorithm...A precise termination condition of the probabilistic packet marking algorithm...
A precise termination condition of the probabilistic packet marking algorithm...Mumbai Academisc
 
Bt0072 computer networks 2
Bt0072 computer networks  2Bt0072 computer networks  2
Bt0072 computer networks 2Techglyphs
 
Selective watchdog technique for intrusion detection in mobile ad hoc network
Selective watchdog technique for intrusion detection in mobile ad hoc networkSelective watchdog technique for intrusion detection in mobile ad hoc network
Selective watchdog technique for intrusion detection in mobile ad hoc networkgraphhoc
 
Packet analyzing with wireshark-basic of packet analyzing - Episode_03
Packet analyzing with wireshark-basic of packet analyzing - Episode_03Packet analyzing with wireshark-basic of packet analyzing - Episode_03
Packet analyzing with wireshark-basic of packet analyzing - Episode_03Dhananja Kariyawasam
 
Review on Detection & Prevention Methods for Black Hole Attack on AODV based ...
Review on Detection & Prevention Methods for Black Hole Attack on AODV based ...Review on Detection & Prevention Methods for Black Hole Attack on AODV based ...
Review on Detection & Prevention Methods for Black Hole Attack on AODV based ...IJERD Editor
 
Volume 2-issue-6-2095-2097
Volume 2-issue-6-2095-2097Volume 2-issue-6-2095-2097
Volume 2-issue-6-2095-2097Editor IJARCET
 
THE FIGHT AGAINST IP SPOOFING ATTACKS: NETWORK INGRESS FILTERING VERSUS FIRST...
THE FIGHT AGAINST IP SPOOFING ATTACKS: NETWORK INGRESS FILTERING VERSUS FIRST...THE FIGHT AGAINST IP SPOOFING ATTACKS: NETWORK INGRESS FILTERING VERSUS FIRST...
THE FIGHT AGAINST IP SPOOFING ATTACKS: NETWORK INGRESS FILTERING VERSUS FIRST...ijsptm
 
Packet capture in network security
Packet capture in network securityPacket capture in network security
Packet capture in network securityChippy Thomas
 

What's hot (19)

Ba25315321
Ba25315321Ba25315321
Ba25315321
 
ECET 465 help Making Decisions/Snaptutorial
ECET 465 help Making Decisions/SnaptutorialECET 465 help Making Decisions/Snaptutorial
ECET 465 help Making Decisions/Snaptutorial
 
Mobile Transport layer
Mobile Transport layerMobile Transport layer
Mobile Transport layer
 
NAT
NATNAT
NAT
 
Arun prjct dox
Arun prjct doxArun prjct dox
Arun prjct dox
 
Ccna 4 chapter 2 2011 v4
Ccna 4 chapter 2 2011 v4Ccna 4 chapter 2 2011 v4
Ccna 4 chapter 2 2011 v4
 
Ijnsa050211
Ijnsa050211Ijnsa050211
Ijnsa050211
 
CREST CCT Exam Prep Notes
CREST CCT Exam Prep NotesCREST CCT Exam Prep Notes
CREST CCT Exam Prep Notes
 
IRJET- Assessment of Network Protocol Packet Analysis in IPV4 and IPV6 on Loc...
IRJET- Assessment of Network Protocol Packet Analysis in IPV4 and IPV6 on Loc...IRJET- Assessment of Network Protocol Packet Analysis in IPV4 and IPV6 on Loc...
IRJET- Assessment of Network Protocol Packet Analysis in IPV4 and IPV6 on Loc...
 
A precise termination condition of the probabilistic packet marking algorithm...
A precise termination condition of the probabilistic packet marking algorithm...A precise termination condition of the probabilistic packet marking algorithm...
A precise termination condition of the probabilistic packet marking algorithm...
 
Bt0072 computer networks 2
Bt0072 computer networks  2Bt0072 computer networks  2
Bt0072 computer networks 2
 
Selective watchdog technique for intrusion detection in mobile ad hoc network
Selective watchdog technique for intrusion detection in mobile ad hoc networkSelective watchdog technique for intrusion detection in mobile ad hoc network
Selective watchdog technique for intrusion detection in mobile ad hoc network
 
Question
QuestionQuestion
Question
 
Packet analyzing with wireshark-basic of packet analyzing - Episode_03
Packet analyzing with wireshark-basic of packet analyzing - Episode_03Packet analyzing with wireshark-basic of packet analyzing - Episode_03
Packet analyzing with wireshark-basic of packet analyzing - Episode_03
 
Review on Detection & Prevention Methods for Black Hole Attack on AODV based ...
Review on Detection & Prevention Methods for Black Hole Attack on AODV based ...Review on Detection & Prevention Methods for Black Hole Attack on AODV based ...
Review on Detection & Prevention Methods for Black Hole Attack on AODV based ...
 
Contents namp
Contents nampContents namp
Contents namp
 
Volume 2-issue-6-2095-2097
Volume 2-issue-6-2095-2097Volume 2-issue-6-2095-2097
Volume 2-issue-6-2095-2097
 
THE FIGHT AGAINST IP SPOOFING ATTACKS: NETWORK INGRESS FILTERING VERSUS FIRST...
THE FIGHT AGAINST IP SPOOFING ATTACKS: NETWORK INGRESS FILTERING VERSUS FIRST...THE FIGHT AGAINST IP SPOOFING ATTACKS: NETWORK INGRESS FILTERING VERSUS FIRST...
THE FIGHT AGAINST IP SPOOFING ATTACKS: NETWORK INGRESS FILTERING VERSUS FIRST...
 
Packet capture in network security
Packet capture in network securityPacket capture in network security
Packet capture in network security
 

Similar to A network behavior analysis method to detect this writes about a method to detect rat by analyzing network behaviours

A Deeper Look into Network Traffic Analysis using Wireshark.pdf
A Deeper Look into Network Traffic Analysis using Wireshark.pdfA Deeper Look into Network Traffic Analysis using Wireshark.pdf
A Deeper Look into Network Traffic Analysis using Wireshark.pdfJessica Thompson
 
Procuring the Anomaly Packets and Accountability Detection in the Network
Procuring the Anomaly Packets and Accountability Detection in the NetworkProcuring the Anomaly Packets and Accountability Detection in the Network
Procuring the Anomaly Packets and Accountability Detection in the NetworkIOSR Journals
 
NON-INTRUSIVE REMOTE MONITORING OF SERVICES IN A DATA CENTRE
NON-INTRUSIVE REMOTE MONITORING OF SERVICES IN A DATA CENTRENON-INTRUSIVE REMOTE MONITORING OF SERVICES IN A DATA CENTRE
NON-INTRUSIVE REMOTE MONITORING OF SERVICES IN A DATA CENTREcscpconf
 
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN Algorithm
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN AlgorithmIRJET - Network Traffic Monitoring and Botnet Detection using K-ANN Algorithm
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN AlgorithmIRJET Journal
 
Optimal remote access trojans detection based on network behavior
Optimal remote access trojans detection based on network behaviorOptimal remote access trojans detection based on network behavior
Optimal remote access trojans detection based on network behaviorIJECEIAES
 
ASSURED NEIGHBOR BASED COUNTER PROTOCOL ON MAC-LAYER PROVIDING SECURITY IN MO...
ASSURED NEIGHBOR BASED COUNTER PROTOCOL ON MAC-LAYER PROVIDING SECURITY IN MO...ASSURED NEIGHBOR BASED COUNTER PROTOCOL ON MAC-LAYER PROVIDING SECURITY IN MO...
ASSURED NEIGHBOR BASED COUNTER PROTOCOL ON MAC-LAYER PROVIDING SECURITY IN MO...cscpconf
 
The Robust system for antivenin DDOS by Rioter Puddle Expertise
The Robust system for antivenin DDOS by Rioter Puddle ExpertiseThe Robust system for antivenin DDOS by Rioter Puddle Expertise
The Robust system for antivenin DDOS by Rioter Puddle ExpertiseAM Publications
 
For your final step, you will synthesize the previous steps and la
For your final step, you will synthesize the previous steps and laFor your final step, you will synthesize the previous steps and la
For your final step, you will synthesize the previous steps and laShainaBoling829
 
Ethical Hacking - sniffing
Ethical Hacking - sniffingEthical Hacking - sniffing
Ethical Hacking - sniffingBhavya Chawla
 
Anomaly detection final
Anomaly detection finalAnomaly detection final
Anomaly detection finalAkshay Bansal
 
Experiment 7 traffic analysis
Experiment 7 traffic analysisExperiment 7 traffic analysis
Experiment 7 traffic analysisnikitaa25
 
JPD1423 A Probabilistic Misbehavior Detection Scheme toward Efficient Trust ...
JPD1423  A Probabilistic Misbehavior Detection Scheme toward Efficient Trust ...JPD1423  A Probabilistic Misbehavior Detection Scheme toward Efficient Trust ...
JPD1423 A Probabilistic Misbehavior Detection Scheme toward Efficient Trust ...chennaijp
 
Online stream mining approach for clustering network traffic
Online stream mining approach for clustering network trafficOnline stream mining approach for clustering network traffic
Online stream mining approach for clustering network trafficeSAT Journals
 
Online stream mining approach for clustering network traffic
Online stream mining approach for clustering network trafficOnline stream mining approach for clustering network traffic
Online stream mining approach for clustering network trafficeSAT Publishing House
 
Packet sniffing & ARP Poisoning
 Packet sniffing & ARP Poisoning  Packet sniffing & ARP Poisoning
Packet sniffing & ARP Poisoning Viren Rao
 

Similar to A network behavior analysis method to detect this writes about a method to detect rat by analyzing network behaviours (20)

A Deeper Look into Network Traffic Analysis using Wireshark.pdf
A Deeper Look into Network Traffic Analysis using Wireshark.pdfA Deeper Look into Network Traffic Analysis using Wireshark.pdf
A Deeper Look into Network Traffic Analysis using Wireshark.pdf
 
Procuring the Anomaly Packets and Accountability Detection in the Network
Procuring the Anomaly Packets and Accountability Detection in the NetworkProcuring the Anomaly Packets and Accountability Detection in the Network
Procuring the Anomaly Packets and Accountability Detection in the Network
 
NON-INTRUSIVE REMOTE MONITORING OF SERVICES IN A DATA CENTRE
NON-INTRUSIVE REMOTE MONITORING OF SERVICES IN A DATA CENTRENON-INTRUSIVE REMOTE MONITORING OF SERVICES IN A DATA CENTRE
NON-INTRUSIVE REMOTE MONITORING OF SERVICES IN A DATA CENTRE
 
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN Algorithm
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN AlgorithmIRJET - Network Traffic Monitoring and Botnet Detection using K-ANN Algorithm
IRJET - Network Traffic Monitoring and Botnet Detection using K-ANN Algorithm
 
Optimal remote access trojans detection based on network behavior
Optimal remote access trojans detection based on network behaviorOptimal remote access trojans detection based on network behavior
Optimal remote access trojans detection based on network behavior
 
ASSURED NEIGHBOR BASED COUNTER PROTOCOL ON MAC-LAYER PROVIDING SECURITY IN MO...
ASSURED NEIGHBOR BASED COUNTER PROTOCOL ON MAC-LAYER PROVIDING SECURITY IN MO...ASSURED NEIGHBOR BASED COUNTER PROTOCOL ON MAC-LAYER PROVIDING SECURITY IN MO...
ASSURED NEIGHBOR BASED COUNTER PROTOCOL ON MAC-LAYER PROVIDING SECURITY IN MO...
 
Check shavad
Check shavadCheck shavad
Check shavad
 
The Robust system for antivenin DDOS by Rioter Puddle Expertise
The Robust system for antivenin DDOS by Rioter Puddle ExpertiseThe Robust system for antivenin DDOS by Rioter Puddle Expertise
The Robust system for antivenin DDOS by Rioter Puddle Expertise
 
Ez33917920
Ez33917920Ez33917920
Ez33917920
 
Ez33917920
Ez33917920Ez33917920
Ez33917920
 
For your final step, you will synthesize the previous steps and la
For your final step, you will synthesize the previous steps and laFor your final step, you will synthesize the previous steps and la
For your final step, you will synthesize the previous steps and la
 
Ethical Hacking - sniffing
Ethical Hacking - sniffingEthical Hacking - sniffing
Ethical Hacking - sniffing
 
Anomaly detection final
Anomaly detection finalAnomaly detection final
Anomaly detection final
 
6
66
6
 
Experiment 7 traffic analysis
Experiment 7 traffic analysisExperiment 7 traffic analysis
Experiment 7 traffic analysis
 
JPD1423 A Probabilistic Misbehavior Detection Scheme toward Efficient Trust ...
JPD1423  A Probabilistic Misbehavior Detection Scheme toward Efficient Trust ...JPD1423  A Probabilistic Misbehavior Detection Scheme toward Efficient Trust ...
JPD1423 A Probabilistic Misbehavior Detection Scheme toward Efficient Trust ...
 
Online stream mining approach for clustering network traffic
Online stream mining approach for clustering network trafficOnline stream mining approach for clustering network traffic
Online stream mining approach for clustering network traffic
 
Online stream mining approach for clustering network traffic
Online stream mining approach for clustering network trafficOnline stream mining approach for clustering network traffic
Online stream mining approach for clustering network traffic
 
What is LoRaNET?
What is LoRaNET?What is LoRaNET?
What is LoRaNET?
 
Packet sniffing & ARP Poisoning
 Packet sniffing & ARP Poisoning  Packet sniffing & ARP Poisoning
Packet sniffing & ARP Poisoning
 

Recently uploaded

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

A network behavior analysis method to detect this writes about a method to detect rat by analyzing network behaviours

  • 1. A Network Behavior Analysis Method to Detect Reverse Remote Access Trojan HongyuZhu State Grid Hunan Electric Power Corporation Research Institute, State Grid Corporation ofChina Changsha , China 602983840@qq.com ZhexiangWu State Grid Zhejiang Yongkang Electric Power Corporation, State Grid Corporation ofChina Yongkang, China wuzx@zj.sgcc.com.cn Jianwei Tian State Grid Hunan Electric Power Corporation Research Institute, State Grid Corporation ofChina Chimgsha , China tianjw@lm.sgcc.com.cn Zheng Tian State Grid Hunan Electric Power Corporation Research Institute, State Grid Corporation ofChina Changsha , China tianz@lm.sgcc.com.cn Hong Qiao State Grid Hunan Electric Power Corporation Research Institute, State Grid Corporation ofChina Changsha , China qiaoh@lm.sgcc.com.cn Xi Li State Grid Hunan Electric Power Corporation Research Institute, State Grid Corporation ofChina Changsha , China lix@lm.sgcc.com.cn Shengshen Chen State Grid Hunan Electric Power Corporation Research Institute, State Grid Corporation ofChina Changsha , China chenss@lm.sgcc.com.cn Abstract-Remote Access Trojan (RAT) reverse connections are secret and malicious, which are established to steal private data or be operated under hacker's command. To detect reverse RAT effectively, a network behavior-based method is introduced in this paper. We first conclude a typical network communication pattern. Then four uncorrelated network behavior features are extracted from every TCP session as the detection model input. Six supervised classification algorithms are applied on real network traffic data set to distinguish RAT and legitimate sessions. Besides detection accuracy, AVC is also used because the amount of RAT sessions is much less than normal sessions and AVC is suitable to evaluate the performance of such imbalanced problem. Detection accuracies of all test algorithms are higher than 0.92. AVC of Random Forest, SVM and Logistic Regression are higher than 0.94, which shows their ability to handle imbalanced data set. Compared to related work, the proposed method is effective on connection encrypted RAT detection, and can distinguish RAT sessions from similar normal sessions, like P2P or cloud application sessions. Keywords-Network Security, Trojan Detection, Network behavior, Machine learning I. INTRODUCTION Remote Access Trojan (RAT) has become a destructive threat to almost every enterprise, due to its ability to steal confidential data and execute malicious commands under the control of hackers [1]. RAT consists of two separate parts and these two parts communicate with each other through Internet The client part is secretly installed on compromised computers through fishing E-mail or USB driver, and remotely receives hacker's command. The server part is on the hacker's side and sends command to the compromised computers. The different direction of RAT sessions classifies RAT into two categories: the forward RAT and reverse RAT. In a normal forward RAT connection, a client connects to a server through the server's 978-1-5386-6565-7118/$31.00 ©2018 IEEE 1007 open port, but in the case of a reverse RAT connection, the client opens the port that the server connects to. Port filtering policy has been widely used on firewall and switches to block forward RAT connections [2-3]. However, it's impossible to distinguish reverse RAT traffic bashed on port filtering, because it behaves just like normal outbound connections, like visiting a website. Deep Packet Inspection (DPI) is another type of detection method, which compares the payload of network packets with Trojan feature library [2][4]. But DPI can't deal with encrypted RAT traffics, and DPI has high computational complexity. Host-based Trojan detection needs to install a program on every target computers, and some RAT clients have the ability to hide themselves or keep inactivated when they discover detection software is scanning them, which makes the host-based RAT detection easy to fail [5]. In this paper, we developed a RAT detection method based on network behavior analysis. This method is efficient and works well on encrypted RAT traffic. The remainder of the paper is organized as follows: Some related works of RAT detection are discussed in Section II. Section III describes typical RAT communication procedure and characteristics. In Section IV, the model of our detection method and its performance test results is introduced. Section V gives conclusion of the whole paper. II. RELATED WORK Since Trojan is one of the most common attacking tool used by hackers, the rule set of Intrusion Detection System (IDS) and Intrusion Protection System (IPS) has included large amount of rules to detect Trojan [6]. Although this is a simple way to detect Trojan, the rapidly growing scale of Trojan rule library reduces the IDSIIPS's time performance, and it becomes invalid when encounter encrypted Trojan traffic. The
  • 2. network-behavior-based RAT detection method has been studied by several network security researchers. Li [7] extracts 6 network behavior features and uses cluster algorithms to detect Trojan from network traffics. But normal P2P service is found to have great influence on the performance of this detection method. Dan [8] mainly focuses on the first few packets of a TCP session and applies classification algorithms, However, the chosen network behavior features need to be optimized, since only 2 of the 7 extracted by Dan are uncorrelated. Besides, the packet number in early stage is too few to provide sufficient features in real network environment. In [5], 927 cross-layer network features are analysed by 3 different classification algorithms to detect rnalware sessions. This solid work is targeted at detecting dozens types of malware, making it too complicated for RAT detection III. OVERVIEW OF RAT COMMUNICATION A typical communication mechanism of reverse RAT is described in this part A complete RAT session starts with a successful TCP handshake and ends with TCP FIN or RST process. After TCP connection has been established, RAT will come through a period called early stage, where interval time between two adjacent packets is less than the threshold t second(s) [8]. Early stage is when the server part and client part of RAT exchanges some basic information, After that, some small packets carrying hacker's command would be sent to the client side. As a response the client would send several packets back, and these packets are relatively larger than the command packets because they may carry confidential data. During the idle time when hacker gives no instruction, keep-alive heartbeat packets may occur between client and server to tell each other they are still connected. Payload lengt h .' RAT se r ver s ide (Hacker) However, not every RAT session acts exactly the same as the described typical communication pattern. Few RATs may not send heartbeat packets or have more inbound bytes than outbound bytes. Some normal application may act like RAT in some way. For example, P2P and cloud sessions also prefer to use PSH flag, and remote desktop service may send heartbeat packets. Thus, none of those features alone can distinguish RAT session from legitimate session, and find a way to integrate those features is the key to solve this RAT detection problem. IV REVERSE RAT DEJECTION MErnOD Our RAT detection method is described in this section. We first pick out four network behavior attributes that best represent the differences between RAT and normal sessions. Then we apply different machine learning models to learn the mapping between the four attributes and the fact that whether a session is RAT or not. A labeled data set consists of real RAT and normal network traffic is used to train and test our models. A. Feature Selection and Extraction We extract four features from every complete TCP session: • out-in-bytes-ratio: the ratio between average outbound byte and inbound byte. It's a positive continuous value. • PSH-flag-ralio: the ratio between number of packets with PSH flag and session packet number. It's a positive continuous value. • early-stage-pocket-number: the packet number of session's early stage with threshold t set to 1 second. It's a positive integer. • heartbeat-flag: the flag of whether a session has heartbeat packets. It's a Boolean with value of aor 1. To obtain the above four features, the basic information of every TCP packet we need to collect is listed in Table I. To get the early-stage-packet-numher, we calculate the time interval between every two adjacent packets until the time interval is greater than the threshold t (1 second here). RAT's heartbeat starts with a fixed length of packet sent by client and a fixed answering packet sent by server, and this pattern would repeat for several times. In this paper we set the repeat time to be 3. The pseudo code of heartbeat packets detection algorithin is shown in Fig. 2. The overall network behavior feature extraction process is shown in Fig. 3. Packet info Meaning Srcjp Source IP address Src---'port Source TCP port Dstip Destination IF address Dstport Destination TCP port Timestamp Packet timestamp flags TCP flags including SYNIACK!FIN/RST/PSH Payload_len Payload length in bytes i t ime TIT SY!'! : Early Data Heartbeat handshake i Stage i translliss ion i packets ---~----------+--------------------."..j:,-------------------","I , . iS3 RAT cl i ent si de Glormal us er) Fig. 1. Typical RAT session communication process There are some network behavior differences between of legitimate and RAT sessions. Legitimate sessions tend to transport data as soon and as much as possible once TCP handshake finished, so most legitimate sessions don't have any early stage. Most RAT sessions transport more outbound data than inbound to send out confidential information, while normal sessions often behave in the opposite way. The PSH flag in the TCP header informs the receiving host that the data should be pushed up to the receiving application immediately. The rate of packets containing PSH flag of RAT sessions is very likely to be higher than that of legitimate sessions, because hackers always hope their packets to have higher priority. Moreover normal sessions usually don't have heartbeat packets. 1008 TABLE!. PACKET INFORMATION TO BEEXTRACTED
  • 3. tlJUJ (I) STATISTICAL FEATURES OF TIIE COLLECTED DATA SETTABLE II. Based on previous studies [2][4-5] related to RAT detection, six different machine leam-ing algorithms are chosen for this classification problem including kNN (k Nearest Neighbor), Naive Bayes, Logistic Regression, SVM (Support Vector Machine), Ran-dom Forest and decision tree. All algorithins are implemented in python scikit-learn library. We use IO-Fold cross validation to split the data for training and testing in all tests, and we use accuracy and Area under the Curve (AUC) metrics to evaluate the performance of our algorithms, • Accuracy: Accuracy is a basic score in classifier evaluation, which is calculated by equation (2): c. Machine Learning Model Training and Evaluation So far, Reverse RAT detection problem has become a binary classification problem, taking the above four features {XI,Xl,x3,x4} as input and yE {O,I} (I for RAT and afor normal session) as output (I). ~ Reverse RAT Legitimate sessions Feature sessions Average early stage packet 475 845 number Average outbound TCP 78.74 146.8 navload leneth (byte) Average inbound TCP 94.24 866.32 navload leneth (byte) Percentage of sessions 24.3% 0.91% havine heartbeat Calculate out-in- bytes- ratio=(outbou nd_b yte/inbound_byte) Calculate the packet interval time Calculate PSH-flag- "'-_--->J rati0=(PSH_pac ket_ number/session_pa cket_number) this the la packet ofthe session? Fig. 2. Heartbeat packets detection Algorithm it' ,:" " IIH" " " I l""'hl. I,'",!I" "' " cq",' I) w,,' t" 'low ,,, ,•.,•.•.<'1 , InJ~,I', an "qUill) th" n Jlw r ll>.dFI".'I ' I ~ n ,t u l'I' If',,,,''II.,,,I FI,,'I ~ ~ nd if III else 11 /,/,," +- !<I..: + I 11 ".,,,/1 +- " .,.11 + (','l.T tldl~ _ ,) 1:1 "'Kl if II " ml whit" 1,. r "tur" flr-m 'I 'k at} I'..~' m e n d (unetiu ll PSH_packet_number Set =PSH_packet_numb ' - - - - - - ---1early_stage_e nd_flag= er+PSH_flag_value True A 1llorith,n 1 H",,1I,,'>« D"1<, 't i' ,L Mo d,, :,' I np ut : ~~,:kct h:·,.Li." , Out put : H:'artl ",'lt ~"l".. l Ii.mctio ll H Efl.Tf<EAT U~ I1<(,T lD., 1'", 'htln!()Li,,' ) 2 IrLr · - 0 I /{ ell rIl H',, 'YIIlJl '!- 0 I w lul" ! d~ < = 1 (~J (jrlf"hd " J ()L." I ) _ t; d o if /H.,.;""tlnl",',i,o/[,,1.c) /0 f'lc kd/ IlfoL i81[ib : t 11)",c 3 1",i ~, oj " ,J' Iv".r',1. ",1.0,,"') 1",,-.,d ., Set current packet to be the next packet inthe list Fig. 3. Network behavior feature extraction process B. Data set To conduct supervised machine learning algorithms, we collect 370 real RAT traffics from open source community', including RAT type of ghOst, Remcos, Nanocore, Adwind, NetSupport Manager from year 2016 to 2018. Around 30% of the collected RAT traffics are encrypted. As for normal sessions, we collect 2190 sessions' traffic information from our company network, covering application type of E-mail, QQ, web browsing, P2P and cloud service. The statistical features in Table II are calculated from the collected data set, from which we can tell some differences between reverse RAT and legitimate sessions. The average early stage packet number of legitimate sessions is much greater than that of RAT sessions. The gap between outbound and inbound TCP payload length is much bigger of legitimate sessions than that of RAT sessions. And the possibility of the existence of heartbeat packets in RAT sessions is higher than that in normal ones. Accuracy Correctly Classified Sample Number (2) Total Sample Number • AUC: The RAT detection task is obviously imbalanced, since the number of RAT sessions is much less than normal sessions in practice. Ave is known to be a reliable measure for imbalanced data set [4]. Receiver Operating Characteristic (ROC) curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings in binary classification. AUC is the area under ROC curve. As seen in Figure 4, all six algorithms achieve accuracy higher than 0.92 and AUC higher than 0.87, which verifies that machine learning is a feasible solution for reverse RAT detection. Random Forest with 10 trees, SVM with linear kernel and Logistics Regression have an average AUC of 0.954, which is higher than 4NN, Gaussian Naive Bayes and decision tree. And it indicates that the first 3 algorithms are more capable of handling imbalanced data set. Meanwhile, decision tree algorithin may not perfectly suit for imbalanced task since it gets the lowest AUC of 0.87. Random Forest gets an accuracy of 0.957 and AUC of 0.979, making it the optimal solution for reverse RAT detection among all six algorithms. www.malware-traffic-analysis.net www.contagiodump.blogspot.corn www.capture.blogspot.com. 1009
  • 4. A ccurac)' a nd AU C Va lue ,~..",.~,...._====== _ io.ecc _ l O·...UC Fig. 1. Accuracy and AUC value of test algorithms V. C ONCLUSIONS In this paper, we introduce a reverse RAT detection method based on network behavior features and machine learning algorithms. Instead of inspect the payload of network traffic, our approach uses only 4 features extracted from TCP headers, making it efficient to detect RAT in real time. The proposed method is mainly based on the fact that reverse RAT sessions are more possible to have short early stages, heart beat packets, PSH flags and send out more data than normal sessions. Machine learning is able to solve this binary classification problem according to our test on real data. Random Forest performs the best with an accuracy of 0.957 and AUC of 0.979, 1010 and the performances of SVM and Logistic Regression algorithms follows. Thus, our approach can detect unencrypted and encrypted RAT sessions accurately and efficiently. REFERENCES [1] Michael, A , Sean, M., Christopher, C., & Aaron, 1. (2016) Hacking Exposed Malware & Rootkits: Security Secrets and Solutions, 2nd edn, McGraw-Hill Education, New York [2] Li, w., Liu, H., & Zhang, X (2016) A network data security analysis method based on DPI technology. 2016 7th IEEE International Conference on Software Engineering and Service Science (ICSESS), 97- 976. [3] Zhu, H., Tian, Z., & Xue, H. Practice of Automatic Monitoring Tool for Boundary Port of Electric Power Information Network Hunan Electric Power, 2017, (37):49-52. [4] Reham, T., Nada, M., & Ayman, M. (2017) A survey on deep packet inspection. Proceedings of ICCES 2017 12th International Conference on Computer Engineering and Systems, 188-197. [5] Dmitri, B., Bracha, S., Lior, R., & Ariel, B. (2015) Unknown Malware Detection Using Network Traffic Classification. 2015 IEEE Conference on Communications and NetworkSecurity, 134-142. [6] Elias, R. & Xenofontas, D. (2014) IDS Alert Correlation in the Wild With EDGe. IEEE Journal on Selected Areas in Communications, 1933- 1946. [7] Shicong, 1., Xiaochun, Y., Yongzheng, Z., Yi, P., & Tao, Y. (2012). A Novel Approach of Detecting Trojan Based on Network Behavior Analysis. 2012 IEEE 14th International Conference on Communication Technology, 513 - 518. [8] Dan, J., & Kazumasa, O. (2015) An Approach to Detect Remote Access Trojan in the Early Stage of Communication. 2015 IEEE 29th International Conference on Advanced Information Networking and Applications, 706-713.