The Internet of Things (IoT) has become an integral requirement to equip common life. According to IDC, the number of IoT devices may increase exponentially up to a trillion in near future. Thus, their cyberspace having inherent vulnerabilities leads to various possible serious cyber-attacks. So, the security of IoT systems becomes the prime concern for its consumers and businesses. Therefore, to enhance the reliability of IoT security systems, a better and real-time approach is required. For this purpose, the creation of a real-time dataset is essential for IoT traffic analysis. In this paper, the experimental testbed has been devised for the generation of a real-time dataset using the IoT botnet traffic in which each of the bots consists of several possible attacks. Besides, an extensive comparative study of the proposed dataset and existing datasets are done using popular Machine Learning (ML) techniques to show its relevance in the real-time scenario.
Scanning the Internet for External Cloud Exposures via SSL Certs
An Efficient Framework for Detection & Classification of IoT BotNet.pptx
1. An Efficient Framework for Detection & Classification of IoT BotNet
A DISSERTATION
Submitted in partial fulfilment of the
requirements for the award of the degree
of
MASTER OF TECHNOLOGY
in
COMPUTER SCIENCE AND ENGINEERING
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
GRAPHIC ERA DEEMED TO BE UNIVERSITY,
DEHRADUN – 248002 (INDIA)
July, 2022
Under Supervision of
Prof. Dr. SANTOSH KUMAR
Ass Prof. Mr. UMANG GARG
by
MAURYA SANDEEP MRITYUNJAY
(EN. NO. GE- 20161983)
2. Table of Contents
1
2
3
4
5
6
7
8
Introduction
Literature Review
Problem Statement and Objectives
Generation of Testbed for IoT Botnet detection
Analysis of IoT Botnet
Analyzing dataset using ML for IoT botnet detection
Conclusion and Scope for Future Work
References
9
Results and Discussion
3. Introduction
An IoT botnet is a network of devices connected to the internet of things (IoT),
typically routers, that have been infected by malware and have fallen into the
control of malicious actors or botmaster.
IoT botnets are known for being used in launching DoS & DDoS attacks on target
entities to disrupt their operations and services. Various emerging IoT botnets have
been mentioned in above figure 1.
IoT BotNet
Figure 1
4. Identifying Data Assets: The value of every IoT device is built on data, and how that data is managed.
Eg: user ID, passwords, etc. Each data asset has security properties i.e., confidentiality, integrity, and
authenticity.
Identifying Threats: Compromising the security properties of a data asset and utilize it for unauthorized
purposes. By evaluating each data asset, a list of potential threats can be known. If the confidentiality of
credentials is compromised, then they can be used by unauthorized actors to gain access to the network.
Security Objectives: Defined at an application level. Some security objectives can be implemented as
Trusted Applications (TAs). With the threats known it can be categorized as Impersonation, MITM,
Firmware Abuse, Tamper, etc.
Requirements: At this point, the analysis provides a logically connected model using above 3 key points
This model provides a list of requirements or features for a secure IoT environment that can be compiled. It
also be used for solution implementation criteria for the IoT device application.
Key points for Analysis:-
6. Literature Review (Detection)
References Technique Attacks Contribution Result Limitation
[1] ML, DL DDoS
To create a practical
or V-Sandbox for
dynamic analysis of
IOT Botnet.
Supports multiple CPU
architectures, the C&C server
connections and the shared
libraries
Limited No of
datasets has been
chosen
[2]
Brute-Force
dictionary-
based
technique
DDoS
Comparisons of
different Botnets with
the MIRAI botnets
Working view of Mirai leads to
tackle with the attacks
Only a temporary
solution that is
reboot is given
[3] AI, ML, DL Adverserial
Federated Learning
techniques to detect
malware in IoT
devices
Centralized performance in a
preserved-privacy manner
Analysis has been
yet to be conducted
on unsupervised
scenario
[4]
Black-Box
Sniffing,
DDoS &
MITM
Certain ideas are
recommended to
make the devices
more secure
Use of strong and unique
passwords that are SHA-516
encrypted to be more secure
Limited no and types
of IoT devices has
been chosen for
testing purpose
7. Literature Review (Cont…)
References Technique Attacks Contribution Result Limitation
[5]
ML/DL, K-
Mediods.
DDoS,
Fuzzers,
Backdoors,
Ransom-
ware
A new IoT BotNet attack
dataset UNSW-NB15 has
been used to check of
the effectiveness of IoT
based NIDS
Results were obtained
using the scatter search
and DL methods at an
accuracy of 100%
Analysis are on some
datasets and not on
real-time scenarios
[6]
The Binary
Code
Obfuscation
technique.
NA
The technique helps to
hide the data locations
and prevent them from
getting attacked.
This technique is further
developed to make the
malware efficient in static
analyses.
The technique is less
effective where novel
detection mechanisms
are required.
[7] DL DDoS
An empirical evaluation
has been performed with
real traffic data.
The best results can be
concluded on the IoT
devices that have the
almost same functionality.
The Autoencoders lack
where the functionality
of IoT devices is not the
same.
[8] Statistical
Learning
DDos/DoS
The developed
framework represents
network data and
improves BotNet
classification.
The framework is more
reliable in exploring
concealed malicious
activity.
The analysis is on
statistical approach and
must be on ML/DL
based to be more
effective.
9. Literature Review (Analysis)
References Approaches Contribution Result Limitation
[1]
Static
analysis
Proposed a new
set of features related to accessing
resources on the target mobile.
URL set of features play the
key role in the Android botnet
detection problem using RFE.
The dataset taken is
small and having less
varities of botnet.
[2]
Static
analysis
Proposed a framework to classify
botnets using botnet unique patterns
and used features.
Experimental results show that
SVM classifier provides
99.06% accuracy.
The proposed approach
emphasized only on
two features.
[3]
Static
analysis
The best performing
ML model is determined by the
accuracy and confusion matrix on three
malware datasets from three different
periods.
The best performance
was from XGBoost at 97.87%
and 97.50% accuracy.
The dataset chosen
was simulation based
and has a non-famous
family of malwares in
datasets.
[4]
Hybrid
analysis
Proposed a DroidDetector model using
features from static with dynamic
analysis of Android apps and
characterize malware using DL
techniques.
DroidDetector achieves
96.76% detection accuracy
that outperforms traditional ML
techniques.
More real-time training
samples should be
chosen to improve the
accuracy of the
proposed model.
10. Literature Review (Cont…)
References Approaches Contribution Result Limitation
[5]
Dynamic
analysis
Proposed a host-based
approach using ML to detect
mobile botnets with features
derived from system calls.
High performance (84%) was
achieved in multiple metrics
across multiple ML algorithms.
Needs to use a rooted
device for using the Strace
tool.
[6]
Static
analysis
Proposed an efficient malware
detection system based on deep
learning.
The proposed approach can
detect new malware samples
with higher accuracy and
reduced FP rates.
The False Negative rate is
high for achieving the
optimal solution.
[7]
Dynamic
analysis
Proposed Malbert, a pre-trained
DL-based method to detect
malicious Windows software
through dynamic analysis.
Malbert achieves a 99.9%
detection rate and a detection
rate exceeding 98% under
different robustness tests.
The results are on API
based datasets and need
to be cloud based for
remote users.
[8]
Static
analysis
Proposed MOCDroid is used to
generate a classifier on specific
behaviours defined by third-
party call groups.
MOCDroid, achieves an
accuracy of 95.15% and 1.69%
of false positives rate.
The datasets used is small
for evaluation and also use
of other clustering methods
is required.
11. Problem Statement and Objectives
Problem Statement
The main objective of this thesis is to develop an efficient framework for
the detection and classification of IoT Botnet traffic.
Analyzing the BYOB botnet used in this experimental analysis.
Further dissect to know the function and origin of the botnet.
The above problem statement further has been classified as the objectives
and are as follows in the next slides.
12. To create and implement a scenario for IoT BotNet
Obj_1
Objectives
Send Updates to IoT devices
(Attacker access server i.e., C&C
server)
• Access
• Generate
• Attack on server
• Receive updates (Victim)
• Monitor traffic
• Generate RI_BoT
dataset
Classify and Detect IoT BotNet
using Machine Learning Models
Build Your Own BotNet
(Phase 1)
C&C server (attacker)
(Phase 2)
IoT Devices (victim)
(Phase 3)
Dataset Analysis
(Phase 4)
14. Obj_2
To analyze traffic using ML model for classification
and detection of IoT BotNet.
For Classification and Comparison Analysis 3 datasets are used.
1. RI_BoT (Our Newly Generated)
2. BoT_IoT
3. UNSW_NB15
15. To analyze BYOB for the clarification and analysis of IoT
BoT traffic using several analysis tools.
Obj_3
16. Goals of IoT Botnet Analysis
Did an attacker implant a Rootkit or Trojan on your systems ?
Is the attacker really gone ?
What did the attacker steal or add ?
How did the attack get in ?
Root-cause analysis
After botnet is found, you need to know
Dissecting botnet to understand:-
IoT botnet analysis
• How it works
• How to identify it
• How to defeat or eliminate it
17. Analysis Techniques
Antivirus Scanning
Botnet can easily change its signature and fool the antivirus/defenders.
VirusTotal is convenient, but using it may alert attackers that they’ve been caught.
18. Analysis Techniques
PE-analysis
Our botnet is essentially a PE-32 bit software that masquerades as a Microsoft
Visual C++ version file
Because the operation of our botnet is obfuscated, the actual file description is not
revealed.
19. Analysis Techniques
CFF Explorer- Imported Directory
CFF explorer supports in the analysis of data saved inside our botnet's PE.
From below figure, kernel32.dll provides the software to make Win32 API calls
such as I/O execution, memory allocation, accepting i/p from keyboard.
ws2 32.dll, handles network access.
msvcp60.dll file is a C/C++ package that performs tasks such as string
modification for inbuilt software’s.
advapi32.dll consists of security calls as well as registry manipulation methods
20. Analysis Techniques
PEView- Import Address Table
As per CFF Explorer- Imported Directory the most significant library is advapi32.dll.
From the below fugure, our botnet sample is building windows services and query
processing as well as saving new information in registry entries.
21. Analysis Techniques
PEiD
PEiD is indeed a user-friendly app that allows its UI to identify PE packers,
cryptors, and compilers within exe files.
PEiD consists of KANAL-addon i.e., Krypto Analyzer that searches the software
information as well as program seeking crypto modules references (in terms of
Ransomware).
22. Analysis Techniques
PowerShell- Strings
Windows Powershell has been used to call basic strings method on our botnet
sample.
The most significant results are:-
Microsoft Enhanced RSA and AES
cmd.exe / c “%s ”
BYOB botnet encrypts data with Microsoft Enhanced RSA and AES
as well as performs instructions using cmd.exe.
23. Analysis Techniques
PEStudio
This tool is useful to verify the file format of the botnet i.e., exe format.
An exe file format will always start by "4D 5A" in hexadecimal.
Entropy score can help to determine whether our malicious botnet is
compressed or not. 0-8 defines the range of file compression/packed.
PeStudio shows that we have to unpack our malicious botnet sample in order to
obtain some important indications of intrusion.
24. Analyzing dataset using ML for IoT botnet
detection
CLASSICAL MODELS ENSEMBLE LEARNING
1. Logistic Regression 5. Gradient Boosting
2. Decision Tree 6. XGBoost
3. Support Vector Machine 7. CatBoost
4. Neural Network
25. Result and Discussion
The performance metrics is used to analyze the performance of three different
datasets.
It is used to evaluate the performance of datasets.
Accuracy =
Precision =
Recall =
F1 Score =
ROC Curve
𝑵o 𝒐𝒇 𝑷𝒓𝒆𝒅𝒊𝒄𝒕𝒆𝒅 𝑶𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏𝒔
𝑻𝒐𝒕𝒂𝒍 𝑵𝒖𝒎𝒃𝒆𝒓 𝒐𝒇 𝑶𝒃𝒔𝒆𝒓𝒗𝒂𝒕𝒊𝒐𝒏
𝑻𝑷
𝑻𝑷 + 𝑭𝑷
𝑻𝑷
𝑻𝑷 + 𝑭𝑵
𝟐 ×
𝑹𝒆𝒄𝒂𝒍𝒍 × 𝑷𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏
𝑹𝒆𝒄𝒂𝒍𝒍 + 𝑷𝒓𝒆𝒄𝒊𝒔𝒊𝒐𝒏
26. Evaluation
We evaluated and compared the efficiency of best performing dataset that is
UNSW_NB15 and BoT_IoT dataset with our new RI_BoT dataset.
The statistical view for our results based on which we have applied Machine
Learning models as per datasets are mentioned in next slide.
28. Figure 2: Comparison Analysis for Logistic Regression Figure 3: Comparison Analysis for Decision Tree
Figure 4: Comparison Analysis for SVM Figure 5: Comparison Analysis for Neural Network
29. Figure 6: Comparison Analysis for Gradient Boosting Figure 7: Comparison Analysis for XGBoost
Figure 8: Comparison Analysis for CatBoost
30. Comparison Analysis
The following ROC curve defined the best performing Model with
respect to datasets
1. RI_BoT dataset:- 2. BoT-IoT dataset:- 3. UNSW_NB15 dataset:-
31. Conclusion and Scope for Future work
This dissertation presents a new dataset, named RI_Bot, which
incorporates the real-time traffic of sensors and actuators. The
dataset was developed using a realistic testbed and has been
tested using different ML models.
The model has been tested on pre-defined datasets such as
BoT_IoT and UNSW_NB15 developed by using normal and attack
traffic.
A comparative analysis is also explained in the result section where
the evaluation of several parameters has been conducted.
Further the IoT botnet has been analyzed using ML as well as
analysis tools to get the in-depth knowledge of the source and
impact of IoT botnet.
Conclusion
32. Conclusion and Scope for Future work
In future, we will be planning to develop the hybrid model using
deep learning techniques for the evaluation of the reliability of the
dataset and performance measures.
Further a Hybrid model that is a combination of both static as well
as dynamic analysis can also introduced that could provide better
performance in detection of the upcoming latest botnets like Mozi
and other Zero-day attacks that is in contrast.
Also, the Block chain technologies can also be applied to analyze
these botnets as well as Zero-day attacks.
Scope for Future Work
33. Published Paper
[1] S. Kumar, A. Gueroudji, V. Tripathi, S. Maurya, and Manoj. K, “An Efficient Approach for Intrusion
Detection Using System Call Traces” Accepted in The 4th International Conference on
Communication and Information Processing (ICCIP), Jun. 27, 2022. (Scopus Index)
Accepted Paper
[1] S. Maurya, S. Kumar, U. Garg, and M. Kumar, “An Efficient Framework for Detection and
Classification of IoT Botnet Traffic,” ECS Sensors Plus, vol. 1, no. 2. The Electrochemical Society,
p. 026401, Jun. 01, 2022. doi: 10.1149/2754-2726/ac7abc.
34. References
1) H.-V. Le and Q.-D. Ngo, “V-Sandbox for Dynamic Analysis IoT Botnet,” IEEE Access, vol. 8. Institute of Electrical and
Electronics Engineers (IEEE), pp. 145768–145786, 2020. doi: 10.1109/access.2020.3014891.
2) C. Kolias, G. Kambourakis, A. Stavrou, and J. Voas, “DDoS in the IoT: Mirai and Other Botnets,” Computer, vol. 50, no. 7.
Institute of Electrical and Electronics Engineers (IEEE), pp. 80–84, 2017. doi: 10.1109/mc.2017.201.
3) V. Rey, P. M. Sánchez Sánchez, A. Huertas Celdrán, and G. Bovet, “Federated learning for malware detection in IoT
devices,” Computer Networks, vol. 204. Elsevier BV, p. 108693, Feb. 2022. doi: 10.1016/j.comnet.2021.108693.
4) O. Shwartz, Y. Mathov, M. Bohadana, Y. Elovici, and Y. Oren, “Reverse Engineering IoT Devices: Effective Techniques
and Methods,” IEEE Internet of Things Journal, vol. 5, no. 6. Institute of Electrical and Electronics Engineers (IEEE), pp.
4965–4976, Dec. 2018. doi: 10.1109/jiot.2018.2875240.
5) M. Panda, A. A. A. Mousa, and A. E. Hassanien, “Developing an Efficient Feature Engineering and Machine Learning
Model for Detecting IoT-Botnet Cyber Attacks,” IEEE Access, vol. 9. Institute of Electrical and Electronics Engineers
(IEEE), pp. 91038–91052, 2021. doi: 10.1109/access.2021.3092054.
6) A. Moser, C. Kruegel, and E. Kirda, “Limits of Static Analysis for Malware Detection,” Twenty-Third Annual Computer
Security Applications Conference (ACSAC 2007). IEEE, Dec. 2007. doi: 10.1109/acsac.2007.21.
7) Y. Meidan et al., “N-BaIoT—Network-Based Detection of IoT Botnet Attacks Using Deep Autoencoders,” IEEE Pervasive
Computing, vol. 17, no. 3. Institute of Electrical and Electronics Engineers (IEEE), pp. 12–22, Jul. 2018. doi:
10.1109/mprv.2018.03367731.
8) J. Ashraf et al., “IoTBoT-IDS: A novel statistical learning-enabled botnet detection framework for protecting networks of
smart cities,” Sustainable Cities and Society, vol. 72. Elsevier BV, p. 103041, Sep. 2021. doi: 10.1016/j.scs.2021.103041.
Fig 1- https://www.trendmicro.com/vinfo/us/security/definition/iot-botnet
Fig 9- https://www.zdnet.com/article/breach-clean-up-cost-linkedin-nearly-1-million-another-2-3-million-in-upgrades/
35. References
1) W. Hijawi, J. Alqatawna, A. M. Al-Zoubi, M. A. Hassonah, and H. Faris, “Android botnet detection using machine learning
models based on a comprehensive static analysis approach,” Journal of Information Security and Applications, vol. 58.
Elsevier BV, p. 102735, May 2021. doi: 10.1016/j.jisa.2020.102735.
2) G. Kirubavathi and R. Anitha, “Structural analysis and detection of android botnets using machine learning techniques,”
International Journal of Information Security, vol. 17, no. 2. Springer Science and Business Media LLC, pp. 153–167, Feb.
01, 2017. doi: 10.1007/s10207-017-0363-3.
3) R. Kumar and G. Subbiah, “Zero-Day Malware Detection and Effective Malware Analysis Using Shapley Ensemble
Boosting and Bagging Approach,” Sensors, vol. 22, no. 7. MDPI AG, p. 2798, Apr. 06, 2022. doi: 10.3390/s22072798.
4) Z. Yuan, Y. Lu, and Y. Xue, “Droiddetector: android malware characterization and detection using deep learning,”
Tsinghua Science and Technology, vol. 21, no. 1. Tsinghua University Press, pp. 114–123, Feb. 2016. doi:
10.1109/tst.2016.7399288.
5) V. G. T. D. Costa, S. B. Junior, R. S. Miani, J. J. P. C. Rodrigues, B. B. Zarpelão, “Mobile botnets detection based on
machine learning over system calls,” International Journal of Security and Networks, vol. 14, no. 2. Inderscience
Publishers, p. 103, 2019. doi: 10.1504/ijsn.2019.100092.
6) J. Hemalatha, S. Roseline, S. Geetha, S. Kadry, and R. Damaševičius, “An Efficient DenseNet-Based Deep Learning
Model for Malware Detection,” Entropy, vol. 23, no. 3. MDPI AG, p. 344, Mar. 15, 2021. doi: 10.3390/e23030344.
7) Z. Xu, X. Fang, and G. Yang, “Malbert: A novel pre-training method for malware detection,” Computers & Security,
vol. 111. Elsevier BV, p. 102458, Dec. 2021. doi: 10.1016/j.cose.2021.102458.
8) A. Martín, H. D. Menéndez, and D. Camacho, “MOCDroid: multi-objective evolutionary classifier for Android malware
detection,” Soft Computing, vol. 21, no. 24. Springer Science and Business Media LLC, pp. 7405–7415, Jul. 25, 2016. doi:
10.1007/s00500-016-2283-y.