Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Ista presentation-apache spark
1. Detecting Malicious Domain Names
using Deep Learning Approaches at
Scale
Vinayakumar R1, K.P Soman1 and Prabaharan Poornachandran2
1Centre for Computational Engineering and Networking (CEN), Amrita School of
Engineering, Coimbatore, Amrita Vishwa Vidyapeetham,
Amrita University, India.
2Center for Cyber Security Systems and Networks, Amrita School of Engineering,
Amritapuri, Amrita Vishwa Vidyapeetham,
Amrita University, India.
2. Outline
• Introduction
• Background information / Related works
• Proposed Method – Deep Learning
• Description of the data set and Results
• Summary
• Future Work
• References
2
3. Introduction
• Threats related to computer security constantly
evolving and attacking the networks and internet
all the time.
• A new approach which can handle and analyze
massive amount of logs from diverse sources
such as network packets, Doman name service
(DNS) logs, proxy logs, system logs etc. required.
• The Domain Name System (DNS) is one of the
vital elements in the Internet. Due to the
importance of DNS, it’s been the target of attacks
by attackers.
3
4. Background information / Related works
• Blacklisting is the most commonly used approach to block
malicious domain name [1].
• This completely fails at detecting the malicious domain
generated by domain generation algorithms (DGAs).
• DGAs generate pseudo random domain names periodically
and connects them to a C2C server. The pseudo random
domain names are generated based on a seed. A seed is a
combination of numeric, alphabet, date/time and other
information.
• Machine learning methods with Feature engineering used to
detect the DGA based malware.
• Deep learning is a new field of machine learning that has the
capability to obtain optimal feature representation by taking
raw domain names as input [2]. 4
5. Proposed Method
Figure 1. Architecture for detecting malicious domain names
(inner units and their connection are not shown for deep layers).
5
6. Description of the data set and Results
Data set 1 is from the real-time environment i.e. collected inside
LAN. Data set 2, benign domain names are from Alexa [1],
OpenDNS [2] and malicious domain names from publically
available DGA algorithms [3] and real-time OSNIT feeds [4].
Table 1. Description of data set
6
7. Contd.
Table 2. Summary of test results of Data set 1 for classifying
domain name as either benign or malicious.
7
8. Contd.
Table 3. Summary of test results of Data set 2 for classifying domain
name as either benign or malicious.
8
9. Contd.
Table 4. Summary of evaluation results - train and test is done on the
Data set 1 and Data set 2 respectively.
9
10. Contd.
Table 5. Summary of evaluation results - train and test is done on the
Data set 2 and Data set 1 respectively.
10
11. Summary
• Developed a robust, scalable distributed
framework, capable of analyzing very large
volumes of DNS logs at the local area network
(LAN) level in an organization and correlating
them to detect the attack patterns.
• Deep learning and machine learning algorithms
are used to detect and classify the malicious
domain names.
• Deep learning approaches performed well in
comparison to the classical machine learning
algorithms.
11
12. Future Work
• Apache Spark has the capability to examine a
large amount of data. So collecting other logs
such as system logs, proxy logs, network logs
etc. and analysis can be done using the
proposed deep learning architecture to detect
the malicious activities inside an organization.
This work can be considered as one the
significant direction towards future work.
12
13. References
[1] Kührer, Marc, Christian Rossow, and Thorsten Holz.
"Paint it black: Evaluating the effectiveness of malware
blacklists." International Workshop on Recent Advances
in Intrusion Detection. Springer, Cham, 2014.
[2] LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton.
"Deep learning." Nature 521.7553 (2015): 436-444.
[3] https://support.alexa.com/
[4] https://umbrella.cisco.com
[5] https://github.com/baderj/domain-generation-
algorithms
[6] http://osint.bambenekconsulting.com/
13