1. Detecting Android Malware using
Long Short-term Memory (LSTM)
Vinayakumar R1, K.P Soman1, Prabaharan Poornachandran2 and Sachin Kumar S1
1Centre for Computational Engineering and Networking (CEN), Amrita School of
Engineering, Coimbatore, Amrita Vishwa Vidyapeetham,
Amrita University, India.
2Center for Cyber Security Systems and Networks, Amrita School of Engineering,
Amritapuri, Amrita Vishwa Vidyapeetham,
Amrita University, India.
2. Outline
• Introduction
• Background information / Related works
• Proposed Method – Deep Learning
• Description of the data set and Results
• Summary
• Future Work
• References
2
3. Introduction
• Android is the most commonly used mobile platform
for smartphones and the current market leader with a
market share holding nearly 87.6% [1].
• As the usage of smart phones surge past the personal
computers (PC’s), the malware writers also followed
suit, focusing their attention creating malware for the
smartphones.
• There is a sudden surge in Android malware and this
sheer number of new malware instances requires
newer approaches as writing signature for each
malware is a daunting task.
3
4. Background information / Related works
• Static and Dynamic analysis are the most commonly used approach.
• Static analysis collects set of features from apps by unpacking or
disassembling them without the run time execution.
• Dynamic analysis examines the run-time execution behavior of apps
such as system calls, network connections, memory utilization,
power consumption and user interactions, etc.
• Commercial systems use combination of both the mechanisms that
has been termed as hybrid analysis.
• Deep learning is a new field of machine learning that has the
capability to obtain optimal feature representation by taking raw
domain names as input [2].
• The feature sets collected from the static and dynamic analysis are
passed to recurrent neural network particularly long short-term
memory to detect and classify the malicious apps.
4
6. Description of the data set and Results
• For static analysis, the publically available data
set [3] is chosen. This contains Android
permissions that are collected from the 279
low-privileged apps and 279 malicious apps
from MalGenome.
• For dynamic analysis, the most popular data
set [4] is chosen. This contains feature vectors
of battery, binder, memory and network
utilization of the device from 1330 malicious
and 408 benign applications.
• A subset of [4] is also used [5].
6
10. Summary
• The effectiveness of recurrent neural network (RNN) and its
variant long short-term memory (LSTM) and static machine
learning are evaluated for android malware detection of time-
varying sequences of benign and malware apps.
• The family of recurrent neural networks performed well in
comparison to the static machine learning classifiers.
Moreover, LSTM has performed better than the recurrent
neural network.
• This is primarily due to the fact that the LSTM have the
potential to store long-range dependencies across time-steps
and to correlate with successive connection sequences
information.
10
11. Future Work
• One is focused on applying the discussed LSTM
network topologies on real raw android malware
samples instead of feature vector of granular
permissions in static analysis and profiled application
features in dynamic analysis.
• Another is to focus on studying the internal mechanics
of a memory block in each and every time-step of
LSTM as it is giving better results. One way to achieve
this is to transform states in LSTM network to linear
form and from that calculate the eigenvalue and
eigenvector to know which eigenvector is actually
carrying out required application information from one
time-step to the others.
11
12. References
[1] M Lindorfer M, M Neugschwandtner,L Weichselbaum, Y Fratantonio, V van der
Veen, C Platzer, Andrubis-1,000,000 Apps Later: a view on current android
malware behaviors. Third International Workshop on Building Analysis Datasets
and Gathering Experience Returns for Security (BADGERS), IEEE 2014 Sep 11 (pp.
3-17)
[2] LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." Nature
521.7553 (2015): 436-444.
[3] L.C.C Urcuqui, and A.N.Cadavid. Framework for malware analysis in Android.
Sistemas & Telemà ˛atica 2016 Aug
5;14(37):PP.45-56
[4] B. Amos, H. Turner, J. White Applying machine learning classifiers to dynamic
android malware detection at scale. 9th International Wireless Communications
and Mobile Computing Conference (IWCMC) (2013), pp. 1666-1671.
[5] Demertzis, K., & Iliadis, L. (2016). Bio-inspired hybrid intelligent method for
detecting android malware. In Knowledge, Information and Creativity Support
Systems (pp. 289-304). Springer International Publishing.
12