Malware is a nuisance for smartphone users. The impact is detrimental to smartphone users if the smartphone is infected by malware. Malware identification is not an easy process for ordinary users due to its deeply concealed dangers in application package kit (APK) files available in the Android Play Store. In this paper, the challenges of creating malware datasets are discussed. Long before a malware classification process and model can be built, the need for datasets with representative features for most types of malwares has to be addressed systematically. Only after a quality data set is available can a quality classification model be obtained using machine learning (ML) or deep learning (DL) algorithms. The entire malware classification process is a full pipeline process and sub processes. The authors purposefully focus on the process of building quality malware datasets, not on ML itself, because implementing ML requires another effort after the reliable dataset is fully built. The overall step in creating the malware dataset starts with the extraction of the Android Manifest from the APK file set and ends with the labeling method for all the extracted APK files. The key contribution of this paper is on how to generate datasets systematically from any APK file.
A Systematic Review of Android Malware Detection TechniquesCSCJournals
Malware detection is a significant key to Android application security. Malwares threat to Android users is increasing day by day. End users need security because they use mobile device to communicate information. Therefore, developing malware detection and control technology should be a priority. This research has extensively explored various state of the art techniques and mechanisms to detect malwares in Android applications by systematic literature review. It categorized the current researches into static, dynamic and hybrid approaches. This research work identifies the limitation and strength current research work. According to the restrictions of current malware detection technologies, it can conclude that detection technologies that use statistical analysis consume more time, energy and resources as compare to machine learning techniques. The results obtained from this research work reinforce the assertion that detection approaches designed for Android malware do not produce 100% efficient detection accuracy.
Mobile Application Security Testing, Testing for Mobility App | www.idexcel.comIdexcel Technologies
Application development has come a long way in last two decades, but it is puzzling to see that despite major security breaches, security testing takes a back seat as compared to other forms of quality testing measures such as usability or functional testing.
Mobile security is one of the most important
aspect when it comes to keeping our data secure from any
external attack like phishing, data hacking and many other
attacks that can have very disastrous effects that may also
lead to social disturbance, as in one’s private data can be
made public by the attackers.
Review on mobile threats and detection techniquesijdpsjournal
Since last-decade, smart-phones have gained widespread usage. Mobile devices store personal details
such as contacts and text messages. Due to this extensive growth, smart-phones are attracted towards
cyber-criminals. In this research work, we have done a systematic review of the terms related to malware
detection algorithms and have also summarized behavioral description of some known mobile malwares
in tabular form. After careful solicitation of all the possible methods and algorithms for detection of
mobile-based malwares, we give some recommendations for designing future malware detection algorithm
by considering computational complexity and detection ration of mobile malwares.
ANDROID UNTRUSTED DETECTION WITH PERMISSION BASED SCORING ANALYSISijitcs
Android smart phone is one of the fast growing mobile phones and because of these it the one of the most preferred target of malware developer. Malware apps can penetrate the device and gain privileges in which it can perform malicious activities such reading user contact, misusing of private information such as sending SMS and can harm user by exploiting the users private data which is stored in the device. The study is about implementation of detecting untrusted on android applications, which would be the basis of all future development regarding malware detection.
The smartphone users worldwide are not aware of the permissions as the basis of all malicious activities that could possibly operate in an android system and may steal personal and private information. Android operating system is an open system in which users are allowed to install application from any unsafe sites. However permission mechanism of and android system is not enough to guarantee the invulnerability of the application that can harm the user. In this paper, the permission scoring-based analysis that will scrutinized the installed permission and allows user to increase the efficiency of Android permission to inform user about the risk of the installed Android application, in this paper, the framework that would classify the level of sensitivity of the permission access by the application. The framework uses a formula that will calculate the sensitivity level of the permission and determine if the installed application is untrusted or not. Our result show that, in a collection of 26 untrusted application, the framework is able to correct and determine the application's behavior consistently and efficiently.
MALWARE DETECTION TECHNIQUES FOR MOBILE DEVICESijmnct
Mobile devices have become very popular nowadays, due to is portability and high performance, a mobile
device became a must device for persons using information and communication technologies. In addition to
hardware rapid evolution, mobile applications are also increasing in their complexity and performance to
cover most the needs of their users. Both software and hardware design focused on increasing performance
and the working hours of a mobile device. Different mobile operating systems are being used today with
different platforms and different market shares. Like all information systems, mobile systems are prone to
malware attacks. Due to the personality feature of mobile devices, malware detection is very important and
is a must tool in each device to protect private data and mitigate attacks. In this paper, we will study and
analyze different malware detection techniques used for mobile operating systems. We will focus on the to
two competing mobile operating systems – Android and iOS. We will asset each technique summarizing its
advantages and disadvantages. The aim of the work is to establish a basis for developing a mobile malware
detection tool based on user profiling.
A Systematic Review of Android Malware Detection TechniquesCSCJournals
Malware detection is a significant key to Android application security. Malwares threat to Android users is increasing day by day. End users need security because they use mobile device to communicate information. Therefore, developing malware detection and control technology should be a priority. This research has extensively explored various state of the art techniques and mechanisms to detect malwares in Android applications by systematic literature review. It categorized the current researches into static, dynamic and hybrid approaches. This research work identifies the limitation and strength current research work. According to the restrictions of current malware detection technologies, it can conclude that detection technologies that use statistical analysis consume more time, energy and resources as compare to machine learning techniques. The results obtained from this research work reinforce the assertion that detection approaches designed for Android malware do not produce 100% efficient detection accuracy.
Mobile Application Security Testing, Testing for Mobility App | www.idexcel.comIdexcel Technologies
Application development has come a long way in last two decades, but it is puzzling to see that despite major security breaches, security testing takes a back seat as compared to other forms of quality testing measures such as usability or functional testing.
Mobile security is one of the most important
aspect when it comes to keeping our data secure from any
external attack like phishing, data hacking and many other
attacks that can have very disastrous effects that may also
lead to social disturbance, as in one’s private data can be
made public by the attackers.
Review on mobile threats and detection techniquesijdpsjournal
Since last-decade, smart-phones have gained widespread usage. Mobile devices store personal details
such as contacts and text messages. Due to this extensive growth, smart-phones are attracted towards
cyber-criminals. In this research work, we have done a systematic review of the terms related to malware
detection algorithms and have also summarized behavioral description of some known mobile malwares
in tabular form. After careful solicitation of all the possible methods and algorithms for detection of
mobile-based malwares, we give some recommendations for designing future malware detection algorithm
by considering computational complexity and detection ration of mobile malwares.
ANDROID UNTRUSTED DETECTION WITH PERMISSION BASED SCORING ANALYSISijitcs
Android smart phone is one of the fast growing mobile phones and because of these it the one of the most preferred target of malware developer. Malware apps can penetrate the device and gain privileges in which it can perform malicious activities such reading user contact, misusing of private information such as sending SMS and can harm user by exploiting the users private data which is stored in the device. The study is about implementation of detecting untrusted on android applications, which would be the basis of all future development regarding malware detection.
The smartphone users worldwide are not aware of the permissions as the basis of all malicious activities that could possibly operate in an android system and may steal personal and private information. Android operating system is an open system in which users are allowed to install application from any unsafe sites. However permission mechanism of and android system is not enough to guarantee the invulnerability of the application that can harm the user. In this paper, the permission scoring-based analysis that will scrutinized the installed permission and allows user to increase the efficiency of Android permission to inform user about the risk of the installed Android application, in this paper, the framework that would classify the level of sensitivity of the permission access by the application. The framework uses a formula that will calculate the sensitivity level of the permission and determine if the installed application is untrusted or not. Our result show that, in a collection of 26 untrusted application, the framework is able to correct and determine the application's behavior consistently and efficiently.
MALWARE DETECTION TECHNIQUES FOR MOBILE DEVICESijmnct
Mobile devices have become very popular nowadays, due to is portability and high performance, a mobile
device became a must device for persons using information and communication technologies. In addition to
hardware rapid evolution, mobile applications are also increasing in their complexity and performance to
cover most the needs of their users. Both software and hardware design focused on increasing performance
and the working hours of a mobile device. Different mobile operating systems are being used today with
different platforms and different market shares. Like all information systems, mobile systems are prone to
malware attacks. Due to the personality feature of mobile devices, malware detection is very important and
is a must tool in each device to protect private data and mitigate attacks. In this paper, we will study and
analyze different malware detection techniques used for mobile operating systems. We will focus on the to
two competing mobile operating systems – Android and iOS. We will asset each technique summarizing its
advantages and disadvantages. The aim of the work is to establish a basis for developing a mobile malware
detection tool based on user profiling.
Malware detection techniques for mobile devicesijmnct
Mobile devices have become very popular nowadays, due to is portability and high performance, a mobile device became a must device for persons using information and communication technologies. In addition to hardware rapid evolution, mobile applications are also increasing in their complexity and performance to cover most the needs of their users. Both software and hardware design focused on increasing performance and the working hours of a mobile device. Different mobile operating systems are being used today with different platforms and different market shares. Like all information systems, mobile systems are prone to malware attacks. Due to
the personality feature of mobile devices, malware detection is very important and is a must tool in each device to protect private data and mitigate attacks. In
this paper, we will study and analyze different malware detection techniques used for mobile operating systems. We will focus on the to two competing mobile operating systems – Android and iOS. We will asset each technique summarizing its advantages and disadvantages. The aim of the work is to establish a basis for developing a mobile malware detection tool based on user profiling.
Comparative Study on Intrusion Detection Systems for Smartphonesiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Adaptive Mobile Malware Detection Model Based on CBRijtsrd
Today, the mobile phones can maintain lots of sensitive information. With the increasing capabilities of such phones, more and more malicious software malware targeting these devices have emerged. However there are many mobile malware detection techniques, they used specified classifiers on selected features to get their best accuracy. Thus, an adaptive malware detection approach is required to effectively detect the concept drift of mobile malware and maintain the accuracy. An adaptive malware detection approach is proposed based on case based reasoning technique in this paper to handle the concept drift issue in mobile malware detection. To demonstrate the design decision of our approach, several experiments are conducted. Large features set with 1,065 features from 10 different categories are used in evaluation. The evaluation includes both accuracy and efficiency of the model. The experimental results prove that our approach achieves acceptable performance and accuracy for the malware detection. Kyaw Soe Moe | Mya Mya Thwe "Adaptive Mobile Malware Detection Model Based on CBR" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-6 , October 2019, URL: https://www.ijtsrd.com/papers/ijtsrd28088.pdf Paper URL: https://www.ijtsrd.com/computer-science/computer-security/28088/adaptive-mobile-malware-detection-model-based-on-cbr/kyaw-soe-moe
Technology has reshaped the way we interact with the world and access information
with the advent of smartphones. Accordingly, the needs we have and the solutions for our
needs have also changed along with the evolving technology. One of the most affected
matters from technology is communication. With the various options and capabilities,
instant messaging applications have been started to use for communication purpose
which is one of the biggest needs of human being. We can send text messages, video
messages, voice recordings and share locations using these applications. Even further,
we no longer need cell phone calls by GSM operators, and instead prefer these
applications for instant calls, as well as sharing private information with these
applications, not only for personal daily life, also for business need. On the other hand,
these applications bring risks with many benefits. One of them is privacy. We do not want
that these applications can store our personal data as its user. How do our best practices
keep our data? Do they give the necessary attention for privacy? Another fact is that these
applications can be used by criminals to communicate and execute a secret plan. If a
criminal gets caught, what can be obtained as evidence from these messaging
applications? This time we need to know what can be extracted from the mobile device.
This research focuses on forensics analysis of the instant messaging applications on the
Android platform
SYSTEM CALL DEPENDENCE GRAPH BASED BEHAVIOR DECOMPOSITION OF ANDROID APPLICAT...IJNSA Journal
Millions of developers and third-party organizations have flooded into the Android ecosystem due to Android’s open-source feature and low barriers to entry for developers. .However, that also attracts many attackers. Over 90 percent of mobile malware is found targeted on Android. Though Android provides multiple security features and layers to protect user data and system resources, there are still some overprivileged applications in Google Play Store or third-party Android app stores at wild. In this paper, we proposed an approach to map system level behavior and Android APIs, based on the observation that system level behaviors cannot be avoidedbut sensitive Android APIs could be evaded.To the best of our knowledge, our approach provides the first work to decompose Android application behaviors based on system-level behaviors. We then map system level behaviors and Android APIs through System Call Dependence Graphs. The study also shows that our approach can effectively identify potential permission abusing, with an almost negligible performance impact.
A Comprehensive Study on Security issues in Android Mobile Phone — Scope and ...AM Publications
Due to tremendous development and growth in mobile phone software and hardware technologies now Security issues is a very big challenge to all concerned persons such as scientists, manufacturers, designers, industrialists and so on. Usually, such technology takes time to be absorbed into the market and this gives time to the security teams to develop effective security controls. The rapid growth of the smart-phone market and the use of these devices for email, online banking, and accessing other forms of sensitive content has led to the emergence of a new and ever-changing threat landscape [1]. Along with this, the fact that anyone can be a user has led to the smart-phone appearing in the hands of almost every person before the proper security controls can be developed. Currently, android has the biggest share in the market among all the smart-phone operating systems. As the powers and features of such phones increase, their vulnerability also increases and makes them prone towards security threats. In the present paper, the authors have made a systematic study on why android security is important, what some of the potential vulnerabilities are and what security measures have been adopted currently to ensure security.
Android is a Linux based operating system used for smart phone devices. Since 2008, Android devices gained huge market share due to its open architecture and popularity. Increased popularity of the Android devices and associated primary benefits attracted the malware developers. Rate of Android malware applications increased between 2008 and 2016. In this paper, we proposed dynamic malware detection approach for Android applications. In dynamic analysis, system calls are recorded to calculate the density of the system calls. For density calculation, we used two different lengths of system calls that are 3 gram and 5 gram. Furthermore, Naive Bayes algorithm is applied to classify applications as benign or malicious. The proposed algorithm detects malware using 100 real world samples of benign and malware applications. We observe that proposed method gives effective and accurate results. The 3 gram Naive Bayes algorithm detects 84 malware application correctly and 14 benign application incorrectly. The 5 gram Naive Bayes algorithm detects 88 malware application correctly and 10 benign application incorrectly. Mr. Tushar Patil | Prof. Bharti Dhote "Malware Detection in Android Applications" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26449.pdfPaper URL: https://www.ijtsrd.com/engineering/computer-engineering/26449/malware-detection-in-android-applications/mr-tushar-patil
MACHINE LEARNING APPLICATIONS IN MALWARE CLASSIFICATION: A METAANALYSIS LITER...IJCI JOURNAL
With a text mining and bibliometrics approach, this study reviews the literature on the evolution
of malware classification using machine learning. This work takes literature from 2008 to 2022
on the subject of using machine learning for malware classification to understand the impact of
this technology on malware classification. Throughout this study, we seek to answer three main
research questions: RQ1: Is the application of machine learning for malware classification
growing? RQ2: What is the most common machine-learning application for malware
classification? RQ3: What are the outcomes of the most common machine learning
applications? The analysis of 2186 articles resulting from a data collection process from peerreviewed databases shows the trajectory of the application of this technology on malware
classification as well as trends in both the machine learning and malware classification fields of
study. This study performs quantitative and qualitative analysis using statistical and N-gram
analysis techniques and a formal literature review to answer the proposed research questions.
The research reveals methods such as support vector machines and random forests to be
standard machine learning methods for malware classification in efforts to detect maliciousness
or categorize malware by family. Machine learning is a highly researched technology with
many applications, from malware classification and beyond.
MOST VIEWED ARTICLES IN ACADEMIA - INTERNATIONAL JOURNAL OF MOBILE NETWORK CO...ijmnct
International Journal of Mobile Network Communications & Telematics (IJMNCT) is an open access peer-reviewed journal that addresses the impacts and challenges of mobile communications and telematics. The journal also aims to focus on various areas such as ecommerce, e-governance, Telematics, Telelearning nomadic computing, data management, related software and hardware technologies, and mobile user services. The journal documents practical and theoretical results which make a fundamental contribution for the development of mobile communication technologies.
ANDROINSPECTOR: A SYSTEM FOR COMPREHENSIVE ANALYSIS OF ANDROID APPLICATIONSIJNSA Journal
Android is an extensively used mobile platform and with evolution it has also witnessed an increased influx of malicious applications in its market place. The availability of multiple sources for downloading applications has also contributed to users falling prey to malicious applications. A major hindrance in blocking the entry of malicious applications into the Android market place is scarcity of effective mechanisms to identify malicious applications. This paper presents AndroInspector, a system for comprehensive analysis of an Android application using both static and dynamic analysis techniques. AndroInspector derives, extracts and analyses crucial features of Android applications using static analysis and subsequently classifies the application using machine learning techniques. Dynamic analysis includes automated execution of Android application to identify a set of pre-defined malicious actions performed by application at run-time.
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
More Related Content
Similar to Android-manifest extraction and labeling method for malware compilation and dataset creation
Malware detection techniques for mobile devicesijmnct
Mobile devices have become very popular nowadays, due to is portability and high performance, a mobile device became a must device for persons using information and communication technologies. In addition to hardware rapid evolution, mobile applications are also increasing in their complexity and performance to cover most the needs of their users. Both software and hardware design focused on increasing performance and the working hours of a mobile device. Different mobile operating systems are being used today with different platforms and different market shares. Like all information systems, mobile systems are prone to malware attacks. Due to
the personality feature of mobile devices, malware detection is very important and is a must tool in each device to protect private data and mitigate attacks. In
this paper, we will study and analyze different malware detection techniques used for mobile operating systems. We will focus on the to two competing mobile operating systems – Android and iOS. We will asset each technique summarizing its advantages and disadvantages. The aim of the work is to establish a basis for developing a mobile malware detection tool based on user profiling.
Comparative Study on Intrusion Detection Systems for Smartphonesiosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Adaptive Mobile Malware Detection Model Based on CBRijtsrd
Today, the mobile phones can maintain lots of sensitive information. With the increasing capabilities of such phones, more and more malicious software malware targeting these devices have emerged. However there are many mobile malware detection techniques, they used specified classifiers on selected features to get their best accuracy. Thus, an adaptive malware detection approach is required to effectively detect the concept drift of mobile malware and maintain the accuracy. An adaptive malware detection approach is proposed based on case based reasoning technique in this paper to handle the concept drift issue in mobile malware detection. To demonstrate the design decision of our approach, several experiments are conducted. Large features set with 1,065 features from 10 different categories are used in evaluation. The evaluation includes both accuracy and efficiency of the model. The experimental results prove that our approach achieves acceptable performance and accuracy for the malware detection. Kyaw Soe Moe | Mya Mya Thwe "Adaptive Mobile Malware Detection Model Based on CBR" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-6 , October 2019, URL: https://www.ijtsrd.com/papers/ijtsrd28088.pdf Paper URL: https://www.ijtsrd.com/computer-science/computer-security/28088/adaptive-mobile-malware-detection-model-based-on-cbr/kyaw-soe-moe
Technology has reshaped the way we interact with the world and access information
with the advent of smartphones. Accordingly, the needs we have and the solutions for our
needs have also changed along with the evolving technology. One of the most affected
matters from technology is communication. With the various options and capabilities,
instant messaging applications have been started to use for communication purpose
which is one of the biggest needs of human being. We can send text messages, video
messages, voice recordings and share locations using these applications. Even further,
we no longer need cell phone calls by GSM operators, and instead prefer these
applications for instant calls, as well as sharing private information with these
applications, not only for personal daily life, also for business need. On the other hand,
these applications bring risks with many benefits. One of them is privacy. We do not want
that these applications can store our personal data as its user. How do our best practices
keep our data? Do they give the necessary attention for privacy? Another fact is that these
applications can be used by criminals to communicate and execute a secret plan. If a
criminal gets caught, what can be obtained as evidence from these messaging
applications? This time we need to know what can be extracted from the mobile device.
This research focuses on forensics analysis of the instant messaging applications on the
Android platform
SYSTEM CALL DEPENDENCE GRAPH BASED BEHAVIOR DECOMPOSITION OF ANDROID APPLICAT...IJNSA Journal
Millions of developers and third-party organizations have flooded into the Android ecosystem due to Android’s open-source feature and low barriers to entry for developers. .However, that also attracts many attackers. Over 90 percent of mobile malware is found targeted on Android. Though Android provides multiple security features and layers to protect user data and system resources, there are still some overprivileged applications in Google Play Store or third-party Android app stores at wild. In this paper, we proposed an approach to map system level behavior and Android APIs, based on the observation that system level behaviors cannot be avoidedbut sensitive Android APIs could be evaded.To the best of our knowledge, our approach provides the first work to decompose Android application behaviors based on system-level behaviors. We then map system level behaviors and Android APIs through System Call Dependence Graphs. The study also shows that our approach can effectively identify potential permission abusing, with an almost negligible performance impact.
A Comprehensive Study on Security issues in Android Mobile Phone — Scope and ...AM Publications
Due to tremendous development and growth in mobile phone software and hardware technologies now Security issues is a very big challenge to all concerned persons such as scientists, manufacturers, designers, industrialists and so on. Usually, such technology takes time to be absorbed into the market and this gives time to the security teams to develop effective security controls. The rapid growth of the smart-phone market and the use of these devices for email, online banking, and accessing other forms of sensitive content has led to the emergence of a new and ever-changing threat landscape [1]. Along with this, the fact that anyone can be a user has led to the smart-phone appearing in the hands of almost every person before the proper security controls can be developed. Currently, android has the biggest share in the market among all the smart-phone operating systems. As the powers and features of such phones increase, their vulnerability also increases and makes them prone towards security threats. In the present paper, the authors have made a systematic study on why android security is important, what some of the potential vulnerabilities are and what security measures have been adopted currently to ensure security.
Android is a Linux based operating system used for smart phone devices. Since 2008, Android devices gained huge market share due to its open architecture and popularity. Increased popularity of the Android devices and associated primary benefits attracted the malware developers. Rate of Android malware applications increased between 2008 and 2016. In this paper, we proposed dynamic malware detection approach for Android applications. In dynamic analysis, system calls are recorded to calculate the density of the system calls. For density calculation, we used two different lengths of system calls that are 3 gram and 5 gram. Furthermore, Naive Bayes algorithm is applied to classify applications as benign or malicious. The proposed algorithm detects malware using 100 real world samples of benign and malware applications. We observe that proposed method gives effective and accurate results. The 3 gram Naive Bayes algorithm detects 84 malware application correctly and 14 benign application incorrectly. The 5 gram Naive Bayes algorithm detects 88 malware application correctly and 10 benign application incorrectly. Mr. Tushar Patil | Prof. Bharti Dhote "Malware Detection in Android Applications" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26449.pdfPaper URL: https://www.ijtsrd.com/engineering/computer-engineering/26449/malware-detection-in-android-applications/mr-tushar-patil
MACHINE LEARNING APPLICATIONS IN MALWARE CLASSIFICATION: A METAANALYSIS LITER...IJCI JOURNAL
With a text mining and bibliometrics approach, this study reviews the literature on the evolution
of malware classification using machine learning. This work takes literature from 2008 to 2022
on the subject of using machine learning for malware classification to understand the impact of
this technology on malware classification. Throughout this study, we seek to answer three main
research questions: RQ1: Is the application of machine learning for malware classification
growing? RQ2: What is the most common machine-learning application for malware
classification? RQ3: What are the outcomes of the most common machine learning
applications? The analysis of 2186 articles resulting from a data collection process from peerreviewed databases shows the trajectory of the application of this technology on malware
classification as well as trends in both the machine learning and malware classification fields of
study. This study performs quantitative and qualitative analysis using statistical and N-gram
analysis techniques and a formal literature review to answer the proposed research questions.
The research reveals methods such as support vector machines and random forests to be
standard machine learning methods for malware classification in efforts to detect maliciousness
or categorize malware by family. Machine learning is a highly researched technology with
many applications, from malware classification and beyond.
MOST VIEWED ARTICLES IN ACADEMIA - INTERNATIONAL JOURNAL OF MOBILE NETWORK CO...ijmnct
International Journal of Mobile Network Communications & Telematics (IJMNCT) is an open access peer-reviewed journal that addresses the impacts and challenges of mobile communications and telematics. The journal also aims to focus on various areas such as ecommerce, e-governance, Telematics, Telelearning nomadic computing, data management, related software and hardware technologies, and mobile user services. The journal documents practical and theoretical results which make a fundamental contribution for the development of mobile communication technologies.
ANDROINSPECTOR: A SYSTEM FOR COMPREHENSIVE ANALYSIS OF ANDROID APPLICATIONSIJNSA Journal
Android is an extensively used mobile platform and with evolution it has also witnessed an increased influx of malicious applications in its market place. The availability of multiple sources for downloading applications has also contributed to users falling prey to malicious applications. A major hindrance in blocking the entry of malicious applications into the Android market place is scarcity of effective mechanisms to identify malicious applications. This paper presents AndroInspector, a system for comprehensive analysis of an Android application using both static and dynamic analysis techniques. AndroInspector derives, extracts and analyses crucial features of Android applications using static analysis and subsequently classifies the application using machine learning techniques. Dynamic analysis includes automated execution of Android application to identify a set of pre-defined malicious actions performed by application at run-time.
Similar to Android-manifest extraction and labeling method for malware compilation and dataset creation (20)
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
Embedded machine learning-based road conditions and driving behavior monitoringIJECEIAES
Car accident rates have increased in recent years, resulting in losses in human lives, properties, and other financial costs. An embedded machine learning-based system is developed to address this critical issue. The system can monitor road conditions, detect driving patterns, and identify aggressive driving behaviors. The system is based on neural networks trained on a comprehensive dataset of driving events, driving styles, and road conditions. The system effectively detects potential risks and helps mitigate the frequency and impact of accidents. The primary goal is to ensure the safety of drivers and vehicles. Collecting data involved gathering information on three key road events: normal street and normal drive, speed bumps, circular yellow speed bumps, and three aggressive driving actions: sudden start, sudden stop, and sudden entry. The gathered data is processed and analyzed using a machine learning system designed for limited power and memory devices. The developed system resulted in 91.9% accuracy, 93.6% precision, and 92% recall. The achieved inference time on an Arduino Nano 33 BLE Sense with a 32-bit CPU running at 64 MHz is 34 ms and requires 2.6 kB peak RAM and 139.9 kB program flash memory, making it suitable for resource-constrained embedded systems.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Neural network optimizer of proportional-integral-differential controller par...IJECEIAES
Wide application of proportional-integral-differential (PID)-regulator in industry requires constant improvement of methods of its parameters adjustment. The paper deals with the issues of optimization of PID-regulator parameters with the use of neural network technology methods. A methodology for choosing the architecture (structure) of neural network optimizer is proposed, which consists in determining the number of layers, the number of neurons in each layer, as well as the form and type of activation function. Algorithms of neural network training based on the application of the method of minimizing the mismatch between the regulated value and the target value are developed. The method of back propagation of gradients is proposed to select the optimal training rate of neurons of the neural network. The neural network optimizer, which is a superstructure of the linear PID controller, allows increasing the regulation accuracy from 0.23 to 0.09, thus reducing the power consumption from 65% to 53%. The results of the conducted experiments allow us to conclude that the created neural superstructure may well become a prototype of an automatic voltage regulator (AVR)-type industrial controller for tuning the parameters of the PID controller.
An improved modulation technique suitable for a three level flying capacitor ...IJECEIAES
This research paper introduces an innovative modulation technique for controlling a 3-level flying capacitor multilevel inverter (FCMLI), aiming to streamline the modulation process in contrast to conventional methods. The proposed
simplified modulation technique paves the way for more straightforward and
efficient control of multilevel inverters, enabling their widespread adoption and
integration into modern power electronic systems. Through the amalgamation of
sinusoidal pulse width modulation (SPWM) with a high-frequency square wave
pulse, this controlling technique attains energy equilibrium across the coupling
capacitor. The modulation scheme incorporates a simplified switching pattern
and a decreased count of voltage references, thereby simplifying the control
algorithm.
A review on features and methods of potential fishing zoneIJECEIAES
This review focuses on the importance of identifying potential fishing zones in seawater for sustainable fishing practices. It explores features like sea surface temperature (SST) and sea surface height (SSH), along with classification methods such as classifiers. The features like SST, SSH, and different classifiers used to classify the data, have been figured out in this review study. This study underscores the importance of examining potential fishing zones using advanced analytical techniques. It thoroughly explores the methodologies employed by researchers, covering both past and current approaches. The examination centers on data characteristics and the application of classification algorithms for classification of potential fishing zones. Furthermore, the prediction of potential fishing zones relies significantly on the effectiveness of classification algorithms. Previous research has assessed the performance of models like support vector machines, naïve Bayes, and artificial neural networks (ANN). In the previous result, the results of support vector machine (SVM) were 97.6% more accurate than naive Bayes's 94.2% to classify test data for fisheries classification. By considering the recent works in this area, several recommendations for future works are presented to further improve the performance of the potential fishing zone models, which is important to the fisheries community.
Electrical signal interference minimization using appropriate core material f...IJECEIAES
As demand for smaller, quicker, and more powerful devices rises, Moore's law is strictly followed. The industry has worked hard to make little devices that boost productivity. The goal is to optimize device density. Scientists are reducing connection delays to improve circuit performance. This helped them understand three-dimensional integrated circuit (3D IC) concepts, which stack active devices and create vertical connections to diminish latency and lower interconnects. Electrical involvement is a big worry with 3D integrates circuits. Researchers have developed and tested through silicon via (TSV) and substrates to decrease electrical wave involvement. This study illustrates a novel noise coupling reduction method using several electrical involvement models. A 22% drop in electrical involvement from wave-carrying to victim TSVs introduces this new paradigm and improves system performance even at higher THz frequencies.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
Bibliometric analysis highlighting the role of women in addressing climate ch...IJECEIAES
Fossil fuel consumption increased quickly, contributing to climate change
that is evident in unusual flooding and draughts, and global warming. Over
the past ten years, women's involvement in society has grown dramatically,
and they succeeded in playing a noticeable role in reducing climate change.
A bibliometric analysis of data from the last ten years has been carried out to
examine the role of women in addressing the climate change. The analysis's
findings discussed the relevant to the sustainable development goals (SDGs),
particularly SDG 7 and SDG 13. The results considered contributions made
by women in the various sectors while taking geographic dispersion into
account. The bibliometric analysis delves into topics including women's
leadership in environmental groups, their involvement in policymaking, their
contributions to sustainable development projects, and the influence of
gender diversity on attempts to mitigate climate change. This study's results
highlight how women have influenced policies and actions related to climate
change, point out areas of research deficiency and recommendations on how
to increase role of the women in addressing the climate change and
achieving sustainability. To achieve more successful results, this initiative
aims to highlight the significance of gender equality and encourage
inclusivity in climate change decision-making processes.
Voltage and frequency control of microgrid in presence of micro-turbine inter...IJECEIAES
The active and reactive load changes have a significant impact on voltage
and frequency. In this paper, in order to stabilize the microgrid (MG) against
load variations in islanding mode, the active and reactive power of all
distributed generators (DGs), including energy storage (battery), diesel
generator, and micro-turbine, are controlled. The micro-turbine generator is
connected to MG through a three-phase to three-phase matrix converter, and
the droop control method is applied for controlling the voltage and
frequency of MG. In addition, a method is introduced for voltage and
frequency control of micro-turbines in the transition state from gridconnected mode to islanding mode. A novel switching strategy of the matrix
converter is used for converting the high-frequency output voltage of the
micro-turbine to the grid-side frequency of the utility system. Moreover,
using the switching strategy, the low-order harmonics in the output current
and voltage are not produced, and consequently, the size of the output filter
would be reduced. In fact, the suggested control strategy is load-independent
and has no frequency conversion restrictions. The proposed approach for
voltage and frequency regulation demonstrates exceptional performance and
favorable response across various load alteration scenarios. The suggested
strategy is examined in several scenarios in the MG test systems, and the
simulation results are addressed.
Enhancing battery system identification: nonlinear autoregressive modeling fo...IJECEIAES
Precisely characterizing Li-ion batteries is essential for optimizing their
performance, enhancing safety, and prolonging their lifespan across various
applications, such as electric vehicles and renewable energy systems. This
article introduces an innovative nonlinear methodology for system
identification of a Li-ion battery, employing a nonlinear autoregressive with
exogenous inputs (NARX) model. The proposed approach integrates the
benefits of nonlinear modeling with the adaptability of the NARX structure,
facilitating a more comprehensive representation of the intricate
electrochemical processes within the battery. Experimental data collected
from a Li-ion battery operating under diverse scenarios are employed to
validate the effectiveness of the proposed methodology. The identified
NARX model exhibits superior accuracy in predicting the battery's behavior
compared to traditional linear models. This study underscores the
importance of accounting for nonlinearities in battery modeling, providing
insights into the intricate relationships between state-of-charge, voltage, and
current under dynamic conditions.
Smart grid deployment: from a bibliometric analysis to a surveyIJECEIAES
Smart grids are one of the last decades' innovations in electrical energy.
They bring relevant advantages compared to the traditional grid and
significant interest from the research community. Assessing the field's
evolution is essential to propose guidelines for facing new and future smart
grid challenges. In addition, knowing the main technologies involved in the
deployment of smart grids (SGs) is important to highlight possible
shortcomings that can be mitigated by developing new tools. This paper
contributes to the research trends mentioned above by focusing on two
objectives. First, a bibliometric analysis is presented to give an overview of
the current research level about smart grid deployment. Second, a survey of
the main technological approaches used for smart grid implementation and
their contributions are highlighted. To that effect, we searched the Web of
Science (WoS), and the Scopus databases. We obtained 5,663 documents
from WoS and 7,215 from Scopus on smart grid implementation or
deployment. With the extraction limitation in the Scopus database, 5,872 of
the 7,215 documents were extracted using a multi-step process. These two
datasets have been analyzed using a bibliometric tool called bibliometrix.
The main outputs are presented with some recommendations for future
research.
Use of analytical hierarchy process for selecting and prioritizing islanding ...IJECEIAES
One of the problems that are associated to power systems is islanding
condition, which must be rapidly and properly detected to prevent any
negative consequences on the system's protection, stability, and security.
This paper offers a thorough overview of several islanding detection
strategies, which are divided into two categories: classic approaches,
including local and remote approaches, and modern techniques, including
techniques based on signal processing and computational intelligence.
Additionally, each approach is compared and assessed based on several
factors, including implementation costs, non-detected zones, declining
power quality, and response times using the analytical hierarchy process
(AHP). The multi-criteria decision-making analysis shows that the overall
weight of passive methods (24.7%), active methods (7.8%), hybrid methods
(5.6%), remote methods (14.5%), signal processing-based methods (26.6%),
and computational intelligent-based methods (20.8%) based on the
comparison of all criteria together. Thus, it can be seen from the total weight
that hybrid approaches are the least suitable to be chosen, while signal
processing-based methods are the most appropriate islanding detection
method to be selected and implemented in power system with respect to the
aforementioned factors. Using Expert Choice software, the proposed
hierarchy model is studied and examined.
Enhancing of single-stage grid-connected photovoltaic system using fuzzy logi...IJECEIAES
The power generated by photovoltaic (PV) systems is influenced by
environmental factors. This variability hampers the control and utilization of
solar cells' peak output. In this study, a single-stage grid-connected PV
system is designed to enhance power quality. Our approach employs fuzzy
logic in the direct power control (DPC) of a three-phase voltage source
inverter (VSI), enabling seamless integration of the PV connected to the
grid. Additionally, a fuzzy logic-based maximum power point tracking
(MPPT) controller is adopted, which outperforms traditional methods like
incremental conductance (INC) in enhancing solar cell efficiency and
minimizing the response time. Moreover, the inverter's real-time active and
reactive power is directly managed to achieve a unity power factor (UPF).
The system's performance is assessed through MATLAB/Simulink
implementation, showing marked improvement over conventional methods,
particularly in steady-state and varying weather conditions. For solar
irradiances of 500 and 1,000 W/m2
, the results show that the proposed
method reduces the total harmonic distortion (THD) of the injected current
to the grid by approximately 46% and 38% compared to conventional
methods, respectively. Furthermore, we compare the simulation results with
IEEE standards to evaluate the system's grid compatibility.
Enhancing photovoltaic system maximum power point tracking with fuzzy logic-b...IJECEIAES
Photovoltaic systems have emerged as a promising energy resource that
caters to the future needs of society, owing to their renewable, inexhaustible,
and cost-free nature. The power output of these systems relies on solar cell
radiation and temperature. In order to mitigate the dependence on
atmospheric conditions and enhance power tracking, a conventional
approach has been improved by integrating various methods. To optimize
the generation of electricity from solar systems, the maximum power point
tracking (MPPT) technique is employed. To overcome limitations such as
steady-state voltage oscillations and improve transient response, two
traditional MPPT methods, namely fuzzy logic controller (FLC) and perturb
and observe (P&O), have been modified. This research paper aims to
simulate and validate the step size of the proposed modified P&O and FLC
techniques within the MPPT algorithm using MATLAB/Simulink for
efficient power tracking in photovoltaic systems.
Adaptive synchronous sliding control for a robot manipulator based on neural ...IJECEIAES
Robot manipulators have become important equipment in production lines, medical fields, and transportation. Improving the quality of trajectory tracking for
robot hands is always an attractive topic in the research community. This is a
challenging problem because robot manipulators are complex nonlinear systems
and are often subject to fluctuations in loads and external disturbances. This
article proposes an adaptive synchronous sliding control scheme to improve trajectory tracking performance for a robot manipulator. The proposed controller
ensures that the positions of the joints track the desired trajectory, synchronize
the errors, and significantly reduces chattering. First, the synchronous tracking
errors and synchronous sliding surfaces are presented. Second, the synchronous
tracking error dynamics are determined. Third, a robust adaptive control law is
designed,the unknown components of the model are estimated online by the neural network, and the parameters of the switching elements are selected by fuzzy
logic. The built algorithm ensures that the tracking and approximation errors
are ultimately uniformly bounded (UUB). Finally, the effectiveness of the constructed algorithm is demonstrated through simulation and experimental results.
Simulation and experimental results show that the proposed controller is effective with small synchronous tracking errors, and the chattering phenomenon is
significantly reduced.
Remote field-programmable gate array laboratory for signal acquisition and de...IJECEIAES
A remote laboratory utilizing field-programmable gate array (FPGA) technologies enhances students’ learning experience anywhere and anytime in embedded system design. Existing remote laboratories prioritize hardware access and visual feedback for observing board behavior after programming, neglecting comprehensive debugging tools to resolve errors that require internal signal acquisition. This paper proposes a novel remote embeddedsystem design approach targeting FPGA technologies that are fully interactive via a web-based platform. Our solution provides FPGA board access and debugging capabilities beyond the visual feedback provided by existing remote laboratories. We implemented a lab module that allows users to seamlessly incorporate into their FPGA design. The module minimizes hardware resource utilization while enabling the acquisition of a large number of data samples from the signal during the experiments by adaptively compressing the signal prior to data transmission. The results demonstrate an average compression ratio of 2.90 across three benchmark signals, indicating efficient signal acquisition and effective debugging and analysis. This method allows users to acquire more data samples than conventional methods. The proposed lab allows students to remotely test and debug their designs, bridging the gap between theory and practice in embedded system design.
Detecting and resolving feature envy through automated machine learning and m...IJECEIAES
Efficiently identifying and resolving code smells enhances software project quality. This paper presents a novel solution, utilizing automated machine learning (AutoML) techniques, to detect code smells and apply move method refactoring. By evaluating code metrics before and after refactoring, we assessed its impact on coupling, complexity, and cohesion. Key contributions of this research include a unique dataset for code smell classification and the development of models using AutoGluon for optimal performance. Furthermore, the study identifies the top 20 influential features in classifying feature envy, a well-known code smell, stemming from excessive reliance on external classes. We also explored how move method refactoring addresses feature envy, revealing reduced coupling and complexity, and improved cohesion, ultimately enhancing code quality. In summary, this research offers an empirical, data-driven approach, integrating AutoML and move method refactoring to optimize software project quality. Insights gained shed light on the benefits of refactoring on code quality and the significance of specific features in detecting feature envy. Future research can expand to explore additional refactoring techniques and a broader range of code metrics, advancing software engineering practices and standards.
Smart monitoring technique for solar cell systems using internet of things ba...IJECEIAES
Rapidly and remotely monitoring and receiving the solar cell systems status parameters, solar irradiance, temperature, and humidity, are critical issues in enhancement their efficiency. Hence, in the present article an improved smart prototype of internet of things (IoT) technique based on embedded system through NodeMCU ESP8266 (ESP-12E) was carried out experimentally. Three different regions at Egypt; Luxor, Cairo, and El-Beheira cities were chosen to study their solar irradiance profile, temperature, and humidity by the proposed IoT system. The monitoring data of solar irradiance, temperature, and humidity were live visualized directly by Ubidots through hypertext transfer protocol (HTTP) protocol. The measured solar power radiation in Luxor, Cairo, and El-Beheira ranged between 216-1000, 245-958, and 187-692 W/m 2 respectively during the solar day. The accuracy and rapidity of obtaining monitoring results using the proposed IoT system made it a strong candidate for application in monitoring solar cell systems. On the other hand, the obtained solar power radiation results of the three considered regions strongly candidate Luxor and Cairo as suitable places to build up a solar cells system station rather than El-Beheira.
An efficient security framework for intrusion detection and prevention in int...IJECEIAES
Over the past few years, the internet of things (IoT) has advanced to connect billions of smart devices to improve quality of life. However, anomalies or malicious intrusions pose several security loopholes, leading to performance degradation and threat to data security in IoT operations. Thereby, IoT security systems must keep an eye on and restrict unwanted events from occurring in the IoT network. Recently, various technical solutions based on machine learning (ML) models have been derived towards identifying and restricting unwanted events in IoT. However, most ML-based approaches are prone to miss-classification due to inappropriate feature selection. Additionally, most ML approaches applied to intrusion detection and prevention consider supervised learning, which requires a large amount of labeled data to be trained. Consequently, such complex datasets are impossible to source in a large network like IoT. To address this problem, this proposed study introduces an efficient learning mechanism to strengthen the IoT security aspects. The proposed algorithm incorporates supervised and unsupervised approaches to improve the learning models for intrusion detection and mitigation. Compared with the related works, the experimental outcome shows that the model performs well in a benchmark dataset. It accomplishes an improved detection accuracy of approximately 99.21%.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
ACEP Magazine edition 4th launched on 05.06.2024Rahul
This document provides information about the third edition of the magazine "Sthapatya" published by the Association of Civil Engineers (Practicing) Aurangabad. It includes messages from current and past presidents of ACEP, memories and photos from past ACEP events, information on life time achievement awards given by ACEP, and a technical article on concrete maintenance, repairs and strengthening. The document highlights activities of ACEP and provides a technical educational article for members.
Harnessing WebAssembly for Real-time Stateless Streaming PipelinesChristina Lin
Traditionally, dealing with real-time data pipelines has involved significant overhead, even for straightforward tasks like data transformation or masking. However, in this talk, we’ll venture into the dynamic realm of WebAssembly (WASM) and discover how it can revolutionize the creation of stateless streaming pipelines within a Kafka (Redpanda) broker. These pipelines are adept at managing low-latency, high-data-volume scenarios.
Low power architecture of logic gates using adiabatic techniquesnooriasukmaningtyas
The growing significance of portable systems to limit power consumption in ultra-large-scale-integration chips of very high density, has recently led to rapid and inventive progresses in low-power design. The most effective technique is adiabatic logic circuit design in energy-efficient hardware. This paper presents two adiabatic approaches for the design of low power circuits, modified positive feedback adiabatic logic (modified PFAL) and the other is direct current diode based positive feedback adiabatic logic (DC-DB PFAL). Logic gates are the preliminary components in any digital circuit design. By improving the performance of basic gates, one can improvise the whole system performance. In this paper proposed circuit design of the low power architecture of OR/NOR, AND/NAND, and XOR/XNOR gates are presented using the said approaches and their results are analyzed for powerdissipation, delay, power-delay-product and rise time and compared with the other adiabatic techniques along with the conventional complementary metal oxide semiconductor (CMOS) designs reported in the literature. It has been found that the designs with DC-DB PFAL technique outperform with the percentage improvement of 65% for NOR gate and 7% for NAND gate and 34% for XNOR gate over the modified PFAL techniques at 10 MHz respectively.
Android-manifest extraction and labeling method for malware compilation and dataset creation
1. International Journal of Electrical and Computer Engineering (IJECE)
Vol. 13, No. 6, December 2023, pp. 6568~6577
ISSN: 2088-8708, DOI: 10.11591/ijece.v13i6.pp6568-6577 6568
Journal homepage: http://ijece.iaescore.com
Android-manifest extraction and labeling method for malware
compilation and dataset creation
Djarot Hindarto1
, Arko Djajadi2
1
Faculty of Information and Communication Technology, University of Nasional, Jakarta, Indonesia
2
Department of Engineering Physics, Faculty of Engineering and Informatics, Multimedia Nusantara University, Tangerang, Indonesia
Article Info ABSTRACT
Article history:
Received Jan 18, 2023
Revised Mar 27, 2023
Accepted Apr 7, 2023
Malware is a nuisance for smartphone users. The impact is detrimental to
smartphone users if the smartphone is infected by malware. Malware
identification is not an easy process for ordinary users due to its deeply
concealed dangers in application package kit (APK) files available in the
Android Play Store. In this paper, the challenges of creating malware datasets
are discussed. Long before a malware classification process and model can be
built, the need for datasets with representative features for most types of
malwares has to be addressed systematically. Only after a quality data set is
available can a quality classification model be obtained using machine learning
(ML) or deep learning (DL) algorithms. The entire malware classification
process is a full pipeline process and sub processes. The authors purposefully
focus on the process of building quality malware datasets, not on ML itself,
because implementing ML requires another effort after the reliable dataset is
fully built. The overall step in creating the malware dataset starts with the
extraction of the Android Manifest from the APK file set and ends with the
labeling method for all the extracted APK files. The key contribution of this
paper is on how to generate datasets systematically from any APK file.
Keywords:
Android application
Artificial neural network
Extract
Machine learning
Malware
This is an open access article under the CC BY-SA license.
Corresponding Author:
Arko Djajadi
Department of Engineering Physics, Faculty of Engineering and Informatics, Multimedia Nusantara
University
Scientia Boulevard Road, Gading Serpong, Tangerang, Indonesia
Email: arko@umn.ac.id
1. INTRODUCTION
The growth of the smartphone market over the last two decades has made Android one of the most
pervasive operating systems for smartphone devices, as it accounts for more than 80% of the global market.
With its popularity, the Android operating system comes at a cost, as it is becoming one of primary targets of
attack by cyber crime. Cyber attacks are so prevalent on most internet-connected systems, and various attack
models are used. Online applications such as web applications that are not properly protected are prone to such
attacks as SQL injection attacks, distribution denial of service, defacing or many other potential dangers. The
last 15 years smartphone users have had the advantage of ever faster mobile connections and now nearly all
smartphones are always connected to the internet. This is true in the case of ever dominating Android
smartphones. It means that Android smartphones in the networks are compromised even more seriously and
suffer from even wider cyber attacks or hijacked by fake Android application package kit (APK) files. Even
worse is the fact that the majority of Android smartphone users tend to be less aware or not literate with obvious
catastrophic danger once their devices are infected. Combined with social engineering attacks, users’ mobile
bank accounts, emails, phone books and social media apps fall quickly into the hand of cyber predators ready
for exploiting the victims. Recent national news of looming attacks being handled by the national cyber security
2. Int J Elec & Comp Eng ISSN: 2088-8708
Android-manifest extraction and labeling method for malware compilation … (Djarot Hindarto)
6569
forces confirm this critical attacks [1]–[3], indicating the eternal need for well-planned efforts to perform
penetration testing [4] and malware compilation. Cryptography can be applied to further protect data [5] in case
of cyber attacks, and scam or fraudulent links for phasing can possibly be anticipated by federated learning [6].
August 2010 was the first time that the existence of malware had been detected in the android
operating system. It did not take long thereafter, that the number of malicious android applications reached
more than thousand APKs in the following years. In the third quarter of 2018, according to Google data, the
total number of Android malware touched 3.2 million and jumped by 40% year on year [7], [8]. Data from
several sources indicate that there are more than 1 billion Android devices at risk due to malware. In addition,
two out of five active mobile phones have security risks [9], which we believe that most Android users are not
aware of their potential attacks, as the attack is stealth. This risk is exaggerated when users do not update the
operating system and APKs to the latest version with the most current security update.
Users today often download applications from anywhere, which often causes problems. When
installing an application and allowing whatever the application asks for, without making a conscious selection
and knowing the purpose of the application, this is what causes security problems on the smartphone, such as
infiltration of certain scripts. That particular script when triggered can perform an action that violates security,
for example the initiation of stealthy data transfer of private data to leak out from the smartphones to the
attackers. Finally, the control of the smartphone will fall into the hands of attackers.
Application development technology is currently very fast, due to the use of a framework that makes
it easy to create Android-based applications. There are several web-based applications that provide solutions
to quickly create Android APK files, such as MIT App Inventor [10], [11], Flutter, Appery.io, and many more.
There are also those who take advantage of ready-made applications and then carry out the Reverse Engineering
process and add several functions to create new applications. The manufacturing process can be done in a short
time. The speed of making Android applications is used by irresponsible parties for negative purposes. By
adding certain scripts such as allowing to activate storage so that the party entering the script can explore the
user’s smartphone storage. So that applications that have been infiltrated by malicious scripts or programs are
referred to as Android malicious software.
Some survey data regarding Android malware reveal that the ease or speed of making Android
applications coupled with reverse engineering APKs are very interesting for security research. Research in
malware based on the latest data for February 2021, released by the AV-TEST Institute, says there are over
350,000 malwares every day. This means that it has increased sharply from the last 5 years. Attacks and threats
on the internet become a major topic that is often discussed in campus forums or other forums. One of the
attacks and threats that are quite trending today is malware on Android-based smartphones. Research on
malware is carried out using a classification algorithm to detect whether the file is normal or malware. Many
anti-virus and anti-malware sometimes cannot detect new malware variants for several reasons. There are two
analyzes namely static analysis and dynamic analysis. The analysis uses a classification of Android malware.
With the presence of artificial intelligence technology, malware research can take advantage of this technology,
one of which is malware classification using machine learning (ML). It is an ongoing effort that the application
of ML technology for malware classification will be accurate.
Malware in smartphones today can be very annoying and disturbing for users [12]. The disturbance
causes it to not run normally on the Smartphone. This problem is often solved by installing anti-malware or
antivirus for smartphones. But another problem arises, namely the use of anti-malware from untrusted sources,
where the anti-malware has been infiltrated with malware. In addition, anti-malware does not run normally,
because several new variants keep appearing, and many anti-malwares do not detect new variants. So anti-
malware remains a problem. Anti-malware producing companies also continue to innovate to detect malware.
Even so, it still cannot detect the latest malware variants that keep popping up.
Many works have been produced in previous research, where the writings discuss various works
discussing malware research with static, dynamic and hybrid methods or those that combine static and dynamic.
One of the results is forensic analysis of mobile devices using scoring (FAMOUS) of application permissions
[13], which proposes a predictive approach to forensics in detecting suspicious Android APKs. The next study
is to detect Android APK malware and benign by weighting the prediction-based feature set using ML. Various
experiments were carried out on the features of the Android APK properties with an accuracy rate of 99%.
FAMOUS extracts Android APK files using the permissions feature only in classifying and analyzing each
Android feature using the AndroidManifest.xml file.
Next work is longitudinal performance analysis (LPA) of ML based Android malware detectors [14].
The aim of the study was to examine the performance degradation over time for various classifiers with ML,
which were trained with static features extracted from a collection of applications and date-labeled malware.
It is a static analysis with quantitative methods, namely by collecting malware dataset, Application features
extraction and noting ML classification for performance evaluation. The investigation is repeated by training
with time periods and samples from the latest datasets. The review of this work is that the method chosen is
3. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 6, December 2023: 6568-6577
6570
the static method with datasets for 2013, 2014 and 2015-2016. Also, the ML algorithms used are support vector
machine (SVM), J48 decision tree (DT), naïve Bayes (NB), simple logistic (SL), and random forest (RF).
Decompiled APK based malicious code classification [15]. The purpose of this study is to adapt the
decompile source code APK technique based on natural language processing for the classification of source
code malware. Using static analysis with quantitative methods it proceeds as follows using Rocky framework:
Decompiling APK files into source code, preprocessing of source code, generalizing N-tokens, feature
representations, and classification. Algorithm baselines are permissions, API calls and neural network (NN)
based. Android malware dataset (AMD) from Argus Lab contains 24,553 sample APKs, grouped into
135 types, 71 malware families from sampling year in 2010 – 2017. Evaluation of metrics are confusion matrix.
true positive ratio (TPR). F1, accuracy, receiver operating characteristic (ROC) and area under the ROC curve
(AUC) [16], [17]. Classification is done with NB, RF, logistic regression, with 10 validation tests. Test results
from decompiled APK based malicious code classification research reached 97% accuracy.
Next is APK auditor-a permission-based Android malware detection system [18]. In APK auditor
permissions feature, the system performs a malware assessment with three main components: Android client,
signature database, central server communicating with both. Presents Android’s permission-based malware
detection system [19]–[22] using static analysis in classifying benign and malware Android applications.
In conducting the experiment using data as many as 8,762 APKs, consisting of 1,853 benign APKs and
6,909 APK malware. The results of the accuracy of the model produce 88%.
As can be summarized from the reviewed studies above, there are no standardized ways yet to address
the key problem of malware identifications. The research questions are focused on the current state of the art
in data engineering for dataset creation and in the labeling approach to support ML algorithms for Android
malware study. The potential ability of the ML methods to detect, analyze and predict new variants of malware
that are currently widespread looks promising. Therefore, both dataset creation and detection algorithms are
two key enablers for solving the problem. Both are interesting and logical to be appointed as research questions
(RQ) as follows: i) How does feature extraction of APK files produce the best malware dataset? (RQ 1) and ii)
How does feature selection from the dataset help optimize the resulting detection model? (RQ2)
The scope of this current research is as follows. The method used in analyzing malware is a static
method. Data collection (public data) for the dataset will be used as material for making models that will be
tested. Data preparation, analyzed datasets, which attributes are used, which attributes are most influential in
the malware are considered very important in modeling and on the performance of ML algorithms. Dataset
creation is a data engineering process that largely relies on the feature extraction method and feature selection
method required, before data consuming ML algorithms can start.
2. THE PROPOSED METHOD (EXTRACTING APK ANDROID FILES)
Figure 1, is a big picture of the system that will extract the Android APK file into a malware dataset,
then the analysis process will be carried out on the features contained in the APK file. The resulting dataset is
the result of reverse engineering using the Jadx module [23], [24], which is a tool from Reverse Engineering.
Reverse Engineering is the process of converting APK files into source code form. This source code will be
carried out. Further analysis, whether the file is malware or benign.
Figure 1. Steps in extracting APK Android files
4. Int J Elec & Comp Eng ISSN: 2088-8708
Android-manifest extraction and labeling method for malware compilation … (Djarot Hindarto)
6571
3. METHOD
3.1. Download APK File
The proposed research framework is as follows: dataset collection in the form of an application package
kit (APK) containing malware APK and benign APK. Source dataset from University of New Brunswick [25],
Virusshare [26], VirusTotal [27]. After collecting the dataset in the form of APK malware files and APK benign
files, the data extraction process is carried out. APK files are converted and decompiled to get feature
permission and feature intent. Feature permission and feature intent will be processed into a dataset. This
process is called feature extraction. Feature selection is done to reduce the features or dimensions of the
malware dataset. Feature permission and feature intent are modeled by dividing training data and test data.
The APK files in the training come from the Google Play store and the Canadian Institute (UNB),
consisting of five classes. The files are Benign APK, Ransomware APK, Riskware APK, Banking APK, and
short message service (SMS) APK. The downloaded data is stored in their respective folders according to their
class. To make sure the files are malware or not malware, check them. Checking through virustotal.com
Website. The website is able to detect the types of malwares. Because the virustotal.com [27] website is
supported by security companies such as Avast, Norton and others. In carrying out this experiment, download
files of around 14,170 APKs measuring 60 GB from various sources above. Android APK files are placed in a
folder according to the type of Android APK, such as Benign will be placed in the benign folder, while banking
will be placed in banking. Figure 2 shows the overall process in a pseudocode. The collection of Android APK
files will later be extracted based on the type or family of malware. The results of the extraction using a reverse
engineer, which will be processed is AndroidManifest.xml. The feature set that is processed is the permission
feature and the intent feature, both of which are the basis of the Android malware classification. It is hoped
that the extraction process will produce the best malware dataset.
Figure 2. Pseudocode for creating folders and storing APK malware
After downloading the malware APK and Benign APK, the next step is to check before carrying out
the extraction process. The purpose of checking the APK file is whether the malware file is malware, not benign
APK. Likewise, with the benign Android APK, whether the file is a benign APK not a malware APK. In
addition, the file can be categorized so that it is not wrong in determining the class in the classification. Later
it will produce a dataset that is not mixed between the malware APK and the benign APK.
3.2. Checking APK file
Virustotal.com [27] is a reference for checking files that contain malware. Not only APK files, but
any files can be checked on the website. Actually, in our society, the term harmful file is called a virus. The
virus is part of malicious software or malware. There are many families of malware [28], [29], there are viruses,
trojans, ransomware, banking, SMS, and others. There are those that infect the computer and there are also
those that infect the smartphone. Making malware has different goals, depending on the malware maker. There
is some malware that does not damage the operating system, but has an effect on data theft [30], spying on
smartphone users [31].
Some smartphone users do not realize that they are being spied on by malware, by sending smartphone
user data to the malware maker’s servers [32]. This is very detrimental for smartphone users. Therefore, before
downloading the Android APK file, the APK file should not be installed directly, but must be checked on the
APK file to virustotal.com or other APK file checking websites.
3.3. Extraction APK file
This stage is reading the APK Files in the folder then doing a reverse engineer to read the
AndroidManifest.xml File. Feature permission, reverse engineer process; Extract the APK file and save it to
the unpacked-permissions folder. The next step that is read is the AndroidManifest.xml file, the XML file
parser is carried out to read the value from “uses-permission” then saved into the UpdatePermList.txt file.
feature intent, reverse engineering process; Extract the APK file and save it to the unpacked-intent folder. The
process is continued by reading the AndroidManifest.xml file, followed by the XML file parser process to read
the feature intent value from the contents of the AndroidManifest.xml file. The contents of the file,
5. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 6, December 2023: 6568-6577
6572
“application/activity/intent-filter/action”, “application/receiver/intent-filter/action”, “application/activity/intent-
filter/category”, if any, will be assigned a value of 1 and 0 if not found. The process is continued by saving the
results of the process into the UpdateIntentList.txt file. APK file, is an executable format in the Android operating
system. To find out the APK file structure, you can use the Jadx module or Jadx-graphical user interface (GUI).
Fitur permission [33]; The APK file [34] is reverse engineered to produce the AndroidManifest.xml
file. Then the XML parser is done by checking the ‘uses-permission’ feature, if there is a uses-permission, then
it is given a value of 1 and if it is not there is given a value of 0. Fitur Intent; The APK file is reverse engineered
to produce the AndroidManifest.xml file. Followed by parser by checking “activity/intent-filter/action,”
“receiver/intent-filter/action,” “activity/intent-filter/category”. If there is a checking condition such as a
checking condition, it will be given a value of 1 and if it is not there, it will be given a value of 0. The intent
feature is the most basic feature in Android, which is used to process data from other components [35]. Reverse
engineering for Android APK files [36], using the Jadx module. This Jadx (APK-tools) module is a module
that can extract Android APK files by creating folders and extracting them into source code, resources and
assets. The resources folder contains the AndroidManifest.xml file. Figure 3 shows the pseudocode for
performing the APK feature extraction.
Begin
do
Read file APK(folder)
Reverse file APK with Jadx module
Save file AndroidManifest.xml in folder UnpackedPermission
Read file AndroidManifest.xml in folder UnpackedPermission
ParseXML in permission root.findall(“user-permission”)
if feature_permission=1
set value_permission=1
else
set value_permission=0
while until eof ()
save update_permission.txt
save datapermission.csv
End
Procedure Intent_Extraction(folder)
Begin
do
Read file APK(folder)
Reverse file APK with Jadx module
Save file AndroidManifest.xml in folder UnpackedIntent
Read file AndroidManifest.xml in folder UnpackedIntent
ParseXML in permission root.findall(“intent-filter/action”)
if intent-filter-action=1
set value-filter-action=1
else
set value-filter-action=0
endif
if intent-filter-receiver=1
set value-filter-receiver=1
else
set value-filter-receiver=0
endif
if intent-filter-category=1
set value-filter-category=1
else
set value- filter-category=0
endif
while until eof ()
save update_intent.txt
save dataintent.csv
End
Procedure merge_dataset
Begin
merged_dataset=[]
for row in dataset_permission:
merged dataset append (row)
for row in dataset_intent:
merged_daatset append (row)
return merged_dataset
End
Figure 3. Pseudocode for performing the APK feature extraction
6. Int J Elec & Comp Eng ISSN: 2088-8708
Android-manifest extraction and labeling method for malware compilation … (Djarot Hindarto)
6573
4. RESULTS AND DISCUSSION
In conducting an experiment for Android APK extraction, using a Macintosh 2020 Notebook, with
8 GB RAM, 256 GB hard drive. Python programming language, NumPy and pandas library, xml. NumPy and
pandas are the main library packages for computational mathematics and data science. The time it takes to
extract Android APK files is 7 days non-stop. Process 14,170 Android APK, 1,179 feature, malware and
Benign. This process generates two datasets namely datapermission.cvs and dataintent.csv. The two dataset
files are merged, resulting in datamalware.csv.
4.1. Extract APK
Table 1 is the result of the extraction of the Android APK file. Where the process has been
described above, explaining about reverse engineering using the JADX module [37] and parsing each
Android APK file. Table 1, the score of permission features that often appear are SEND_SMS (Developer),
ACCESS_COARSE_LOCATION (Developer), SYSTEM_ALERT_WINDOW, READ_PHONE_STATE,
RECEIVE_SMS, RECEIVE_BOOT_COMPLETED, GET_TASKS, READ_SMS, ACCESS_WIFI_STATE,
WRITE_EXTERNAL_STORAGE. The following is a description of the features.
4.2. Dataset
The result of this research is a dataset that can contribute to the detection of malware and non-malware.
So that other researchers can directly use the dataset from the extraction process on various original malware
on the internet. The extraction process was carried out according to the algorithm described in Figure 3.
Table 1 shown the sample result of a malware dataset.
Table 1. The result of the dataset creation process
Name File Android permission access Android intent access
Downloads Bluetooth_Share Package_Removed Package_Replaced
ffa01b3ce624d6efc8028b3c2dfa17a4.apk 0 0 0 1
fe73930d1a24fb7d81693471c3677f8f.apk 1 0 0 1
f9e6378ebfbd69e77c451e32cf2af90c.apk 0 0 0 1
fa8ac1e84089e249e2e4a52cc588b810.apk 0 1 0 1
f83a2cc8303ea8af2c8f55c059564485.apk 0 1 0 0
f7296fa9243869375577e7770ca148f9.apk 1 1 0 0
f7013204327c182fb0bc2b9a45adc734.apk 1 1 0 0
f6c4919d0f465cc4e0d346c285ccd297.apk 1 1 0 1
f6515bfa39b1754867b6fd66c9cdc864.apk 0 0 1 1
fd6d6f5467370d6d3921192d8e7f6466.apk 0 0 1 0
fd4a86dfa65eb92eb8780e9cebdb822e.apk 0 0 1 0
f9c00e18740daf9a5dc3bd7dff25f14e.apk 0 0 1 0
f5e3158eba53be270164494bb4c830a2.apk 0 0 1 1
fe16e96706eb4a2b9132589a4f9fe582.apk 0 1 0 0
Explanation of the final result of the extraction process from the android file (.apk) using the reverse
engineering process, extract the android manifest file, selecting the permission keywords and intent keywords
in the android manifest file. The result of this process becomes a dataset file for Android malware classification
or detection. Table 1 explains the features in the resulting dataset. The explanation for the left column is NAME,
NAME is the name of the Android APK file. The android_permission_ACCESS_All column is a request to
access all functions on a smartphone device. If an Android APK application executes certain instructions and
requests access to all functions on a smartphone device, then an irregularity appears. A column or feature with
a value of 1 indicates the function of the permissions feature requesting access rights to run on smartphone
device functions. Column or feature if it has a value of 0 indicates the function of the permissions feature does
not ask for access rights to run on smartphone device functions.
4.3. Discussion
APK files are the package files used to share and install apps on Android devices. To extract the source
code from an APK file, you can use a tool called Jadx. Jadx is a reverse engineering tool that can decompile
an APK file into Java source code. It can also convert DEX bytecode to Java source code and provides an
option to view the code in a graphical format. To use Jadx, you can download the tool from the official website
and then open the APK file you want to extract using the Jadx GUI. Once the file is open, you can navigate
through the package hierarchy and view the source code for each package, class, and method. Additionally,
you can also export the source code as a zip or a jar file. Reverse engineering the Jadx module to create a
dataset involves using the Jadx tool to decompile multiple APK files and then organizing the resulting source
code into a structured dataset. This dataset can then be used for various purposes such as code analysis, malware
7. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 6, December 2023: 6568-6577
6574
detection, and more. One way to create a dataset using Jadx is to first gather a set of APK files that you want
to decompile. These files can be obtained from various sources such as the Google Play Store or from other
sources like GitHub. Once you have a set of APK files, you can use Jadx to decompile each file and extract the
source code. Next, organize the extracted source code into a structured dataset. This can be done by creating a
new directory for each APK file and placing the decompiled source code into the corresponding directory.
Alternatively, you can also organize the source code into a spreadsheet (format file csv), with each row
representing an APK file and its corresponding source code.
The methodological or framework approach in this research is reverse engineering and extraction of
the AndroidManifest file. Using the JADX module reverse engineering method, to perform reverse
engineering. The process is that the android file (example file-android.apk) is reversed to become the source
code files collected in the folder. A collection of source-code files contains the AndroidManifest file. This file
is extracted and selects the permission keywords and intent keywords. If the permission keyword is enabled, it
will be written into the dataset with a value of 1, if the permission keyword is disabled, then the dataset
permission feature is 0. The algorithm has been explained in the APK file extraction section, complete with
pseudocode Feature_Extraction.
The design of the system used is as follows: So far in detecting malware APK, the difficulty
encountered is getting the malware dataset if it detects malware APK with a static method. The first difficulty
is getting the dataset in the form of an android virus (Apk file). To get the original virus file, download it at
University of New Brunswick [25], Virusshare [26], VirusTotal [27]. After getting the Android virus, the
second step is to do reverse engineering using the JADX module tools. The third step is to read the android
manifest file in the reverse engineering results (android manifest file) and select the permission and intent
keywords. The fourth step saves the results of the third step (selection of permission keywords and intent
keywords in the android manifest file) into the dataset.csv file. By carrying out four stages, a dataset is obtained.
The design explanation is in Figure 1 steps in extracting APK Android files.
Evaluate the performance of the dataset if the simulation is carried out using the grid search cross
validation, 5 folds cross and the multi-layer perceptron (MLP) Classifier. For the code from GridSearchCV as
given in Figure 4 whose results are as shown in the next Figure 5. Figure 5 shows the simulation results of the
malware.csv dataset using artificial neural network (ANN) MLP. The dataset resulting from the reverse
engineering process and the extraction of the AndroidManifest file produces a better performance accuracy
model, reaching 100%. The number of datasets processed is 14,170 malwares. Simulations using ML
algorithms are also carried out using DT, SVM and KNN algorithms. So that the simulation using the ML
algorithm will be compared. The simulation results of ML algorithms are compared with NNs such as ANNs.
GridSearchCV (mlpc, mlpc_params, cv=5, n_jobs=-1, verbose=2)
mlpc_params={“alpha”: [0.1, 0.01, 0.0001], “hidden_layer_sizes”: [(10,10,10),
(100,100,100), (100,100)], “solver”: [“lbfgs”,”adam”,”sgd”], “activation”: [“relu”,”logistic”]}
mlpc_cv_model.fit (X_train, y_train)
mlpc_tuned=mlpc_cv_model.best_estimator_
mlpc_tuned.fit (X_train, y_train)
For the value of K-fold F1_weighted as follows:
from sklearn. model_selection import Kfold
from sklearn. model_selection import cross_val_score
kf=Kfold (shuffle=True, n_splits=5)
cv_results_kfold=cross_val_score (mlpc_tuned, X_test, np.argmax (y_test, axis=1), cv=kf, scoring= ‘f1_weighted’)
print (“K-fold Cross Validation f1_weigted Results: “, cv_results_kfold)
print (“K-fold Cross Validation f1_weigted Results Mean: “, cv_results_kfold.mean ())
The result is as follows:
K-fold Cross Validation f1_weigted Results: [0.99823636 1. 1. 1. 1.]
K-fold Cross Validation f1_weigted Results Mean: 0.9996472717814008
K-fold accuracy
from sklearn. model_selection import Kfold
from sklearn. model_selection import cross_val_score
kf=Kfold (shuffle=True, n_splits=5) # To make a 5-fold CV
cv_results_kfold=cross_val_score (mlpc_tuned, X_test, np.argmax (y_test, axis=1), cv=kf, scoring= ‘accuracy’)
print (“K-fold Cross Validation accuracy Results: “, cv_results_kfold)
print (“K-fold Cross Validation accuracy Results Mean: “, cv_results_kfold.mean())
The result is as follows:
K-fold Cross Validation accuracy Results: [1. 1. 0.99823633 1. 0.99646643]
K-fold Cross Validation accuracy Results Mean: 0.9989405525330142
Figure 4. Source code GridSearchCV and K-fold accuracy
8. Int J Elec & Comp Eng ISSN: 2088-8708
Android-manifest extraction and labeling method for malware compilation … (Djarot Hindarto)
6575
Figure 5. Performance simulation using ANN MLP
Table 2, shows a comparison of simulations using several algorithms such as DT, SVM, KNN+PCA,
ANN+GridSearch+MLP. The result is that the DT algorithm produces a very good Precision, Recall, F1-Score
of 100%, when using a relatively small dataset of 600 malwares. If the 7,000 dataset and 14,170 malware
datasets experience a decrease in precision, Recall and F1-score. The same thing happened to SVM and
KNN+PCA. DT, SVM and KNN are included in ML. In this simulation, the ML algorithm will decrease if the
number of datasets increases. The difference occurs in the ANN, if the dataset is larger in number, the Precision,
Recall and F1-score will increase. The average yield reaches 100%. The contribution of this study provides an
alternative dataset that will be used for further research. The next work is how to use the dataset from the
extracted APK file into ML and deep learning (DL) algorithms. The reason for using an ANN with
GridSearchCV and MLP is because the dataset is large. For the use of ML methods such as SVM, DT does not
have the maximum precision, recall and F1-score. The dataset is simulated using an ANN with GridSearchCV
and MLP, resulting in 100% performance.
Table 2. Comparison of DT, SVM, KKN+PCA, ANN+GridSearchCV+MLP
Method Dataset size Precision Recall F1-score Support
Decision tree 600 * 1.00 1.00 1.00 58
7000 0.89 0.89 0.89 1401
14170 0.92 0.91 0.91 2834
SVM 600 0.99 0.99 0.99 58
7000 0.90 0.89 0.89 1401
14170 0.91 0.91 0.90 2834
KNN+PCA 600 0.85 0.84 0.84 1401
7000 0.85 0.85 0.85 2834
14170 0.88 0.88 0.88 2834
ANN+GridSearchCV+MLP 600 0.99 0.99 0.99 116
7000 * 1.00 1.00 1.00 1401
14170 * 1.00 1.00 1.00 2834
5. CONCLUSION
The authors purposefully focus on the process of building quality malware datasets as it is seen as the
most demanding approach and implementation, and not on machine learning itself, because implementing
machine learning requires another effort only doable after the reliable dataset is fully built. The overall steps
in creating the malware dataset have been extensively described systematically, starting with the collection,
9. ISSN: 2088-8708
Int J Elec & Comp Eng, Vol. 13, No. 6, December 2023: 6568-6577
6576
reverse engineering, followed by extraction of the Android Manifest from the APK file set, and ending with
the labeling method for all the extracted APK files. The core contribution of this paper is on how to generate
datasets systematically from any APK file. The conclusion of this study is very insightful and useful for
researchers working in the various fields of ML. The constructed dataset can be directly used for various
purposes, especially for supervised classification and malware identification.
REFERENCES
[1] “Avoid fraud with wedding invitation mode APK,” Republika.co.id, 2023. https://kampus.republika.co.id/posts/200272/tips-
menghindari-penipuan-dengan-apk-modus-undangan-pernikahan (accessed Feb. 03, 2023).
[2] “Alert! The latest online fraud mode through.APK files,” Departemen Komunikasi Bank Indonesia, 2023.
https://www.bi.go.id/id/publikasi/ruang-media/cerita-bi/Pages/Waspada!-Modus-Penipuan-Online-Terbaru-lewat-File-APK.aspx
(accessed Feb. 01, 2023).
[3] “Many fraud using APK files, understand how it works and tips to avoid it,” Kompas.com, 2023.
https://money.kompas.com/read/2023/02/05/053000226/ramai-penipuan-bermodus-file-apk-pahami-cara-kerja-dan-tips-
menghindarinya?page=all (accessed Feb. 05, 2023).
[4] A. Djajadi and N. Sutisna, “Penetration testing: Dumping data from web application using SQL injection attack (case study:
eArsip),” Internetworking Indonesia Journal, vol. 13, no. 1, pp. 3–9, 2021.
[5] M. Chen and D. Kusuma Halim, “Federated learning for scam classification in small Indonesian language dataset: an initial study,”
Indonesian Journal of Electrical Engineering and Computer Science, vol. 30, no. 1, pp. 325–331, Apr. 2023, doi:
10.11591/ijeecs.v30.i1.pp325-331.
[6] A. A. Permana and L. A. Pratiwi, “Implementation of the advanced encryption standard (AES) algorithm for digital image security,”
Jurnal Teknik Informatika, vol. 15, no. 1, pp. 44–51, Jun. 2022, doi: 10.15408/jti.v15i1.25735.
[7] F. Di Cerbo, A. Girardello, F. Michahelles, and S. Voronkova, “Detection of malicious applications on Android OS,” in IWCF
2010: Computational Forensics, 2011, pp. 138–149.
[8] M. A. Omer et al., “Efficiency of Malware detection in Android system: A survey,” Asian Journal of Research in Computer Science,
pp. 59–69, Apr. 2021, doi: 10.9734/ajrcos/2021/v7i430189.
[9] X. Ge, Y. Pan, Y. Fan, and C. Fang, “AMDroid: Android Malware detection using function call graphs,” in 2019 IEEE 19th
International Conference on Software Quality, Reliability and Security Companion (QRS-C), Jul. 2019, pp. 71–77, doi:
10.1109/QRS-C.2019.00027.
[10] Z. R. Mohsin, A. M. Dayish, and B. A. Hamdan, “Android projects on Android attack application,” Journal of Xidian University,
vol. 14, no. 5, pp. 16–21, May 2020, doi: 10.37896/jxu14.5/233.
[11] W. Stallings, Cryptography and network security: Principles and practice, Seventh Ed. Harlow: Pearson Education Limited, 2017.
[12] S. Peng, S. Yu, and A. Yang, “Smartphone Malware and its propagation modeling: A survey,” IEEE Communications Surveys &
Tutorials, vol. 16, no. 2, pp. 925–941, 2014, doi: 10.1109/SURV.2013.070813.00214.
[13] A. Kumar, K. S. Kuppusamy, and G. Aghila, “FAMOUS: Forensic analysis of mobile devices using scoring of application
permissions,” Future Generation Computer Systems, vol. 83, pp. 158–172, Jun. 2018, doi: 10.1016/j.future.2018.02.001.
[14] S. Y. Yerima and S. Khan, “Longitudinal performance analysis of machine learning based Android malware detectors,” in 2019
International Conference on Cyber Security and Protection of Digital Services (Cyber Security), Jun. 2019, pp. 1–8, doi:
10.1109/CyberSecPODS.2019.8885384.
[15] R. Mateless, D. Rejabek, O. Margalit, and R. Moskovitch, “Decompiled APK based malicious code classification,” Future
Generation Computer Systems, vol. 110, pp. 135–147, Sep. 2020, doi: 10.1016/j.future.2020.03.052.
[16] K. Gajowniczek and T. Ząbkowski, “ImbTreeAUC: An R package for building classification trees using the area under the ROC
curve (AUC) on imbalanced datasets,” SoftwareX, vol. 15, Jul. 2021, doi: 10.1016/j.softx.2021.100755.
[17] A. R. Rachakonda and A. Bhatnagar, “A: Extending area under the ROC curve for probabilistic labels,” Pattern Recognition Letters,
vol. 150, pp. 265–271, Oct. 2021, doi: 10.1016/j.patrec.2021.06.023.
[18] K. A. Talha, D. I. Alper, and C. Aydin, “APK Auditor: Permission-based Android malware detection system,” Digital Investigation,
vol. 13, pp. 1–14, Jun. 2015, doi: 10.1016/j.diin.2015.01.001.
[19] S. K. Smmarwar, G. P. Gupta, S. Kumar, and P. Kumar, “An optimized and efficient android malware detection framework for
future sustainable computing,” Sustainable Energy Technologies and Assessments, vol. 54, Dec. 2022, doi:
10.1016/j.seta.2022.102852.
[20] A. Mathur, L. M. Podila, K. Kulkarni, Q. Niyaz, and A. Y. Javaid, “NATICUSdroid: A Malware detection framework for Android
using native and custom permissions,” Journal of Information Security and Applications, vol. 58, May 2021, doi:
10.1016/j.jisa.2020.102696.
[21] A. S. Shatnawi, Q. Yassen, and A. Yateem, “An Android Malware detection approach based on static feature analysis using machine
learning algorithms,” Procedia Computer Science, vol. 201, pp. 653–658, 2022, doi: 10.1016/j.procs.2022.03.086.
[22] P. Bhat and K. Dutta, “A multi-tiered feature selection model for android malware detection based on feature discrimination and
information gain,” Journal of King Saud University - Computer and Information Sciences, vol. 34, no. 10, pp. 9464–9477, Nov.
2022, doi: 10.1016/j.jksuci.2021.11.004.
[23] H. Ali et al., “Security hardened and privacy preserved Android Malware detection using fuzzy hash of reverse engineered source
code,” Security and Communication Networks, vol. 2022, pp. 1–11, Sep. 2022, doi: 10.1155/2022/7972230.
[24] P. Agrawal and B. Trivedi, “Unstructured data collection from APK files for Malware detection,” International Journal of Computer
Applications, vol. 176, no. 28, pp. 42–45, Jun. 2020, doi: 10.5120/ijca2020920308.
[25] UNB, “Android adware and general Malware Dataset (CIC-AAGM2017),” University of New Brunswick (UNB), 2017.
https://www.unb.ca/cic/datasets/andmal2017.html (accessed: Jan. 12, 2023).
[26] J.-M. Roberts and Melissa, “Virusshare: Report for a sample recently added to the system,” VirusShare.com, Accessed: Jan. 20,
2023. [Online]. Available: https://virusshare.com/
[27] VirusTotal, “Virustotal: Dataset malware,” VirusTotal. Accessed Jan. 05, 2023. [Online]. Available:
https://www.virustotal.com/gui/home/upload
[28] S. Turker and A. B. Can, “AndMFC: Android Malware family classification framework,” in 2019 IEEE 30th International
Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC Workshops), Sep. 2019, pp. 1–6, doi:
10.1109/PIMRCW.2019.8880840.
10. Int J Elec & Comp Eng ISSN: 2088-8708
Android-manifest extraction and labeling method for malware compilation … (Djarot Hindarto)
6577
[29] A. Walenstein and M. Venable, “Exploiting similarity between variants to defeat malware,” in Proc. BlackHat Briefings DC, 2007,
pp. 1–12.
[30] A. Das and H. U. Khan, “Security behaviors of smartphone users,” Information & Computer Security, vol. 24, no. 1, pp. 116–134,
Mar. 2016, doi: 10.1108/ICS-04-2015-0018.
[31] D. Perakovic, S. Husnjak, and V. Remenar, “Research of security threats in the use of modern terminal devices,” in Conference:
Annals of DAAAM for 2012 & Proceedings of the 23rd International DAAAM Symposium, 2012, pp. 545–548.
[32] A.-D. Schmidt, F. Peters, F. Lamour, C. Scheel, S. A. Çamtepe, and Ş. Albayrak, “Monitoring Smartphones for anomaly detection,”
Mobile Networks and Applications, vol. 14, no. 1, pp. 92–106, Feb. 2009, doi: 10.1007/s11036-008-0113-x.
[33] F. Tchakounte and P. Dayang, “System calls analysis of Malwares on Android,” Maejo International Journal of Science and
Technology, vol. 2, no. 9, pp. 669–674, 2013.
[34] B. Gruver, “Smali/Baksmali,” Github.com. Accessed: Jan. 12, 2023. [Online]. Available: https://github.com/JesusFreke/smali
[35] S. Wu, P. Wang, X. Li, and Y. Zhang, “Effective detection of android malware based on the usage of data flow APIs and machine
learning,” Information and Software Technology, vol. 75, pp. 17–25, Jul. 2016, doi: 10.1016/j.infsof.2016.03.004.
[36] T. Liu, “Software vulnerability mining techniques based on data fusion and reverse engineering,” Wireless Communications and
Mobile Computing, vol. 2022, pp. 1–6, Apr. 2022, doi: 10.1155/2022/4329034.
[37] M. T. Kyaw, Y. N. Soe, and N. S. M. Kham, “Security analysis of Android application by using reverse engineering,” in Proceedings
of 2019 the 9th International Workshop on Computer Science and Engineering, 2019, pp. 171–177, doi:
10.18178/wcse.2019.03.029.
BIOGRAPHIES OF AUTHORS
Djarot Hindarto received the B.Eng. degree in computer engineering from Sepuluh
Nopember Institute of Technology (ITS), Indonesia, in 1994 and the Master of Information
Technology Pradita University, in 2022, respectively. Currently, he is a lecture at the Faculty of
Communication and Information Technology (FKTI), Universitas Nasional (UNAS) Jakarta,
Indonesia. His research interests include security, artificial intelligence, deep learning,
machine learning, internet of things and blockchain. He can be contacted at email:
djarot.hindarto@civitas.unas.ac.id.
Arko Djajadi received his Bachelor and Master degrees from the Delft University
of Technology – Netherlands in 1992 and his Ph.D. from the University of Manchester – UK in
1999, all of them are in Electrical and Electronics Engineering. His research interests include
smart embedded systems, mechatronics, instrumentation, Big Data and IoT, EV and renewable
energy both in academic and industrial settings. Currently, he is with the Faculty of Engineering
and Informatics at the Multimedia Nusantara University (UMN). He can be contacted at email:
arko@umn.ac.id.