This paper presents a study that compares Android app testing using automated Monkey testing versus human testing. The study collected event traces from running five Android apps with both Monkey testing and human testing. By analyzing the event traces, the study found that:
1) Monkey testing and human testing generated similar numbers of unique events on average, but Monkey created more system events while humans created more UI events.
2) There was no correlation between testing time and number of unique events uncovered for either Monkey testing or human testing.
3) Statistically, Monkey testing and human testing performed indistinguishably in terms of event coverage, but Monkey was more effective at triggering system events while humans were better at triggering UI events.
ANDROINSPECTOR: A SYSTEM FOR COMPREHENSIVE ANALYSIS OF ANDROID APPLICATIONSIJNSA Journal
Â
Android is an extensively used mobile platform and with evolution it has also witnessed an increased influx of malicious applications in its market place. The availability of multiple sources for downloading applications has also contributed to users falling prey to malicious applications. A major hindrance in blocking the entry of malicious applications into the Android market place is scarcity of effective mechanisms to identify malicious applications. This paper presents AndroInspector, a system for comprehensive analysis of an Android application using both static and dynamic analysis techniques. AndroInspector derives, extracts and analyses crucial features of Android applications using static analysis and subsequently classifies the application using machine learning techniques. Dynamic analysis includes automated execution of Android application to identify a set of pre-defined malicious actions performed by application at run-time.
Android is an extensively used mobile platform and with evolution it has also witnessed an increased influx of malicious applications in its market place. The availability of multiple sources for downloading applications has also contributed to users falling prey to malicious applications. A major hindrance in blocking the entry of malicious applications into the Android market place is scarcity of effective mechanisms to identify malicious applications. This paper presents AndroInspector, a system for comprehensive analysis of an Android application using both static and dynamic analysis techniques. And roInspector derives, extracts and analyses crucial features of Android applications using static analysis and subsequently classifies the application using machine learning techniques. Dynamic analysis includes automated execution of Android application to identify a set of pre-defined malicious actions performed by application at run-time.
The adoption of smart phones and the usages of mobile applications are increasing rapidly.
Consequently, within limited time-range, mobile Internet usages have managed to take over the
desktop usages particularly since the first smart phone-touched application released by iPhone
in 2007. This paper is proposed to provide solution and answer the most demandable questions
related to mobile application automated and manual testing limitations. Moreover, Mobile
application testing requires agility and physically testing. Agile testing is to detect bugs through
automated tools, whereas the compatibility testing is more to ensure that the apps operates on
mobile OS (Operation Systems) as well as on the different real devices. Moreover, we have
managed to answer automated or manual questions through two mobile application case studies
MES (Mobile Exam System) and MLM (Mobile Lab Mate) by creating test scripts for both case
studies and our experiment results have been discussed and evaluated on whether to adopt test
on real devices or on emulators? In addition to this, we have introduced new mobile application
testing matrix for the testers and some enterprises to obtain knowledge from.
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...ACM Chicago
Â
Join us as Tao Xie, Professor and Willett Faculty Scholar in the Department of Computer Science at the University of Illinois at Urbana-Champaign and ACM Distinguished Speaker, talks about Intelligent Software Engineering: Synergy between AI and Software Engineering. This is a joint meeting hosted by Chicago Chapter ACM / Loyola University Computer Science Department.
Android is a Linux based operating system used for smart phone devices. Since 2008, Android devices gained huge market share due to its open architecture and popularity. Increased popularity of the Android devices and associated primary benefits attracted the malware developers. Rate of Android malware applications increased between 2008 and 2016. In this paper, we proposed dynamic malware detection approach for Android applications. In dynamic analysis, system calls are recorded to calculate the density of the system calls. For density calculation, we used two different lengths of system calls that are 3 gram and 5 gram. Furthermore, Naive Bayes algorithm is applied to classify applications as benign or malicious. The proposed algorithm detects malware using 100 real world samples of benign and malware applications. We observe that proposed method gives effective and accurate results. The 3 gram Naive Bayes algorithm detects 84 malware application correctly and 14 benign application incorrectly. The 5 gram Naive Bayes algorithm detects 88 malware application correctly and 10 benign application incorrectly. Mr. Tushar Patil | Prof. Bharti Dhote "Malware Detection in Android Applications" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26449.pdfPaper URL: https://www.ijtsrd.com/engineering/computer-engineering/26449/malware-detection-in-android-applications/mr-tushar-patil
ANDROINSPECTOR: A SYSTEM FOR COMPREHENSIVE ANALYSIS OF ANDROID APPLICATIONSIJNSA Journal
Â
Android is an extensively used mobile platform and with evolution it has also witnessed an increased influx of malicious applications in its market place. The availability of multiple sources for downloading applications has also contributed to users falling prey to malicious applications. A major hindrance in blocking the entry of malicious applications into the Android market place is scarcity of effective mechanisms to identify malicious applications. This paper presents AndroInspector, a system for comprehensive analysis of an Android application using both static and dynamic analysis techniques. AndroInspector derives, extracts and analyses crucial features of Android applications using static analysis and subsequently classifies the application using machine learning techniques. Dynamic analysis includes automated execution of Android application to identify a set of pre-defined malicious actions performed by application at run-time.
Android is an extensively used mobile platform and with evolution it has also witnessed an increased influx of malicious applications in its market place. The availability of multiple sources for downloading applications has also contributed to users falling prey to malicious applications. A major hindrance in blocking the entry of malicious applications into the Android market place is scarcity of effective mechanisms to identify malicious applications. This paper presents AndroInspector, a system for comprehensive analysis of an Android application using both static and dynamic analysis techniques. And roInspector derives, extracts and analyses crucial features of Android applications using static analysis and subsequently classifies the application using machine learning techniques. Dynamic analysis includes automated execution of Android application to identify a set of pre-defined malicious actions performed by application at run-time.
The adoption of smart phones and the usages of mobile applications are increasing rapidly.
Consequently, within limited time-range, mobile Internet usages have managed to take over the
desktop usages particularly since the first smart phone-touched application released by iPhone
in 2007. This paper is proposed to provide solution and answer the most demandable questions
related to mobile application automated and manual testing limitations. Moreover, Mobile
application testing requires agility and physically testing. Agile testing is to detect bugs through
automated tools, whereas the compatibility testing is more to ensure that the apps operates on
mobile OS (Operation Systems) as well as on the different real devices. Moreover, we have
managed to answer automated or manual questions through two mobile application case studies
MES (Mobile Exam System) and MLM (Mobile Lab Mate) by creating test scripts for both case
studies and our experiment results have been discussed and evaluated on whether to adopt test
on real devices or on emulators? In addition to this, we have introduced new mobile application
testing matrix for the testers and some enterprises to obtain knowledge from.
ACM Chicago March 2019 meeting: Software Engineering and AI - Prof. Tao Xie, ...ACM Chicago
Â
Join us as Tao Xie, Professor and Willett Faculty Scholar in the Department of Computer Science at the University of Illinois at Urbana-Champaign and ACM Distinguished Speaker, talks about Intelligent Software Engineering: Synergy between AI and Software Engineering. This is a joint meeting hosted by Chicago Chapter ACM / Loyola University Computer Science Department.
Android is a Linux based operating system used for smart phone devices. Since 2008, Android devices gained huge market share due to its open architecture and popularity. Increased popularity of the Android devices and associated primary benefits attracted the malware developers. Rate of Android malware applications increased between 2008 and 2016. In this paper, we proposed dynamic malware detection approach for Android applications. In dynamic analysis, system calls are recorded to calculate the density of the system calls. For density calculation, we used two different lengths of system calls that are 3 gram and 5 gram. Furthermore, Naive Bayes algorithm is applied to classify applications as benign or malicious. The proposed algorithm detects malware using 100 real world samples of benign and malware applications. We observe that proposed method gives effective and accurate results. The 3 gram Naive Bayes algorithm detects 84 malware application correctly and 14 benign application incorrectly. The 5 gram Naive Bayes algorithm detects 88 malware application correctly and 10 benign application incorrectly. Mr. Tushar Patil | Prof. Bharti Dhote "Malware Detection in Android Applications" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-5 , August 2019, URL: https://www.ijtsrd.com/papers/ijtsrd26449.pdfPaper URL: https://www.ijtsrd.com/engineering/computer-engineering/26449/malware-detection-in-android-applications/mr-tushar-patil
Machine Learning Based Ensemble Classifier for Android Malware DetectionIJCNCJournal
Â
Malware problem has infiltrated into every aspect of cyber space including Android mobiles. Due to proliferation of Android applications and widespread usage of smartphones, malware problem is causing significant damage to mobile users and application vendors. With the emergence of Artificial Intelligence (AI), machine learning (ML) models are widely used for detection of Android malware. However, many of the existing methods focused on static or dynamic data to train classifiers for malware detection. In this paper, we propose an ensemble model with intelligent methods that are empirically selected. Only the malware detection models with highest accuracy are chosen to be part of stacking ensemble model. An algorithm named Stacking Ensemble for Automatic Android Malware Detection (SE-AAMD)is proposed and implemented. We made three experiments with the same algorithm but three different datasets reflecting features obtained through different modus operandi. Each dataset is found to have influence on the performance of the models. However, in all experiments, the ensemble approach showed highest performance. The proposed method can be used in improving security for Android devices and applications.
Machine Learning Based Ensemble Classifier for Android Malware DetectionIJCNCJournal
Â
Malware problem has infiltrated into every aspect of cyber space including Android mobiles. Due to proliferation of Android applications and widespread usage of smartphones,malware problem is causing significant damage to mobile users and application vendors. With the emergence of Artificial Intelligence (AI), machine learning (ML) models are widely used for detection of Android malware. However, many of the existing methods focused on static or dynamic data to train classifiers for malware detection. In this paper, we propose an ensemble model with intelligent methods that are empirically selected. Only the malware detection models with highest accuracy arechosen to be part of stacking ensemble model. An algorithm named Stacking Ensemble for Automatic Android Malware Detection (SE-AAMD)is proposed and implemented. We made three experiments with the same algorithm but three different datasets reflecting features obtained through different modus operandi. Each dataset is found to have influence on the performance of the models. However, in all experiments, the ensemble approach showed highest performance. The proposed method can be used in improving security for Android devices and applications.
We describe a novel approach and a software framework for mobile crowdsensing applications with mobile agents that enables energy efficient, robust and scalable campaign execution. The framework is Web-enabled for integration with existing pervasive computing systems.
DEF: Deep Ensemble Neural Network Classifier for Android Malware DetectionIJCNCJournal
Â
Malware is one of the threats to security of computer networks and information systems. Since malware instances are available sufficiently, there is increased interest among researchers on usage of Artificial Intelligence (AI). Of late AI-enabled methods such as machine learning (ML) and deep learning paved way for solving many real-world problems. As it is a learning-based approach, accumulated training samples help in improving thequality of training and thus leveraging malware detection accuracy. Existing deep learning methods are focusing on learning-based malware detection systems. However, there is need for improving the state of the art through ensemble approach. Towards this end, in this paper we proposed a framework known as Deep Ensemble Framework (DEF) for automatic malware detection. The framework obtains features from training samples. From given malware instance a grayscale image is generated. There is another process to extract the opcode sequences. Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) techniques are used to obtain grayscale image and opcode sequence respectively. Afterwards, a stacking ensemble is employed in order to achieve efficient malware detection and classification. Malware samples collected fromthe Internet sources and Microsoft are used for theempirical study. An algorithm known as Ensemble Learning for Automatic Malware Detection (EL-AML) is proposed to realize our framework. Another algorithm named Pre-Process is proposed to assist the EL-AML algorithm for obtaining intermediate features required by CNN and LSTM.Empirical study reveals that our framework outperforms many existing methods in terms of speed-up and accuracy.
DEF: Deep Ensemble Neural Network Classifier for Android Malware DetectionIJCNCJournal
Â
Malware is one of the threats to security of computer networks and information systems. Since malware instances are available sufficiently, there is increased interest among researchers on usage of Artificial Intelligence (AI). Of late AI-enabled methods such as machine learning (ML) and deep learning paved way for solving many real-world problems. As it is a learning-based approach, accumulated training samples help in improving thequality of training and thus leveraging malware detection accuracy. Existing deep learning methods are focusing on learning-based malware detection systems. However, there is need for improving the state of the art through ensemble approach. Towards this end, in this paper we proposed a framework known as Deep Ensemble Framework (DEF) for automatic malware detection. The framework obtains features from training samples. From given malware instance a grayscale image is generated. There is another process to extract the opcode sequences. Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) techniques are used to obtain grayscale image and opcode sequence respectively. Afterwards, a stacking ensemble is employed in order to achieve efficient malware detection and classification. Malware samples collected fromthe Internet sources and Microsoft are used for theempirical study. An algorithm known as Ensemble Learning for Automatic Malware Detection (EL-AML) is proposed to realize our framework. Another algorithm named Pre-Process is proposed to assist the EL-AML algorithm for obtaining intermediate features required by CNN and LSTM.Empirical study reveals that our framework outperforms many existing methods in terms of speed-up and accuracy.
Android-manifest extraction and labeling method for malware compilation and d...IJECEIAES
Â
Malware is a nuisance for smartphone users. The impact is detrimental to smartphone users if the smartphone is infected by malware. Malware identification is not an easy process for ordinary users due to its deeply concealed dangers in application package kit (APK) files available in the Android Play Store. In this paper, the challenges of creating malware datasets are discussed. Long before a malware classification process and model can be built, the need for datasets with representative features for most types of malwares has to be addressed systematically. Only after a quality data set is available can a quality classification model be obtained using machine learning (ML) or deep learning (DL) algorithms. The entire malware classification process is a full pipeline process and sub processes. The authors purposefully focus on the process of building quality malware datasets, not on ML itself, because implementing ML requires another effort after the reliable dataset is fully built. The overall step in creating the malware dataset starts with the extraction of the Android Manifest from the APK file set and ends with the labeling method for all the extracted APK files. The key contribution of this paper is on how to generate datasets systematically from any APK file.
Bug tracking system plays major role in identifying bug in various software system. Present paper underlined the importance of bug tracking system in the field of software development and had also identified the various drawbacks of the current bug tracking system by exploring the various literatures and research papers published up to year 2014. Various recommendations and suggestions have also been provided after analyzing the obstacles faced by developers and users due to functioning of current bug tracking system. An extensive survey is presented in this regards and the provided analysis is truly based on what have been analyzed s in context with bug tracking system.
Android Malware Detection in Official and Third Party Application StoresEswar Publications
Â
Android is one of the most popular operating system for mobile devices and tablets. The growing number of Android users and open source nature of this platform has also attracted attackers to target Android devices. This paper presents the static and dynamic analysis of the Android applications in order to detect malware. In this work, we have performed permission based and behavioural based filtering of Android applications with the help of malware analysis tools. Our results revel that 80% of the applications request for dangerous permissions.
13% applications consist of malicious activities. Most of the applications are interested in the device data like contact lists, IMEI, IMSI, SMS etc. These results clearly indicate the need for better security measures for Android apps.
Why Don't Software Developers Use Static Analysis Tools to Find Bugs?PVS-Studio
Â
Using static analysis tools for automating code inspections can be beneficial for software engineers. Such tools can make finding bugs, or software defects, faster and cheaper than manual inspections. Despite the benefits of using static analysis tools to find bugs, research suggests that these tools are underused. In this paper, we investigate why developers are not widely using static analysis tools and how current tools could potentially be improved. We conducted interviews with 20 developers and found that although all of our participants felt that use is beneficial, false positives and the way in which the warnings are presented, among other things, are barriers to use. We discuss several implications of these results, such as the need for an interactive mechanism to help developers fix defects.
Intelligent Software Engineering: Synergy between AI and Software Engineering...Tao Xie
Â
2018 Distinguished Speaker, the UC Irvine Institute for Software Research (ISR) Distinguished Speaker Series 2018-2019. "Intelligent Software Engineering: Synergy between AI and Software Engineering" http://isr.uci.edu/content/isr-distinguished-speaker-series-2018-2019
Mobile apps-user interaction measurement & Apps ecosystemSalah Amean
Â
This presentation talks about the user behaviours and the trend of mobile applications. It also talks about the behaviour of users downloading most common application on the market.
USABILITY EVALUATION OF A CONTROL AND PROGRAMMING ENVIRONMENT FOR PROGRAMMING...ijseajournal
Â
This paper presents an assessment of usability of Control and Programming Environment (CPE) of a
remote mobile robot. The CPE is an educational environment focused on computer programming education
that integrates a program development online tool with a remote lab. To evaluate system usability,
empirical test was conducted with computer science students in order to identify the views of users on the
system and get directions on how to improve the quality of interface use. The study used questionnaire and
observation of the evaluator. The degree of usersâ satisfaction was measured by using a quantitative
approach that establishes the average ranking for each question of the questionnaire. The results indicate
that the system is simple, easy to use and suited to programming practices, however needed changes to
make it more intuitive and efficient. The realization test of usability, even with a small sample user, is
important to provide feedback on the system's user experience and help identify problems.
Machine Learning Based Ensemble Classifier for Android Malware DetectionIJCNCJournal
Â
Malware problem has infiltrated into every aspect of cyber space including Android mobiles. Due to proliferation of Android applications and widespread usage of smartphones, malware problem is causing significant damage to mobile users and application vendors. With the emergence of Artificial Intelligence (AI), machine learning (ML) models are widely used for detection of Android malware. However, many of the existing methods focused on static or dynamic data to train classifiers for malware detection. In this paper, we propose an ensemble model with intelligent methods that are empirically selected. Only the malware detection models with highest accuracy are chosen to be part of stacking ensemble model. An algorithm named Stacking Ensemble for Automatic Android Malware Detection (SE-AAMD)is proposed and implemented. We made three experiments with the same algorithm but three different datasets reflecting features obtained through different modus operandi. Each dataset is found to have influence on the performance of the models. However, in all experiments, the ensemble approach showed highest performance. The proposed method can be used in improving security for Android devices and applications.
Machine Learning Based Ensemble Classifier for Android Malware DetectionIJCNCJournal
Â
Malware problem has infiltrated into every aspect of cyber space including Android mobiles. Due to proliferation of Android applications and widespread usage of smartphones,malware problem is causing significant damage to mobile users and application vendors. With the emergence of Artificial Intelligence (AI), machine learning (ML) models are widely used for detection of Android malware. However, many of the existing methods focused on static or dynamic data to train classifiers for malware detection. In this paper, we propose an ensemble model with intelligent methods that are empirically selected. Only the malware detection models with highest accuracy arechosen to be part of stacking ensemble model. An algorithm named Stacking Ensemble for Automatic Android Malware Detection (SE-AAMD)is proposed and implemented. We made three experiments with the same algorithm but three different datasets reflecting features obtained through different modus operandi. Each dataset is found to have influence on the performance of the models. However, in all experiments, the ensemble approach showed highest performance. The proposed method can be used in improving security for Android devices and applications.
We describe a novel approach and a software framework for mobile crowdsensing applications with mobile agents that enables energy efficient, robust and scalable campaign execution. The framework is Web-enabled for integration with existing pervasive computing systems.
DEF: Deep Ensemble Neural Network Classifier for Android Malware DetectionIJCNCJournal
Â
Malware is one of the threats to security of computer networks and information systems. Since malware instances are available sufficiently, there is increased interest among researchers on usage of Artificial Intelligence (AI). Of late AI-enabled methods such as machine learning (ML) and deep learning paved way for solving many real-world problems. As it is a learning-based approach, accumulated training samples help in improving thequality of training and thus leveraging malware detection accuracy. Existing deep learning methods are focusing on learning-based malware detection systems. However, there is need for improving the state of the art through ensemble approach. Towards this end, in this paper we proposed a framework known as Deep Ensemble Framework (DEF) for automatic malware detection. The framework obtains features from training samples. From given malware instance a grayscale image is generated. There is another process to extract the opcode sequences. Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) techniques are used to obtain grayscale image and opcode sequence respectively. Afterwards, a stacking ensemble is employed in order to achieve efficient malware detection and classification. Malware samples collected fromthe Internet sources and Microsoft are used for theempirical study. An algorithm known as Ensemble Learning for Automatic Malware Detection (EL-AML) is proposed to realize our framework. Another algorithm named Pre-Process is proposed to assist the EL-AML algorithm for obtaining intermediate features required by CNN and LSTM.Empirical study reveals that our framework outperforms many existing methods in terms of speed-up and accuracy.
DEF: Deep Ensemble Neural Network Classifier for Android Malware DetectionIJCNCJournal
Â
Malware is one of the threats to security of computer networks and information systems. Since malware instances are available sufficiently, there is increased interest among researchers on usage of Artificial Intelligence (AI). Of late AI-enabled methods such as machine learning (ML) and deep learning paved way for solving many real-world problems. As it is a learning-based approach, accumulated training samples help in improving thequality of training and thus leveraging malware detection accuracy. Existing deep learning methods are focusing on learning-based malware detection systems. However, there is need for improving the state of the art through ensemble approach. Towards this end, in this paper we proposed a framework known as Deep Ensemble Framework (DEF) for automatic malware detection. The framework obtains features from training samples. From given malware instance a grayscale image is generated. There is another process to extract the opcode sequences. Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) techniques are used to obtain grayscale image and opcode sequence respectively. Afterwards, a stacking ensemble is employed in order to achieve efficient malware detection and classification. Malware samples collected fromthe Internet sources and Microsoft are used for theempirical study. An algorithm known as Ensemble Learning for Automatic Malware Detection (EL-AML) is proposed to realize our framework. Another algorithm named Pre-Process is proposed to assist the EL-AML algorithm for obtaining intermediate features required by CNN and LSTM.Empirical study reveals that our framework outperforms many existing methods in terms of speed-up and accuracy.
Android-manifest extraction and labeling method for malware compilation and d...IJECEIAES
Â
Malware is a nuisance for smartphone users. The impact is detrimental to smartphone users if the smartphone is infected by malware. Malware identification is not an easy process for ordinary users due to its deeply concealed dangers in application package kit (APK) files available in the Android Play Store. In this paper, the challenges of creating malware datasets are discussed. Long before a malware classification process and model can be built, the need for datasets with representative features for most types of malwares has to be addressed systematically. Only after a quality data set is available can a quality classification model be obtained using machine learning (ML) or deep learning (DL) algorithms. The entire malware classification process is a full pipeline process and sub processes. The authors purposefully focus on the process of building quality malware datasets, not on ML itself, because implementing ML requires another effort after the reliable dataset is fully built. The overall step in creating the malware dataset starts with the extraction of the Android Manifest from the APK file set and ends with the labeling method for all the extracted APK files. The key contribution of this paper is on how to generate datasets systematically from any APK file.
Bug tracking system plays major role in identifying bug in various software system. Present paper underlined the importance of bug tracking system in the field of software development and had also identified the various drawbacks of the current bug tracking system by exploring the various literatures and research papers published up to year 2014. Various recommendations and suggestions have also been provided after analyzing the obstacles faced by developers and users due to functioning of current bug tracking system. An extensive survey is presented in this regards and the provided analysis is truly based on what have been analyzed s in context with bug tracking system.
Android Malware Detection in Official and Third Party Application StoresEswar Publications
Â
Android is one of the most popular operating system for mobile devices and tablets. The growing number of Android users and open source nature of this platform has also attracted attackers to target Android devices. This paper presents the static and dynamic analysis of the Android applications in order to detect malware. In this work, we have performed permission based and behavioural based filtering of Android applications with the help of malware analysis tools. Our results revel that 80% of the applications request for dangerous permissions.
13% applications consist of malicious activities. Most of the applications are interested in the device data like contact lists, IMEI, IMSI, SMS etc. These results clearly indicate the need for better security measures for Android apps.
Why Don't Software Developers Use Static Analysis Tools to Find Bugs?PVS-Studio
Â
Using static analysis tools for automating code inspections can be beneficial for software engineers. Such tools can make finding bugs, or software defects, faster and cheaper than manual inspections. Despite the benefits of using static analysis tools to find bugs, research suggests that these tools are underused. In this paper, we investigate why developers are not widely using static analysis tools and how current tools could potentially be improved. We conducted interviews with 20 developers and found that although all of our participants felt that use is beneficial, false positives and the way in which the warnings are presented, among other things, are barriers to use. We discuss several implications of these results, such as the need for an interactive mechanism to help developers fix defects.
Intelligent Software Engineering: Synergy between AI and Software Engineering...Tao Xie
Â
2018 Distinguished Speaker, the UC Irvine Institute for Software Research (ISR) Distinguished Speaker Series 2018-2019. "Intelligent Software Engineering: Synergy between AI and Software Engineering" http://isr.uci.edu/content/isr-distinguished-speaker-series-2018-2019
Mobile apps-user interaction measurement & Apps ecosystemSalah Amean
Â
This presentation talks about the user behaviours and the trend of mobile applications. It also talks about the behaviour of users downloading most common application on the market.
USABILITY EVALUATION OF A CONTROL AND PROGRAMMING ENVIRONMENT FOR PROGRAMMING...ijseajournal
Â
This paper presents an assessment of usability of Control and Programming Environment (CPE) of a
remote mobile robot. The CPE is an educational environment focused on computer programming education
that integrates a program development online tool with a remote lab. To evaluate system usability,
empirical test was conducted with computer science students in order to identify the views of users on the
system and get directions on how to improve the quality of interface use. The study used questionnaire and
observation of the evaluator. The degree of usersâ satisfaction was measured by using a quantitative
approach that establishes the average ranking for each question of the questionnaire. The results indicate
that the system is simple, easy to use and suited to programming practices, however needed changes to
make it more intuitive and efficient. The realization test of usability, even with a small sample user, is
important to provide feedback on the system's user experience and help identify problems.
Similar to An empirical comparison between monkey testing and human testing (WIP paper).pdf (20)
How to Make a Field invisible in Odoo 17Celine George
Â
It is possible to hide or invisible some fields in odoo. Commonly using âinvisibleâ attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Â
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
Â
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Safalta Digital marketing institute in Noida, provide complete applications that encompass a huge range of virtual advertising and marketing additives, which includes search engine optimization, virtual communication advertising, pay-per-click on marketing, content material advertising, internet analytics, and greater. These university courses are designed for students who possess a comprehensive understanding of virtual marketing strategies and attributes.Safalta Digital Marketing Institute in Noida is a first choice for young individuals or students who are looking to start their careers in the field of digital advertising. The institute gives specialized courses designed and certification.
for beginners, providing thorough training in areas such as SEO, digital communication marketing, and PPC training in Noida. After finishing the program, students receive the certifications recognised by top different universitie, setting a strong foundation for a successful career in digital marketing.
Embracing GenAI - A Strategic ImperativePeter Windle
Â
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Operation âBlue Starâ is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
Introduction to AI for Nonprofits with Tapp NetworkTechSoup
Â
Dive into the world of AI! Experts Jon Hill and Tareq Monaur will guide you through AI's role in enhancing nonprofit websites and basic marketing strategies, making it easy to understand and apply.
Francesca Gottschalk - How can education support child empowerment.pptxEduSkills OECD
Â
Francesca Gottschalk from the OECDâs Centre for Educational Research and Innovation presents at the Ask an Expert Webinar: How can education support child empowerment?
Synthetic Fiber Construction in lab .pptxPavel ( NSTU)
Â
Synthetic fiber production is a fascinating and complex field that blends chemistry, engineering, and environmental science. By understanding these aspects, students can gain a comprehensive view of synthetic fiber production, its impact on society and the environment, and the potential for future innovations. Synthetic fibers play a crucial role in modern society, impacting various aspects of daily life, industry, and the environment. ynthetic fibers are integral to modern life, offering a range of benefits from cost-effectiveness and versatility to innovative applications and performance characteristics. While they pose environmental challenges, ongoing research and development aim to create more sustainable and eco-friendly alternatives. Understanding the importance of synthetic fibers helps in appreciating their role in the economy, industry, and daily life, while also emphasizing the need for sustainable practices and innovation.
An empirical comparison between monkey testing and human testing (WIP paper).pdf
1. An Empirical Comparison between Monkey Testing
and Human Testing (WIP Paper)
Mostafa Mohammed
Virginia Tech
Blacksburg, Virginia
profmdn@vt.edu
Haipeng Cai
Washington State University
Pullman, Washington
haipeng.cai@wsu.edu
Na Meng
Virginia Tech
Blacksburg, Virginia
nm8247@vt.edu
Abstract
Android app testing is challenging and time-consuming be-
cause fully testing all feasible execution paths is difficult.
Nowadays apps are usually tested in two ways: human test-
ing or automated testing. Prior work compared different auto-
mated tools. However, some fundamental questions are still
unexplored, including (1) how automated testing behaves
differently from human testing, and (2) whether automated
testing can fully or partially substitute human testing.
This paper presents our study to explore the open ques-
tions. Monkey has been considered one of the best automated
testing tools due to its usability, reliability, and competitive
coverage metrics, so we applied Monkey to five Android
apps and collected their dynamic event traces. Meanwhile,
we recruited eight users to manually test the same apps and
gathered the traces. By comparing the collected data, we re-
vealed that i.) on average, the two methods generated similar
numbers of unique events; ii.) Monkey created more system
events while humans created more UI events; iii.) Monkey
could mimic human behaviors when apps have UIs full of
clickable widgets to trigger logically independent events;
and iv.) Monkey was insufficient to test apps that require
information comprehension and problem-solving skills. Our
research sheds light on future research that combines human
expertise with the agility of Monkey testing.
CCS Concepts ¡ General and reference â Empirical
studies;
Keywords Empirical, Monkey testing, human testing
ACM Reference Format:
Mostafa Mohammed, Haipeng Cai, and Na Meng. 2019. An Empiri-
cal Comparison between Monkey Testing and Human Testing (WIP
Paper). In Proceedings of the 20th ACM SIGPLAN/SIGBED Conference
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are not
made or distributed for profit or commercial advantage and that copies bear
this notice and the full citation on the first page. Copyrights for components
of this work owned by others than ACM must be honored. Abstracting with
credit is permitted. To copy otherwise, or republish, to post on servers or to
redistribute to lists, requires prior specific permission and/or a fee. Request
permissions from permissions@acm.org.
LCTES â19, June 23, 2019, Phoenix, AZ, USA
Š 2019 Association for Computing Machinery.
ACM ISBN 978-1-4503-6724-0/19/06...$15.00
https://doi.org/10.1145/3316482.3326342
on Languages, Compilers, and Tools for Embedded Systems (LCTES
â19), June 23, 2019, Phoenix, AZ, USA. ACM, New York, NY, USA,
5 pages. https://doi.org/10.1145/3316482.3326342
1 Introduction
As Android devices become popular, many new apps are
built every day. In year 2016 alone, developers published
1.3 million new Android apps on Google Play [1], which
can be translated to 3,561 new apps every day. Traditionally,
software professionals test such apps by randomly clicking
buttons or exploring different features shown in the User
Interfaces (UIs). As new apps significantly increase day-by-
day, it is almost infeasible for humans to quickly test all apps
or reveal all of the scenarios when app failures occur.
Tools were built to automatically test Android apps [4,
5, 10, 11, 14, 16, 17, 22, 23]. For instance, Monkey treats an
Android app as a black box, and randomly generates UI
actions (e.g., click and touch) as test inputs. GUIRipper [10]
dynamically builds a GUI model of the app under testing to
trigger GUI events in a depth-first search (DFS) manner [10].
ACTEve uses concolic testing to iteratively (1) create an
event based on symbolic execution, (2) monitor any concrete
execution triggered by the event, and (3) negate some path
constraints to cover more execution paths.
Choudhary et al. conducted an empirical study to compare
14 automated testing tools [13]. The researchers showed that
Monkey is the best because it (1) achieves the highest code
coverage, (2) triggers the most software failures within a
time limit, (3) is easy to use, and (4) is compatible with any
Android framework version. Patel et al. further characterized
the effectiveness of random testing by Monkey in five as-
pects (e.g., stress testing and parameter tuning), and reported
that Monkey is on par with manual exploration in terms of
code coverage at various granularity levels (e.g., block, state-
ment) [19]. Machiry et al. showed that Monkeyâs coverage
of UI events is comparable to that of the peer tools [16]. Nev-
ertheless, it is still unknown how automated tools test
Android apps differently from human testers.
This paper is intended to empirically contrast automated
testing with human testing. Prior work shows that Monkey
generally works at least as effectively as other automated
tools [13, 18, 20], so we used Monkey as a representative au-
tomated testing tool. Specifically, we chose five Android apps
from different domains, and executed the apps with Monkey.
2. LCTES â19, June 23, 2019, Phoenix, AZ, USA Mostafa Mohammed, Haipeng Cai, and Na Meng
Meanwhile, we recruited eight developers to manually test
these apps. With prior work Droidfax [12], we instrumented
each app to log any UI event (e.g., onClick()), lifecycle event
(e.g., onCreate(...)), or system event (e.g., onLocationChanged())
triggered in the execution. In this way, we collected the event
traces for Monkey testing and human testing, and compared
the traces. We investigated three research questions:
RQ1: Does Monkey always trigger more diverse events when
it runs an app for a longer time?
RQ2: Does Monkey trigger more or fewer unique events
than humans within the same period of time?
RQ3: Can Monkey produce certain events more or less ef-
fectively than humans?
Our experiments show that there is no correlation between
Monkeyâs event coverage and execution time. As execu-
tion time increases, Monkey inputs triggered more or fewer
unique events between different runs of the same app. Addi-
tionally, we applied both Monkey testing and human testing
to the same set of apps for the same periods of time (e.g. 15
minutes). Although humans were slower to generate test
inputs, we observed no significant difference between the
numbers of unique events triggered by different methods.
Furthermore, we found Monkey to effectively trigger more
system events but fewer UI events than humans, probably
because Monkey has no domain knowledge of the app usage.
Our study corroborates some of the observations by prior
work [13, 19]. More importantly, we quantitatively and quali-
tatively analyzed the behavioral differences between Monkey
and humans. Our findings not only provide insights on how
developers should test apps to complement Monkey testing,
but also shed lights on future research that integrates human
expertise into automated testing.
2 Methodology
For our study, we selected five Android apps from different
application domains. These apps are:
⢠Amazon [2]: an online shopping app;
⢠Candy Crush [3]: a game app to earn scores by moving,
matching, and crushing candies;
⢠Spotify [7]: an entertainment app to listen to music;
⢠Twitter [8]: a social media app for users to broadcast
text messages and live streams; and
⢠Viber [9]: a call app for people to make free phone
calls or send messages through the Internet.
The idea behind selecting these apps is to compare Monkey
testing and human testing in different application scenarios.
Monkey Testing. We used Monkey to test the apps. Mon-
key ran each app five times with different lengths of execu-
tion time (i.e., 15, 20, 25, 30, and 60 minutes).
Human Testing. We invited eight students at Virginia
Tech to use the apps. These students include five CS PhDs,
an ECE PhD, a CS Master, and a CS Bachelor. To thoroughly
compare Monkey and humans in various contexts, we inten-
tionally asked the users to test apps in the following way (see
Table 1). We required each user to spend in total two hours
in testing all apps to limit their time contribution. However,
to diversify the human-testing periods of each app, we asked
each user to test 1 app for 15 minutes, 1 app for 20 minutes,
1 app for 25 minutes, and 2 apps each for 30 minutes.
Table 1. Minutes spent by each user to test the five apps
Candy Crush Amazon Twitter Spotify Viber
User 1 30 15 20 25 30
User 2 15 20 25 30 30
User 3 30 15 20 25 30
User 4 30 30 15 20 25
User 5 25 30 30 15 20
User 6 20 25 30 30 15
User 7 15 20 25 30 30
User 8 30 30 15 20 25
Droidfax [12]. When Monkey or humans tested Android
apps, we used an existing toolĂDroidfaxĂto instrument apps
and to generate execution traces. Given an appâs apk file,
Droidfax instruments the appâs Dalvik bytecode and collects
various dynamic execution information, such as triggered
events, invoked methods, and involved Inter-Component
Communications (ICCs). We used Droidfaxâs event tracing
feature to log all triggered events, including UI events, life-
cycle events, and system events. UI events (e.g., onClick(...))
are directly triggered by usersâ operations (e.g., button click)
to use apps. Lifecycle events (e.g., onPause()) are triggered
when an appâs component has its status changed (e.g., stopped).
System events are produced when any component of the An-
droid system has its status changed (e.g., onNetworkActive()).
After instrumenting the apps with Droidfax, we installed
the instrumented apps to an Android emulator [6] on a Linux
laptop. When an instrumented app was tested by Monkey
or a human, we used the command Ĺadb logcatĹž to save
log information to a file. For any run of any app, Droidfax
created an event-trace log file. We analyzed these log files to
investigate our research questions.
3 Major Findings
We present our investigation results for the research ques-
tions separately in Section 3.1-3.3.
3.1 Correlation between Monkeyâs Event Coverage
and Testing Time
Table 2 reports the number of unique events covered by dif-
ferent runs of Monkey testing, with each cell corresponding
to one Monkey testing run. From the table, we do not see
any correlation between Monkeyâs event coverage and
testing time. As the time period increases, Monkey testing
triggered more or fewer unique events. For instance, the
20-minute run of Candy Crush covered 24 unique events,
which number is larger than the number of events produced
by any other run of the same app. Among these 24 events,
3. An Empirical Comparison between Monkey Testing and Human Testing LCTES â19, June 23, 2019, Phoenix, AZ, USA
there were 2 events never triggered by any of the other 4
runs: onBind(...) and onReceive(...).
Table 2. Numbers of unique events covered by different runs
of Monkey testing
Time
Span
Candy
Crush
Amazon Twitter Spotify Viber Average
15 min 18 36 52 38 56 40
20 min 24 45 48 35 56 42
25 min 23 40 45 34 63 41
30 min 22 47 48 36 63 43
60 min 22 44 55 45 57 45
3.2 Event Coverage Comparison between Monkey
Testing and Human Testing
Table 3 presents the average numbers of unique events cov-
ered by different runs of human testing. If one app was tested
by multiple users for the same period of time (e.g., 30 min-
utes), we reported the average number of unique events
covered in the multiple log files. If an app was tested by only
one user for a certain period of time (e.g., only User 1 ran
Amazon for 15 minutes), we reported the number of unique
events covered by the single log file.
Table 3. Average numbers of unique events covered by dif-
ferent runs of human testing
Time
Span
Candy
Crush
Amazon Twitter Spotify Viber Average
15 min 26 46 51 39 43 41
20 min 15 46 48 39 54 40
25 min 17 38 49 36 54 39
30 min 31 41 52 37 57 44
Similar to the observation in Section 3.1, we also found
that there was no correlation between humansâ event
coverage and testing time. As the time period increased,
human testing triggered more or fewer unique events. One
possible reason is that humans tested apps in a random way.
Although users tried to investigate as many features as possi-
ble for an app, sometimes their investigation was still limited
to a subset of the features to test. Consequently, longer test-
ing time does not necessarily lead to higher coverage.
Comparing Table 2 and Table 3, we found that when test-
ing the same app for the same period of time, Monkey cov-
ered more or fewer unique events than humans. To decide
whether one approach outperforms the other in general,
we conducted Studentâs t-test [15] to compare the event
coverage metrics of both approaches for each app. Table 4
shows our comparison results. Mean â = MeanMonkey â
Meanhuman. To calculate Mean â for each app, we first com-
puted the mean event coverage of each approach for each app,
and then deducted Meanhuman from MeanMonkey. p-value
is a probability ranging in [0, 1]. Its value reflects whether
the distributions of two data groups are significantly differ-
ent. At the significance level of ι = 0.05, if p-value ⤠0.05,
the mean difference is significant.
Table 4. Mean event coverage comparison
Candy Crush Amazon Twitter Spotify Viber
Mean â -0.5 -0.75 -1.75 -2.0 7.5
p-value 0.9045 0.8210 0.3434 0.1289 0.0881
According to Table 4, we observed that all distribution
differences are insignificant. On average, human testing cov-
ered more events than Monkey testing for four apps (i.e.,
Candy Crush, Amazon, Twitter, and Spotify) and Monkey
testing covered more for one app (i.e., Viber). All p-values
are greater than 0.05. This implies that Monkey testing
works indistinguishably differently from human test-
ing with respect to event coverage.
3.3 Event Frequency Comparison between Monkey
Testing and Human Testing
To further compare how frequently Monkey and humans
generated different kinds of events, we classified events
into three categories: lifecycle events, UI events, and sys-
tem events. We clustered the recorded events accordingly
for each log file. Suppose a file contains N1 lifecycle events,
N2 UI events, and N3 system events, we further normalized
the event frequencies by computing the ratio of each type of
events among all logged events, i.e.,
ratioi =
Ni
(N1 + N2 + N3)
, (i â [1, 3]).
In this way, we could uniformly compute the average event
frequency distribution of each approach. As shown in Fig-
ure 1, M bars correspond to Monkey testing, while H bars
are for human testing. Each stacked bar usually has three
parts to manifest the ratios of lifecycle, UI, and system events.
We observed three phenomena in Figure 1.
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
M H M H M H M H M H
Candy Crush Amazon Twitter Spotify Viber
Lifecycle events UI events System events
Figure 1. Normalized event frequency comparison
First, in most scenarios, system events were the most
frequently generated events. To trigger such events, users
may check for or modify the phoneâs status or settings (e.g.,
GPS and wifi). System events dominated the events triggered
by Monkey, taking up to 54-99% of the overall generated
events among the five apps. Although such events were also
4. LCTES â19, June 23, 2019, Phoenix, AZ, USA Mostafa Mohammed, Haipeng Cai, and Na Meng
the dominant events triggered by humans for four apps, we
observed an exception for Candy Crush. As Candy Crush is
an interaction-intensive video game, it requires users to (1)
first respond to pop-up dialogs by selecting or entering in-
formation, and (2) frequently use the mouse to move objects.
Figure 2 presents the first UI to which users need to re-
spond by clicking the ĹPlay!Ĺž button. Although such testing
behaviors look naĂŻve to humans, they seem not well sup-
ported by Monkey testing. Two possible reasons can explain
such insufficiency. First, Monkey cannot read dialogs, neither
can it smartly make decisions or enter necessary informa-
tion to proceed in the game. Second, Monkey does not know
the game rules. It does not recognize candies, so it cannot
move mouse around to drag-and-drop candies. Consequently,
Monkey cannot play the game for testing.
Figure 2. Snapshot of Candy Crushâs starting UI
Second, Monkey usually generated lower percent-
ages of UI events. Among the five apps, Monkey triggered
0.1-14.7% UI events, while humans triggered 5.4-61.6% such
events. In our categorization, UI events include any userâs
interaction with the UI elements like menu bars, dialogs,
media players, views, and various widgets (e.g., information
widget). As a general-purpose random UI exerciser, Monkey
does not intelligently locate any operable UI element [21];
instead, it randomly tries different pixels in a screen.
When the business logic of different apps varies a lot, the
UI elements related to application logic have various layouts.
However, the UI elements related to system configurations
usually have simpler and fixed layouts. Between the two
types of UIs, it seems easier for Monkey to ĹaccidentallyĹž
trigger system events than application-specific UI events. For
example, we observed that Monkey frequently succeeded in
opening and manipulating a panel of shortcuts for system
settings. Such operations could produce many system events.
Third, Monkey could mimic human behaviors for
certain apps like Amazon, Twitter, and Viber. In partic-
ular for Viber, Monkey obtained the percentages of lifecycle,
UI, and system events as 40.8%, 4.8%, and 54.4%; while hu-
mans obtained quite close percentages: 38.3%, 5.4%, 56.3%.
For Twitter, Monkeyâs event distribution is: 24.5%, 14.7%, and
60.8%; while humansâ event distribution is: 26.8%, 7.0%, and
66.3%. For Amazon, the distributions are less similar. Mon-
keyâs event distribution is 14.5%, 2.5%, and 83.0%; and hu-
mansâ average event distribution is 12.6%, 11.1%, and 76.2%.
Figure 3. Snapshot of Viberâs contact UI
Two possible reasons can explain why Monkey worked
as effectively as humans for these apps. First, with lots of
clickable items in each UI, these apps are simple to navigate.
For instance, the Viber app only supports users to call or
message people. It has a few UIs. Especially in the contact
UI shown in Figure 3, almost every pixel corresponds to a
clickable UI element. Therefore, even though Monkey knows
nothing about the UI, its actions on different pixels can al-
ways effectively trigger events. Second, more importantly,
there is no implicit constraint on the sequence of user actions.
No matter how random the generated actions are, Monkey
can always successfully trigger events and make progress.
Monkey worked less effectively than humans when test-
ing Candy Crush and Spotify. As mentioned in Section 3.1,
Candy Crush requires users to first click the ĹPlay!Ĺž button
to start the game. If Monkey keeps wasting time clicking
pixels irrelevant to the button or pressing keys, it cannot
even start the game for testing. Similarly, the Spotify app
requires users to first click a button ĹPlayĹž to listen to a se-
lected song, and then to navigate the Media Player buttons
(e.g., ĹFast ForwardĹž) to control how music is played. Such
button-clicking sequence constraints are intuitive to humans,
but unknown to Monkey. Therefore, Monkey did not work
well for apps requiring for (1) UI element recognition, and
(2) action sequences following certain patterns.
4 Conclusion
We observed that Monkey is good at testing simple apps
(1) with UIs full of operable widgets and (2) requiring no
sequential ordering between any events triggered in the same
UI. Developers can apply Monkey testing to simple apps
multiple times to explore distinct usage scenarios. Monkey
is not good at testing complex apps that require for domain
knowledge and decision making. Developers can focus their
manual effort on such complex apps. In the future,while
humans test complex apps, we will instrument apps to gather
the user inputs and triggered events. By inferring the cause-
effect relationship between inputs and events, we can further
extract app-specific usage patterns, generalize those patterns,
and build automated tools to better mimic human testers.
Acknowledgments
We thank anonymous reviewers for their valuable comments
on the earlier version of our paper.
5. An Empirical Comparison between Monkey Testing and Human Testing LCTES â19, June 23, 2019, Phoenix, AZ, USA
References
[1] 2017. App Stores Start to Mature Ĺ 2016 Year in Review. https://blog.
appfigures.com/app-stores-start-to-mature-2016-year-in-review/.
(2017).
[2] 2018. Amazon Shopping APK. https://apkpure.com/cn/
amazon-shopping/com.amazon.mShop.android.shopping. (2018).
[3] 2018. Candy Crush APK. https://apkpure.com/cn/candy-crush-saga/
com.king.candycrushsaga. (2018).
[4] 2018. Google. Android Monkey. http://developer.android.com/tools/
help/monkey.html. (2018).
[5] 2018. Intent fuzzer. https://www.nccgroup.trust/us/about-us/
resources/intent-fuzzer/. (2018).
[6] 2018. Run Apps on the Android Emulator. https://developer.android.
com/studio/run/emulator.html. (2018).
[7] 2018. Spotify Music APK. https://apkpure.com/cn/
spotify-premium-music/com.spotify.music. (2018).
[8] 2018. Twitter APK. https://apkpure.com/cn/twitter/com.twitter.
android. (2018).
[9] 2018. Viber Messenger APK. https://apkpure.com/cn/viber-messenger/
com.viber.voip. (2018).
[10] Domenico Amalfitano, Anna Rita Fasolino, Porfirio Tramontana, Sal-
vatore De Carmine, and Atif M. Memon. 2012. Using GUI Rip-
ping for Automated Testing of Android Applications. In Proceedings
of the 27th IEEE/ACM International Conference on Automated Soft-
ware Engineering (ASE 2012). ACM, New York, NY, USA, 258Ĺ261.
https://doi.org/10.1145/2351676.2351717
[11] Saswat Anand, Mayur Naik, Mary Jean Harrold, and Hongseok Yang.
2012. Automated Concolic Testing of Smartphone Apps. In Proceedings
of the ACM SIGSOFT 20th International Symposium on the Foundations
of Software Engineering (FSE â12). ACM, New York, NY, USA, Article
59, 11 pages. https://doi.org/10.1145/2393596.2393666
[12] Haipeng Cai and Barbara Ryder. 2017. DroidFax: A Toolkit for Sys-
tematic Characterization of Android Applications. In International
Conference on Software Maintenance and Evolution (ICSME). 643Ĺ647.
[13] Shauvik Roy Choudhary, Alessandra Gorla, and Alessandro Orso. 2015.
Automated Test Input Generation for Android: Are We There Yet? (E).
In Proceedings of the 2015 30th IEEE/ACM International Conference
on Automated Software Engineering (ASE) (ASE â15). IEEE Computer
Society, Washington, DC, USA, 429Ĺ440. https://doi.org/10.1109/ASE.
2015.89
[14] Shuai Hao, Bin Liu, Suman Nath, William G.J. Halfond, and Ramesh
Govindan. 2014. PUMA: Programmable UI-automation for Large-
scale Dynamic Analysis of Mobile Apps. In Proceedings of the 12th
Annual International Conference on Mobile Systems, Applications, and
Services (MobiSys â14). ACM, New York, NY, USA, 204Ĺ217. https:
//doi.org/10.1145/2594368.2594390
[15] Winston Haynes. 2013. Studentâs t-Test. Springer New York, New York,
NY, 2023Ĺ2025. https://doi.org/10.1007/978-1-4419-9863-7_1184
[16] Aravind Machiry, Rohan Tahiliani, and Mayur Naik. 2013. Dyn-
odroid: An Input Generation System for Android Apps. In Proceedings
of the 2013 9th Joint Meeting on Foundations of Software Engineer-
ing (ESEC/FSE 2013). ACM, New York, NY, USA, 224Ĺ234. https:
//doi.org/10.1145/2491411.2491450
[17] Riyadh Mahmood, Nariman Mirzaei, and Sam Malek. 2014. EvoDroid:
Segmented Evolutionary Testing of Android Apps. In Proceedings of
the 22Nd ACM SIGSOFT International Symposium on Foundations of
Software Engineering (FSE 2014). ACM, New York, NY, USA, 599Ĺ609.
https://doi.org/10.1145/2635868.2635896
[18] Ke Mao, Mark Harman, and Yue Jia. 2016. Sapienz: Multi-objective
automated testing for android applications. In Proceedings of the 25th
International Symposium on Software Testing and Analysis. 94Ĺ105.
[19] Priyam Patel, Gokul Srinivasan, Sydur Rahaman, and Iulian Neamtiu.
2018. On the effectiveness of random testing for Android: or how
i learned to stop worrying and love the monkey. In Proceedings of
the 13th International Workshop on Automation of Software Test. ACM,
34Ĺ37.
[20] Ting Su, Guozhu Meng, Yuting Chen, Ke Wu, Weiming Yang, Yao Yao,
Geguang Pu, Yang Liu, and Zhendong Su. 2017. Guided, stochastic
model-based GUI testing of Android apps. In Proceedings of the 2017
11th Joint Meeting on Foundations of Software Engineering. 245Ĺ256.
[21] C. Sun, Z. Zhang, B. Jiang, and W. K. Chan. 2016. Facilitating Monkey
Test by Detecting Operable Regions in Rendered GUI of Mobile Game
Apps. In 2016 IEEE International Conference on Software Quality, Relia-
bility and Security (QRS). 298Ĺ306. https://doi.org/10.1109/QRS.2016.41
[22] Heila van der Merwe, Brink van der Merwe, and Willem Visser. 2014.
Execution and Property Specifications for JPF-android. SIGSOFT Softw.
Eng. Notes 39, 1 (Feb. 2014), 1Ĺ5. https://doi.org/10.1145/2557833.
2560576
[23] Wei Yang, Mukul R. Prasad, and Tao Xie. 2013. A Grey-box Ap-
proach for Automated GUI-model Generation of Mobile Applications.
In Proceedings of the 16th International Conference on Fundamental
Approaches to Software Engineering (FASEâ13). Springer-Verlag, Berlin,
Heidelberg, 250Ĺ265. https://doi.org/10.1007/978-3-642-37057-1_19