Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How do Developers Test Android Applications?

248 views

Published on

Enabling fully automated testing of mobile applications has recently become an important topic of study for both researchers and practitioners. A plethora of tools and approaches have been proposed to aid mobile developers both by augmenting manual testing practices and by automating various parts of the testing process. However, current approaches for automated testing fall short in convincing developers about their benefits, leading to a majority of mobile testing being performed manually. With the goal of helping researchers and practitioners – who design approaches supporting mobile testing – to understand developer’s needs, we analyzed survey responses from 102 open source contributors to Android projects about their practices when performing testing. The survey focused on questions regarding practices and preferences of developers/testers in-the-wild for (i) designing and generating test cases, (ii) automated testing practices, and (iii) perceptions of quality metrics such as code coverage for determining test quality. Analyzing the information gleaned from this survey, we compile a body of knowledge to help guide researchers and professionals toward tailoring new automated testing approaches to the need of a diverse set of open source developers.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

How do Developers Test Android Applications?

  1. 1. ICSME’17 Shanghai, China Wednesday, September 20th, 2017 Mario Linares-Vásquez, Carlos Bernal-Cárdenas, Kevin Moran, & Denys Poshyvanyk How do DevelopersTest Android Apps?
  2. 2. Automation is clearly applicable to mobile testing… But what are developers goals & needs?
  3. 3. “Automation applied to an inefficient operation will magnify the inefficiency” -Bill Gates
  4. 4. EMPIRICAL STUDY Practices Goals Needs Learned Lessons • Test Case Design Strategies • Perceptions of Testing Quality Metrics • Existing Tool Use • Preferences on Test Case Representation • Preferred Types of Testing Survey with Open Source Developers
  5. 5. EMPIRICAL STUDY GitHub Android projects 16,331 Developers (emails) 10,000 102 378 Invalid surveys Valid surveys Online survey
  6. 6. EMPIRICAL STUDY Numeric, Single-choice, Multiple-choice Authors 102Valid surveys Open Coding 25%
  7. 7. PARTICIPANT DEMOGRAPHICS
  8. 8. ACADEMIC LEVEL BACHELOR MASTER HIGH SCH. PH.D. POSTDOC 50.00% 35.29% 12.74% 1.96% 0.98%
  9. 9. PROGRAMMING EXPERIENCE MIN 3 YEARS MAX 35 YEARS MIN 0.5 YEARS MAX 9 YEARS GENERAL GENERAL 4.5 YEARS Average 7.5 YEARS Average
  10. 10. RESEARCH QUESTIONS • RQ1: Strategies for DesigningTest Cases • RQ2:Test Case Preferences • RQ3: CurrentTesting Practices • RQ4:Test Quality Metrics
  11. 11. SURVEY RESULTS
  12. 12. RQ1: Strategies for Designing GUITest Cases
  13. 13. DOCUMENTING APP REQUIREMENTS NO DOCS NL DESCRIPTION FEATURE LISTS USER STORIES 57 46 45 43 USE CASES 6
  14. 14. 0 10 TEST CASE DESIGN STRATEGIES SINGLE UC/F MULTIPLE UC/F RANDOM INPUTS OTHER 20 30 40 50 60 70 # RESPONSES 70+ 49 16 9
  15. 15. TEST CASE DESIGN STRATEGIES
  16. 16. TEST CASE DESIGN STRATEGIES “Based on the user stories or Use cases. Define initial state (local data). Perform scenario (call a rest service / perform an activity, etc). Validate final screen or rest service result.” “l look for boundary conditions - i try to work out what happens in the Grey Areas - i look for ways of breaking it. i also test a range of use cases, how Can the user interact with the app?”
  17. 17. RQ2:Test Case Preferences
  18. 18. AUTOMATED TEST CASE PREFERENCES 62 0 10 OUTPUTS UC/F STEPS STEPS SCREENSHOTS 20 30 40 50 60 70 52 36 22 18 OTHER 0 5 NL OTHER/NONE ATA ADB-INPUT 10 15 20 25 30 35 33 30 22 18 TEST CASE TYPE TEST CASE OUTPUT INFORMATION
  19. 19. RQ3: CurrentTesting Practices
  20. 20. DISTRIBUTION OF TESTING EFFORT 59% MANUAL 19% JUNIT 10% ATA 5% OTHER 3% MONKEY 2% CLOUD 1% R&R 1% GUI-RIPPING
  21. 21. DISTRIBUTION OF TESTING EFFORT 59% MANUAL 19% JUNIT 10% ATA 5% OTHER 3% MONKEY 2% CLOUD 1% R&R 1% GUI-RIPPING
  22. 22. DEVELOPER REASONING 1. Changing Requirements 2. Lack of Time forTesting 3. App Size 4. Lack of AutomatedTool Knowledge 5. Costs of AutomatedTests
  23. 23. DEVELOPER REASONING “our app changes very frequently and we can’t afford unit testing and automated testing.” “I’m faster by testing the app and all it’s possibilities on the device itself, instead of writing separate Test Cases.” “Too much time required in configuring the components for automated testing.” “We prioritize feature work over automated testing. Automated tests have done little to prevent bugs, but incur significant overhead when creating new features or refactoring old.”
  24. 24. TOOLS USED
  25. 25. EXPERIENCES WITH RANDOM TESTING
  26. 26. EXPERIENCES WITH RANDOM TESTING “Monkey is very useful for stress testing the application or to verify that there are no leaks (typically memory leaks) that build up over time. Sometimes, it also catches the odd bug as well.” “Good for stress testing, not very consistent results. ” “I’ve ran Android Monkey and it found some defects, but the developers said ‘that barely happens’ or ‘that never happens’. So the defects weren’t looked into.”
  27. 27. RQ4:Test Quality Metrics
  28. 28. TEST QUALITY METRICS • On the Use of Code Coverage: • ~64% of Participants said Code Coverage was not useful for measuring mobile test effectiveness. • Useful Metrics Included: • Feature Coverage • Test Code Review • Number of Faults Uncovered
  29. 29. TEST QUALITY METRICS
  30. 30. TEST QUALITY METRICS “[We] calculate total coverage based on features, covered elements etc.” “No. We measure the number of uncaught bugs and regressions over time that devs had to spend time fixing” “We use code coverage more as a guide to which part of the code base might need more attention in terms of writing more tests. We don’t really have other metric for measuring quality of the test cases. ”
  31. 31. SUMMARY OF RESULTS RQ1: Test Case Design • Usage Models are heavily relied upon • Developers tend to design test cases around features/use cases. RQ2: Test Case Preferences • Natural LanguageTests are Preferred • Contextual Information (e.g., screenshots, outputs) is preferred in test cases RQ3:Current Tool Use • Testing APIs (e.g., JUnit, Robotium) dominate • AutomatedTools are typically not used • Primary barriers for adoption are expressiveness and maintenance RQ4:Test Quality Metrics • Most Mobile Developers don’t consider code coverage a reliable metric • Instead, feature coverage, code review, and faults uncovered are used
  32. 32. SUMMARY OF RESULTS RQ1: Test Case Design • Usage Models are heavily relied upon • Developers tend to design test cases around features/use cases. RQ2: Test Case Preferences • Natural LanguageTests are Preferred • Contextual Information (e.g., screenshots, outputs) is preferred in test cases RQ3:Current Tool Use • Testing APIs (e.g., JUnit, Robotium) dominate • AutomatedTools are typically not used • Primary barriers for adoption are expressiveness and maintenance RQ4:Test Quality Metrics • Most Mobile Developers don’t consider code coverage a reliable metric • Instead, feature coverage, code review, and faults uncovered are used
  33. 33. SUMMARY OF RESULTS RQ1: Test Case Design • Usage Models are heavily relied upon • Developers tend to design test cases around features/use cases. RQ2: Test Case Preferences • Natural LanguageTests are Preferred • Contextual Information (e.g., screenshots, outputs) is preferred in test cases RQ3:Current Tool Use • Testing APIs (e.g., JUnit, Robotium) dominate • AutomatedTools are typically not used • Primary barriers for adoption are expressiveness and maintenance RQ4:Test Quality Metrics • Most Mobile Developers don’t consider code coverage a reliable metric • Instead, feature coverage, code review, and faults uncovered are used
  34. 34. SUMMARY OF RESULTS RQ1: Test Case Design • Usage Models are heavily relied upon • Developers tend to design test cases around features/use cases. RQ2: Test Case Preferences • Natural LanguageTests are Preferred • Contextual Information (e.g., screenshots, outputs) is preferred in test cases RQ3:Current Tool Use • Testing APIs (e.g., JUnit, Robotium) dominate • AutomatedTools are typically not used • Primary barriers for adoption are expressiveness and maintenance RQ4:Test Quality Metrics • Most Mobile Developers don’t consider code coverage a reliable metric • Instead, feature coverage, code review, and faults uncovered are used
  35. 35. SUMMARY OF RESULTS RQ1: Test Case Design • Usage Models are heavily relied upon • Developers tend to design test cases around features/use cases. RQ2: Test Case Preferences • Natural LanguageTests are Preferred • Contextual Information (e.g., screenshots, outputs) is preferred in test cases RQ3:Current Tool Use • Testing APIs (e.g., JUnit, Robotium) dominate • AutomatedTools are typically not used • Primary barriers for adoption are expressiveness and maintenance RQ4:Test Quality Metrics • Most Mobile Developers don’t consider code coverage a reliable metric • Instead, feature coverage, code review, and faults uncovered are used
  36. 36. SUMMARY OF RESULTS RQ1: Test Case Design • Usage Models are heavily relied upon • Developers tend to design test cases around features/use cases. RQ2: Test Case Preferences • Natural LanguageTests are Preferred • Contextual Information (e.g., screenshots, outputs) is preferred in test cases RQ3:Current Tool Use • Testing APIs (e.g., JUnit, Robotium) dominate • AutomatedTools are typically not used • Primary barriers for adoption are expressiveness and maintenance RQ4:Test Quality Metrics • Most Mobile Developers don’t consider code coverage a reliable metric • Instead, feature coverage, code review, and faults uncovered are used
  37. 37. WHAT CAN THE RESEARCH COMMUNITY DO?
  38. 38. WHAT CAN THE RESEARCH COMMUNITY DO? • Support Multiple Testing Goals
  39. 39. WHAT CAN THE RESEARCH COMMUNITY DO? • Support Multiple Testing Goals • Consider Different Test Quality Metrics
  40. 40. WHAT CAN THE RESEARCH COMMUNITY DO? • Support Multiple Testing Goals • Consider Different Test Quality Metrics • Eliminate the Overhead of Automated Testing
  41. 41. Any Questions? Thank you! Kevin Moran Ph.D. candidate kpmoran@cs.wm.edu www.kpmoran.com Denys Poshyvanyk Associate Professor denys@cs.wm.edu cs.wm.edu/~denys Mario LinaresVasquez Assistant Professor m.linaresv@uniandes.edu.co sistemas.uniandes.edu.co/~mlinaresv Carlos Bernal Cardenas Ph.D. candidate cebernal@cs.wm.edu cs.wm.edu/~cebernal

×