This document summarizes Pavneet Singh Kochhar's study on mining testing questions from Stack Overflow. The study aims to understand common challenges and topics discussed related to software testing. The methodology involves collecting 38,289 questions tagged with "test" from Stack Overflow from 2009-2014. Latent Dirichlet allocation is used to analyze topics and categories of discussion. Results show the most common categories are test frameworks, databases, and client-server. Hot topics over time include these as well. Mobile-related testing questions are increasing. Main challenges discussed include app testing, test frameworks, best practices, and database testing.
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Mining Testing Questions on Stack Overflow (STO
1. Mining Testing Questions on
Stack Overflow
Pavneet Singh Kochhar
Singapore Management University
kochharps.2012@smu.edu.sg
Fifth International Workshop on Software Mining
2. Software Testing, Why Bother?
Functionality -- Requirements
Bugs -- Software reliability
Costs -- Late bugs cost more
3. Software Testing, Why Bother?
• Horgan and Mathur [1]
– Adequate testing is critical to develop reliable
software
• Tassey [2]
– Inadequate testing cost US economy 59
billion dollars annually
[1] J.R. Horgan and A.P. Mathur, “Software testing and reliability.”
McGraw-Hill, Inc., 1996.
[2] G. Tassey, “The economic impacts of inadequate infrastructure for
software testing,” National Institute of Standards and Technology, 2002.
4. Related Work
• Mining Questions asked by Web
Developers [1]
– 3 topics – JavaScript, HTML5, CSS
– Categories of discussions & hot topics.
– Temporal trends
– Prevalence in mobile-related discussions
– Challenges faced by web developers
[1] Bajaj et al. , “Mining Questions asked by Web Developers.”, MSR 2014.
5. Study Goals
To study common challenges and important
topics of discussion.
What are the questions asked by
developers about testing?
6. 6
• Question Answering community
• 10 million questions; 4 million users
• Posts are related to:
• programming problem, software algorithm,
tools
• Share knowledge
• Seek expert advice
Stack Overflow
8. Dataset
• Collect all the questions Jan ‘09 – Dec ’14
• Filter out questions with tag “test”
• Tags are predefined on Stack Overflow
e.g., unit-testing, testing, automated-tests.
Number of
Questions
Number of
Askers
38,289 25,292
10. Research Questions
RQ1: What are the categories of topics of testing
related discussion?
RQ2: What are the hot topics related to software
testing in terms of importance?
RQ3: Are there temporal trends present in
discussions related to software testing?
RQ4: How prevalent are testing-related topics in
discussions related to mobile web development?
RQ5: What are the main technical challenges related
to testing?
12. RQ1: Topics of Discussion
12
Dataset
Filter questions with “test” in tags
Extract question &
accepted answer
LDA
Categories
Stop word removal & stemming
13. RQ1: Topics of Discussion
13
Topic Words
Test Framework test unit run suit integr
Database databas db creat db delet
Client Server request server respons client http
Login user password login usernam
Threads run start thread process call
Forms button window form click element
Image Processing imag png imgur path
14. RQ2: Hot Topics of Discussion
14
Dataset
Filter questions with “test” in tags
Extract accepted
answer
LDA
Categories
Stop word removal & stemming
Top 2000 sorted by view count
15. RQ2: Hot Topics of Discussion
15
Hot Topics
Test Framework
Database
Client Server
16. RQ3: Temporal Trends
16
Dataset
Filter questions with “test” in tags
Jan-Jun ‘09
LDA
Categories
Stop word removal & stemming
Partition Dataset
Jul-Dec ‘14
18. RQ4: Mobile Development
18
Dataset
Filter questions with “test” in tags
Extract question &
accepted answer
LDA
Categories
Stop word removal & stemming
Filter by mobile tags –
android, iphone, ios etc.
20. RQ5: Technical Challenges
20
Dataset
Filter questions with “test” in tags
Filter top 50 questions
Qualitative Analysis
Ranking based on formula
AMSi= 3Ui − 25Di + 10Ci + Ai + Fi
where Ui= number of users who upvoted, Di = users who downvoted, Ci =
number of comments, Ai = number of answers, Fi = favorite count
21. RQ5: Technical Challenges
• App Testing
“How to emulate GPS location in the Android
Emulator? I want to get longitude and latitude in
Android emulator for testing.”
Answer – Connecting to the emulator via Telnet
telnet localhost 5554
geo fix <longitude value> <latitude value>
22. RQ5: Technical Challenges
• Test Framework
“NUnit vs. MbUnit vs. MSTest vs. xUnit.net... I am
to choose the best one for us. But how? Does it
matter? Which one is most future proof? Should I
care about the features?”
• Best Practices
“I was wondering what the best practice is for unit
testing abstract classes and classes that extend
abstract classes.”
23. RQ5: Technical Challenges
• Database
“What strategies have you used for testing
database-driven applications, if any? What has
worked the best for you?”
• Web Testing
“what’s the best way to replicate a large load on an
asp.net web application? Is there an easy way to
simulate many requests on particular pages?
24. Conclusion
• Discussion Categories: test framework,
database, client server, threads, forms etc.
• Hot Topics: test framework, database, client
server.
• Hot topics have been consistently discussed
from Jan ‘09 – Dec ‘14.
• Mobile related discussions have increased in
testing questions.
• Users often post questions related to app
testing, test framework, best practices and
testing database-driven applications.
25. Future Work
• Expand the study to other Community Question
Answering websites.
• Survey developers to get an in-depth
understanding of challenges faced by
developers.
27. Outline
• Motivation and Goals
• Overall Process
• Dataset
• Empirical Results
• Conclusion and Future Work
28. Threats to Validity
• Internal validity:
– We link bug reports to commits using bug ids
– We use Randoop for 5 minutes
• External validity:
– Only analyze 2 large software systems
• Construct validity:
– We use point biserial correlation
28
29. Related Work
• Empirical study on testing and coverage
– Gligoric et al. show that branch coverage is the
best measure for test suite quality[1]
– Namin et al. show that test suite size and
coverage is correlated with test suite
effectiveness [2]
– Gopinath et al. investigate the correlation
between coverage and a test suite’s
effectiveness in killing mutants [3]
29
[1] M. Gligoric, A. Groce, C. Zhang, R. Sharma, M. A. Alipour, and D. Marinov. Comparing non-adequate
test suites using coverage criteria, ISSTA, 2013.
[2] A. S. Namin and J. H. Andrews. The influence of size and coverage on test suite effectiveness, ISSTA, 2009.
[3] R Gopinath, C. Jensen, and A. Groce, Code coverage for suite evaluation for developers, ICSE, 2014.