Identifying Bug-Prone API Methods using Crowdsourced Knowledge

IDENTIFICATION OF BUG-PRONE API
METHODS USING CROWDSOURCED
KNOWLEDGE
Mohammad Masudur Rahman
Department of Computer Science
University of Saskatchewan, Canada
CMPT-842: Mobile and Cloud Computing
Course Instructor: Dr. Ralph Deter

AN EXAMPLE BUGGY CODE!
2
 7 API classes from 2 packages
 7 Constructors
 7 API method invocations
Fig: Zip file creation

CHALLENGES AHEAD!!
 Relevant Information Sources
 Higher Learning Curve
3

GOOD NEWS---STACK OVERFLOW!
4
4M users
10M questions 21M answers
Massive body of information
(2008)
Programming
languages
Code
example
API issues & bugs
Relevant knowledge

JAVA REFLECTION ERROR/BUG!
5
Defective code
Rectified code
Defective invocation of
API method
Corrected invocation
of API method

OUTLINE OF THE TALK
6
Stack Overflow Q & A
BRACK
Evaluation
using
8 systems
Take-home messages
Validation
with
2 studies
Exploratory study 2 Research questions
API method
invocation database

EXPLORATORY STUDY
7
Construction of API Method
Invocation Database
Answering Research
Questions

EXPLORATORY STUDY: CONSTRUCTION OF
API METHOD INVOCATION DATABASE
8
Defective
method calls
Corrected
method calls
SO Q & A
thread
Defective code
Rectified code
Island parsing
API invocation
database
SO Q&A threads Preprocessing Topic
modeling
Bug/error
related topics
Bug/error
related threads
165,580
49,425

EXPLORATORY STUDY: RESEARCH QUESTIONS
RQ1: Are programming issues, errors or exceptions
reported at Stack Overflow frequently associated with
API method invocations?
9
RQ2: Are certain APIs and their methods more prone to
programming errors or bugs than the others?

RESEARCH QUESTIONS: ANSWER TO RQ1
10

RESEARCH QUESTIONS: ANSWER TO RQ2
11Fig: Related bug-proneness of Java API packages

EXPLORATORY STUDY SUMMARY
12
Programming issues, errors or exceptions reported at
Stack Overflow frequently are associated with API
method invocations.
Some APIs and their methods more prone to
programming errors or bugs than the others?

BRACK: IDENTIFICATION OF BUG-PRONE API
METHODS USING CROWDSOURCED KNOWLEDGE
13

BRACK: API BUG-PRONENESS
HEURISTICS—H1
 API Context-Susceptibility (ACS)
14
Defective code
 Dependency an of API invocation on the context
 Context can alter the expected behaviour of the invocation
 ACS-- estimates how vulnerable an API method invocation (e.g.,
BufferedReader.readLine()) to its context
 Based on reported programming errors at Stack Overflow

BRACK: API BUG-PRONENESS
HEURISTICS—H2
 API Error-Associativity (AEA)
15 Code segments from bug related Q & A of SO.
 AEA– calculates co-occurrence of an API method invocation in
both defective and rectified code segments
Defective code
Rectified code

BRACK: API BUG-PRONENESS RANKING
16
Defective code
Island parsing
API invocations
API invocation
database
Heuristic collector
Bug-proneness
score calculator
Bug-proneness
ranking
Bug-prone
API method
invocations
 Input: Defective code
 Output: Ranked bug-prone API method invocations
 Detailed algorithm in the paper.

CODE CONTEXTUAL SIMILARITY
17
ACS, AEA,

EXPERIMENTAL DESIGN
18
8 OSS systems 3,821 Bug-fixing commits Bug reports
Island parsing
Test cases & Gold setEvaluation Validation

EXPERIMENT: RESEARCH QUESTIONS
19
RQ1: How does BRACK perform in identifying bug-prone API method
invocations from a given code segment?
RQ2: How effective are those heuristics—ACS and AEA-- in
identifying bug-prone API method invocations?
RQ3: Does BRACK show any bias to any particular subject systems
or API packages in such identification?
RQ4: Is BRACK comparable to state-of-the-art in identifying bug-
prone API method invocations from the buggy code?

Performance Metric Top-3
Top-3 Accuracy 75.93%
Mean Reciprocal Rank@3 0.47
Mean Average Precision@3 59.04%
Mean Recall@3 34.44%
PERFORMANCE: ANSWER TO RQ1
20
Fig: Performance for different Top-K

EFFECTIVENESS: ANSWER TO RQ2
21
Metric ACS (H1) AEA (H2) Combined (H1+H2)
Top-3 Accuracy 75.54% 61.77% 75.93%
MRR@3 0.47 0.44 0.47
MAP@3 58.47% 51.47% 59.04%
MR@3 33.18% 21.20% 34.44%
 ACS is found more effective than AEA
 Combination marginally improves the performance
 Detailed analysis in the paper.

BIAS: ANSWER TO RQ3
22
Metric Small Systems (4) Medium Systems (4)
Top-3 Accuracy 77.23% 74.63%
MRR@3 0.50 0.44
MAP@3 61.41% 56.65%
MR@3 34.95% 33.93%
 Small Systems <150 commits.
 Medium Systems > 400 commits.
 MWU-test on Top-3 accuracy: p-value=0.75>0.05, performance
difference is NOT significant
 Similar findings about API packages (in the paper)

STATE-OF-THE-ART
 Chen and Kim, FSE 2015
 Detects defective code in Stack Overflow and suggests
corresponding rectified code.
 Subject to the availability of code clones.
 Kim et al. FSE 2015
 Applies 28 source code metrics and 12 software
process metrics.
 Random Forest based machine learning classifier.
 Less generalization.
23

VALIDATION: ANSWER TO RQ4
24
Fig: Comparative analysis

THREATS TO VALIDITY
 Internal Validity: Replication of existing studies in
our environment.
 Best performing settings applied.
 External Validity: Generalization of BRACK.
 API invocation convention similar across various
languages.
 Construct Validity: Appropriateness of the
performance metrics.
 Metrics taken from existing literature.
 Bias in gold set: Overlapping method invocation
assumption
 JDK bug fixing history should be added.
25

THANK YOU!! QUESTIONS?
27
Masud Rahman (masud.rahman@usask.ca)
BRACK (http://www.usask.ca/~masud.rahman/brack)

Identifying Bug-Prone API Methods using Crowdsourced Knowledge

Recommended

Recommended

More Related Content

Similar to Identifying Bug-Prone API Methods using Crowdsourced Knowledge

Similar to Identifying Bug-Prone API Methods using Crowdsourced Knowledge (20)

More from Masud Rahman

More from Masud Rahman (20)

Recently uploaded

Recently uploaded (20)

Identifying Bug-Prone API Methods using Crowdsourced Knowledge

Editor's Notes