A preliminary study on using code smells to improve bug localization

A Preliminary Study
on Using Code Smells
to Improve Bug Localization
Aoi Takahashi
Natthawute Sae-Lim
Shinpei Hayashi
Motoshi Saeki
Tokyo Institute of Technology, Japan
Authors:

Introduction
u Bug Localization (BL)
Ø The process of identifying the location(s) of a given bug
Ø BL can help developers to fix the bug
[Existing work] IR-based technique [1] etc...
2
[1] Jian Zhou et al. ”Where should the bugs be fixed? More accurate information retrieval-based bug localization based
on bug reports" ICSE 2012
Module list
Bug description
Exception when
saving to a readonly
directory.
…
1. -
2. -
3. -
Source code
ClassA {
...
}
Calculate
the textual similarity
Developer
Scoring
Fixed source code
ClassA {
...
}
ranked by the textual similarity

Example (IR-based BL) 3
Rank Class name Textual similarity
1 ActionSaveProject 0.179
2 AbstractFilePersister 0.174
3 ZipFilePersister 0.150
4 XmiFilePersister 0.150
5 UmlFilePersister 0.142
6 ProjectFilePersister 0.128
7 FileConstants 0.123
8 ActionSaveProjectAs 0.122
9 ActionOpenProject 0.121
10 ProjectBrowser 0.121
11 ・
・
・
Bug description
Exception when
directory.
…
Module list ranked by the textual similarity
in ArgoUML
(#3790)

Problem and Goal
u Problem of IR-based BL
Ø The likelihood of each module having a bug is often overlooked
u Goal of this paper
Ø To improve its accuracy by utilizing fault-proneness
→ We use code smell to represent fault-proneness
4
(Fault-proneness)

Code Smell [2]
u An indicator of a design flaw in source code
p Examples
Ø God Class
p Severity [3] → How strong a given code smell is
p Code smells have been found to be related to
change- and fault-proneness [4]
5
[2] Fowler “Refactoring. Improving the Design of Existing Code” Addison-Wesley, 1999
[4] Khomh et al. “An exploratory study of the impact of antipatterns on class change- and fault-proneness.”
Empirical Software Engineering 2012
[3] Marinescu “Assessing technical debt by identifying design flaws in software systems” IBM Journal of Research and
Development, 2012
Integer value

Example (IR-based BL) 6
11 ・
・
・
There are code smells
(God Class, Blob Class)
Bug description
Exception when
directory.
…

Overview of Our Technique 7
Scoring
The severity of
code smells
û
û
û
Code smell
detection
Module list ranked by
the proposed metric
1. -
2. -
3. -
Source code
Bug description
Scoring
Module list
ranked by the textual similarity
IR-based BL
1. -
2. -
3. -
Exception when
directory.
…
ClassA {
...
}

u Bug Likelihood Index (BLI) (!: Class or Method)
Proposed metric 8
#$% ! = 1 − ) * +,-! ! + ) * +,/0 !
Proposed metric Textual similarity The sum
of the severity
(normalized)
(normalized)
p ): Weight between 2345 and 2367 (9 ≤ ) ≤ ;)
Ø = = 0 → Same as IR-based BL
Ø A Parameter of the proposed technique

11
Proposed Technique 9
IR-based BL only
Rank Class name BLI
6 Modeller 0.479
8 Import 0.460
9 ParserDisplay 0.449
10 StylePanel 0.445
11・
・
・
・
・
・
Proposed technique
There are code smells
(God Class, Blob Class)

p Experimental Setup
[Tools]
n TraceLab [5]: calculating the textual similarity
n inFusion: detecting code smells
[Dataset] four open source projects [6]
[Metric] Mean Average Precision (MAP)
Preliminary Evaluation
RQ1: Can our approach improve IR-based BL?
RQ2: What is the best parameter value of our approach?
10
[6] Dit et al. “Feature location in source code: a taxonomy and survey” Journal of Software: Evolution and Process, 2013
[5] Dit et al. “A TraceLab-based Solution for Creating, Conducting, and Sharing Feature Location Experiments“ ICPC, 2012
(ArgoUML, JabRef, jEdit, muCommander)

11RQ1:
Can our approach improve
IR-based BL?
281% 184% 68% 36%
0
0.01
0.02
0.03
0.04
0.05
0.06
ArgoUML JabRef jEdit muCommander
IR-based BL Our technique
(MAP)
Method level
u Our approach can improve it! (Avg.142.25%)
11

12
Class level
RQ1:
Can our approach improve
IR-based BL?
0
0.05
0.1
0.15
0.2
0.25
ArgoUML JabRef jEdit muCommander
36% 34% 24% 28%
IR-based BL Our technique
(MAP)
u Our approach can improve it! (Avg.30.5%)
12

13
0
0.05
0.1
0.15
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ArgoUML
Class level Method level
RQ2:
What is the best parameter
value of our approach?
(MAP)
(α)

14
0
0.05
0.1
0.15
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ArgoUML
RQ2:
(α)
Same as IR-based BL
(MAP)

0
0.05
0.1
0.15
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ArgoUML
15RQ2:
(α)
The best point of MAP! = #. %&
! = #. %'
(MAP)

0
0.05
0.1
0.15
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
16RQ2:
u The average ! is approximately 0.41
u Combining code smell is better
(MAP)
(α)
ArgoUML
muCommanderjEdit
JabRef
0
0.05
0.1
0.15
0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
(MAP)
(α)
0
0.05
0.1
0.15
0.2
0.25
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
(MAP)
(α)
0
0.05
0.1
0.15
0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
(MAP)
(α)
! = #. %&
! = #. %'
! = #. %(
! = #. %)
! = #. '*
! = #. %*
! = #. &+
! = #. %%

Conclusion
u The first step of using code smells to improve BL
Ø Combining the severity of code smells with existing
information from IR-based approach
u Applying our technique to four open source projects
Ø Increasing MAP on both the class and method levels
in all projects
17
↑142.25% ↑30.50%

Introduction
p In large-scale software, many bugs occur daily
Ø These bugs are described in bug descriptions
p It is difficult to fix these bugs
Ø Developers have to identify the location(s)
of a bug from a large amount of source code
u Bug Localization (BL)
Ø The process of identifying the location(s) of a given bug
[Existing work] IR-based technique [1] etc...
Exception when
directory.
…
[1] Jian Zhou et al. ”Where should the bugs be fixed? More accurate information retrieval-based bug localization based
on bug reports" ICSE 2012
ArgoUML
(#3790)
19

Related Work
p Information Retrieval (IR) [1]
p Dynamic Analysis [7]
p Repository Mining [8]
p Combining Technique [9]
20
[1] Zhou et al. ”Where should the bugs be fixed? More accurate information retrieval-based bug localization
based on bug reports" ICSE 2012
[7] Liu et al. “How Does Execution Information Help with Information-Retrieval Based Bug Localization?” ICPC 2017
[8] Tantithamthavorn et al. ”Using co-change histories to improve bug localization performance” SNPD 2013
u IR needs only a bug description and source code
u No research combining code smell with existing work
[9] Youm et al. ” Improved bug localization based on code change histories and bug reports”
Information and Software Technology 2017

Example of Code Smells
p God Class
Ø A class that does too much work
p Blob Class
Ø A class that has too long lines of code
p Duplicated Code (We didn’t use this time)
Ø Where the same source code occurs more than once
21

How to detect God Class [10]
1. ATFD (Access To Foreign Data)
2. WMC (Weighted Method Count)
3. TCC (Tight Class Cohesion)
[10] Lanza and Marinescu ”Object-oriented metrics in practice: using software metrics to characterize, evaluate, and
improve the design of object-oriented systems" Springer Science & Business Media, 2007
!"#$ > #&'
'() > *&+, -./-
")) < 12& "-.+$
God ClassAND
22

Dataset 23
Dataset
Bug description Source codeGold set
Project Version Bug description Class Method
ArgoUML 0.20 - 0.24 74 1476 12131
jEdit 2.0 - 2.6 36 374 2947
JabRef 4.2 - 4.3 86 406 5276
muCommander 0.8.0 - 0.8.5 81 529 3916

Future Work
p More Preliminary Evaluation
Ø Increase the number of projects
Ø Combine code smell with other BL techniques
p Detailed Analysis
Ø Which code smells influence on IR-based BL?
Ø In what cases is our technique effective?
Ø Is it necessary to use code smell?
We need to verify the software metrics
24

A preliminary study on using code smells to improve bug localization

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to A preliminary study on using code smells to improve bug localization

Similar to A preliminary study on using code smells to improve bug localization (20)

Recently uploaded

Recently uploaded (20)

A preliminary study on using code smells to improve bug localization