SlideShare a Scribd company logo
A Preliminary Study
on Using Code Smells
to Improve Bug Localization
Aoi Takahashi
Natthawute Sae-Lim
Shinpei Hayashi
Motoshi Saeki
Tokyo Institute of Technology, Japan
Authors:
Introduction
u Bug Localization (BL)
Ø The process of identifying the location(s) of a given bug
Ø BL can help developers to fix the bug
[Existing work] IR-based technique [1] etc...
2
[1] Jian Zhou et al. ”Where should the bugs be fixed? More accurate information retrieval-based bug localization based
on bug reports" ICSE 2012
Module list
Bug description
Exception when
saving to a readonly
directory.
…
1. -
2. -
3. -
Source code
ClassA {
...
}
Calculate
the textual similarity
Developer
Scoring
Fixed source code
ClassA {
...
}
ranked by the textual similarity
Example (IR-based BL) 3
Rank Class name Textual similarity
1 ActionSaveProject 0.179
2 AbstractFilePersister 0.174
3 ZipFilePersister 0.150
4 XmiFilePersister 0.150
5 UmlFilePersister 0.142
6 ProjectFilePersister 0.128
7 FileConstants 0.123
8 ActionSaveProjectAs 0.122
9 ActionOpenProject 0.121
10 ProjectBrowser 0.121
11 ・
・
・
Bug description
Exception when
saving to a readonly
directory.
…
Module list ranked by the textual similarity
in ArgoUML
(#3790)
Problem and Goal
u Problem of IR-based BL
Ø The likelihood of each module having a bug is often overlooked
u Goal of this paper
Ø To improve its accuracy by utilizing fault-proneness
→ We use code smell to represent fault-proneness
4
(Fault-proneness)
Code Smell [2]
u An indicator of a design flaw in source code
p Examples
Ø God Class
p Severity [3] → How strong a given code smell is
p Code smells have been found to be related to
change- and fault-proneness [4]
5
[2] Fowler “Refactoring. Improving the Design of Existing Code” Addison-Wesley, 1999
[4] Khomh et al. “An exploratory study of the impact of antipatterns on class change- and fault-proneness.”
Empirical Software Engineering 2012
[3] Marinescu “Assessing technical debt by identifying design flaws in software systems” IBM Journal of Research and
Development, 2012
Integer value
Example (IR-based BL) 6
Rank Class name Textual similarity
1 ActionSaveProject 0.179
2 AbstractFilePersister 0.174
3 ZipFilePersister 0.150
4 XmiFilePersister 0.150
5 UmlFilePersister 0.142
6 ProjectFilePersister 0.128
7 FileConstants 0.123
8 ActionSaveProjectAs 0.122
9 ActionOpenProject 0.121
10 ProjectBrowser 0.121
11 ・
・
・
There are code smells
(God Class, Blob Class)
Bug description
Exception when
saving to a readonly
directory.
…
Overview of Our Technique 7
Scoring
The severity of
code smells
û
û
û
Code smell
detection
Module list ranked by
the proposed metric
1. -
2. -
3. -
Source code
Bug description
Scoring
Module list
ranked by the textual similarity
IR-based BL
1. -
2. -
3. -
Exception when
saving to a readonly
directory.
…
ClassA {
...
}
u Bug Likelihood Index (BLI) (!: Class or Method)
Proposed metric 8
#$% ! = 1 − ) * +,-! ! + ) * +,/0 !
Proposed metric Textual similarity The sum
of the severity
(normalized)
(normalized)
p ):	Weight between 2345 and 2367 (9 ≤ ) ≤ ;)
Ø = = 0 → Same as IR-based BL
Ø A Parameter of the proposed technique
Rank Class name Textual similarity
1 ActionSaveProject 0.179
2 AbstractFilePersister 0.174
3 ZipFilePersister 0.150
4 XmiFilePersister 0.150
5 UmlFilePersister 0.142
6 ProjectFilePersister 0.128
7 FileConstants 0.123
8 ActionSaveProjectAs 0.122
9 ActionOpenProject 0.121
10 ProjectBrowser 0.121
11
Proposed Technique 9
IR-based BL only
Rank Class name BLI
1 ProjectBrowser 0.674
2 ActionSaveProject 0.580
3 AbstractFilePersister 0.563
4 ZipFilePersister 0.489
5 XmiFilePersister 0.488
6 Modeller 0.479
7 UmlFilePersister 0.461
8 Import 0.460
9 ParserDisplay 0.449
10 StylePanel 0.445
11・
・
・
・
・
・
Proposed technique
There are code smells
(God Class, Blob Class)
p Experimental Setup
[Tools]
n TraceLab [5]: calculating the textual similarity
n inFusion: detecting code smells
[Dataset] four open source projects [6]
[Metric] Mean Average Precision (MAP)
Preliminary Evaluation
RQ1: Can our approach improve IR-based BL?
RQ2: What is the best parameter value of our approach?
10
[6] Dit et al. “Feature location in source code: a taxonomy and survey” Journal of Software: Evolution and Process, 2013
[5] Dit et al. “A TraceLab-based Solution for Creating, Conducting, and Sharing Feature Location Experiments“ ICPC, 2012
(ArgoUML, JabRef, jEdit, muCommander)
11RQ1:
Can our approach improve
IR-based BL?
281% 184% 68% 36%
0
0.01
0.02
0.03
0.04
0.05
0.06
ArgoUML JabRef jEdit muCommander
IR-based BL Our technique
(MAP)
Method level
u Our approach can improve it! (Avg.142.25%)
11
12
Class level
RQ1:
Can our approach improve
IR-based BL?
0
0.05
0.1
0.15
0.2
0.25
ArgoUML JabRef jEdit muCommander
36% 34% 24% 28%
IR-based BL Our technique
(MAP)
u Our approach can improve it! (Avg.30.5%)
12
13
0
0.05
0.1
0.15
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ArgoUML
Class level Method level
RQ2:
What is the best parameter
value of our approach?
(MAP)
(α)
14
0
0.05
0.1
0.15
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ArgoUML
Class level Method level
RQ2:
What is the best parameter
value of our approach?
(α)
Same as IR-based BL
(MAP)
0
0.05
0.1
0.15
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ArgoUML
Class level Method level
15RQ2:
What is the best parameter
value of our approach?
(α)
The best point of MAP! = #. %&
! = #. %'
(MAP)
0
0.05
0.1
0.15
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Class level Method level
16RQ2:
What is the best parameter
value of our approach?
u The average ! is approximately 0.41
u Combining code smell is better
(MAP)
(α)
ArgoUML
muCommanderjEdit
JabRef
0
0.05
0.1
0.15
0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Class level Method level
(MAP)
(α)
0
0.05
0.1
0.15
0.2
0.25
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Class level Method level
(MAP)
(α)
0
0.05
0.1
0.15
0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Class level Method level
(MAP)
(α)
! = #. %&
! = #. %'
! = #. %(
! = #. %)
! = #. '*
! = #. %*
! = #. &+
! = #. %%
Conclusion
u The first step of using code smells to improve BL
Ø Combining the severity of code smells with existing
information from IR-based approach
u Applying our technique to four open source projects
Ø Increasing MAP on both the class and method levels
in all projects
17
↑142.25% ↑30.50%
APPENDIX
Introduction
p In large-scale software, many bugs occur daily
Ø These bugs are described in bug descriptions
p It is difficult to fix these bugs
Ø Developers have to identify the location(s)
of a bug from a large amount of source code
u Bug Localization (BL)
Ø The process of identifying the location(s) of a given bug
[Existing work] IR-based technique [1] etc...
Exception when
saving to a readonly
directory.
…
[1] Jian Zhou et al. ”Where should the bugs be fixed? More accurate information retrieval-based bug localization based
on bug reports" ICSE 2012
ArgoUML
(#3790)
19
Related Work
p Information Retrieval (IR) [1]
p Dynamic Analysis [7]
p Repository Mining [8]
p Combining Technique [9]
20
[1] Zhou et al. ”Where should the bugs be fixed? More accurate information retrieval-based bug localization
based on bug reports" ICSE 2012
[7] Liu et al. “How Does Execution Information Help with Information-Retrieval Based Bug Localization?” ICPC 2017
[8] Tantithamthavorn et al. ”Using co-change histories to improve bug localization performance” SNPD 2013
u IR needs only a bug description and source code
u No research combining code smell with existing work
[9] Youm et al. ” Improved bug localization based on code change histories and bug reports”
Information and Software Technology 2017
Example of Code Smells
p God Class
Ø A class that does too much work
p Blob Class
Ø A class that has too long lines of code
p Duplicated Code (We didn’t use this time)
Ø Where the same source code occurs more than once
21
How to detect God Class [10]
1. ATFD (Access To Foreign Data)
2. WMC (Weighted Method Count)
3. TCC (Tight Class Cohesion)
[10] Lanza and Marinescu ”Object-oriented metrics in practice: using software metrics to characterize, evaluate, and
improve the design of object-oriented systems" Springer Science & Business Media, 2007
!"#$ > #&'
'() > *&+, -./-
")) < 12& "-.+$
God ClassAND
22
Dataset 23
Dataset
Bug description Source codeGold set
Project Version Bug description Class Method
ArgoUML 0.20 - 0.24 74 1476 12131
jEdit 2.0 - 2.6 36 374 2947
JabRef 4.2 - 4.3 86 406 5276
muCommander 0.8.0 - 0.8.5 81 529 3916
Future Work
p More Preliminary Evaluation
Ø Increase the number of projects
Ø Combine code smell with other BL techniques
p Detailed Analysis
Ø Which code smells influence on IR-based BL?
Ø In what cases is our technique effective?
Ø Is it necessary to use code smell?
We need to verify the software metrics
24

More Related Content

What's hot

Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Feng Zhang
 
Clone Detection for Graph-Based Model Transformation Languages
Clone Detection for Graph-Based Model Transformation LanguagesClone Detection for Graph-Based Model Transformation Languages
Clone Detection for Graph-Based Model Transformation Languages
Daniel G. Strüber
 
Cross-project defect prediction
Cross-project defect predictionCross-project defect prediction
Cross-project defect prediction
Thomas Zimmermann
 
Dissertation Defense
Dissertation DefenseDissertation Defense
Dissertation Defense
Sung Kim
 
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
Ali Ouni
 
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffAnalyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
Martin Pinzger
 
A Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug Prediction
Martin Pinzger
 
Data collection for software defect prediction
Data collection for software defect predictionData collection for software defect prediction
Data collection for software defect prediction
AmmAr mobark
 
Who Should Review My Code?
Who Should Review My Code?  Who Should Review My Code?
Who Should Review My Code?
The University of Adelaide
 
Web Service Antipatterns Detection Using Genetic Programming
Web Service Antipatterns Detection Using Genetic ProgrammingWeb Service Antipatterns Detection Using Genetic Programming
Web Service Antipatterns Detection Using Genetic Programming
Ali Ouni
 
ProspectusPresentationPrinterFriendly
ProspectusPresentationPrinterFriendlyProspectusPresentationPrinterFriendly
ProspectusPresentationPrinterFriendly
martijnetje
 
ICSE2013
ICSE2013ICSE2013
ICSE2013
swy351
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
Sung Kim
 
An Empirical Study on the Adequacy of Testing in Open Source Projects
An Empirical Study on the Adequacy of Testing in Open Source ProjectsAn Empirical Study on the Adequacy of Testing in Open Source Projects
An Empirical Study on the Adequacy of Testing in Open Source Projects
Pavneet Singh Kochhar
 
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
SAIL_QU
 
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code ReviewICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
Ali Ouni
 
ICSE2014
ICSE2014ICSE2014
ICSE2014
swy351
 
Using HPC Resources to Exploit Big Data for Code Review Analytics
Using HPC Resources to Exploit Big Data for Code Review AnalyticsUsing HPC Resources to Exploit Big Data for Code Review Analytics
Using HPC Resources to Exploit Big Data for Code Review Analytics
The University of Adelaide
 
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Chakkrit (Kla) Tantithamthavorn
 
A hybrid model to detect malicious executables
A hybrid model to detect malicious executablesA hybrid model to detect malicious executables
A hybrid model to detect malicious executables
UltraUploader
 

What's hot (20)

Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
 
Clone Detection for Graph-Based Model Transformation Languages
Clone Detection for Graph-Based Model Transformation LanguagesClone Detection for Graph-Based Model Transformation Languages
Clone Detection for Graph-Based Model Transformation Languages
 
Cross-project defect prediction
Cross-project defect predictionCross-project defect prediction
Cross-project defect prediction
 
Dissertation Defense
Dissertation DefenseDissertation Defense
Dissertation Defense
 
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
A Multi-Objective Refactoring Approach to Introduce Design Patterns and Fix A...
 
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiffAnalyzing Changes in Software Systems From ChangeDistiller to FMDiff
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
 
A Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug PredictionA Tale of Experiments on Bug Prediction
A Tale of Experiments on Bug Prediction
 
Data collection for software defect prediction
Data collection for software defect predictionData collection for software defect prediction
Data collection for software defect prediction
 
Who Should Review My Code?
Who Should Review My Code?  Who Should Review My Code?
Who Should Review My Code?
 
Web Service Antipatterns Detection Using Genetic Programming
Web Service Antipatterns Detection Using Genetic ProgrammingWeb Service Antipatterns Detection Using Genetic Programming
Web Service Antipatterns Detection Using Genetic Programming
 
ProspectusPresentationPrinterFriendly
ProspectusPresentationPrinterFriendlyProspectusPresentationPrinterFriendly
ProspectusPresentationPrinterFriendly
 
ICSE2013
ICSE2013ICSE2013
ICSE2013
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
 
An Empirical Study on the Adequacy of Testing in Open Source Projects
An Empirical Study on the Adequacy of Testing in Open Source ProjectsAn Empirical Study on the Adequacy of Testing in Open Source Projects
An Empirical Study on the Adequacy of Testing in Open Source Projects
 
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
 
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code ReviewICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
ICSME 2016: Search-Based Peer Reviewers Recommendation in Modern Code Review
 
ICSE2014
ICSE2014ICSE2014
ICSE2014
 
Using HPC Resources to Exploit Big Data for Code Review Analytics
Using HPC Resources to Exploit Big Data for Code Review AnalyticsUsing HPC Resources to Exploit Big Data for Code Review Analytics
Using HPC Resources to Exploit Big Data for Code Review Analytics
 
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
 
A hybrid model to detect malicious executables
A hybrid model to detect malicious executablesA hybrid model to detect malicious executables
A hybrid model to detect malicious executables
 

Similar to A preliminary study on using code smells to improve bug localization

Multi step automated refactoring for code smell
Multi step automated refactoring for code smellMulti step automated refactoring for code smell
Multi step automated refactoring for code smell
eSAT Journals
 
Multi step automated refactoring for code smell
Multi step automated refactoring for code smellMulti step automated refactoring for code smell
Multi step automated refactoring for code smell
eSAT Publishing House
 
A tale of bug prediction in software development
A tale of bug prediction in software developmentA tale of bug prediction in software development
A tale of bug prediction in software development
Martin Pinzger
 
OpenOffice++: Improving the Quality of Open Source Software
OpenOffice++: Improving the Quality of Open Source SoftwareOpenOffice++: Improving the Quality of Open Source Software
OpenOffice++: Improving the Quality of Open Source Software
Alexandro Colorado
 
ICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-Localization
Masud Rahman
 
Towards a Macrobenchmark Framework for Performance Analysis of Java Applications
Towards a Macrobenchmark Framework for Performance Analysis of Java ApplicationsTowards a Macrobenchmark Framework for Performance Analysis of Java Applications
Towards a Macrobenchmark Framework for Performance Analysis of Java Applications
Gábor Szárnyas
 
Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models
IJECEIAES
 
Put Your Hands in the Mud: What Technique, Why, and How
Put Your Hands in the Mud: What Technique, Why, and HowPut Your Hands in the Mud: What Technique, Why, and How
Put Your Hands in the Mud: What Technique, Why, and How
Massimiliano Di Penta
 
‘CodeAliker’ - Plagiarism Detection on the Cloud
‘CodeAliker’ - Plagiarism Detection on the Cloud ‘CodeAliker’ - Plagiarism Detection on the Cloud
‘CodeAliker’ - Plagiarism Detection on the Cloud
acijjournal
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
Tao Xie
 
Looking for Bugs in MonoDevelop
Looking for Bugs in MonoDevelopLooking for Bugs in MonoDevelop
Looking for Bugs in MonoDevelop
PVS-Studio
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
dgarijo
 
Statistical debuging for programs written in dynamic programming language ruby
Statistical debuging for programs written in dynamic programming language   rubyStatistical debuging for programs written in dynamic programming language   ruby
Statistical debuging for programs written in dynamic programming language ruby
Adeel Akhter
 
Part 1
Part 1Part 1
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine LearningIRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET Journal
 
I explore
I exploreI explore
Towards effective bug triage with software data reduction techniques
Towards effective bug triage with software data reduction techniquesTowards effective bug triage with software data reduction techniques
Towards effective bug triage with software data reduction techniques
Pvrtechnologies Nellore
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lecture
Masud Rahman
 
Introduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaIntroduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKenna
openseesdays
 
Software Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and SecuritySoftware Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and Security
Tao Xie
 

Similar to A preliminary study on using code smells to improve bug localization (20)

Multi step automated refactoring for code smell
Multi step automated refactoring for code smellMulti step automated refactoring for code smell
Multi step automated refactoring for code smell
 
Multi step automated refactoring for code smell
Multi step automated refactoring for code smellMulti step automated refactoring for code smell
Multi step automated refactoring for code smell
 
A tale of bug prediction in software development
A tale of bug prediction in software developmentA tale of bug prediction in software development
A tale of bug prediction in software development
 
OpenOffice++: Improving the Quality of Open Source Software
OpenOffice++: Improving the Quality of Open Source SoftwareOpenOffice++: Improving the Quality of Open Source Software
OpenOffice++: Improving the Quality of Open Source Software
 
ICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-LocalizationICSE2018-Poster-Bug-Localization
ICSE2018-Poster-Bug-Localization
 
Towards a Macrobenchmark Framework for Performance Analysis of Java Applications
Towards a Macrobenchmark Framework for Performance Analysis of Java ApplicationsTowards a Macrobenchmark Framework for Performance Analysis of Java Applications
Towards a Macrobenchmark Framework for Performance Analysis of Java Applications
 
Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models Finding Bad Code Smells with Neural Network Models
Finding Bad Code Smells with Neural Network Models
 
Put Your Hands in the Mud: What Technique, Why, and How
Put Your Hands in the Mud: What Technique, Why, and HowPut Your Hands in the Mud: What Technique, Why, and How
Put Your Hands in the Mud: What Technique, Why, and How
 
‘CodeAliker’ - Plagiarism Detection on the Cloud
‘CodeAliker’ - Plagiarism Detection on the Cloud ‘CodeAliker’ - Plagiarism Detection on the Cloud
‘CodeAliker’ - Plagiarism Detection on the Cloud
 
Software Analytics - Achievements and Challenges
Software Analytics - Achievements and ChallengesSoftware Analytics - Achievements and Challenges
Software Analytics - Achievements and Challenges
 
Looking for Bugs in MonoDevelop
Looking for Bugs in MonoDevelopLooking for Bugs in MonoDevelop
Looking for Bugs in MonoDevelop
 
Towards Reusable Research Software
Towards Reusable Research SoftwareTowards Reusable Research Software
Towards Reusable Research Software
 
Statistical debuging for programs written in dynamic programming language ruby
Statistical debuging for programs written in dynamic programming language   rubyStatistical debuging for programs written in dynamic programming language   ruby
Statistical debuging for programs written in dynamic programming language ruby
 
Part 1
Part 1Part 1
Part 1
 
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine LearningIRJET- Data Reduction in Bug Triage using Supervised Machine Learning
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
 
I explore
I exploreI explore
I explore
 
Towards effective bug triage with software data reduction techniques
Towards effective bug triage with software data reduction techniquesTowards effective bug triage with software data reduction techniques
Towards effective bug triage with software data reduction techniques
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lecture
 
Introduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKennaIntroduction to OpenSees by Frank McKenna
Introduction to OpenSees by Frank McKenna
 
Software Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and SecuritySoftware Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Data Analytics for Software Engineering and Security
 

Recently uploaded

WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
Patrick Weigel
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
Bert Jan Schrijver
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
rodomar2
 
The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024
Yara Milbes
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
The Third Creative Media
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
Hornet Dynamics
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
Alina Yurenko
 
UI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design SystemUI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design System
Peter Muessig
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
Alberto Brandolini
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
kalichargn70th171
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
Marcin Chrost
 
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
kalichargn70th171
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
Grant Fritchey
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
Drona Infotech
 
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
safelyiotech
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Peter Caitens
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
VALiNTRY360
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
Remote DBA Services
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 

Recently uploaded (20)

WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
 
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CDKuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
KuberTENes Birthday Bash Guadalajara - Introducción a Argo CD
 
The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024The Rising Future of CPaaS in the Middle East 2024
The Rising Future of CPaaS in the Middle East 2024
 
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
Unlock the Secrets to Effortless Video Creation with Invideo: Your Ultimate G...
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
 
UI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design SystemUI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design System
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
 
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf8 Best Automated Android App Testing Tool and Framework in 2024.pdf
8 Best Automated Android App Testing Tool and Framework in 2024.pdf
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
 
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
The Key to Digital Success_ A Comprehensive Guide to Continuous Testing Integ...
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
 
Mobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona InfotechMobile App Development Company In Noida | Drona Infotech
Mobile App Development Company In Noida | Drona Infotech
 
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
Safelyio Toolbox Talk Softwate & App (How To Digitize Safety Meetings)
 
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom KittEnhanced Screen Flows UI/UX using SLDS with Tom Kitt
Enhanced Screen Flows UI/UX using SLDS with Tom Kitt
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
 
Oracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptxOracle Database 19c New Features for DBAs and Developers.pptx
Oracle Database 19c New Features for DBAs and Developers.pptx
 
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdfBaha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
Baha Majid WCA4Z IBM Z Customer Council Boston June 2024.pdf
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 

A preliminary study on using code smells to improve bug localization

  • 1. A Preliminary Study on Using Code Smells to Improve Bug Localization Aoi Takahashi Natthawute Sae-Lim Shinpei Hayashi Motoshi Saeki Tokyo Institute of Technology, Japan Authors:
  • 2. Introduction u Bug Localization (BL) Ø The process of identifying the location(s) of a given bug Ø BL can help developers to fix the bug [Existing work] IR-based technique [1] etc... 2 [1] Jian Zhou et al. ”Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports" ICSE 2012 Module list Bug description Exception when saving to a readonly directory. … 1. - 2. - 3. - Source code ClassA { ... } Calculate the textual similarity Developer Scoring Fixed source code ClassA { ... } ranked by the textual similarity
  • 3. Example (IR-based BL) 3 Rank Class name Textual similarity 1 ActionSaveProject 0.179 2 AbstractFilePersister 0.174 3 ZipFilePersister 0.150 4 XmiFilePersister 0.150 5 UmlFilePersister 0.142 6 ProjectFilePersister 0.128 7 FileConstants 0.123 8 ActionSaveProjectAs 0.122 9 ActionOpenProject 0.121 10 ProjectBrowser 0.121 11 ・ ・ ・ Bug description Exception when saving to a readonly directory. … Module list ranked by the textual similarity in ArgoUML (#3790)
  • 4. Problem and Goal u Problem of IR-based BL Ø The likelihood of each module having a bug is often overlooked u Goal of this paper Ø To improve its accuracy by utilizing fault-proneness → We use code smell to represent fault-proneness 4 (Fault-proneness)
  • 5. Code Smell [2] u An indicator of a design flaw in source code p Examples Ø God Class p Severity [3] → How strong a given code smell is p Code smells have been found to be related to change- and fault-proneness [4] 5 [2] Fowler “Refactoring. Improving the Design of Existing Code” Addison-Wesley, 1999 [4] Khomh et al. “An exploratory study of the impact of antipatterns on class change- and fault-proneness.” Empirical Software Engineering 2012 [3] Marinescu “Assessing technical debt by identifying design flaws in software systems” IBM Journal of Research and Development, 2012 Integer value
  • 6. Example (IR-based BL) 6 Rank Class name Textual similarity 1 ActionSaveProject 0.179 2 AbstractFilePersister 0.174 3 ZipFilePersister 0.150 4 XmiFilePersister 0.150 5 UmlFilePersister 0.142 6 ProjectFilePersister 0.128 7 FileConstants 0.123 8 ActionSaveProjectAs 0.122 9 ActionOpenProject 0.121 10 ProjectBrowser 0.121 11 ・ ・ ・ There are code smells (God Class, Blob Class) Bug description Exception when saving to a readonly directory. …
  • 7. Overview of Our Technique 7 Scoring The severity of code smells û û û Code smell detection Module list ranked by the proposed metric 1. - 2. - 3. - Source code Bug description Scoring Module list ranked by the textual similarity IR-based BL 1. - 2. - 3. - Exception when saving to a readonly directory. … ClassA { ... }
  • 8. u Bug Likelihood Index (BLI) (!: Class or Method) Proposed metric 8 #$% ! = 1 − ) * +,-! ! + ) * +,/0 ! Proposed metric Textual similarity The sum of the severity (normalized) (normalized) p ): Weight between 2345 and 2367 (9 ≤ ) ≤ ;) Ø = = 0 → Same as IR-based BL Ø A Parameter of the proposed technique
  • 9. Rank Class name Textual similarity 1 ActionSaveProject 0.179 2 AbstractFilePersister 0.174 3 ZipFilePersister 0.150 4 XmiFilePersister 0.150 5 UmlFilePersister 0.142 6 ProjectFilePersister 0.128 7 FileConstants 0.123 8 ActionSaveProjectAs 0.122 9 ActionOpenProject 0.121 10 ProjectBrowser 0.121 11 Proposed Technique 9 IR-based BL only Rank Class name BLI 1 ProjectBrowser 0.674 2 ActionSaveProject 0.580 3 AbstractFilePersister 0.563 4 ZipFilePersister 0.489 5 XmiFilePersister 0.488 6 Modeller 0.479 7 UmlFilePersister 0.461 8 Import 0.460 9 ParserDisplay 0.449 10 StylePanel 0.445 11・ ・ ・ ・ ・ ・ Proposed technique There are code smells (God Class, Blob Class)
  • 10. p Experimental Setup [Tools] n TraceLab [5]: calculating the textual similarity n inFusion: detecting code smells [Dataset] four open source projects [6] [Metric] Mean Average Precision (MAP) Preliminary Evaluation RQ1: Can our approach improve IR-based BL? RQ2: What is the best parameter value of our approach? 10 [6] Dit et al. “Feature location in source code: a taxonomy and survey” Journal of Software: Evolution and Process, 2013 [5] Dit et al. “A TraceLab-based Solution for Creating, Conducting, and Sharing Feature Location Experiments“ ICPC, 2012 (ArgoUML, JabRef, jEdit, muCommander)
  • 11. 11RQ1: Can our approach improve IR-based BL? 281% 184% 68% 36% 0 0.01 0.02 0.03 0.04 0.05 0.06 ArgoUML JabRef jEdit muCommander IR-based BL Our technique (MAP) Method level u Our approach can improve it! (Avg.142.25%) 11
  • 12. 12 Class level RQ1: Can our approach improve IR-based BL? 0 0.05 0.1 0.15 0.2 0.25 ArgoUML JabRef jEdit muCommander 36% 34% 24% 28% IR-based BL Our technique (MAP) u Our approach can improve it! (Avg.30.5%) 12
  • 13. 13 0 0.05 0.1 0.15 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ArgoUML Class level Method level RQ2: What is the best parameter value of our approach? (MAP) (α)
  • 14. 14 0 0.05 0.1 0.15 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ArgoUML Class level Method level RQ2: What is the best parameter value of our approach? (α) Same as IR-based BL (MAP)
  • 15. 0 0.05 0.1 0.15 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ArgoUML Class level Method level 15RQ2: What is the best parameter value of our approach? (α) The best point of MAP! = #. %& ! = #. %' (MAP)
  • 16. 0 0.05 0.1 0.15 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Class level Method level 16RQ2: What is the best parameter value of our approach? u The average ! is approximately 0.41 u Combining code smell is better (MAP) (α) ArgoUML muCommanderjEdit JabRef 0 0.05 0.1 0.15 0.2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Class level Method level (MAP) (α) 0 0.05 0.1 0.15 0.2 0.25 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Class level Method level (MAP) (α) 0 0.05 0.1 0.15 0.2 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Class level Method level (MAP) (α) ! = #. %& ! = #. %' ! = #. %( ! = #. %) ! = #. '* ! = #. %* ! = #. &+ ! = #. %%
  • 17. Conclusion u The first step of using code smells to improve BL Ø Combining the severity of code smells with existing information from IR-based approach u Applying our technique to four open source projects Ø Increasing MAP on both the class and method levels in all projects 17 ↑142.25% ↑30.50%
  • 19. Introduction p In large-scale software, many bugs occur daily Ø These bugs are described in bug descriptions p It is difficult to fix these bugs Ø Developers have to identify the location(s) of a bug from a large amount of source code u Bug Localization (BL) Ø The process of identifying the location(s) of a given bug [Existing work] IR-based technique [1] etc... Exception when saving to a readonly directory. … [1] Jian Zhou et al. ”Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports" ICSE 2012 ArgoUML (#3790) 19
  • 20. Related Work p Information Retrieval (IR) [1] p Dynamic Analysis [7] p Repository Mining [8] p Combining Technique [9] 20 [1] Zhou et al. ”Where should the bugs be fixed? More accurate information retrieval-based bug localization based on bug reports" ICSE 2012 [7] Liu et al. “How Does Execution Information Help with Information-Retrieval Based Bug Localization?” ICPC 2017 [8] Tantithamthavorn et al. ”Using co-change histories to improve bug localization performance” SNPD 2013 u IR needs only a bug description and source code u No research combining code smell with existing work [9] Youm et al. ” Improved bug localization based on code change histories and bug reports” Information and Software Technology 2017
  • 21. Example of Code Smells p God Class Ø A class that does too much work p Blob Class Ø A class that has too long lines of code p Duplicated Code (We didn’t use this time) Ø Where the same source code occurs more than once 21
  • 22. How to detect God Class [10] 1. ATFD (Access To Foreign Data) 2. WMC (Weighted Method Count) 3. TCC (Tight Class Cohesion) [10] Lanza and Marinescu ”Object-oriented metrics in practice: using software metrics to characterize, evaluate, and improve the design of object-oriented systems" Springer Science & Business Media, 2007 !"#$ > #&' '() > *&+, -./- ")) < 12& "-.+$ God ClassAND 22
  • 23. Dataset 23 Dataset Bug description Source codeGold set Project Version Bug description Class Method ArgoUML 0.20 - 0.24 74 1476 12131 jEdit 2.0 - 2.6 36 374 2947 JabRef 4.2 - 4.3 86 406 5276 muCommander 0.8.0 - 0.8.5 81 529 3916
  • 24. Future Work p More Preliminary Evaluation Ø Increase the number of projects Ø Combine code smell with other BL techniques p Detailed Analysis Ø Which code smells influence on IR-based BL? Ø In what cases is our technique effective? Ø Is it necessary to use code smell? We need to verify the software metrics 24