Impact Analysis of Granularity Levels on Feature Location Technique

Impact Analysis of Granularity Levels
on Feature Location Technique
Chakkrit Tantithamthavorn (Ph.D. Student)
and Akinori Ihara, Hideaki Hata, Ken-ichi Matsumoto
!
Software Engineering Laboratory
Graduate School of Information Science
Nara Institute of Science and Technology

Outline
✤ Introduction
✤ Motivation
✤ Study Design
✤ Results
✤ Conclusion
2
Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.

Introduction > Motivation > Study Design > Result > Conclusion
Growing complexity makes software difficult to maintain.
Within 12 years, the product size has grown more than 10 folds.
The evolution of the software size of Eclipse Platform Project.
3
Millions lines
of code!!!!!

WHERE is a bug?
4
Identifying WHERE a feature is implemented in the source code
based on a given requirement is painstaking and time-consuming.
Implement
new features
Enhance
existing feature Fix bugs

IR-based feature localization helps get it done.
Current research adopts Information Retrieval
(IR) models to find source code entities that
are textually similar to a given issue report.
5
New Bug Report
Retrieving Searching
and
Ranking
Rank Method Score
1 foo() 0.98
2 bar() 0.854
3 foobar() 0.321
Top N search results
Source Code
Entities
Overview of Information Retrieval based feature localization
Class-Level
[Rao et al,. 2011]
Query
Document Corpus Source code{
entities
Function-Level
[Lukins et al,. 2010]
An Open Issue:
How does the granularity levels
impact to the performance and effort
of IR-based feature localization,
however, it’s not known.

14 functions in a class{ Only 1 function is buggy.
1.) Class-level feature localization
might not be practical in reality.
6
A Motivating Example:
2.) Class-level feature localization
requires a huge amount of extra
effort to locate bugs.
Only 1 line is needed to be fixed.
~ 500 lines of code
Bug Report
Source Code

Study Design: Overview
7
Research Hypothesis: Function-level feature localization is
more practical than class-level feature localization.
!
To validate this hypothesis, we aim to explore two research
questions by comparing the performance and effort of IR-based
feature localization at the class and feature levels.
Research Questions
RQ1: Does function-level
feature localization
outperform class-level feature
localization?
RQ2: How much effort does
function-level feature
localization save over class-level
feature localization?

Study Design: Studied Projects
8
Reasons:
1.) These projects are large, active and real-world systems.
2.) Each software project carefully maintains bug tracking system and
source code version control repositories.

Study Design: IR-based Feature Localization
9
Source code
files or methods

Study Design: Effort-Based Evaluation
Ranked results for
an issue report at
the class level
Function A
Function D
Class 3
Function C
Function F ...
Rank3
Ranked results for
an issue report at
the function level
Function E
Function F ...
Rank5
Rank6
0 LOC }
LOC required
to review
suspicious
entities
Related
Non-Related
Function A
Function B
Function C
Function D
Class 1
Class 2
Function B
Function E
Rank1
Rank2
Rank3
Rank4
Rank1
Rank2
LOC threshold
10
We used lines of code as a proxy to measure effort required to find the first
relevant source code entity.

RQ1: Does function-level feature localization outperform class-level
feature localization?
LOC-based Performance: The percentage of successfully localized bug reports at the LOC threshold.
Eclipse Platform
● Method
File
80
70
60
50
40
30
20
10
Eclipse PDE
● Method
File
80
70
60
50
40
30
20
10
Eclipse JDT
● Method
File
When inspecting 1,000 LOC, function-level feature localization
can localize 50% of issue reports, while class-level feature
localization can localize 40% of issue reports.
11
●
●
●
●
●
●
●
●
● ● ●
LOC
LOC−based Performnace (%)
80
70
60
50
40
30
20
10
0
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
●
●
●
●
●
●
●
●
●
● ●
LOC
0
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
●
●
●
●
●
● ●
●
● ● ●
LOC
0
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
Feature
Class
Feature
Class
Feature
Class

RQ2: How much effort does function-level feature localization
save over class-level feature localization?
12
●
●
●
●
●
● ●●
●
●● ●
●
●
●
Effort required to find the
●
●
●
●
●
●
●
first buggy location
●
●
●
●
●
●
●● ●
●
●
●
●
● ●
●
●
●
Function Method Class File
Function Class Function Class Function Class Function Class Function Class
0 1000 2000 3000 4000 5000
Eclipse Platform
LOC
●
●
●
● ●
●●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●
Method File
0 1000 2000 3000 4000 5000
Eclipse PDE
LOC
●
●
Method File
0 1000 2000 3000 4000 5000
Eclipse JDT
LOC
●
●
●
●
●
●
●●
● ●
●
●
●
Effort required to find
80% ●
●
●
of buggy locations
●
●
●
●
●
●
●
● ● ●
●
●
●
●
●
●
●
●
●
●
●●
●
●
● ●
●
●
●
●
●
●
●
●●
●
Method File
0 5000 10000 15000 20000 25000 30000
Eclipse Platform
LOC
●
●
●
●●
●
●
●●
●
● ●
●
●●
●
●
●●
●
●
●
●
●
Method File
0 5000 10000 15000 20000 25000 30000
Eclipse PDE
LOC
Method File
0 10000 20000 30000 40000 50000 60000
Eclipse JDT
LOC
Function-level feature localization requires
113 LOC, while class-level feature
localization requires 906 LOC to locate the
first relevant source code entity.
Function-level bug localization requires
1,309 LOC, while class-level feature
localization requires 2,744 LOC to locate
80% of relevant source code entities.
saves 7 times saves 4.4 times

Summary
Goal: To investigate the impact of granularity levels on the performance and
effort of IR-based feature localization
Main findings:
!
- For the same amount of inspection effort, function-level feature localization
outperforms class-level feature localization.
!
- Function-level feature localization saves 7 times of inspection effort to find
the first relevant bug location and 4.4 times to find 80% of bug locations.
13
Approach:
We used the Vector Space Model (VSM) to localize bugs at method and file
granularity levels. We evaluated on 1,968 bug reports with 10,959 files and
82,946 methods.

15
“Feature localization at the function-level is effective in practice.”
!
Thank you for your attention

Impact Analysis of Granularity Levels on Feature Location Technique

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Impact Analysis of Granularity Levels on Feature Location Technique

Similar to Impact Analysis of Granularity Levels on Feature Location Technique (20)

More from Chakkrit (Kla) Tantithamthavorn

More from Chakkrit (Kla) Tantithamthavorn (14)

Recently uploaded

Recently uploaded (20)

Impact Analysis of Granularity Levels on Feature Location Technique