SlideShare a Scribd company logo
1 of 15
Download to read offline
Impact Analysis of Granularity Levels 
on Feature Location Technique 
Chakkrit Tantithamthavorn (Ph.D. Student) 
and Akinori Ihara, Hideaki Hata, Ken-ichi Matsumoto 
! 
Software Engineering Laboratory 
Graduate School of Information Science 
Nara Institute of Science and Technology
Outline 
✤ Introduction 
✤ Motivation 
✤ Study Design 
✤ Results 
✤ Conclusion 
2 
Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
Introduction > Motivation > Study Design > Result > Conclusion 
Growing complexity makes software difficult to maintain. 
Within 12 years, the product size has grown more than 10 folds. 
The evolution of the software size of Eclipse Platform Project. 
3 
Millions lines 
of code!!!!! 
Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
Introduction > Motivation > Study Design > Result > Conclusion 
WHERE is a bug? 
4 
Identifying WHERE a feature is implemented in the source code 
based on a given requirement is painstaking and time-consuming. 
Implement 
new features 
Enhance 
existing feature Fix bugs 
Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
Introduction > Motivation > Study Design > Result > Conclusion 
IR-based feature localization helps get it done. 
Current research adopts Information Retrieval 
(IR) models to find source code entities that 
are textually similar to a given issue report. 
5 
New Bug Report 
Retrieving Searching 
and 
Ranking 
Rank Method Score 
1 foo() 0.98 
2 bar() 0.854 
3 foobar() 0.321 
Top N search results 
Source Code 
Entities 
Overview of Information Retrieval based feature localization 
Class-Level 
[Rao et al,. 2011] 
Query 
Document Corpus Source code{ 
entities 
Function-Level 
[Lukins et al,. 2010] 
An Open Issue: 
How does the granularity levels 
impact to the performance and effort 
of IR-based feature localization, 
however, it’s not known. 
Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
14 functions in a class{ Only 1 function is buggy. 
1.) Class-level feature localization 
might not be practical in reality. 
6 
A Motivating Example: 
2.) Class-level feature localization 
requires a huge amount of extra 
effort to locate bugs. 
Only 1 line is needed to be fixed. 
~ 500 lines of code 
Bug Report 
Source Code
Introduction > Motivation > Study Design > Result > Conclusion 
Study Design: Overview 
7 
Research Hypothesis: Function-level feature localization is 
more practical than class-level feature localization. 
! 
To validate this hypothesis, we aim to explore two research 
questions by comparing the performance and effort of IR-based 
feature localization at the class and feature levels. 
Research Questions 
RQ1: Does function-level 
feature localization 
outperform class-level feature 
localization? 
RQ2: How much effort does 
function-level feature 
localization save over class-level 
feature localization? 
Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
Introduction > Motivation > Study Design > Result > Conclusion 
Study Design: Studied Projects 
8 
Reasons: 
1.) These projects are large, active and real-world systems. 
2.) Each software project carefully maintains bug tracking system and 
source code version control repositories. 
Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
Introduction > Motivation > Study Design > Result > Conclusion 
Study Design: IR-based Feature Localization 
9 
Source code 
files or methods 
Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
Introduction > Motivation > Study Design > Result > Conclusion 
Study Design: Effort-Based Evaluation 
Ranked results for 
an issue report at 
the class level 
Function A 
Function D 
Class 3 
Function C 
Function F ... 
Rank3 
Ranked results for 
an issue report at 
the function level 
Function E 
Function F ... 
Rank5 
Rank6 
0 LOC } 
LOC required 
to review 
suspicious 
entities 
Related 
Non-Related 
Function A 
Function B 
Function C 
Function D 
Class 1 
Class 2 
Function B 
Function E 
Rank1 
Rank2 
Rank3 
Rank4 
Rank1 
Rank2 
LOC threshold 
10 
We used lines of code as a proxy to measure effort required to find the first 
relevant source code entity. 
Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
Introduction > Motivation > Study Design > Result > Conclusion 
RQ1: Does function-level feature localization outperform class-level 
feature localization? 
LOC-based Performance: The percentage of successfully localized bug reports at the LOC threshold. 
Eclipse Platform 
● Method 
File 
80 
70 
60 
50 
40 
30 
20 
10 
Eclipse PDE 
● Method 
File 
80 
70 
60 
50 
40 
30 
20 
10 
Eclipse JDT 
● Method 
File 
When inspecting 1,000 LOC, function-level feature localization 
can localize 50% of issue reports, while class-level feature 
localization can localize 40% of issue reports. 
11 
● 
● 
● 
● 
● 
● 
● 
● 
● ● ● 
LOC 
LOC−based Performnace (%) 
80 
70 
60 
50 
40 
30 
20 
10 
0 
0 
500 
1000 
1500 
2000 
2500 
3000 
3500 
4000 
4500 
5000 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● ● 
LOC 
LOC−based Performnace (%) 
0 
0 
500 
1000 
1500 
2000 
2500 
3000 
3500 
4000 
4500 
5000 
● 
● 
● 
● 
● 
● ● 
● 
● ● ● 
LOC 
LOC−based Performnace (%) 
0 
0 
500 
1000 
1500 
2000 
2500 
3000 
3500 
4000 
4500 
5000 
Feature 
Class 
Feature 
Class 
Feature 
Class 
Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
Introduction > Motivation > Study Design > Result > Conclusion 
RQ2: How much effort does function-level feature localization 
save over class-level feature localization? 
12 
● 
● 
● 
● 
● 
● ●● 
● 
●● ● 
● 
● 
● 
Effort required to find the 
● 
● 
● 
● 
● 
● 
● 
first buggy location 
● 
● 
● 
● 
● 
● 
●● ● 
● 
● 
● 
● 
● ● 
● 
● 
● 
Function Method Class File 
Function Class Function Class Function Class Function Class Function Class 
0 1000 2000 3000 4000 5000 
Eclipse Platform 
LOC 
● 
● 
● 
● ● 
●● 
● 
● 
● ● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
●● ● 
● 
● 
Method File 
0 1000 2000 3000 4000 5000 
Eclipse PDE 
LOC 
● 
● 
Method File 
0 1000 2000 3000 4000 5000 
Eclipse JDT 
LOC 
● 
● 
● 
● 
● 
● 
●● 
● ● 
● 
● 
● 
Effort required to find 
80% ● 
● 
● 
of buggy locations 
● 
● 
● 
● 
● 
● 
● 
● ● ● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
● 
●● 
● 
● 
● ● 
● 
● 
● 
● 
● 
● 
● 
●● 
● 
Method File 
0 5000 10000 15000 20000 25000 30000 
Eclipse Platform 
LOC 
● 
● 
● 
●● 
● 
● 
●● 
● 
● ● 
● 
●● 
● 
● 
●● 
● 
● 
● 
● 
● 
Method File 
0 5000 10000 15000 20000 25000 30000 
Eclipse PDE 
LOC 
Method File 
0 10000 20000 30000 40000 50000 60000 
Eclipse JDT 
LOC 
Function-level feature localization requires 
113 LOC, while class-level feature 
localization requires 906 LOC to locate the 
first relevant source code entity. 
Function-level bug localization requires 
1,309 LOC, while class-level feature 
localization requires 2,744 LOC to locate 
80% of relevant source code entities. 
saves 7 times saves 4.4 times 
Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
Introduction > Motivation > Study Design > Result > Conclusion 
Summary 
Goal: To investigate the impact of granularity levels on the performance and 
effort of IR-based feature localization 
Main findings: 
! 
- For the same amount of inspection effort, function-level feature localization 
outperforms class-level feature localization. 
! 
- Function-level feature localization saves 7 times of inspection effort to find 
the first relevant bug location and 4.4 times to find 80% of bug locations. 
13 
Approach: 
We used the Vector Space Model (VSM) to localize bugs at method and file 
granularity levels. We evaluated on 1,968 bug reports with 10,959 files and 
82,946 methods. 
Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
14
15 
“Feature localization at the function-level is effective in practice.” 
! 
Thank you for your attention

More Related Content

What's hot

Requirements in Cyber-Physical Systems: Specifications and Applications
Requirements in Cyber-Physical Systems: Specifications and ApplicationsRequirements in Cyber-Physical Systems: Specifications and Applications
Requirements in Cyber-Physical Systems: Specifications and Applications
Lionel Briand
 
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Lionel Briand
 
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Feng Zhang
 
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
Lionel Briand
 
Effective Test Suites for ! Mixed Discrete-Continuous Stateflow Controllers
Effective Test Suites for ! Mixed Discrete-Continuous Stateflow ControllersEffective Test Suites for ! Mixed Discrete-Continuous Stateflow Controllers
Effective Test Suites for ! Mixed Discrete-Continuous Stateflow Controllers
Lionel Briand
 
Comparing Offline and Online Testing of Deep Neural Networks: An Autonomous C...
Comparing Offline and Online Testing of Deep Neural Networks: An Autonomous C...Comparing Offline and Online Testing of Deep Neural Networks: An Autonomous C...
Comparing Offline and Online Testing of Deep Neural Networks: An Autonomous C...
Lionel Briand
 
Dynamic Adaptation of Software-defined Networks for IoT Systems: A Search-bas...
Dynamic Adaptation of Software-defined Networks for IoT Systems: A Search-bas...Dynamic Adaptation of Software-defined Networks for IoT Systems: A Search-bas...
Dynamic Adaptation of Software-defined Networks for IoT Systems: A Search-bas...
Lionel Briand
 
Change Impact Analysis for Natural Language Requirements
Change Impact Analysis for Natural Language RequirementsChange Impact Analysis for Natural Language Requirements
Change Impact Analysis for Natural Language Requirements
Lionel Briand
 

What's hot (20)

Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learning
 
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
 
Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)Crowd debugging (FSE 2015)
Crowd debugging (FSE 2015)
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
 
DYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMS
DYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMSDYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMS
DYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMS
 
Requirements in Cyber-Physical Systems: Specifications and Applications
Requirements in Cyber-Physical Systems: Specifications and ApplicationsRequirements in Cyber-Physical Systems: Specifications and Applications
Requirements in Cyber-Physical Systems: Specifications and Applications
 
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
 
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
 
Artificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software TestingArtificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software Testing
 
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
 
Symbexecsearch
SymbexecsearchSymbexecsearch
Symbexecsearch
 
Effective Test Suites for ! Mixed Discrete-Continuous Stateflow Controllers
Effective Test Suites for ! Mixed Discrete-Continuous Stateflow ControllersEffective Test Suites for ! Mixed Discrete-Continuous Stateflow Controllers
Effective Test Suites for ! Mixed Discrete-Continuous Stateflow Controllers
 
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
Partitioning Composite Code Changes to Facilitate Code Review (MSR2015)
 
QUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-SingaporeQUICKAR-ASE2016-Singapore
QUICKAR-ASE2016-Singapore
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.
 
DLint: dynamically checking bad coding practices in JavaScript (ISSTA'15 Slides)
DLint: dynamically checking bad coding practices in JavaScript (ISSTA'15 Slides)DLint: dynamically checking bad coding practices in JavaScript (ISSTA'15 Slides)
DLint: dynamically checking bad coding practices in JavaScript (ISSTA'15 Slides)
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
 
Comparing Offline and Online Testing of Deep Neural Networks: An Autonomous C...
Comparing Offline and Online Testing of Deep Neural Networks: An Autonomous C...Comparing Offline and Online Testing of Deep Neural Networks: An Autonomous C...
Comparing Offline and Online Testing of Deep Neural Networks: An Autonomous C...
 
Dynamic Adaptation of Software-defined Networks for IoT Systems: A Search-bas...
Dynamic Adaptation of Software-defined Networks for IoT Systems: A Search-bas...Dynamic Adaptation of Software-defined Networks for IoT Systems: A Search-bas...
Dynamic Adaptation of Software-defined Networks for IoT Systems: A Search-bas...
 
Change Impact Analysis for Natural Language Requirements
Change Impact Analysis for Natural Language RequirementsChange Impact Analysis for Natural Language Requirements
Change Impact Analysis for Natural Language Requirements
 

Similar to Impact Analysis of Granularity Levels on Feature Location Technique

Iwsm2014 application of function points to software based on open source - ...
Iwsm2014   application of function points to software based on open source - ...Iwsm2014   application of function points to software based on open source - ...
Iwsm2014 application of function points to software based on open source - ...
Nesma
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Lucidworks
 
Testing survey by_directions
Testing survey by_directionsTesting survey by_directions
Testing survey by_directions
Tao He
 

Similar to Impact Analysis of Granularity Levels on Feature Location Technique (20)

Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
 
A preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localizationA preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localization
 
Iwsm2014 application of function points to software based on open source - ...
Iwsm2014   application of function points to software based on open source - ...Iwsm2014   application of function points to software based on open source - ...
Iwsm2014 application of function points to software based on open source - ...
 
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
 
Search Quality Evaluation: a Developer Perspective
Search Quality Evaluation: a Developer PerspectiveSearch Quality Evaluation: a Developer Perspective
Search Quality Evaluation: a Developer Perspective
 
Play with Testing on Android - Gilang Ramadhan (Academy Content Writer at Dic...
Play with Testing on Android - Gilang Ramadhan (Academy Content Writer at Dic...Play with Testing on Android - Gilang Ramadhan (Academy Content Writer at Dic...
Play with Testing on Android - Gilang Ramadhan (Academy Content Writer at Dic...
 
Netflix conductor
Netflix conductorNetflix conductor
Netflix conductor
 
Search Quality Evaluation: a Developer Perspective
Search Quality Evaluation: a Developer PerspectiveSearch Quality Evaluation: a Developer Perspective
Search Quality Evaluation: a Developer Perspective
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
 
CV - Chandni Kundlia
CV - Chandni KundliaCV - Chandni Kundlia
CV - Chandni Kundlia
 
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
 
PhD Dissertation Defense (April 2015)
PhD Dissertation Defense (April 2015)PhD Dissertation Defense (April 2015)
PhD Dissertation Defense (April 2015)
 
Oracle 1Z0-1111-22 Exam
Oracle 1Z0-1111-22 ExamOracle 1Z0-1111-22 Exam
Oracle 1Z0-1111-22 Exam
 
Aspect Oriented Programming: Hidden Toolkit That You Already Have
Aspect Oriented Programming: Hidden Toolkit That You Already HaveAspect Oriented Programming: Hidden Toolkit That You Already Have
Aspect Oriented Programming: Hidden Toolkit That You Already Have
 
DYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMS
DYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMSDYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMS
DYNAMIC SLICING OF ASPECT-ORIENTED PROGRAMS
 
Unit testing of spark applications
Unit testing of spark applicationsUnit testing of spark applications
Unit testing of spark applications
 
java sql hibernate and springsDAKSHAYINI 3BR19EE026.pptx
java sql hibernate and springsDAKSHAYINI 3BR19EE026.pptxjava sql hibernate and springsDAKSHAYINI 3BR19EE026.pptx
java sql hibernate and springsDAKSHAYINI 3BR19EE026.pptx
 
An exploratory study of the state of practice of performance testing in Java-...
An exploratory study of the state of practice of performance testing in Java-...An exploratory study of the state of practice of performance testing in Java-...
An exploratory study of the state of practice of performance testing in Java-...
 
Testing survey by_directions
Testing survey by_directionsTesting survey by_directions
Testing survey by_directions
 
Lecture-1.pptx
Lecture-1.pptxLecture-1.pptx
Lecture-1.pptx
 

More from Chakkrit (Kla) Tantithamthavorn

Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Chakkrit (Kla) Tantithamthavorn
 
The Impact of Class Rebalancing Techniques on the Performance and Interpretat...
The Impact of Class Rebalancing Techniques on the Performance and Interpretat...The Impact of Class Rebalancing Techniques on the Performance and Interpretat...
The Impact of Class Rebalancing Techniques on the Performance and Interpretat...
Chakkrit (Kla) Tantithamthavorn
 
Mining Software Defects: Should We Consider Affected Releases?
Mining Software Defects: Should We Consider Affected Releases?Mining Software Defects: Should We Consider Affected Releases?
Mining Software Defects: Should We Consider Affected Releases?
Chakkrit (Kla) Tantithamthavorn
 
AI-Driven Software Quality Assurance in the Age of DevOps
AI-Driven Software Quality Assurance in the Age of DevOpsAI-Driven Software Quality Assurance in the Age of DevOps
AI-Driven Software Quality Assurance in the Age of DevOps
Chakkrit (Kla) Tantithamthavorn
 
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Chakkrit (Kla) Tantithamthavorn
 
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
Chakkrit (Kla) Tantithamthavorn
 
Towards a Better Understanding of the Impact of Experimental Components on De...
Towards a Better Understanding of the Impact of Experimental Components on De...Towards a Better Understanding of the Impact of Experimental Components on De...
Towards a Better Understanding of the Impact of Experimental Components on De...
Chakkrit (Kla) Tantithamthavorn
 
Automated parameter optimization should be included in future 
defect predict...
Automated parameter optimization should be included in future 
defect predict...Automated parameter optimization should be included in future 
defect predict...
Automated parameter optimization should be included in future 
defect predict...
Chakkrit (Kla) Tantithamthavorn
 
The Impact of Mislabelling on the Performance and Interpretation of Defect Pr...
The Impact of Mislabelling on the Performance and Interpretation of Defect Pr...The Impact of Mislabelling on the Performance and Interpretation of Defect Pr...
The Impact of Mislabelling on the Performance and Interpretation of Defect Pr...
Chakkrit (Kla) Tantithamthavorn
 

More from Chakkrit (Kla) Tantithamthavorn (14)

Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
Explainable Artificial Intelligence (XAI) 
to Predict and Explain Future Soft...
 
The Impact of Class Rebalancing Techniques on the Performance and Interpretat...
The Impact of Class Rebalancing Techniques on the Performance and Interpretat...The Impact of Class Rebalancing Techniques on the Performance and Interpretat...
The Impact of Class Rebalancing Techniques on the Performance and Interpretat...
 
Mining Software Defects: Should We Consider Affected Releases?
Mining Software Defects: Should We Consider Affected Releases?Mining Software Defects: Should We Consider Affected Releases?
Mining Software Defects: Should We Consider Affected Releases?
 
Software Analytics In Action: A Hands-on Tutorial on Mining, Analyzing, Model...
Software Analytics In Action: A Hands-on Tutorial on Mining, Analyzing, Model...Software Analytics In Action: A Hands-on Tutorial on Mining, Analyzing, Model...
Software Analytics In Action: A Hands-on Tutorial on Mining, Analyzing, Model...
 
AI-Driven Software Quality Assurance in the Age of DevOps
AI-Driven Software Quality Assurance in the Age of DevOpsAI-Driven Software Quality Assurance in the Age of DevOps
AI-Driven Software Quality Assurance in the Age of DevOps
 
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
Leveraging HPC Resources to Improve the Experimental Design of Software Analy...
 
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
 
Towards a Better Understanding of the Impact of Experimental Components on De...
Towards a Better Understanding of the Impact of Experimental Components on De...Towards a Better Understanding of the Impact of Experimental Components on De...
Towards a Better Understanding of the Impact of Experimental Components on De...
 
Automated parameter optimization should be included in future 
defect predict...
Automated parameter optimization should be included in future 
defect predict...Automated parameter optimization should be included in future 
defect predict...
Automated parameter optimization should be included in future 
defect predict...
 
The Impact of Mislabelling on the Performance and Interpretation of Defect Pr...
The Impact of Mislabelling on the Performance and Interpretation of Defect Pr...The Impact of Mislabelling on the Performance and Interpretation of Defect Pr...
The Impact of Mislabelling on the Performance and Interpretation of Defect Pr...
 
Open Data in Asia: An Overview of Open Data Policies and Practices in 13 Coun...
Open Data in Asia: An Overview of Open Data Policies and Practices in 13 Coun...Open Data in Asia: An Overview of Open Data Policies and Practices in 13 Coun...
Open Data in Asia: An Overview of Open Data Policies and Practices in 13 Coun...
 
Introduction to Google App Engine
Introduction to Google App EngineIntroduction to Google App Engine
Introduction to Google App Engine
 
Introduction to GPU Programming
Introduction to GPU ProgrammingIntroduction to GPU Programming
Introduction to GPU Programming
 
Example Application of GPU
Example Application of GPUExample Application of GPU
Example Application of GPU
 

Recently uploaded

Recently uploaded (20)

🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

Impact Analysis of Granularity Levels on Feature Location Technique

  • 1. Impact Analysis of Granularity Levels on Feature Location Technique Chakkrit Tantithamthavorn (Ph.D. Student) and Akinori Ihara, Hideaki Hata, Ken-ichi Matsumoto ! Software Engineering Laboratory Graduate School of Information Science Nara Institute of Science and Technology
  • 2. Outline ✤ Introduction ✤ Motivation ✤ Study Design ✤ Results ✤ Conclusion 2 Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
  • 3. Introduction > Motivation > Study Design > Result > Conclusion Growing complexity makes software difficult to maintain. Within 12 years, the product size has grown more than 10 folds. The evolution of the software size of Eclipse Platform Project. 3 Millions lines of code!!!!! Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
  • 4. Introduction > Motivation > Study Design > Result > Conclusion WHERE is a bug? 4 Identifying WHERE a feature is implemented in the source code based on a given requirement is painstaking and time-consuming. Implement new features Enhance existing feature Fix bugs Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
  • 5. Introduction > Motivation > Study Design > Result > Conclusion IR-based feature localization helps get it done. Current research adopts Information Retrieval (IR) models to find source code entities that are textually similar to a given issue report. 5 New Bug Report Retrieving Searching and Ranking Rank Method Score 1 foo() 0.98 2 bar() 0.854 3 foobar() 0.321 Top N search results Source Code Entities Overview of Information Retrieval based feature localization Class-Level [Rao et al,. 2011] Query Document Corpus Source code{ entities Function-Level [Lukins et al,. 2010] An Open Issue: How does the granularity levels impact to the performance and effort of IR-based feature localization, however, it’s not known. Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
  • 6. 14 functions in a class{ Only 1 function is buggy. 1.) Class-level feature localization might not be practical in reality. 6 A Motivating Example: 2.) Class-level feature localization requires a huge amount of extra effort to locate bugs. Only 1 line is needed to be fixed. ~ 500 lines of code Bug Report Source Code
  • 7. Introduction > Motivation > Study Design > Result > Conclusion Study Design: Overview 7 Research Hypothesis: Function-level feature localization is more practical than class-level feature localization. ! To validate this hypothesis, we aim to explore two research questions by comparing the performance and effort of IR-based feature localization at the class and feature levels. Research Questions RQ1: Does function-level feature localization outperform class-level feature localization? RQ2: How much effort does function-level feature localization save over class-level feature localization? Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
  • 8. Introduction > Motivation > Study Design > Result > Conclusion Study Design: Studied Projects 8 Reasons: 1.) These projects are large, active and real-world systems. 2.) Each software project carefully maintains bug tracking system and source code version control repositories. Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
  • 9. Introduction > Motivation > Study Design > Result > Conclusion Study Design: IR-based Feature Localization 9 Source code files or methods Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
  • 10. Introduction > Motivation > Study Design > Result > Conclusion Study Design: Effort-Based Evaluation Ranked results for an issue report at the class level Function A Function D Class 3 Function C Function F ... Rank3 Ranked results for an issue report at the function level Function E Function F ... Rank5 Rank6 0 LOC } LOC required to review suspicious entities Related Non-Related Function A Function B Function C Function D Class 1 Class 2 Function B Function E Rank1 Rank2 Rank3 Rank4 Rank1 Rank2 LOC threshold 10 We used lines of code as a proxy to measure effort required to find the first relevant source code entity. Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
  • 11. Introduction > Motivation > Study Design > Result > Conclusion RQ1: Does function-level feature localization outperform class-level feature localization? LOC-based Performance: The percentage of successfully localized bug reports at the LOC threshold. Eclipse Platform ● Method File 80 70 60 50 40 30 20 10 Eclipse PDE ● Method File 80 70 60 50 40 30 20 10 Eclipse JDT ● Method File When inspecting 1,000 LOC, function-level feature localization can localize 50% of issue reports, while class-level feature localization can localize 40% of issue reports. 11 ● ● ● ● ● ● ● ● ● ● ● LOC LOC−based Performnace (%) 80 70 60 50 40 30 20 10 0 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 ● ● ● ● ● ● ● ● ● ● ● LOC LOC−based Performnace (%) 0 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 ● ● ● ● ● ● ● ● ● ● ● LOC LOC−based Performnace (%) 0 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 Feature Class Feature Class Feature Class Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
  • 12. Introduction > Motivation > Study Design > Result > Conclusion RQ2: How much effort does function-level feature localization save over class-level feature localization? 12 ● ● ● ● ● ● ●● ● ●● ● ● ● ● Effort required to find the ● ● ● ● ● ● ● first buggy location ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● Function Method Class File Function Class Function Class Function Class Function Class Function Class 0 1000 2000 3000 4000 5000 Eclipse Platform LOC ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● Method File 0 1000 2000 3000 4000 5000 Eclipse PDE LOC ● ● Method File 0 1000 2000 3000 4000 5000 Eclipse JDT LOC ● ● ● ● ● ● ●● ● ● ● ● ● Effort required to find 80% ● ● ● of buggy locations ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ● Method File 0 5000 10000 15000 20000 25000 30000 Eclipse Platform LOC ● ● ● ●● ● ● ●● ● ● ● ● ●● ● ● ●● ● ● ● ● ● Method File 0 5000 10000 15000 20000 25000 30000 Eclipse PDE LOC Method File 0 10000 20000 30000 40000 50000 60000 Eclipse JDT LOC Function-level feature localization requires 113 LOC, while class-level feature localization requires 906 LOC to locate the first relevant source code entity. Function-level bug localization requires 1,309 LOC, while class-level feature localization requires 2,744 LOC to locate 80% of relevant source code entities. saves 7 times saves 4.4 times Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
  • 13. Introduction > Motivation > Study Design > Result > Conclusion Summary Goal: To investigate the impact of granularity levels on the performance and effort of IR-based feature localization Main findings: ! - For the same amount of inspection effort, function-level feature localization outperforms class-level feature localization. ! - Function-level feature localization saves 7 times of inspection effort to find the first relevant bug location and 4.4 times to find 80% of bug locations. 13 Approach: We used the Vector Space Model (VSM) to localize bugs at method and file granularity levels. We evaluated on 1,968 bug reports with 10,959 files and 82,946 methods. Impact Analysis of Granularity Levels on Feature Location Technique (APRES’2014), Auckland, New Zealand.
  • 14. 14
  • 15. 15 “Feature localization at the function-level is effective in practice.” ! Thank you for your attention