Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Investigating Automatic Static Analysis Results to Identify Quality Problems: an Inductive StudyAntonio Vetro’ – Nico Zazw...
Introduction and motivations
Automatic Static Analysis (ASA) - process                  rules,                 patterns                                ...
Use of ASA: research streams    Looking at ASA             Looking at large    issues to identify         sets of issues a...
Looking at ASA       Looking at largeissues to identify   sets of issues asdefects in single    early indicators oflines o...
Looking at ASA                        Looking at largeissues to identify                    sets of issues asdefects in si...
Looking at ASA                                         Looking at large    issues to identify                             ...
Looking at ASA       Looking at largeissues to identify   sets of issues asdefects in single    early indicators oflines o...
Looking at ASA                      Looking at largeissues to identify                  sets of issues asdefects in single...
This study
Looking at ASA       Looking at largeissues to identify   sets of issues asdefects in single    early indicators oflines o...
Contributions•   New tool/language/application combination    (Resharper/ C#/ Web application).•   Analysis at two granula...
Study context•   Web based industrial application (C#) of about    35 KLOCS•   78 fixed and closed defects reported in the...
Study goals                                                                      G2                                     G1...
Goal 1                               RQ C1                         Which ASA issue     ASA issues vs defects:             ...
Goal 2                               RQ C2                         Which ASA issue                          categories can...
Goal 2                                      Functionality                                       Reliability               ...
Mapping between ASA issues, Defects, Files, Components
Mapping between ASA issues, Defects, Files, Components
Results
Which ASA issue categories can identify defect-prone components ?RQ C1-C2                                                A...
RQ F1          Which ASA issue categories can identify defect-prone files?           RQ F2                                ...
Follow-up analysis: sorting files by different indicators           Cumulative distribution of defects in files and indica...
Follow-up analysis: sorting components by different indicators           Cumulative distribution of defects in components ...
Conclusions
Summary• Few Resharper categories had positive correlations with defects at  component level• Several Resharper categories...
Recommendations for future work• Analysis on file level might lead to more promising results  than on component level.• Th...
Questions?  Investigating Automatic Static Analysis Results  to Identify Quality Problems: an Inductive Study  Antonio Vet...
Upcoming SlideShare
Loading in …5
×

Investigating Automatic Static Analysis Results to Identify Quality Problems: an Inductive Study

679 views

Published on

Background: Automatic static analysis (ASA) tools examine Keywords
source code to discover “issues”, i.e. code patterns that are symptoms of bad programming practices and that can lead to defective behavior. Studies in the literature have shown that these tools find defects earlier than other verification activities, but they produce a substantial number of false positive warnings. For this reason, an alternative approach is to use the set of ASA issues to identify defect prone files and components rather than focusing on the individual issues.
Aim: We conducted an exploratory study to investigate whether ASA issues can be used as early indicators of faulty files and components and, for the first time, whether they point to a decay of specific software quality attributes, such as maintainability or functionality. Our aim is to understand the critical parameters and feasibility of such an approach to feed into future research on more specific quality and defect prediction models.
Method: We analyzed an industrial C# web application using the Resharper ASA tool and explored if significant correlations exist in such a data set.
Results: We found promising results when predicting defect-prone files. A set of specific Resharper categories are better indicators of faulty files than common software metrics or the collection of issues of all issue categories, and these categories correlate to different software quality attributes.
Conclusions: Our advice for future research is to perform analysis on file rather component level and to evaluate the generalizability of categories. We also recommend using larger datasets as we learned that data sparseness can lead to challenges in the proposed analysis process.

  • Be the first to comment

  • Be the first to like this

Investigating Automatic Static Analysis Results to Identify Quality Problems: an Inductive Study

  1. 1. Investigating Automatic Static Analysis Results to Identify Quality Problems: an Inductive StudyAntonio Vetro’ – Nico Zazworka – Forrest Shull – Carolyn Seaman – Michele A. Shaw IEEE Software Engineering Workshop (SEW-35), 12-13 October 2012 Heraclion, Crete, Greece
  2. 2. Introduction and motivations
  3. 3. Automatic Static Analysis (ASA) - process rules, patterns Issues ASA tool (warnings) Source code Refactored Source Fix issues code
  4. 4. Use of ASA: research streams Looking at ASA Looking at large issues to identify sets of issues as defects in single early indicators of lines of code the more defect prone modules
  5. 5. Looking at ASA Looking at largeissues to identify sets of issues asdefects in single early indicators oflines of code the more defect prone modules Legend code bug ASA issue ASA issue related to a bug
  6. 6. Looking at ASA Looking at largeissues to identify sets of issues asdefects in single early indicators oflines of code the more defect prone modules • Useful as an early verification technique • It can shorten the defect insert- remove time • However many studies report high rate of false positive ASA issues (from 30% to 96%)
  7. 7. Looking at ASA Looking at large issues to identify sets of issues as defects in single early indicators of lines of code the more defect prone modules• On realistic sized application applications ASA tools typically generate thounsands of issues• Output needs further fefinement and tailoring from developers to be useful
  8. 8. Looking at ASA Looking at largeissues to identify sets of issues asdefects in single early indicators oflines of code the more defect prone modules Legend Software module (class, file, component)
  9. 9. Looking at ASA Looking at largeissues to identify sets of issues asdefects in single early indicators oflines of code the more defect prone modules • Useful as a proxy for defects location • It can provide guidance to inspection/test planning • Many studies report positive correlations between number of defects and number of ASA issues
  10. 10. This study
  11. 11. Looking at ASA Looking at largeissues to identify sets of issues asdefects in single early indicators oflines of code the more defect prone modules
  12. 12. Contributions• New tool/language/application combination (Resharper/ C#/ Web application).• Analysis at two granularity levels, i.e. software components and source code files.• We investigate whether specific types of ASA issues can be linked to specific quality dimensions. Looking at large sets of Looking at ASA issues as early issues to identify indicators of defects in single the more lines of code defect prone modules
  13. 13. Study context• Web based industrial application (C#) of about 35 KLOCS• 78 fixed and closed defects reported in the JIRA tracking system• ASA tool: Resharper Looking at large sets of Looking at ASA issues as early issues to identify indicators of defects in single the more lines of code defect prone modules
  14. 14. Study goals G2 G1 Understand whether/which Understand whether/which ASA issues are related to ASA issues are indicators of specific software quality defect-proneness characteristicsLegend RQ C2G = goal RQ C1 Which ASA issueRQ = research question Which ASA issue categories can point to categories can identify defect-prone componentsC = component defect-prone components? that impact various system quality characteristics?F = file RQ F2 RQ F1 Which ASA issue Which ASA issue categories can point to categories can identify defect-prone files that defect-prone files? impact various system quality characteristics?
  15. 15. Goal 1 RQ C1 Which ASA issue ASA issues vs defects: categories can Spearman correlation identify defect- prone components? G1 Understandwhether/which ASAissues are indicatorsof defect-proneness RQ F1 Issues in non defect Which ASA issue prone files vs issues in categories can defect prone files identify defect- Mann Whitney test prone files?
  16. 16. Goal 2 RQ C2 Which ASA issue categories can ASA issues vs defects: point to defect- Spearman correlation prone components that impact various system quality G2 characteristics? Understandwhether/which ASAissues are related to specific software quality RQ F2 characteristics Which ASA issue categories can Issues in non defect point to defect- prone files vs issues in prone files that defect prone files impact various Mann Whitney test system quality characteristics?
  17. 17. Goal 2 Functionality Reliability Efficiency SW Quality Usability Portability Maintainability Using the ISO/IEC 9126 product quality model to classify defects : a controlled experiment, A. Vetro’, N. Zazworka, C. Seaman, and F. Shull, IET Digest 2012, 187 (2012), DOI:10.1049/ic.2012.0025
  18. 18. Mapping between ASA issues, Defects, Files, Components
  19. 19. Mapping between ASA issues, Defects, Files, Components
  20. 20. Results
  21. 21. Which ASA issue categories can identify defect-prone components ?RQ C1-C2 All Defect types F FR FU R U RQ C1 Common Practicesand -0.14 -0.13 -0.34 0.07 0 -0.2 Code Improvements Compiler Warnings 0.3 0.31 0.48 0.28 0.04 0.25 Constraints Violations 0.11 0.1 0.03 0.09 0.23 0.18 Language Usage Opportunities 0.57 0.53 0.55 0.5 0.2 0.43 Resharper issues categories Potential Code Quality 0.54 0.5 0.51 0.44 0.22 0.44 Issues Redundancies in Code 0.52 0.49 0.47 0.33 0.39 0.53 Redundancies in 0.42 0.45 0.01 0.28 0.17 0.14 Symbol Declarations Unused symbols 0.53 0.53 0.75 0.57 0.33 0.56 Sum of Resharper 0.19 0.18 0.1 0.09 0.23 0.23 issues In bold significant values (90%)
  22. 22. RQ F1 Which ASA issue categories can identify defect-prone files? RQ F2 Pval Quality characteristic –Resharper issues Pval Resharper issue categoryASP.NET NA F – Constraints Violations 0.013 F – Redundancies in Code 0.002Common Practices and Code Improvements 0.983 FR – Compiler Warnings 0.001Compiler Warnings 0.333 FU – Constraints Violations 0.002Constraints Violations 0.014 FU – Redundancies in Code 0.004Language Usage Opportunities 0.026 FU - Sum 0.062Potential Code Quality Issues 0.021 R – Redundancies in Code 0.033 R - Sum 0.029Redundancies in Code <0.001 U – Constraints Violations 0.085 U – Language UsageRedundancies in Symbol Declarations 0.969 0.042 OpportunitiesUnused.Symbols NA U – Potential Code Quality <0.001 IssuesSum 0.133 U – Redundancies in Code 0.033 In bold significant values (90%)
  23. 23. Follow-up analysis: sorting files by different indicators Cumulative distribution of defects in files and indicators
  24. 24. Follow-up analysis: sorting components by different indicators Cumulative distribution of defects in components and indicators
  25. 25. Conclusions
  26. 26. Summary• Few Resharper categories had positive correlations with defects at component level• Several Resharper categories were concentrated to defect prone files• The issues with higher correlations identify problems regarding code readability, performance, and more in general related to maintainability problems.• Classifying the defects according to the ISO 9126 quality characteristics, different ASA issues categories were positively correlated to different quality characteristics.• Comparing the capability of Resharper issues to detect the faultiest modules, specific ASA issues were more efficient than the sum of them or traditional indicators (i.e. software metrics).
  27. 27. Recommendations for future work• Analysis on file level might lead to more promising results than on component level.• The size of the project should be at least, but preferably larger than our medium sized project, to avoid data sparseness problems as we found in our study.• Understand if results for specific categories are useful in other environments or tailoring is necessary• Provide practitioner-oriented methods to build prediction models rather than building new models.
  28. 28. Questions? Investigating Automatic Static Analysis Results to Identify Quality Problems: an Inductive Study Antonio Vetro’ – Nico Zazworka – Forrest Shull – Carolyn Seaman – Michele A. Shaw antonio.vetro@polito.it IEEE Software Engineering Workshop (SEW-35), 12-13 October 2012 Heraclion, Crete, Greece

×