Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Evaluating SZZ Implementations Through a Developer-informed Oracle (ICSE 2021)

The SZZ algorithm for identifying bug-inducing changes has been widely used to evaluate defect prediction techniques and to empirically investigate when, how, and by whom bugs are introduced. Over the years, researchers have proposed several heuristics to improve the SZZ accuracy, providing various implementations of SZZ. However, fairly evaluating those implementations on a reliable oracle is an open problem: SZZ evaluations usually rely on (i) the manual analysis of the SZZ output to classify the identified bug-inducing commits as true or false positives; or (ii) a golden set linking bug-fixing and bug-inducing commits. In both cases, these manual evaluations are performed by researchers with limited knowledge of the studied subject systems. Ideally, there should be a golden set created by the original developers of the studied systems.
We propose a methodology to build a "developer-informed" oracle for the evaluation of SZZ variants. We use Natural Language Processing (NLP) to identify bug-fixing commits in which developers explicitly reference the commit(s) that introduced a fixed bug. This was followed by a manual filtering step aimed at ensuring the quality and accuracy of the oracle. Once built, we used the oracle to evaluate several variants of the SZZ algorithm in terms of their accuracy. Our evaluation helped us to distill a set of lessons learned to further improve the SZZ algorithm.

Giovanni Rosa (University of Molise), Luca Pascarella (Università della Svizzera italiana), Simone Scalabrino (University of Molise), Rosalia Tufano (Università della Svizzera Italiana), Gabriele Bavota (Software Institute, USI Università della Svizzera italiana), Michele Lanza (Software Institute, USI Università della Svizzera italiana), Rocco Oliveto (University of Molise),

* Live Presentation: https://www.youtube.com/watch?v=ZiuAaysj_Sk

* IEEE Digital Library: https://www.computer.org/csdl/proceedings-article/icse/2021/029600a436/1sEXo2zV4OI

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to comment

  • Be the first to like this

Evaluating SZZ Implementations Through a Developer-informed Oracle (ICSE 2021)

  1. 1. Evaluating SZZ Implementations Through a Developer-informed Oracle Giovanni Rosa, Luca Pascarella, Simone Scalabrino, Rosalia Tufano, Gabriele Bavota, Michele Lanza, Rocco Oliveto
  2. 2. Where do bugs come from?
  3. 3. Find out changes that can lead to a problem and avoid them in future Understanding where bugs are introduced allows to…
  4. 4. Estimate how much a program is error-prone Understanding where bugs are introduced allows to…
  5. 5. Better allocate resources in testing activities Understanding where bugs are introduced allows to…
  6. 6. Śliwerski Zimmermann Zeller @ MSR 2005
  7. 7. Step 1 SZZ in a nutshell bug report analysis
  8. 8. Step 1 (A) Bug-fixing commit (B) git blame (C) Buggy commit SZZ in a nutshell bug report analysis
  9. 9. Step 1 Step 2 Filtering of resulting commits SZZ in a nutshell (A) Bug-fixing commit (B) git blame (C) Buggy commit bug report analysis
  10. 10. Step 1 bug-inducing commit Step 2 Step 3 SZZ in a nutshell Filtering of resulting commits (A) Bug-fixing commit (B) git blame (C) Buggy commit bug report analysis
  11. 11. Different SZZ variants proposed
  12. 12. There is a problem
  13. 13. Evaluating and comparing the SZZ variants Da Costa et al. @ TSE 2016
  14. 14. Evaluating and comparing the SZZ variants Da Costa et al. @ TSE 2016 Small datasets used for evaluation
  15. 15. Evaluating and comparing the SZZ variants Da Costa et al. @ TSE 2016 Small datasets used for evaluation Validation manually performed by researchers
  16. 16. Define a dataset validated by the developers The way
  17. 17. fixes a search bug introduced by 2508e12 and fixes a typo in the README.md
  18. 18. Developer-informed dataset
  19. 19. Mining of commits 2011 2020
  20. 20. Heuristic approach 1 keyword-based filter AI-powered syntax analysis Duplicate commits removal
  21. 21. Heuristic approach 2 keyword-based filter AI-powered syntax analysis Duplicate commits removal
  22. 22. 3 Heuristic approach keyword-based filter AI-powered syntax analysis duplicate commits removal
  23. 23. Manual validation False positives Bug report data
  24. 24. Bug report data fixes #1740 quote pov-ray binary on windows this fixes a bug introduced by #3523741… URL Date when the issue is reported https://tracker.freecadweb.org/view.php?id=1740 Commit message
  25. 25. 19,6M 3,6k 1,9k Analyzed commits: Extracted commits: After manual validation:
  26. 26. Top programming languages 0 185 370 C P y t h o n C + + J S J a v a P H P R u b y C #
  27. 27. 1,1k 129 Final number of commits: Commits with issue report:
  28. 28. How do different variants of SZZ perform in identifying bug-inducing changes?
  29. 29. B-SZZ Śliwerski et al. @ MSR 2005
  30. 30. R-SZZ e L-SZZ B-SZZ AG-SZZ DJ-SZZ Śliwerski et al. @ MSR 2005 Williams and Spacco @ ISSTA 2008 Kim et al. @ ASE 2006 Davies et al. @ JSE 2013
  31. 31. R-SZZ e L-SZZ B-SZZ AG-SZZ MA-SZZ DJ-SZZ RA-SZZ Śliwerski et al. @ MSR 2005 Williams and Spacco @ ISSTA 2008 Da Costa et al. @ TSE 2016 Kim et al. @ ASE 2006 Davies et al. @ JSE 2013 Neto et al. @ SANER 2018
  32. 32. Open-Source implementations SZZ Unleashed (DJ-SZZ) OpenSZZ (B-SZZ) PyDriller (AG-SZZ) RA-SZZ (RA-SZZ)
  33. 33. Step 1 bug-inducing commit Step 2 Step 3 Our experiment Filtering of resulting commits (A) Bug-fixing commit (B) git blame (C) Buggy commit bug report analysis
  34. 34. Results 0.66 (R-SZZ) Precision Recall F1-score 0.72 (SZZ@UNL) 0.61 (R-SZZ)
  35. 35. Results 0.66 (R-SZZ) Precision Recall F1-score 0.72 (SZZ@UNL) 0.61 (R-SZZ) 0.09 (SZZ@UNL) 0.19 (SZZ@OPN) Java only 0.16 (SZZ@UNL)
  36. 36. Qualitative Analysis
  37. 37. What have we learned?
  38. 38. “ The buggy line is not always impacted in the bug-fix „ Lesson 1
  39. 39. “ SZZ is sensible to history rewritings „ Lesson 2
  40. 40. “ Looking at the big picture in code changes „ Lesson 3
  41. 41. Summary
  42. 42. Take a look at our SZZ implementation! https://github.com/grosa1/pyszz
  43. 43. “fixes a search bug introduced by 2508e12 and fixes a typo in the README.md” Heuristic approach: Example
  44. 44. • Spiegare dataset con isse senza issue
  45. 45. • Dettagli su dataset e numero dei linguaggi analizzati
  46. 46. • Riportare risultati delle esecuzioni java only
  47. 47. • Esempio Differenza tra lesson 1 e 3
  48. 48. • Keyword usate per l euristica e le relazioni considerate con spacy

    Be the first to comment

The SZZ algorithm for identifying bug-inducing changes has been widely used to evaluate defect prediction techniques and to empirically investigate when, how, and by whom bugs are introduced. Over the years, researchers have proposed several heuristics to improve the SZZ accuracy, providing various implementations of SZZ. However, fairly evaluating those implementations on a reliable oracle is an open problem: SZZ evaluations usually rely on (i) the manual analysis of the SZZ output to classify the identified bug-inducing commits as true or false positives; or (ii) a golden set linking bug-fixing and bug-inducing commits. In both cases, these manual evaluations are performed by researchers with limited knowledge of the studied subject systems. Ideally, there should be a golden set created by the original developers of the studied systems. We propose a methodology to build a "developer-informed" oracle for the evaluation of SZZ variants. We use Natural Language Processing (NLP) to identify bug-fixing commits in which developers explicitly reference the commit(s) that introduced a fixed bug. This was followed by a manual filtering step aimed at ensuring the quality and accuracy of the oracle. Once built, we used the oracle to evaluate several variants of the SZZ algorithm in terms of their accuracy. Our evaluation helped us to distill a set of lessons learned to further improve the SZZ algorithm. Giovanni Rosa (University of Molise), Luca Pascarella (Università della Svizzera italiana), Simone Scalabrino (University of Molise), Rosalia Tufano (Università della Svizzera Italiana), Gabriele Bavota (Software Institute, USI Università della Svizzera italiana), Michele Lanza (Software Institute, USI Università della Svizzera italiana), Rocco Oliveto (University of Molise), * Live Presentation: https://www.youtube.com/watch?v=ZiuAaysj_Sk * IEEE Digital Library: https://www.computer.org/csdl/proceedings-article/icse/2021/029600a436/1sEXo2zV4OI

Views

Total views

36

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

0

Shares

0

Comments

0

Likes

0

×