Patterns for Extracting High Level
Information from Bug Reports
Rodrigo Souza1,*
Christina Chavez1
Roberto Bittencourt2
1 Federal University of Bahia, Brazil
2 State University of Feira de Santana, Brazil
DAPSE’13: International Workshop on Data Analysis Patterns in Software Engineering
* speaker; email: rodrigo@dcc.ufba.br
May 21, 2013 San Francisco, USA
Bug reports 	
  
provide insight about…
- the quality of the software
- the quality of the process
Bug reports 	
  
are like oysters…Bug reports 	
  
If you look inside,
you may find
something
valuable
In This Talk
Two patterns to help you extract information about
the software verification process
1. Fixers and Verifiers
2. Testing Phases
Fixers and Verifiers
Find the quality engineering team (if it exists).
1. Context
2. Problem
3. Solution
4. Discussion
Fixers and Verifiers
Developers assume specific roles in a team
fixer: fixes bugs
verifier: verifies if fixes are appropriate
A quality engineering team is formed by
verifiers, who perform most of the
verifications in the project
(among other activities)
The roles should be taken into account in
data analysis
You can’t judge a verifier by
the number of fixes
1. Context
2. Problem
3. Solution
4. Discussion
Fixers and Verifiers
Find the quality engineering team
(if it exists)
1. Context
2. Problem
3. Solution
4. Discussion
Fixers and Verifiers
You’ll need:
-  For each developer
- Number of times he changed
status to VERIFIED (i.e., verifications)
- Number of times he changed
resolution to FIXED (i.e., fixes)
Ingredients
1
Directions
For each developer, compute the ratio:
verifications / (1 + fixes)
2 Choose a threshold and assume that
a developer is a verifier if
ratio > threshold
how?	
  
Directions
2.1 For each ratio, use it as the threshold
and compute:
- the number of verifiers in the project
- the % of verifications performed by verifiers
2.2 Plot this data
x
y
Directions
90%	
  
80%	
  
70%	
  
60%	
  
50%	
  
40%	
  
30%	
  
2%	
   4%	
   6%	
   8%	
   10%	
   12%	
   14%	
  
1	
  
5	
  
10	
  15	
  
20	
  
25,	
  30	
  
35	
  
40	
  
How to choose
a threshold?
size of QE team = number of verifiers (%)	
  
% of verifications
by verifiers	
  
ratio
(threshold candidate)
=	
  
Directions
1	
  
5	
  
10	
  15	
  
20	
  
25,	
  30	
  
35	
  
40	
  
fit an arm,
find the elbow!
% of verifications
by verifiers	
  
2%	
   4%	
   6%	
   8%	
   10%	
   12%	
  
90%	
  
80%	
  
70%	
  
60%	
  
50%	
  
40%	
  
30%	
  
2.3
size of QE team = number of verifiers (%)	
  
3
Directions
If % of verifications by verifiers is high*,
they form a quality engineering team.
* e.g., > 50% 84%	
  
1. Context
2. Problem
3. Solution
4. Discussion
Fixers and Verifiers
Don’t use the absolute number of verifications,
because developers may fix & verify
simple bugs
If developers are expected to change roles
over time, use sliding windows.
Testing Phase
Identify testing phases in the software development life cycle.
1. Context
2. Problem
3. Solution
4. Discussion
Testing Phase
In mature projects, new features and bug fixes
are verified before being released to the public
When are bugs verified?
Fix	
  
Verify	
  
Fix	
  
Verify	
  
Fix	
  
Verify	
  
Fix	
  
Fix	
  
Fix	
  
Verify	
  
Verify	
  
Verify	
  
testing phase
Failing to recognize testing phases
can mislead your analyses
Fix	
  
Verify	
  
Fix	
  bug	
  #5	
  
Verify	
  bug	
  #5	
  
Fix	
  
Verify	
  
time	
  …	
  
time ~ complexity	
  
Fix	
  bug	
  #7	
  
Fix	
  
Fix	
  
Verify	
  bug	
  #7	
  
Verify	
  
Verify	
  
…	
   time	
  
time ~ …?	
  
1. Context
2. Problem
3. Solution
4. Discussion
Testing Phase
Identify testing phases in the software
development life cycle
1. Context
2. Problem
3. Solution
4. Discussion
Testing Phase
You’ll need:
-  Time of verifications
-  Release dates (optional)
Ingredients
Plot the accum. number of verifications over time1
Directions (solution #1)
time	
  
accum.
num.
verif.	
  
If possible, highlight release dates2
Directions (solution #1)
time	
  
accum.
num.
verif.	
  
Find cliffs, especially before release dates
(they are testing phases)
3
Directions (solution #1)
time	
  
accum.
num.
verif.	
  
Apply Kleinberg’s algorithm to verification
times in order to detect verification bursts
1
Directions (solution #2)
Bursts
There’s no 2.2
Directions (solution #2)
Bursts (= testing phases)
1. Context
2. Problem
3. Solution
4. Discussion
Testing Phase
If the number of verifications on a particular
day is too high, they may be mass updates
Look Out For Mass Updates and remove them
beforing looking for testing phases
Testing phases are less common in projects
with quality engineering teams
Thank you!
Go beyond the surface to
find pearls in bug reports!

Patterns for Extracting High Level Information from Bug Reports

  • 1.
    Patterns for ExtractingHigh Level Information from Bug Reports Rodrigo Souza1,* Christina Chavez1 Roberto Bittencourt2 1 Federal University of Bahia, Brazil 2 State University of Feira de Santana, Brazil DAPSE’13: International Workshop on Data Analysis Patterns in Software Engineering * speaker; email: rodrigo@dcc.ufba.br May 21, 2013 San Francisco, USA
  • 2.
  • 3.
    provide insight about… -the quality of the software - the quality of the process Bug reports  
  • 4.
    are like oysters…Bugreports   If you look inside, you may find something valuable
  • 5.
    In This Talk Twopatterns to help you extract information about the software verification process 1. Fixers and Verifiers 2. Testing Phases
  • 6.
    Fixers and Verifiers Findthe quality engineering team (if it exists).
  • 7.
    1. Context 2. Problem 3.Solution 4. Discussion Fixers and Verifiers
  • 8.
    Developers assume specificroles in a team fixer: fixes bugs verifier: verifies if fixes are appropriate
  • 9.
    A quality engineeringteam is formed by verifiers, who perform most of the verifications in the project (among other activities)
  • 10.
    The roles shouldbe taken into account in data analysis You can’t judge a verifier by the number of fixes
  • 11.
    1. Context 2. Problem 3.Solution 4. Discussion Fixers and Verifiers
  • 12.
    Find the qualityengineering team (if it exists)
  • 13.
    1. Context 2. Problem 3.Solution 4. Discussion Fixers and Verifiers
  • 14.
    You’ll need: -  Foreach developer - Number of times he changed status to VERIFIED (i.e., verifications) - Number of times he changed resolution to FIXED (i.e., fixes) Ingredients
  • 15.
    1 Directions For each developer,compute the ratio: verifications / (1 + fixes) 2 Choose a threshold and assume that a developer is a verifier if ratio > threshold how?  
  • 16.
    Directions 2.1 For eachratio, use it as the threshold and compute: - the number of verifiers in the project - the % of verifications performed by verifiers 2.2 Plot this data x y
  • 17.
    Directions 90%   80%   70%   60%   50%   40%   30%   2%   4%   6%   8%   10%   12%   14%   1   5   10  15   20   25,  30   35   40   How to choose a threshold? size of QE team = number of verifiers (%)   % of verifications by verifiers   ratio (threshold candidate) =  
  • 18.
    Directions 1   5   10  15   20   25,  30   35   40   fit an arm, find the elbow! % of verifications by verifiers   2%   4%   6%   8%   10%   12%   90%   80%   70%   60%   50%   40%   30%   2.3 size of QE team = number of verifiers (%)  
  • 19.
    3 Directions If % ofverifications by verifiers is high*, they form a quality engineering team. * e.g., > 50% 84%  
  • 20.
    1. Context 2. Problem 3.Solution 4. Discussion Fixers and Verifiers
  • 21.
    Don’t use theabsolute number of verifications, because developers may fix & verify simple bugs
  • 22.
    If developers areexpected to change roles over time, use sliding windows.
  • 23.
    Testing Phase Identify testingphases in the software development life cycle.
  • 24.
    1. Context 2. Problem 3.Solution 4. Discussion Testing Phase
  • 25.
    In mature projects,new features and bug fixes are verified before being released to the public
  • 26.
    When are bugsverified? Fix   Verify   Fix   Verify   Fix   Verify   Fix   Fix   Fix   Verify   Verify   Verify   testing phase
  • 27.
    Failing to recognizetesting phases can mislead your analyses
  • 28.
    Fix   Verify   Fix  bug  #5   Verify  bug  #5   Fix   Verify   time  …   time ~ complexity   Fix  bug  #7   Fix   Fix   Verify  bug  #7   Verify   Verify   …   time   time ~ …?  
  • 29.
    1. Context 2. Problem 3.Solution 4. Discussion Testing Phase
  • 30.
    Identify testing phasesin the software development life cycle
  • 31.
    1. Context 2. Problem 3.Solution 4. Discussion Testing Phase
  • 32.
    You’ll need: -  Timeof verifications -  Release dates (optional) Ingredients
  • 33.
    Plot the accum.number of verifications over time1 Directions (solution #1) time   accum. num. verif.  
  • 34.
    If possible, highlightrelease dates2 Directions (solution #1) time   accum. num. verif.  
  • 35.
    Find cliffs, especiallybefore release dates (they are testing phases) 3 Directions (solution #1) time   accum. num. verif.  
  • 36.
    Apply Kleinberg’s algorithmto verification times in order to detect verification bursts 1 Directions (solution #2) Bursts
  • 37.
    There’s no 2.2 Directions(solution #2) Bursts (= testing phases)
  • 38.
    1. Context 2. Problem 3.Solution 4. Discussion Testing Phase
  • 39.
    If the numberof verifications on a particular day is too high, they may be mass updates Look Out For Mass Updates and remove them beforing looking for testing phases
  • 40.
    Testing phases areless common in projects with quality engineering teams
  • 41.
    Thank you! Go beyondthe surface to find pearls in bug reports!