4. Existing Approaches Aren’t Adding Value
• Obvious to practitioners
• Require a large amount of effort
• Not all defects are equally important
So….what can we do?
FOCUS ON HIGH-IMPACT DEFECTS !
4
5. Impact Is In The Eye of The Beholder!
Customers: Breakages
Break existing functionality
Affect established customers
Hurt company image
Low pre-, high post-release defects
Catch developers off-guard
Lead to schedule interruptions
Developers: Surprises
Occur in unexpected locations
5
7. Part 1 Part 2
Part 3 Part 4
Exploratory Study of
Breakages and Surprises
Prediction of Breakages
and Surprises
Understanding
Prediction Models of
Breakages and Surprises
Value of Focusing on
Breakages and Surprises
Study Overview
7
8. Exploratory Study of Breakages and
Surprises
All files
Breakages Surprises
Post-release
10%
2% 2%
Rare (2% of files)
6% overlap Should study them separately
Very difficult to model
8
9. Part 1 Part 2
Part 3 Part 4
Exploratory Study of
Breakages and Surprises
Prediction of
Breakages and Surprises
Understanding Prediction
Models of Breakages and
Surprises
Value of Focusing on
Breakages and Surprises
Predicting Breakages and Surprises
9
11. Factors Used to Model Breakages and
Surprises
Size
Pre-release defects
Number, churn, size, pre-release
changes, pre-release defects
Latest change
Age
Traditional
Co-changed files
Time 11
13. Part 1 Part 2
Part 3 Part 4
Exploratory Study of
Breakages and Surprises
Prediction of Breakages
and Surprises
Understanding
Prediction Models of
Breakages and Surprises
Value of Focusing on
Breakages and Surprises
Understanding Breakages and
Surprises Models
13
15. Traditional Co-change Time
Important Factors for High-Impact Defects
0
5
10
15
20
25
30
35
40
R1.1 R2.1 R3 R4 R4.1
0
5
10
15
20
25
30
35
40
R1.1 R2.1 R3 R4 R4.1
Breakages Surprises
DevianceExplained(%)
Traditional
Co-change
Time
15
16. Part 1 Part 2
Part 3 Part 4
Exploratory Study of
Breakages and Surprises
Prediction of Breakages
and Surprises
Understanding Prediction
Models of Breakages and
Surprises
Value of Focusing on
Breakages and Surprises
Value of Focusing on Breakages and
Surprises
16
19. Take Home Messages
1. Breakages and Surprises are different. Occur in 2%
of files, hard to predict
2. Achieve 2-3X improvement in precision, high recall
Co-change and Time metrics
4. Building specialized models saves 40-50% effort
Traditional metrics3. Breakages
Surprises
19
http://research.cs.queensu.ca/home/emads/data/FSE2011/hid_artifact.html
21. Quantifying Effort Savings
Yes No
Yes 26 320
No 7 1093
Predicted
Actual
Yes No
Yes 26 538
No 7 875
Predicted
Actual
Set recall to be the same
Effort Savings ~41%!
General model Specialized model
21
22. Remaining Challenges
• “We tend to test features not files”
– Can we predict defects for features
• “Without knowing more about the nature of the
defect or recommendations for how to fix it, I
am not sure how we can use it”
– Predict the nature of defects
– Can we provide specific remediation strategies for
predicted defects
• e.g., surprises mostly relate to incorrectly implemented
requirements
22
24. Effect of Factors on Breakages and Surprises
154
39
-85
-19
-92
-150
-100
-50
0
50
100
150
200
Pre-release
defects
Size No. co-
changed files
Churn of co-
changed files
Latest change
Breakages
Surprises
24
25. High Impact Defects: Summary
Can we identify
them?
What factor best
predict them?
What is the value of
focusing on them?
Yes, 2-3X precision,
~70% recall
Breakages: Traditional
Surprises: Co-change and
release schedule
40-50% effort savings
25
26. Current approaches predict the obvious
Focus on high-impact, i.e. surprises and
breakages
Pre-defects and size predict Breakages
Number and churn of co-changed files
and late changes predict surprises
Using specialized models reduces effort by 40-50%
26
28. Breakage Defects
Defects that break
existing functionality
Affect an established
customer base
Hurt quality image
28
29. Surprise Defects
Flag files with defects in
unexpected locations
Catch practitioners
off guard
Interrupt schedules
High ratio of post-
to-pre defects
29
30. Predicting Breakages and Surprises
Explanative Power
Breakages Surprises
17.8%
13.1%
State of Art
(post-release)
17.7 – 27.9%
30
31. Stability of Important Factors
Breakages
R1.1 R2.1 R3.1
No. co-changed files
Late changes
Pre-defects
R3 R4.1
Size
Churn co-changed files
Highly
stable
Mainly
stable
Not
stable
31
32. Stability of Important Factors
R1.1 R2.1 R3.1R3 R4.1 R1.1 R2.1 R3.1R3 R4.1
Breakages Surprises 32
33. Breakage Defects
Defects that break
existing functionality
Affect an established
customer base
Hurt quality image
33
34. Surprise Defects
Flag files with defects in
unexpected locations
Catch practitioners
off guard
Interrupt schedules
High ratio of post-
to-pre defects
34
36. Factors Used to Model High Impact Defects
Size
Pre-release defects
Age
Number, churn, size, pre-release
changes, pre-release defects
Latest changes
Traditional
Co-changed files
Release schedule 36
38. Evaluation of Prediction Model
Yes No
Yes TP FP
No FN TN
Predicted
Actual
Precision 𝑇𝑃
𝑇𝑃 + 𝐹𝑃
𝑇𝑃
𝑇𝑃 + 𝐹𝑁
Recall
Training
2/3 Testing
1/3
Data
Build Model
Input
Outcome
38
Editor's Notes
How is this slide related to previous one?
Way too many terms that are not defined:
Predictive power
- relative impact
-effort saving
Just remove all green stuff for now – you need to sell your work for now not the exact technique the exact techqniue needs to be presented and detailed later on.
Avoid Green text very hard on the eyes
Also you never get back to these questions? These questions need to be answered later in your presentation (so the presentation should be around that structure and your conclusion should highlight these answers too)
The black magic picture means that your methodology is black magic
Predictors are a way to study this thing – your paper is not about predictors it is about studying what makes things happen. You are using prediction models as a tool for your study.
What are the best predictors What
Factors… may be say Causes?
What is this graph? ?How is it measured? What is your Y-axis? Need aslide before to explain how this graph is generated and what is the intuition behind it?
Way too many terms that are not defined:
Predictive power
- relative impact
-effort saving
Just remove all green stuff for now – you need to sell your work for now not the exact technique the exact techqniue needs to be presented and detailed later on.
Avoid Green text very hard on the eyes
Also you never get back to these questions? These questions need to be answered later in your presentation (so the presentation should be around that structure and your conclusion should highlight these answers too)
The black magic picture means that your methodology is black magic
Predictors are a way to study this thing – your paper is not about predictors it is about studying what makes things happen. You are using prediction models as a tool for your study.
What are the best predictors What
I do not get how you measured effort savings?
What do you mean by File or LOC? Need a slide before this to explain what you are doing? In the last slide you said you are comparing false positives.. I do not see that I just see File and LOC
Have one model box
Then have angeled input so both inputs and output are visible.. Ie. have two lines in and two lines out.
I would use the overview running slide here and have below each point – the take home
Put a basic model of how defect prediction works and how people use it, so attendees understand what defect prediction is about – not everyone knows this stuff
How is this slide related to previous one?
precision is the fraction of retrieved instances that are relevant,
while recall is the fraction of relevant instances that are retrieved