SlideShare a Scribd company logo
1 of 40
Predicting Re-opened Bugs
A Case Study on the Eclipse Project
Emad Shihab, A. Ihara, Y. Kamei, W. Ibrahim,
M. Ohira, B. Adams, A. E. Hassan and K. Matsumoto
emads@cs.queensu.ca
SAIL, Queen’s University, Canada
NAIST, Japan
1
When you discover a bug …
Report bug Fix bug Verify fix Close bug
Re-opened
2
Degrade quality …
3
Increase maintenance costs …
4
Unnecessary re-work…
5
Research questions …
1. Which attributes indicate re-opened bugs?
2. Can we accurately predict if a bug will be re-
opened using the extracted attributes?
6
Determine
best
attributes
Mine code
and bug
repositories
Approach overview
Extract
attributes
Predict re-
opened bugs
7
Our dimensions …
8
Work habit Bug report
Bug fix People
Work habit attributes
1. Time (Hour of day)
2. Weekday
3. Day of month
4. Month
9
Bug report attributes
1. Component
2. Platform
3. Severity
4. Priority
5. CC list
6. Priority changed
7. Description size
8. Description text
9. Number of comments
10. Comment size
11. Comment text
10
Metadata
Textual
data
Bug fix attributes
1. Time to resolve (in days)
2. Last status
3. Number of edited files
11
People attributes
1. Reporter Name
2. Reporter experience
3. Fixer name
4. Fixer experience
12
Research question 1
Which attributes indicate re-opened bugs?
13
Comment text, description text and fix location
(component) are the best indicators
Top node analysis setup
1. Build 10 decision trees for each attribute set
3. Repeat using all attributes
2. Record the frequency and level of each attribute
14
Decision tree prediction model
15
No. files
>= 5 < 5
Dev exp
>= 3 < 3
Re-openedMonth
Time
>= 12 < 12
Time to resolve
>= 6 < 6 >= 24 < 24
Re-opened Not Re-opened Re-opened.
.
.
.
.
.
Level 1
Level 2
Level 3
Top node analysis example with 3
trees
Comment
Time No. comments
Comment
Time No. files
No. files
Time Description size
Level Frequency Attributes
Level 1 2
1
Comment
No. files
Level 2 3
1
1
1
Time
No. comments
No. files
Description size
.
.
.
.
.
.
16
Which attributes best indicate re-
opened bugs?
17
Work habit attributes
9 X Month
1 X Time (Hour of day)
Weekday
Day of month
Which attributes best indicate re-
opened bugs?
18
Bug report attributes
Component
Platform
Severity
Priority
CC list
Priority changed
Description size
Description text
Number of comments
Comment size
10 X Comment text
Metadata
Textual
data
Which attributes best indicate re-
opened bugs?
7 X Time to resolve
3 X Last status
Number of files in fix
19
Bug fix attributes
Which attributes best indicate re-
opened bugs?
5 X Reporter name
5 X Fixer name
Reporter experience
Fixer experience
20
People attributes
Combining all attributes
+ ++
Level Frequency Attributes
Level 1 10 Comment text
Level 2 19
1
Description text
Component
21
Research question 2
Can we accurately predict if a bug will be
re-opened using the extracted attributes?
22
Our models can correctly predict re-opened bugs with
63% precision and 85% recall
Decision tree prediction model
23
No. files
>= 5 < 5
Dev exp
>= 3 < 3
Re-openedMonth
Time
>= 12 < 12
Time to resolve
>= 6 < 6 >= 24 < 24
Re-opened Not Re-opened Re-opened.
.
.
.
.
.
Level 1
Level 2
Level 3
Performance measures
Re-opened precision:
Re-opened Recall:
Re-opened Not re-opened
Re-opened TP FP
Not re-opened FN TN
Predicted
Actual
𝑇𝑃
𝑇𝑃 + 𝐹𝑃
𝑇𝑃
𝑇𝑃 + 𝐹𝑁
Not re-opened precision:
Not re-opened recall:
𝑇𝑁
𝑇𝑁 + 𝐹𝑁
𝑇𝑁
𝑇𝑁 + 𝐹𝑃
24
33
63
21
27
74
83 83
67
Work habits Bug report Bug fix People
Precisionandrecall(%)
Precision
Recall
Predicting re-opened bugs
25
93
97
93 91
71
91
39
66
Work habits Bug report Bug fix People
Precisionandrecall(%)
Precision
Recall
Predicting NOT re-opened bugs
26
Combining all attributes
63
97
85
90
re-opened NOT re-opened
Precisionandrecall(%)
Precision
Recall
27
+ ++
Bug comments are important …
Bug report is most important set
What words are important?
Comment text most important bug report attribute
28
Important words
Re-opened Not Re-opened
control
background
debugging
breakpoint
blocked
platforms
verified
duplicate
screenshot
important
testing
warning
29
30
Predicting re-opened bugs
Pr: 93 %
Re: 71 %
Work habits Bug report Bug fix People
Pr: 33 %
Re: 74 %
Pr: 97%
Re: 91%
Pr: 93%
Re: 39%
Pr: 63 %
Re: 83 %
Pr: 21%
Re: 83%
Pr: 91%
Re: 66%
Pr: 27%
Re: 67%ened
pened
31
Predicting re-opened bugs
Work habits Bug report Bug fix People
32
Predicting NOT re-opened bugs
Pr: 93 %
Re: 71 %
Work habits Bug report Bug fix People
Pr: 97%
Re: 91%
Pr: 93%
Re: 39%
Pr: 91%
Re: 66%
33
Predicting re-opened bugs
Pr: 97 %
Re: 90 %
Pr: 63 %
Re: 85 %Re-opened
Not Re-opened
+ ++
Recall
Precision
34
Predict re-
opened
bugs
Mine code
and bug
repositories
Approach overview
Attributes of
re-opened
bugs
Measure
performance
35
Work habits Bug report Bug fix People
Precisionandrecallquantity
Recall
Precision
Predicting re-opened bugs
36
Which attributes best indicate re-
opened bugs?
Month (9)
Time (1)
Work habits
Comment
text (10)
Bug report Bug fix
Time to fix (7)
Last status (3)
People
Fixer (5)
Reporter (5)
37
38
A typical work day…
39
Bug report attributes
1. Component
2. Platform
3. Severity
4. Priority
5. CC list
6. Priority changed
7. Description size
8. Description text
9. Number of comments
10. Comment size
11. Comment text
40
Metadata
Textual
data

More Related Content

Similar to Wcre2010 shihab

Performance Forensics - Understanding Application Performance
Performance Forensics - Understanding Application PerformancePerformance Forensics - Understanding Application Performance
Performance Forensics - Understanding Application PerformanceAlois Reitbauer
 
The Use of Development History in Software Refactoring Using a Multi-Objectiv...
The Use of Development History in Software Refactoring Using a Multi-Objectiv...The Use of Development History in Software Refactoring Using a Multi-Objectiv...
The Use of Development History in Software Refactoring Using a Multi-Objectiv...Ali Ouni
 
Life Cycle of Metrics, Alerting, and Performance Monitoring in Microservices
Life Cycle of Metrics, Alerting, and Performance Monitoring in MicroservicesLife Cycle of Metrics, Alerting, and Performance Monitoring in Microservices
Life Cycle of Metrics, Alerting, and Performance Monitoring in MicroservicesSean Chittenden
 
Icse 2011 ds_1
Icse 2011 ds_1Icse 2011 ds_1
Icse 2011 ds_1SAIL_QU
 
Investigating the Quality Aspects of Crowd-Sourced Developer Forum: A Case St...
Investigating the Quality Aspects of Crowd-Sourced Developer Forum: A Case St...Investigating the Quality Aspects of Crowd-Sourced Developer Forum: A Case St...
Investigating the Quality Aspects of Crowd-Sourced Developer Forum: A Case St...University of Saskatchewan
 
Tdd for BT E2E test community
Tdd for BT E2E test communityTdd for BT E2E test community
Tdd for BT E2E test communityKerry Buckley
 
Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamDoug Needham
 
Reactive Performance Testing
Reactive Performance TestingReactive Performance Testing
Reactive Performance TestingLilit Yenokyan
 
Works For Me! Characterizing Non-Reproducible Bug Reports
Works For Me! Characterizing Non-Reproducible Bug ReportsWorks For Me! Characterizing Non-Reproducible Bug Reports
Works For Me! Characterizing Non-Reproducible Bug ReportsSALT Lab @ UBC
 
maXbox Starter 43 Work with Code Metrics ISO Standard
maXbox Starter 43 Work with Code Metrics ISO StandardmaXbox Starter 43 Work with Code Metrics ISO Standard
maXbox Starter 43 Work with Code Metrics ISO StandardMax Kleiner
 
Understanding Log Lines using Development Knowledge
Understanding Log Lines using Development KnowledgeUnderstanding Log Lines using Development Knowledge
Understanding Log Lines using Development KnowledgeSAIL_QU
 
Keynote: Machine Learning for Design Automation at DAC 2018
Keynote:  Machine Learning for Design Automation at DAC 2018Keynote:  Machine Learning for Design Automation at DAC 2018
Keynote: Machine Learning for Design Automation at DAC 2018Manish Pandey
 
Data Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup GroupData Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup GroupDoug Needham
 
Cloudera Data Science Challenge
Cloudera Data Science ChallengeCloudera Data Science Challenge
Cloudera Data Science ChallengeMark Nichols, P.E.
 
Training deep auto encoders for collaborative filtering
Training deep auto encoders for collaborative filteringTraining deep auto encoders for collaborative filtering
Training deep auto encoders for collaborative filteringMarlesson Santana
 
Event Sourcing: Introduction & Challenges
Event Sourcing: Introduction & ChallengesEvent Sourcing: Introduction & Challenges
Event Sourcing: Introduction & ChallengesMichael Plöd
 
Entaggle: an Agile Software Development Case Study
Entaggle: an Agile Software Development Case StudyEntaggle: an Agile Software Development Case Study
Entaggle: an Agile Software Development Case StudyElisabeth Hendrickson
 
Akka: Actor Design & Communication Technics
Akka: Actor Design & Communication TechnicsAkka: Actor Design & Communication Technics
Akka: Actor Design & Communication TechnicsAlex Fruzenshtein
 
Facebook Comments Volume Prediction
Facebook Comments Volume PredictionFacebook Comments Volume Prediction
Facebook Comments Volume PredictionVaibhav Sharma
 

Similar to Wcre2010 shihab (20)

Performance Forensics - Understanding Application Performance
Performance Forensics - Understanding Application PerformancePerformance Forensics - Understanding Application Performance
Performance Forensics - Understanding Application Performance
 
The Use of Development History in Software Refactoring Using a Multi-Objectiv...
The Use of Development History in Software Refactoring Using a Multi-Objectiv...The Use of Development History in Software Refactoring Using a Multi-Objectiv...
The Use of Development History in Software Refactoring Using a Multi-Objectiv...
 
Life Cycle of Metrics, Alerting, and Performance Monitoring in Microservices
Life Cycle of Metrics, Alerting, and Performance Monitoring in MicroservicesLife Cycle of Metrics, Alerting, and Performance Monitoring in Microservices
Life Cycle of Metrics, Alerting, and Performance Monitoring in Microservices
 
Icse 2011 ds_1
Icse 2011 ds_1Icse 2011 ds_1
Icse 2011 ds_1
 
Investigating the Quality Aspects of Crowd-Sourced Developer Forum: A Case St...
Investigating the Quality Aspects of Crowd-Sourced Developer Forum: A Case St...Investigating the Quality Aspects of Crowd-Sourced Developer Forum: A Case St...
Investigating the Quality Aspects of Crowd-Sourced Developer Forum: A Case St...
 
Tdd for BT E2E test community
Tdd for BT E2E test communityTdd for BT E2E test community
Tdd for BT E2E test community
 
Cloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug NeedhamCloudera Data Science Challenge 3 Solution by Doug Needham
Cloudera Data Science Challenge 3 Solution by Doug Needham
 
Reactive Performance Testing
Reactive Performance TestingReactive Performance Testing
Reactive Performance Testing
 
Works For Me! Characterizing Non-Reproducible Bug Reports
Works For Me! Characterizing Non-Reproducible Bug ReportsWorks For Me! Characterizing Non-Reproducible Bug Reports
Works For Me! Characterizing Non-Reproducible Bug Reports
 
maXbox Starter 43 Work with Code Metrics ISO Standard
maXbox Starter 43 Work with Code Metrics ISO StandardmaXbox Starter 43 Work with Code Metrics ISO Standard
maXbox Starter 43 Work with Code Metrics ISO Standard
 
Understanding Log Lines using Development Knowledge
Understanding Log Lines using Development KnowledgeUnderstanding Log Lines using Development Knowledge
Understanding Log Lines using Development Knowledge
 
Keynote: Machine Learning for Design Automation at DAC 2018
Keynote:  Machine Learning for Design Automation at DAC 2018Keynote:  Machine Learning for Design Automation at DAC 2018
Keynote: Machine Learning for Design Automation at DAC 2018
 
Data Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup GroupData Science Challenge presentation given to the CinBITools Meetup Group
Data Science Challenge presentation given to the CinBITools Meetup Group
 
Cloudera Data Science Challenge
Cloudera Data Science ChallengeCloudera Data Science Challenge
Cloudera Data Science Challenge
 
Training deep auto encoders for collaborative filtering
Training deep auto encoders for collaborative filteringTraining deep auto encoders for collaborative filtering
Training deep auto encoders for collaborative filtering
 
Event Sourcing: Introduction & Challenges
Event Sourcing: Introduction & ChallengesEvent Sourcing: Introduction & Challenges
Event Sourcing: Introduction & Challenges
 
Entaggle: an Agile Software Development Case Study
Entaggle: an Agile Software Development Case StudyEntaggle: an Agile Software Development Case Study
Entaggle: an Agile Software Development Case Study
 
Akka: Actor Design & Communication Technics
Akka: Actor Design & Communication TechnicsAkka: Actor Design & Communication Technics
Akka: Actor Design & Communication Technics
 
6th sem
6th sem6th sem
6th sem
 
Facebook Comments Volume Prediction
Facebook Comments Volume PredictionFacebook Comments Volume Prediction
Facebook Comments Volume Prediction
 

More from SAIL_QU

Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...SAIL_QU
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...SAIL_QU
 
Improving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load testsImproving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load testsSAIL_QU
 
Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...SAIL_QU
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...SAIL_QU
 
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...SAIL_QU
 
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...SAIL_QU
 
Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...SAIL_QU
 
Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?SAIL_QU
 
Towards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log ChangesTowards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log ChangesSAIL_QU
 
The Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution AnalysesThe Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution AnalysesSAIL_QU
 
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...SAIL_QU
 
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...SAIL_QU
 
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...SAIL_QU
 
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...SAIL_QU
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...SAIL_QU
 
What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?SAIL_QU
 
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...SAIL_QU
 
Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...SAIL_QU
 
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsMeasuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsSAIL_QU
 

More from SAIL_QU (20)

Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
 
Improving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load testsImproving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load tests
 
Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...
 
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
 
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
 
Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...
 
Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?
 
Towards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log ChangesTowards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log Changes
 
The Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution AnalysesThe Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution Analyses
 
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
 
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
 
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
 
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
 
What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?
 
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
 
Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...
 
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsMeasuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
 

Wcre2010 shihab

  • 1. Predicting Re-opened Bugs A Case Study on the Eclipse Project Emad Shihab, A. Ihara, Y. Kamei, W. Ibrahim, M. Ohira, B. Adams, A. E. Hassan and K. Matsumoto emads@cs.queensu.ca SAIL, Queen’s University, Canada NAIST, Japan 1
  • 2. When you discover a bug … Report bug Fix bug Verify fix Close bug Re-opened 2
  • 6. Research questions … 1. Which attributes indicate re-opened bugs? 2. Can we accurately predict if a bug will be re- opened using the extracted attributes? 6
  • 7. Determine best attributes Mine code and bug repositories Approach overview Extract attributes Predict re- opened bugs 7
  • 8. Our dimensions … 8 Work habit Bug report Bug fix People
  • 9. Work habit attributes 1. Time (Hour of day) 2. Weekday 3. Day of month 4. Month 9
  • 10. Bug report attributes 1. Component 2. Platform 3. Severity 4. Priority 5. CC list 6. Priority changed 7. Description size 8. Description text 9. Number of comments 10. Comment size 11. Comment text 10 Metadata Textual data
  • 11. Bug fix attributes 1. Time to resolve (in days) 2. Last status 3. Number of edited files 11
  • 12. People attributes 1. Reporter Name 2. Reporter experience 3. Fixer name 4. Fixer experience 12
  • 13. Research question 1 Which attributes indicate re-opened bugs? 13 Comment text, description text and fix location (component) are the best indicators
  • 14. Top node analysis setup 1. Build 10 decision trees for each attribute set 3. Repeat using all attributes 2. Record the frequency and level of each attribute 14
  • 15. Decision tree prediction model 15 No. files >= 5 < 5 Dev exp >= 3 < 3 Re-openedMonth Time >= 12 < 12 Time to resolve >= 6 < 6 >= 24 < 24 Re-opened Not Re-opened Re-opened. . . . . . Level 1 Level 2 Level 3
  • 16. Top node analysis example with 3 trees Comment Time No. comments Comment Time No. files No. files Time Description size Level Frequency Attributes Level 1 2 1 Comment No. files Level 2 3 1 1 1 Time No. comments No. files Description size . . . . . . 16
  • 17. Which attributes best indicate re- opened bugs? 17 Work habit attributes 9 X Month 1 X Time (Hour of day) Weekday Day of month
  • 18. Which attributes best indicate re- opened bugs? 18 Bug report attributes Component Platform Severity Priority CC list Priority changed Description size Description text Number of comments Comment size 10 X Comment text Metadata Textual data
  • 19. Which attributes best indicate re- opened bugs? 7 X Time to resolve 3 X Last status Number of files in fix 19 Bug fix attributes
  • 20. Which attributes best indicate re- opened bugs? 5 X Reporter name 5 X Fixer name Reporter experience Fixer experience 20 People attributes
  • 21. Combining all attributes + ++ Level Frequency Attributes Level 1 10 Comment text Level 2 19 1 Description text Component 21
  • 22. Research question 2 Can we accurately predict if a bug will be re-opened using the extracted attributes? 22 Our models can correctly predict re-opened bugs with 63% precision and 85% recall
  • 23. Decision tree prediction model 23 No. files >= 5 < 5 Dev exp >= 3 < 3 Re-openedMonth Time >= 12 < 12 Time to resolve >= 6 < 6 >= 24 < 24 Re-opened Not Re-opened Re-opened. . . . . . Level 1 Level 2 Level 3
  • 24. Performance measures Re-opened precision: Re-opened Recall: Re-opened Not re-opened Re-opened TP FP Not re-opened FN TN Predicted Actual 𝑇𝑃 𝑇𝑃 + 𝐹𝑃 𝑇𝑃 𝑇𝑃 + 𝐹𝑁 Not re-opened precision: Not re-opened recall: 𝑇𝑁 𝑇𝑁 + 𝐹𝑁 𝑇𝑁 𝑇𝑁 + 𝐹𝑃 24
  • 25. 33 63 21 27 74 83 83 67 Work habits Bug report Bug fix People Precisionandrecall(%) Precision Recall Predicting re-opened bugs 25
  • 26. 93 97 93 91 71 91 39 66 Work habits Bug report Bug fix People Precisionandrecall(%) Precision Recall Predicting NOT re-opened bugs 26
  • 27. Combining all attributes 63 97 85 90 re-opened NOT re-opened Precisionandrecall(%) Precision Recall 27 + ++
  • 28. Bug comments are important … Bug report is most important set What words are important? Comment text most important bug report attribute 28
  • 29. Important words Re-opened Not Re-opened control background debugging breakpoint blocked platforms verified duplicate screenshot important testing warning 29
  • 30. 30
  • 31. Predicting re-opened bugs Pr: 93 % Re: 71 % Work habits Bug report Bug fix People Pr: 33 % Re: 74 % Pr: 97% Re: 91% Pr: 93% Re: 39% Pr: 63 % Re: 83 % Pr: 21% Re: 83% Pr: 91% Re: 66% Pr: 27% Re: 67%ened pened 31
  • 32. Predicting re-opened bugs Work habits Bug report Bug fix People 32
  • 33. Predicting NOT re-opened bugs Pr: 93 % Re: 71 % Work habits Bug report Bug fix People Pr: 97% Re: 91% Pr: 93% Re: 39% Pr: 91% Re: 66% 33
  • 34. Predicting re-opened bugs Pr: 97 % Re: 90 % Pr: 63 % Re: 85 %Re-opened Not Re-opened + ++ Recall Precision 34
  • 35. Predict re- opened bugs Mine code and bug repositories Approach overview Attributes of re-opened bugs Measure performance 35
  • 36. Work habits Bug report Bug fix People Precisionandrecallquantity Recall Precision Predicting re-opened bugs 36
  • 37. Which attributes best indicate re- opened bugs? Month (9) Time (1) Work habits Comment text (10) Bug report Bug fix Time to fix (7) Last status (3) People Fixer (5) Reporter (5) 37
  • 38. 38
  • 39. A typical work day… 39
  • 40. Bug report attributes 1. Component 2. Platform 3. Severity 4. Priority 5. CC list 6. Priority changed 7. Description size 8. Description text 9. Number of comments 10. Comment size 11. Comment text 40 Metadata Textual data