Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Defect Prediction:
Accomplishments and Future Challenges
Yasutaka Kamei
POSL Lab, Kyushu University
Emad Shihab
CSE, Conco...
POSL Lab.
❖ 2 PhD students
❖ 7 masters students
❖ 5 undergraduates
N. Ubayashi Y. Kamei
Improving

Software Quality
Scalin...
Defect Prediction
Fumio Akiyama
An Example of Software
System Debugging
IFIC, 1971
Defect Prediction
Fumio Akiyama
An Example of Software
System Debugging
IFIC, 1971
Background
Accomplishments
Future Chall...
What is Defect Prediction?
Describe the relationship between various
software metrics and software defects
Predicting wher...
Leverages Data from
Repositories
Communication and discussions
Source code and development
history of a project
Bug report...
Measure Source Code
❖ Complexity
❖ Cohesion
❖ Churn
❖ …
❖ # Previous Defects
and Build a Prediction Model
❖ Statistical or
❖ Machine learning

techniques
Predict a Defect
Predict a Defect
Predict a Defect
# Bugs: 0
# Bugs: 7
# Bugs: 2
Its Performance is Evaluated
Compare the predicted and actual number
of defects in each file.
Background
Accomplishments
Future Challenges
Accomplishment
Data Metrics
Modeling Performance
Lack of Availability and Openness
The early 2000s
Almost never want to
disclose the quality of
companies’ software
Rarely ...
Defect prediction studies
started sharing their data
The early 2000s Current Trend
For Example
MSR
Data showcase track
ESEC/FSE
Replication package track
(5 more mins + 1 extra page)
tera-PROMISE
More than...
Accomplishment
Data Metrics
Modeling Performance
Most Papers Used Data from
Source Code Repositories
The early 2000s
We can Extract to Measure
Various Types of Metrics
The early 2000s Current Trend
GerritGit GitHub
RHSA Mylyn
 え
Accomplishment
Data Metrics
Modeling Performance
Defect Prediction Requires
Sufficient Historical Data
The early 2000s
Past Current
Building Cross-Project Defect Prediction
Models for Projects with Limited Data
The early 2000s
Other
Projects Current
Curr...
Accomplishment
Data Metrics
Modeling Performance
Many studies used standard
statistical measures
The early 2000s
How well defect
prediction models
explain defects
The early 2000s Current Trend
How well defect
prediction models
explain defects
Considering the effort
required to address...
Background
Accomplishments
Future Challenges
Consider new markets
Defect Prediction + Mobile Apps
Defect Prediction + Green Mining
Defect Prediction + Green Mining
We anticipate new markets to be
an area of significant growth in the
future.
Keeping Up with the Fast Pace
of Development
Firefox project
Reducing release cycles
to days or even hours
1,000 improvem-...
Just-In-Time (JIT) Quality Assurance
Prediction ModelDevelopers
Example of Change Features
Still
Fresh
Low
Risk
High Risk
...
Just-In-Time (JIT) Quality Assurance
Prediction ModelDevelopers
Example of Change Features
Still
Fresh
Low
Risk
High Risk
...
Making our Models More
Accessible
Replication
Packages
Prediction

Models
Other Researchers
and Practitioners
Our Models and Techniques
Simple and Extendable
Commit Guru (Rosen et al. FSE 2015)
Via the Web
Its source code is freely
available
Conclusion
What is Defect Prediction?
Describe the relationship between various
software metrics and software defects
File Prediction...
Accomplishment
Data Metrics
Modeling Performance
Defect Prediction + Mobile Apps
Mobile applications
play a significant role in
our daily life
Defect Prediction:
Accomplishments and Future Challenges
Yasutaka Kamei
Principles of Software Languages Group (POSL)
Kyus...
Accomplishment
Data Metrics
Modeling Performance
What is Defect Prediction?
Describe the relationship between various
soft...
Upcoming SlideShare
Loading in …5
×

Defect Prediction: Accomplishments and Future Challenges

@CREST COW52 workshop

  • Login to see the comments

Defect Prediction: Accomplishments and Future Challenges

  1. 1. Defect Prediction: Accomplishments and Future Challenges Yasutaka Kamei POSL Lab, Kyushu University Emad Shihab CSE, Concordia University
  2. 2. POSL Lab. ❖ 2 PhD students ❖ 7 masters students ❖ 5 undergraduates N. Ubayashi Y. Kamei Improving
 Software Quality Scaling up
 MSR Analysis Understanding
 OSS Collaboration
  3. 3. Defect Prediction Fumio Akiyama An Example of Software System Debugging IFIC, 1971
  4. 4. Defect Prediction Fumio Akiyama An Example of Software System Debugging IFIC, 1971 Background Accomplishments Future Challenges
  5. 5. What is Defect Prediction? Describe the relationship between various software metrics and software defects Predicting where defects might appear Understanding the effect of metrics
  6. 6. Leverages Data from Repositories Communication and discussions Source code and development history of a project Bug reports or feature requests
  7. 7. Measure Source Code ❖ Complexity ❖ Cohesion ❖ Churn ❖ … ❖ # Previous Defects
  8. 8. and Build a Prediction Model ❖ Statistical or ❖ Machine learning
 techniques
  9. 9. Predict a Defect
  10. 10. Predict a Defect
  11. 11. Predict a Defect # Bugs: 0 # Bugs: 7 # Bugs: 2
  12. 12. Its Performance is Evaluated Compare the predicted and actual number of defects in each file.
  13. 13. Background Accomplishments Future Challenges
  14. 14. Accomplishment Data Metrics Modeling Performance
  15. 15. Lack of Availability and Openness The early 2000s Almost never want to disclose the quality of companies’ software Rarely share the OSS datasets
  16. 16. Defect prediction studies started sharing their data The early 2000s Current Trend
  17. 17. For Example MSR Data showcase track ESEC/FSE Replication package track (5 more mins + 1 extra page) tera-PROMISE More than 1TB of data More than 45 datasets
  18. 18. Accomplishment Data Metrics Modeling Performance
  19. 19. Most Papers Used Data from Source Code Repositories The early 2000s
  20. 20. We can Extract to Measure Various Types of Metrics The early 2000s Current Trend GerritGit GitHub RHSA Mylyn
  21. 21.  え
  22. 22. Accomplishment Data Metrics Modeling Performance
  23. 23. Defect Prediction Requires Sufficient Historical Data The early 2000s Past Current
  24. 24. Building Cross-Project Defect Prediction Models for Projects with Limited Data The early 2000s Other Projects Current Current Trend Past Current
  25. 25. Accomplishment Data Metrics Modeling Performance
  26. 26. Many studies used standard statistical measures The early 2000s How well defect prediction models explain defects
  27. 27. The early 2000s Current Trend How well defect prediction models explain defects Considering the effort required to address the predicted defects More Practical Performance Evaluations
  28. 28. Background Accomplishments Future Challenges
  29. 29. Consider new markets
  30. 30. Defect Prediction + Mobile Apps
  31. 31. Defect Prediction + Green Mining
  32. 32. Defect Prediction + Green Mining We anticipate new markets to be an area of significant growth in the future.
  33. 33. Keeping Up with the Fast Pace of Development Firefox project Reducing release cycles to days or even hours 1,000 improvem- ents in 3 months
  34. 34. Just-In-Time (JIT) Quality Assurance Prediction ModelDevelopers Example of Change Features Still Fresh Low Risk High Risk AcceptedSoftware Changes 4: file = fopen(fileName); 5: if(file == null) 6: return true; Risk 0.90 Try Again!! NF: Number of modified files DEV: The number of developers EXP: Developer experience 1:bool existFile( 2: String fileName){ 3: File file = null; 4: file = fopen(fileName); 5: if(file == null) 6: return true; 7: else 8: return false; 9:} Kamei et al. TSE, 2013.
  35. 35. Just-In-Time (JIT) Quality Assurance Prediction ModelDevelopers Example of Change Features Still Fresh Low Risk High Risk AcceptedSoftware Changes 4: file = fopen(fileName); 5: if(file == null) 6: return true; Risk 0.90 Try Again!! NF: Number of modified files DEV: The number of developers EXP: Developer experience 1:bool existFile( 2: String fileName){ 3: File file = null; 4: file = fopen(fileName); 5: if(file == null) 6: return true; 7: else 8: return false; 9:} We need to evaluate how to integrate JIT models into CI process Suggest how much effort developers spend to find and fix defects
  36. 36. Making our Models More Accessible Replication Packages Prediction
 Models Other Researchers and Practitioners
  37. 37. Our Models and Techniques Simple and Extendable
  38. 38. Commit Guru (Rosen et al. FSE 2015) Via the Web Its source code is freely available
  39. 39. Conclusion
  40. 40. What is Defect Prediction? Describe the relationship between various software metrics and software defects File Prediction model Output
  41. 41. Accomplishment Data Metrics Modeling Performance
  42. 42. Defect Prediction + Mobile Apps Mobile applications play a significant role in our daily life
  43. 43. Defect Prediction: Accomplishments and Future Challenges Yasutaka Kamei Principles of Software Languages Group (POSL) Kyushu University, Fukuoka, Japan Email: kamei@ait.kyushu-u.ac.jp Emad Shihab Dept. of Computer Science and Software Engineering Concordia University, Montr´eal, Canada Email: eshihab@encs.concordia.ca Abstract—As software systems play an increasingly important role in our lives, their complexity continues to increase. The increased complexity of software systems makes the assurance of their quality very difficult. Therefore, a significant amount of recent research focuses on the prioritization of software quality assurance efforts. One line of work that has been receiving an increasing amount of attention for over 40 years is software defect prediction, where predictions are made to determine where future defects might appear. Since then, there have been many studies and many accomplishments in the area of software defect prediction. At the same time, there remain many challenges that face that field of software defect prediction. The paper aims to accomplish four things. First, we provide a brief overview of software defect prediction and its various components. Second, we revisit the challenges of software prediction models as they were seen in the year 2000, in order to reflect on our accom- plishments since then. Third, we highlight our accomplishments and current trends, as well as, discuss the game changers that had a significant impact on software defect prediction. Fourth, we highlight some key challenges that lie ahead in the near (and not so near) future in order for us as a research community to tackle these future challenges. I. INTRODUCTION future and allocate SQA resources to defect-prone artifacts (e.g., subsystems and files) [58] and (2) to understand the effect of factors on the likelihood of finding a defect and derive practical guidelines for future software development projects [9, 45]. Due to its importance, defect prediction work has been at the focus of researchers for over 40 years. Akiyama [3] first attempted to build defect prediction models using size- based metrics and regression modelling techniques in 1971. Since then, there have been a plethora of studies and many accomplishments in the software defect prediction area [23]. At the same time, there remain many challenges that face software defect prediction. Hence, we believe that it is a perfect time to write a Future of Software Engineering (FoSE) paper on the topic of software defect prediction. The paper is written from a budding university researchers’ point of view and aims to accomplish four things. First, we provide a brief overview of software defect prediction and its various components. Second, we revisit the challenges of
  44. 44. Accomplishment Data Metrics Modeling Performance What is Defect Prediction? Describe the relationship between various software metrics and software defects File Prediction model Output Defect Prediction + Mobile Apps Mobile applications play a significant role in our daily life

×