Software dependencies, work dependencies and Failure Proneness
1. Software Dependencies, Work Dependencies,
and their impact on Failures
Jeffrey A.Roberts, James D.Herbsleb, Marcelo cataldo and Audris Mockus
Presented By
Muthukumaran Kasinathan
2. Introduction
“Software faults are often the result of violated dependencies
that are not recognized by developers implementing software”
Research Questions
What is the relative impact of syntactic and logical
dependencies on the failure proneness of a software
system?
Do higher levels of work dependencies lead to higher levels
of failure proneness of a software system?
4. Syntactic Dependencies
“Syntactic dependencies will be computed between source
code files by identifying data, function and method references
crossing the boundary of each source code file ”
Focuses on Control and Dataflow relationships
These dependencies could be:
Inflow, Outflow Data Dependencies
Inflow, Outflow Functional Dependencies
5. Logical Dependencies
Relate source-code files that are modified together as part of
an MR
If only one file was changed for an MR, then there is no
dependencies
Using the Commit information from the Version control
system, a logical dependency matrix (LDM) was created
LDM is a symmetric matrix of source-code files where Cij
represents the sum, across all releases, of the number of
times files i and j were changed together as part of an MR
6. Logical Dependencies
“Logical Dependencies between the source code files will be
calculated by identifying source code files that are changed
together as part of software development”
Identifies important dependencies that are not visible or
covered in Syntactic Dependencies.
These dependencies could be:
Number of Logical Dependencies
Clustering of Logical Dependencies
7. Work Dependencies
Impact of human and organizational factors on the failure proneness of
software systems
Impact of Lack of proper communication and coordination between developers
Identification and management of work dependencies
These dependencies are:
Workflow Dependencies
Coordination Requirements
8. Data Collection
Examined two large software development projects:
Project A
Complex distributed system
Data are covered for 3 years of development activity
The company had 114 developers grouped into 8 development team
and has 3 development locations
≃ 5 million lines of code distributed in 7,737 source code files in C
language
Project B
Embedded software system
40 developers in the project over a period of 5 years
1.2 million lines of code developed in both C and C++ language
9. Data Collection
In both projects, every change to source code was controlled
by Modification Requests (MR)
Every change made to Source code has to be committed to
Version Control System
Information Used for this Analysis:
Collected a total of 8,257 and 3,372 MRs for Project A and
Project B
Version control system from both projects
The source code itself from both projects
10. Measuring Failure
Goal is to investigate failure proneness at the file level
File Buggyness – indicates whether a file has been modified
in the course of resolving a defect
11. Results
Analysis consists of two stages:
First Stage: Focus on examining the relative impact of
each dependency type on failure proneness of source-
code files
Second Stage: Verified the consistency of the initial
results by conduction a number of confirmatory analysis
Constructed several logistic regression models
12. Results
Model I:
Based on LOC and Average Lines Changed
LOC is positively associated with failure proneness
Average lines changed is also positively associated with defects
Model II:
Introduces Syntactic Dependency measures by:
Inflow Data
Has significant impact on error proneness
Inflow Functional
This type of syntactic dependency has less impact on failure
proneness
13. Results
Model III:
Higher number of logical dependencies related to an increase in
the likelihood of failure
Model IV:
Workflow dependencies do increase the likelihood of defects
Model V:
Coordination requirement has an higher impact in Project A and
lesser impact in Project B
14. Conclusion
All dependencies increases fault proneness
Logical Dependencies has the highest impact, followed by
Workflow dependencies and then Syntactic Dependencies