This document summarizes a scoping study on software defect density (DD). The study analyzed DD data from 109 software projects to address 5 research questions. Key findings include: 1) the typical DD is 7.47 defects per KLOC, 2) open source projects have lower DD than closed source, 3) C projects have higher DD than Java, 4) larger projects have lower DD, and 5) project age is not correlated with DD. Threats to the study's validity include survivorship bias and limited process data. Overall, the study provides initial evidence on typical DD values and relationships between DD and characteristics like project size and type.
Multiple Services Throughput Optimization in a Hierarchical MiddlewareFrederic Desprez
The document discusses optimizing throughput in hierarchical middleware like DIET. It presents DIET and describes its hierarchical architecture. The goal is to determine the optimal deployment of agents and servers across platforms to maximize throughput. It develops models for estimating service times at different levels. It then describes approaches for automatic deployment, including using mixed integer programming or genetic algorithms to evolve populations of solutions.
The document discusses the importance of measuring metrics related to code quality and health in order to transition from working software to high quality software. It provides examples of metrics that can be measured such as coding standard violations, duplicated code, dead code, code coverage by tests, number of open tasks, complexity of expressions, and modularity. It explains what different trends in these metrics may indicate and considerations for utilizing the metrics such as tools, policies, and developer skills. Regular measurement and response to trends helps avoid issues like increasing technical debt and the "Broken Window Syndrome."
The document introduces the use of learning classifier systems (LCS) for digital design verification. It defines what a design and design verification are, and explains that design verification aims to discover bugs before manufacturing by maximizing test coverage. The document notes that design verification is challenging due to increasing design complexity and limited resources. It summarizes previous attempts using genetic algorithms, genetic programming, Bayesian networks and Markov models that achieved good results but had limitations. The document proposes using an LCS called XCS to overcome these limitations through adaptive learning and developing an accurate problem representation.
The document discusses best practices for using RSpec, a testing library for Ruby projects. It provides examples of good and bad ways to write RSpec tests and describes techniques like using contexts and shared examples to organize tests and DRY up code. The document emphasizes testing edge cases, mocking only when needed, creating data dynamically in tests rather than using fixtures, and following practices like describing methods under test and having single-expectation tests. It encourages starting to use RSpec and provides contact information for the author to learn more.
The impact of process maturity on defect densityMarco Torchiano
Context: A common assumption in software engineering is that a more structured process delivers higher quality products. However, there are limited empirical studies that support this assumption.
Objective: In this paper we analyze 61 projects looking for a relationship between process structuredness and quality of the product.
Method: The structuredness is considered under two dimensions: level of maturity (as measured by the CMMI assessment model) and type (e.g. TSP, RUP). The quality of the product is measured in terms of Defect Density (DD) defined as the number of delivered defects divided per size.
Results: We found a small and statistically not significant difference of DD between the projects developed under CMMI and those that are not developed under CMMI. Considering the CMMI levels, the pair (CMMI 1, CMMI 3) is characterized by a statistically significant different DD. CMMI 1 exhibiting higher DD than CMMI 3. By comparing different software processes with each other we found that Hybrid process exhibits statistically significant lower DD than Waterfall.
Conclusion: Software process in either dimension, level of maturity and type has an impact on the software quality but smaller than one might expect.
A functional software measurement approach bridging the gap between problem a...IWSM Mensura
This document discusses a functional software measurement approach to bridge the gap between the problem and solution domains in software engineering. It identifies five key issues with current measurement approaches: granularity, parametric estimation methods, benchmarking, reliability of measurements, and measurement procedures. It proposes separating the problem domain from the solution domain to address these issues. It defines the "what vs. how" distinction between problem and solution aspects and argues that the two domains are mutually independent based on two case studies. The document provides context around idealized versus practical definitions and implementations in software engineering.
An Empirical Study of Adoption of Software Testing in Open Source ProjectsPavneet Singh Kochhar
We study more than 20,000 non-trivial software
projects and explore the correlation of test cases with various
project development characteristics including: project size,
development team size, number of bugs, number of bug
reporters, and the programming languages of these projects.
Multiple Services Throughput Optimization in a Hierarchical MiddlewareFrederic Desprez
The document discusses optimizing throughput in hierarchical middleware like DIET. It presents DIET and describes its hierarchical architecture. The goal is to determine the optimal deployment of agents and servers across platforms to maximize throughput. It develops models for estimating service times at different levels. It then describes approaches for automatic deployment, including using mixed integer programming or genetic algorithms to evolve populations of solutions.
The document discusses the importance of measuring metrics related to code quality and health in order to transition from working software to high quality software. It provides examples of metrics that can be measured such as coding standard violations, duplicated code, dead code, code coverage by tests, number of open tasks, complexity of expressions, and modularity. It explains what different trends in these metrics may indicate and considerations for utilizing the metrics such as tools, policies, and developer skills. Regular measurement and response to trends helps avoid issues like increasing technical debt and the "Broken Window Syndrome."
The document introduces the use of learning classifier systems (LCS) for digital design verification. It defines what a design and design verification are, and explains that design verification aims to discover bugs before manufacturing by maximizing test coverage. The document notes that design verification is challenging due to increasing design complexity and limited resources. It summarizes previous attempts using genetic algorithms, genetic programming, Bayesian networks and Markov models that achieved good results but had limitations. The document proposes using an LCS called XCS to overcome these limitations through adaptive learning and developing an accurate problem representation.
The document discusses best practices for using RSpec, a testing library for Ruby projects. It provides examples of good and bad ways to write RSpec tests and describes techniques like using contexts and shared examples to organize tests and DRY up code. The document emphasizes testing edge cases, mocking only when needed, creating data dynamically in tests rather than using fixtures, and following practices like describing methods under test and having single-expectation tests. It encourages starting to use RSpec and provides contact information for the author to learn more.
The impact of process maturity on defect densityMarco Torchiano
Context: A common assumption in software engineering is that a more structured process delivers higher quality products. However, there are limited empirical studies that support this assumption.
Objective: In this paper we analyze 61 projects looking for a relationship between process structuredness and quality of the product.
Method: The structuredness is considered under two dimensions: level of maturity (as measured by the CMMI assessment model) and type (e.g. TSP, RUP). The quality of the product is measured in terms of Defect Density (DD) defined as the number of delivered defects divided per size.
Results: We found a small and statistically not significant difference of DD between the projects developed under CMMI and those that are not developed under CMMI. Considering the CMMI levels, the pair (CMMI 1, CMMI 3) is characterized by a statistically significant different DD. CMMI 1 exhibiting higher DD than CMMI 3. By comparing different software processes with each other we found that Hybrid process exhibits statistically significant lower DD than Waterfall.
Conclusion: Software process in either dimension, level of maturity and type has an impact on the software quality but smaller than one might expect.
A functional software measurement approach bridging the gap between problem a...IWSM Mensura
This document discusses a functional software measurement approach to bridge the gap between the problem and solution domains in software engineering. It identifies five key issues with current measurement approaches: granularity, parametric estimation methods, benchmarking, reliability of measurements, and measurement procedures. It proposes separating the problem domain from the solution domain to address these issues. It defines the "what vs. how" distinction between problem and solution aspects and argues that the two domains are mutually independent based on two case studies. The document provides context around idealized versus practical definitions and implementations in software engineering.
An Empirical Study of Adoption of Software Testing in Open Source ProjectsPavneet Singh Kochhar
We study more than 20,000 non-trivial software
projects and explore the correlation of test cases with various
project development characteristics including: project size,
development team size, number of bugs, number of bug
reporters, and the programming languages of these projects.
When, why and for whom do practitioners detect technical debts?: An experienc...Norihiro Yoshida
First International Workshop on Technical Debt Analytics (TDA 2016), in conjunction with the 23rd Asia-Pacific Software Engineering Conference (APSEC 2016), University of Waikato, Hamilton, New Zealand, 6th December 2016.
With commits and releases, hundreds of tests are run on varying conditions (e.g., over different hardware and workloads) that can help to understand evolution and ensure non-regression of software
performance. We hypothesize that performance is not only sensitive to evolution of software, but also to different variability layers
of its execution environment, spanning the hardware, the operating system, the build, or the workload processed by the software.
Leveraging the MongoDB dataset, our results show that changes in
hardware and workload can drastically impact performance evolution and thus should be taken into account when reasoning about evolution. An open problem resulting from this study is how to manage the variability layers in order to efficiently test the performance
evolution of a software.
Conducting and reporting the results of a cfd simulationMalik Abdul Wahab
This document discusses conducting and reporting the results of a computational fluid dynamics (CFD) simulation to achieve credibility and confidence in the results. It outlines sources of uncertainty and error in CFD simulations, including physical modeling error, discretization error, and others. It also discusses verification and validation procedures, such as grid convergence studies and comparisons to experimental data, to demonstrate accuracy and ensure the CFD results accurately represent real-world phenomena.
Towards a Macrobenchmark Framework for Performance Analysis of Java ApplicationsGábor Szárnyas
This document discusses the need for macrobenchmarks to evaluate the performance and scalability of large model querying systems. It presents the Train Benchmark, which measures the performance of validation queries on randomly generated railway network models of increasing sizes. The benchmark includes loading models, running validation queries to detect errors, transforming models by injecting faults, and revalidating. It aims to provide a realistic and scalable way to assess model querying tools for domains like software engineering, where models can contain billions of elements.
A case study trying to answering the question "Are there statistical correlations between statement coverage and the number of failures detected?" and running a comparison between different reliability growth models
What does it take to have high impact in software engineering research? Andreas Zeller, a "high impact" SE researcher, shares his personal story and perspective.
Efficient Rails Test-Driven Development Week #1. A class by Wolfram Arnold of rubyfocus.biz, in collaboration with Sarah Allen of blazingcloud.net and marakana.com
1) Software reliability models estimate the defect rate and quality of software either through static attributes or dynamic testing patterns.
2) Dynamic models like the Rayleigh and Weibull distributions use statistical analysis of defect patterns over time to project future reliability. Finding and removing defects earlier in the development process leads to better quality in later stages.
3) Accuracy of estimates from reliability models depends on the input data and how well the model fits the specific organization. No single model works for all situations.
TJ Randall, VP of Customer Success at XebiaLabs, gives his presentation on how to express the cost of your application delivery at the DevOps Leadership Summit in Boston MA.
The document discusses Open-DO, an open source initiative for developing safety-critical software. It provides an overview of Open-DO concepts like FLOSS, agile development practices, and high-integrity certification. Updates on Open-DO include new community projects, conferences, and tools to support qualifications. Formal methods like Couverture and Hi-Lite are presented as ways to verify properties and generate verification conditions for proof.
The document discusses problems with traditional software development approaches and proposes an alternative approach focused on continuous delivery. It argues that current practices prioritize resource utilization, projects, and technical possibilities over delivering value. Instead, it advocates for minimizing work-in-progress, focusing on flow over batching, establishing stable teams to continuously deliver working software, and making decisions based on learning rather than upfront estimates. The overall message is that software development needs to shift from a project mindset to prioritizing the delivery of valuable, working code.
This document provides an introduction to computer programming skills for civil engineering students. It discusses why engineers need programming abilities and how computer modeling is important for engineering problems. The lecture covers algorithms, finite element analysis, and examples of programming applications in areas like fluid dynamics, geotechnics, and groundwater flow. Limitations of models and the importance of verification and validation are also addressed. An example algorithm for extracting annual maximum river discharge data from daily records is presented to illustrate programming concepts.
IEEE ACM Studying the Relationship between Exception Handling Practices and P...Gui Padua
Paper presentation at IEEE/ACM MSR 2018 - Studying the Relationship between Exception Handling Practices and Post-release Defects.
For C# .NET and Java.
More at: https://guipadua.github.io/eh-model-defects2018/
The document discusses topics in verification including reuse, coverage, regression engineering, planning, and qualification. It provides an overview of a voluntary community-based verification group called DVClub that discusses present day challenges in verification. The group has an advisory board of about 10 members from 6 companies. Topics discussed at meetings should relate to chip verification and may cover adjacent areas like design for verification, IT, and project management.
This document discusses various software quality metrics including lines of code count, defect rates based on lines of code, cyclomatic complexity, fan-in and fan-out, and structural and data complexity metrics. It explains that while lines of code is commonly used, it does not fully capture complexity. Other metrics like cyclomatic complexity, fan-in/fan-out, and data/structural complexity provide additional insight into a program's quality and maintainability. The optimal size of a program may depend on factors like language, project, and environment.
This document discusses various software quality metrics including lines of code count, defect density as it relates to size, cyclomatic complexity, fan-in/fan-out, and other structural and data complexity metrics. It provides empirical data on the relationship between size and defects, defines key metrics like cyclomatic complexity, and discusses how these metrics can help evaluate software quality and estimate testing effort.
The document provides techniques for estimating the risk level and number of defects in code during testing. It discusses counting previous defects, investigating code coverage, assigning risk scores, and using metrics like cyclomatic complexity and bug level to estimate risk for functionality. It also outlines techniques for deciding when sufficient testing has been done, such as looking for a plateau in detected defects and ensuring test cases fully exercise the code. Methods for estimating the number of defects include using the number of requirements, lines of code, similar previous projects, and defects found in prior phases.
A preliminary study on using code smells to improve bug localizationkrws
The document summarizes a preliminary study on using code smells to improve bug localization. The study proposes combining code smell severity scores with textual similarity scores from information retrieval-based bug localization. Code smells indicate fault-proneness in code. The study evaluates the approach on four open source projects, finding it improves mean average precision over the baseline technique, with the best improvement around 142%. Future work includes more evaluation of when and how code smells influence bug localization.
The document discusses various topics related to software development practices including Scrum, pair programming, test-driven development, and studies on project success rates. It questions some commonly held beliefs and presents alternative perspectives. For example, it notes that while co-located work seems more productive, remote work can be effective if certain techniques are used. Studies on practices like pair programming and TDD have had mixed results. It cautions against making definitive conclusions from correlations without considering other factors and possibilities of bias.
When, why and for whom do practitioners detect technical debts?: An experienc...Norihiro Yoshida
First International Workshop on Technical Debt Analytics (TDA 2016), in conjunction with the 23rd Asia-Pacific Software Engineering Conference (APSEC 2016), University of Waikato, Hamilton, New Zealand, 6th December 2016.
With commits and releases, hundreds of tests are run on varying conditions (e.g., over different hardware and workloads) that can help to understand evolution and ensure non-regression of software
performance. We hypothesize that performance is not only sensitive to evolution of software, but also to different variability layers
of its execution environment, spanning the hardware, the operating system, the build, or the workload processed by the software.
Leveraging the MongoDB dataset, our results show that changes in
hardware and workload can drastically impact performance evolution and thus should be taken into account when reasoning about evolution. An open problem resulting from this study is how to manage the variability layers in order to efficiently test the performance
evolution of a software.
Conducting and reporting the results of a cfd simulationMalik Abdul Wahab
This document discusses conducting and reporting the results of a computational fluid dynamics (CFD) simulation to achieve credibility and confidence in the results. It outlines sources of uncertainty and error in CFD simulations, including physical modeling error, discretization error, and others. It also discusses verification and validation procedures, such as grid convergence studies and comparisons to experimental data, to demonstrate accuracy and ensure the CFD results accurately represent real-world phenomena.
Towards a Macrobenchmark Framework for Performance Analysis of Java ApplicationsGábor Szárnyas
This document discusses the need for macrobenchmarks to evaluate the performance and scalability of large model querying systems. It presents the Train Benchmark, which measures the performance of validation queries on randomly generated railway network models of increasing sizes. The benchmark includes loading models, running validation queries to detect errors, transforming models by injecting faults, and revalidating. It aims to provide a realistic and scalable way to assess model querying tools for domains like software engineering, where models can contain billions of elements.
A case study trying to answering the question "Are there statistical correlations between statement coverage and the number of failures detected?" and running a comparison between different reliability growth models
What does it take to have high impact in software engineering research? Andreas Zeller, a "high impact" SE researcher, shares his personal story and perspective.
Efficient Rails Test-Driven Development Week #1. A class by Wolfram Arnold of rubyfocus.biz, in collaboration with Sarah Allen of blazingcloud.net and marakana.com
1) Software reliability models estimate the defect rate and quality of software either through static attributes or dynamic testing patterns.
2) Dynamic models like the Rayleigh and Weibull distributions use statistical analysis of defect patterns over time to project future reliability. Finding and removing defects earlier in the development process leads to better quality in later stages.
3) Accuracy of estimates from reliability models depends on the input data and how well the model fits the specific organization. No single model works for all situations.
TJ Randall, VP of Customer Success at XebiaLabs, gives his presentation on how to express the cost of your application delivery at the DevOps Leadership Summit in Boston MA.
The document discusses Open-DO, an open source initiative for developing safety-critical software. It provides an overview of Open-DO concepts like FLOSS, agile development practices, and high-integrity certification. Updates on Open-DO include new community projects, conferences, and tools to support qualifications. Formal methods like Couverture and Hi-Lite are presented as ways to verify properties and generate verification conditions for proof.
The document discusses problems with traditional software development approaches and proposes an alternative approach focused on continuous delivery. It argues that current practices prioritize resource utilization, projects, and technical possibilities over delivering value. Instead, it advocates for minimizing work-in-progress, focusing on flow over batching, establishing stable teams to continuously deliver working software, and making decisions based on learning rather than upfront estimates. The overall message is that software development needs to shift from a project mindset to prioritizing the delivery of valuable, working code.
This document provides an introduction to computer programming skills for civil engineering students. It discusses why engineers need programming abilities and how computer modeling is important for engineering problems. The lecture covers algorithms, finite element analysis, and examples of programming applications in areas like fluid dynamics, geotechnics, and groundwater flow. Limitations of models and the importance of verification and validation are also addressed. An example algorithm for extracting annual maximum river discharge data from daily records is presented to illustrate programming concepts.
IEEE ACM Studying the Relationship between Exception Handling Practices and P...Gui Padua
Paper presentation at IEEE/ACM MSR 2018 - Studying the Relationship between Exception Handling Practices and Post-release Defects.
For C# .NET and Java.
More at: https://guipadua.github.io/eh-model-defects2018/
The document discusses topics in verification including reuse, coverage, regression engineering, planning, and qualification. It provides an overview of a voluntary community-based verification group called DVClub that discusses present day challenges in verification. The group has an advisory board of about 10 members from 6 companies. Topics discussed at meetings should relate to chip verification and may cover adjacent areas like design for verification, IT, and project management.
This document discusses various software quality metrics including lines of code count, defect rates based on lines of code, cyclomatic complexity, fan-in and fan-out, and structural and data complexity metrics. It explains that while lines of code is commonly used, it does not fully capture complexity. Other metrics like cyclomatic complexity, fan-in/fan-out, and data/structural complexity provide additional insight into a program's quality and maintainability. The optimal size of a program may depend on factors like language, project, and environment.
This document discusses various software quality metrics including lines of code count, defect density as it relates to size, cyclomatic complexity, fan-in/fan-out, and other structural and data complexity metrics. It provides empirical data on the relationship between size and defects, defines key metrics like cyclomatic complexity, and discusses how these metrics can help evaluate software quality and estimate testing effort.
The document provides techniques for estimating the risk level and number of defects in code during testing. It discusses counting previous defects, investigating code coverage, assigning risk scores, and using metrics like cyclomatic complexity and bug level to estimate risk for functionality. It also outlines techniques for deciding when sufficient testing has been done, such as looking for a plateau in detected defects and ensuring test cases fully exercise the code. Methods for estimating the number of defects include using the number of requirements, lines of code, similar previous projects, and defects found in prior phases.
A preliminary study on using code smells to improve bug localizationkrws
The document summarizes a preliminary study on using code smells to improve bug localization. The study proposes combining code smell severity scores with textual similarity scores from information retrieval-based bug localization. Code smells indicate fault-proneness in code. The study evaluates the approach on four open source projects, finding it improves mean average precision over the baseline technique, with the best improvement around 142%. Future work includes more evaluation of when and how code smells influence bug localization.
The document discusses various topics related to software development practices including Scrum, pair programming, test-driven development, and studies on project success rates. It questions some commonly held beliefs and presents alternative perspectives. For example, it notes that while co-located work seems more productive, remote work can be effective if certain techniques are used. Studies on practices like pair programming and TDD have had mixed results. It cautions against making definitive conclusions from correlations without considering other factors and possibilities of bias.
Similar to An overview of software dd a scoing study (20)
A Free 200-Page eBook ~ Brain and Mind Exercise.pptxOH TEIK BIN
(A Free eBook comprising 3 Sets of Presentation of a selection of Puzzles, Brain Teasers and Thinking Problems to exercise both the mind and the Right and Left Brain. To help keep the mind and brain fit and healthy. Good for both the young and old alike.
Answers are given for all the puzzles and problems.)
With Metta,
Bro. Oh Teik Bin 🙏🤓🤔🥰
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumMJDuyan
(𝐓𝐋𝐄 𝟏𝟎𝟎) (𝐋𝐞𝐬𝐬𝐨𝐧 𝟏)-𝐏𝐫𝐞𝐥𝐢𝐦𝐬
𝐃𝐢𝐬𝐜𝐮𝐬𝐬 𝐭𝐡𝐞 𝐄𝐏𝐏 𝐂𝐮𝐫𝐫𝐢𝐜𝐮𝐥𝐮𝐦 𝐢𝐧 𝐭𝐡𝐞 𝐏𝐡𝐢𝐥𝐢𝐩𝐩𝐢𝐧𝐞𝐬:
- Understand the goals and objectives of the Edukasyong Pantahanan at Pangkabuhayan (EPP) curriculum, recognizing its importance in fostering practical life skills and values among students. Students will also be able to identify the key components and subjects covered, such as agriculture, home economics, industrial arts, and information and communication technology.
𝐄𝐱𝐩𝐥𝐚𝐢𝐧 𝐭𝐡𝐞 𝐍𝐚𝐭𝐮𝐫𝐞 𝐚𝐧𝐝 𝐒𝐜𝐨𝐩𝐞 𝐨𝐟 𝐚𝐧 𝐄𝐧𝐭𝐫𝐞𝐩𝐫𝐞𝐧𝐞𝐮𝐫:
-Define entrepreneurship, distinguishing it from general business activities by emphasizing its focus on innovation, risk-taking, and value creation. Students will describe the characteristics and traits of successful entrepreneurs, including their roles and responsibilities, and discuss the broader economic and social impacts of entrepreneurial activities on both local and global scales.
This document provides an overview of wound healing, its functions, stages, mechanisms, factors affecting it, and complications.
A wound is a break in the integrity of the skin or tissues, which may be associated with disruption of the structure and function.
Healing is the body’s response to injury in an attempt to restore normal structure and functions.
Healing can occur in two ways: Regeneration and Repair
There are 4 phases of wound healing: hemostasis, inflammation, proliferation, and remodeling. This document also describes the mechanism of wound healing. Factors that affect healing include infection, uncontrolled diabetes, poor nutrition, age, anemia, the presence of foreign bodies, etc.
Complications of wound healing like infection, hyperpigmentation of scar, contractures, and keloid formation.
A Visual Guide to 1 Samuel | A Tale of Two HeartsSteve Thomason
These slides walk through the story of 1 Samuel. Samuel is the last judge of Israel. The people reject God and want a king. Saul is anointed as the first king, but he is not a good king. David, the shepherd boy is anointed and Saul is envious of him. David shows honor while Saul continues to self destruct.
Elevate Your Nonprofit's Online Presence_ A Guide to Effective SEO Strategies...TechSoup
Whether you're new to SEO or looking to refine your existing strategies, this webinar will provide you with actionable insights and practical tips to elevate your nonprofit's online presence.
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptxCapitolTechU
Slides from a Capitol Technology University webinar held June 20, 2024. The webinar featured Dr. Donovan Wright, presenting on the Department of Defense Digital Transformation.
How to Download & Install Module From the Odoo App Store in Odoo 17Celine George
Custom modules offer the flexibility to extend Odoo's capabilities, address unique requirements, and optimize workflows to align seamlessly with your organization's processes. By leveraging custom modules, businesses can unlock greater efficiency, productivity, and innovation, empowering them to stay competitive in today's dynamic market landscape. In this tutorial, we'll guide you step by step on how to easily download and install modules from the Odoo App Store.
3. Defect Density
• Product quality measured in terms of
Defect Density (DD)
DD = CND/ Size
• Where CND is the cumulative number of
post release defects in the observed period.
• Size is measured at the end of the observed
period in thousands of lines of code (KLoC).
4. RQ 1: What are the Typical Figures of DD in Software
Projects
Typical Figures of
DD
Such figures provide managers benchmarks to define quality
goals upfront, to evaluate and to assess the quality
5. RQ2: Is there a difference in DD between open and closed
source project?
Since often the context, motivation, and development process
differ between open source and proprietary projects. We aim at
finding some evidence, at least in terms of DD.
6. RQ3: Is there a difference in DD among programming languages?
Different programming languages encompass e.g. varying
styles, expressive power and abstraction level. Such
differences are likely to influence the DD of projects.
7. What is the relationship between DD and
project size?
It provide researchers the evidence of a
relationship between project size and DD.
8. What is the relationship of DD and project age?
It provides researchers the evidence of evolution of reliability
over time, not only on failure happening (traditional reliability
definition) but also on DD.
9. Scoping Study
• Followed the framework of Arksey and O’Malley
for conducting our scoping study.
The framework has five stages
• Stage 1: Research Questions Definition
• Stage 2: Relevant Studies Identification
• Stage 3: Selection of Studies
• Stage 4: Charting the Data
• Stage 5: Results
10. Search Strategy
Three phase search strategy.
Phase 1:
• Considers papers published in top software
engineering journals from 2000 to 2011.
• The search string is “Defect Density” OR “Fault
Density” OR “Reliability” is used.
Phase 2
• We search using two publication databases, IEEE
explorer and ACM digital library.
11. ….
• The search keywords are “Software Defect
Density” and “Software Fault Density”.
Phase 3
• We followed the references of selected 18
papers to identify other relevant studies.
• We particularly searched the Promise
proceedings
This led to a selection of 19 papers in total
containing DD data about 110 projects.
12. Inclusion Criteria
• Are related to software engineering
• Are related to software projects
• Contain directly figures of DD, or contain data
that allows to compute DD indirectly (such as
number of post release defects and size in LOC)
• Mention that DD was computed after the
particular release or the project (operation
phase, post release phase, etc).
13. Data Analysis and Hypothesis
To answer the research questions we used
descriptive statistics, regression analysis and when
applicable we also statistically test some hypothesis.
Concerning RQ 1: we used descriptive statistics and
cumulative distribution diagrams.
Concerning RQ 4,5: we performed regression
analysis
14. …..
We answer RQ’s 2 and 3 by means of
hypothesis testing
Concerning RQ2:
H20: There is no significant difference in
term of DD between open source and
close source projects.
H2a: There is a significant difference in
term of DD between open source
and close source projects.
15. ….
Concerning RQ3:
H3.10:There is no significant difference in
terms of DD among projects adopting
different languages.
H3.1a:There is a significant difference in
terms of DD among projects adopting
different languages.
If the above null hypothesis can be rejected we can
conduct a post-hoc investigation of the pair-wise
differences;
16. DD Typical Figures
7.47
STANDARD
NUMBERS MEAN MEDIAN DEVIATION
Projects 109 7.47 4.3 7.99
17. DD Vs. (open, closed
source)
Mann-
Mann-
Whitney
Whitney
p = 0.004
p = 0.004
Open source projects DD that is on average 4 defects/KLoC
smaller than closed source ones
STANDARD
NUMBERS MEAN MEDIAN DEVIATION
Closed source 77 8.6 5.4 8.49
Open source 32 4.66 2.75 5.84
18. DD Vs. Programming
languages Kruskal-Wallis
Kruskal-Wallis
Mann-Whitney
Mann-Whitney αB =α/3=0.016 p = 0.009
p = 0.003
p = 0.003 p = 0.009
C projects DD have 4.1 defect per KLoC higher than Java ones
STANDARD
NUMBERS MEAN MEDIAN DEVIATION
C 43 10.0 7.9 7.98
Java 45 5.9 3.5 6.22
C++ 11 8.73 1.3 13.17
19. log(DD) = 5.12 - 0.341 log(Size)
DD Vs. Size of Project
The negative correlation,
The negative correlation,
for the log-values has a
for the log-values has a
higher, though still small
higher, though still small
practical relevance.
practical relevance.
log(DD) = 5.12 - 0.341 log(Size)
Regression’s p-value is smaller than 0.001 and the
corresponding adjusted R2 is 22%.
20. We find evidence that the larger the projects, the lower the
defect density
Mann-Whitney
Mann-Whitney
p <0.016
p <0.016
αB =α/3=0.016
Defect Density
Size range N Mean Median Std Dev
upto 400K LoC 88 8.63 5.3 8.4
400K to 2M LoC 15 3.3 2.8 2.72
Above 2M LoC 5 0.38 0.40 0.28
21. DD Vs. Age of Project
DD = 1.86 + 0.59 ⋅ Age
The p-value of the regression is 0.011 and the corresponding
adjusted R2 is 13.8%.
22. Contingency Tables
Size Type Closed Open
Small 63 82% 25 78%
Medium 12 15% 4 12%
Large 2 3% 3 10%
77 100% 32 100%
Lang C C++ Java Other
Type
Closed 39 91% 11 100% 20 44% 7 70%
Open 4 9% 0 0% 25 56% 3 30%
43 100% 11 100% 45 100% 10 100%
23. Speculative Hypotheses
• Small projects are more in closed source projects
(82%) than for open source ones (78%).
• C and Java Unbalanced use Open Source and Close
Source may have some influence.
• Size is negatively correlated with defect density”
Large projects typically put more effort on testing
24. Threats to Validity
• “survivorship bias”. Checking figures.
• Missed papers calling DD by another name.
• Little concern programming language,
development mode.
• Size, defects are easily subject to ambiguities.
• 109 projects is not a negligible number, but is a
very limited percentage of the number of projects
released overall.
25. Main Limitations
Lack of process variables:
• Testing effort
• Quality effort in general
• code churn
• code history
• Experience (team and manager)
That are probably key factors for DD.
26. Conclusion
• The first to mined the literature for DD
figures that had not been gathered and
analyzed before.
• Found that development mode is a factor
for DD
• Programming language is sometimes a
factor for DD
• Found that projects size is relevant while
Age is not a factor for DD
Such figures provide quality managers and project managers benchmarks to define quality goals upfront, to evaluate the quality of a project during development, and to assess the quality of a project post mortem
Since often the context, motivation, and development process differ between open source and proprietary projects, as the anecdotal story goes one would expect a different quality of products. We aim at finding some evidence, at least in terms of DD.
Such differences are likely to influence the DD of projects.
A lot of analysis has been conducted on the relationship between module size and DD, but we could not find any on project size and DD. RQ4 provide researchers the evidence of a relationship between project size and DD.
A reasonable expectation is that the longer a project is released, more defects are found and therefore the higher the defect density. In addition it also provides researchers the evidence of evolution of reliability over time, not only on failure happening (traditional reliability definition) but also on DD.
System and Software IEEE Transaction on Software Engineering ACM Transaction on Software Engineering and Measurement IEEE Software Empirical Software Engineering Information and Software Technology
to determine up to which extent the dependent variable varies as a function of one or more independent variables
The DD is reported on a logarithmic scale, we observe that it spans nearly three orders of magnitudes, from 0.05 to close to 50.0.
Open source projects DD that is on average 4 defects/KLoC smaller than closed source ones
C projects have a DD that is an average 4.1 defect per KLoC higher than Java ones Bonferroni correction for multiple tests shall be applied.
Since both the DD and size distributions are extremely skewed – actually requiring a dual log scale to have a discernible representation The negative correlation, for the log-values has a higher, though still small practical relevance.
This may give an impression that the reported data to calculate DD would be “survivorship bias”. as in any overview the composition is subject of debate [16], and actually we acknowledge that this is not the definitive: it is deemed to evolve indefinitely. It is well known that measuring size in lines of code is subject to variations due to programming language and modes of measuring (with or without comments, with or without blank lines, including libraries or not, etc). Here too there may be differences in the precision of measurement processes used, and on the definition of a defect used. This lack of precision of defect data may give estimates with larger error but following [16] we believe that this is better than having no data at all and relying on intuition