This document discusses key measurements for testers, including precision vs accuracy. It provides examples to illustrate the difference between precision and accuracy. The document also discusses goals for testing, including using the SMART framework. It introduces the GQM methodology for defining test goals and questions. Additional sections cover test planning and resources, defect tracking, defect prediction, and release criteria.
Implementing a Test Dashboard to Boost QualityTechWell
You are responsible for addressing quality problems that are plaguing your product and having an adverse impact on the business. Have you been challenged to provide a simple mechanism for quantifying and tracking key performance indicators selected by your organization. The ultimate goal is an approach that will enable the cross-functional team to identify problem areas so they can take corrective action. Where do you start? Attend this session to learn how you can develop a quantifiable approach to assessing testing effectiveness and addressing quality. Scott Acker shows you a solution he developed, deployed, and managed to effectively leverage various types of data to support analyzing, tracking, and reporting changes in testing and quality over time. Discover how to drive communication and collaboration improvements across the entire cross-functional team and boost quality efforts.
Oplægget blev holdt ved et seminar i InfinIT-interessegruppen Processer & IT Nord den 5. marts 2014. Læs mere om interessegruppen her: http://infinit.dk/dk/interessegrupper/processer_og_it/processer_og_it.htm
QA Fest 2017. Ирина Жилинская. Тестирование основанное на рисках в реальности...QAFest
ЧТО? и КАК? Вы можете протестировать имея ограниченное количество времени?
Это вероятно два самых популярных вопроса в реальной жизни современного QA. Тестирование основанное на рисках один из подходов в тестировании, который позволит вам концентрироваться на том что действительно важно.
Во время доклада, мы с вами обсудим следующее:
Что же такое «риск-бейзд» тестирование на практике,: какие риски есть в тестировании реального проекта? Как их определить? Как их оценить?
Как внедрить «риск-бейзд» подход на вашем проекте?
Как вовлечь столь необходимых вам стейкхолдеров?
А дальше мы посмотрим как строить тестирование на основании рисков, как поддерживать его в последствии и что же делать с новыми рисками которые будут происходить с ходом эволюции вашего продукта
И последний в теме доклада, однако не последний по важности вопрос: как измерять качество продукта в "риск-бейзд" тестировании, какие метрики взять за основу?
Lean software development has been described as “better, faster, cheaper” and focused on “eliminating waste,” but those are misnomers. Going after speed improvement and waste elimination can actually reduce the benefits you might otherwise get from lean. Ken Pugh describes what lean software development really is and why you should be incorporating it into your development efforts—whether you use Scrum, kanban, or SAFe. Ken explains the mindset, principles, and practices of lean. Its foundations are systems thinking, a relentless focus on time, and an understanding that complex systems require holistic solutions. Employing lean principles, you optimize the whole, eliminate delays, improve collaboration, deliver value quickly, create effective ecosystems for development, push decisions to the people doing the work, and build integrity in. Lean practices include small batches, cross-functional teams, implementing pull, and managing work in process. Join Ken to learn how to use lean—no matter where you are in your development process.
Implementing a Test Dashboard to Boost QualityTechWell
You are responsible for addressing quality problems that are plaguing your product and having an adverse impact on the business. Have you been challenged to provide a simple mechanism for quantifying and tracking key performance indicators selected by your organization. The ultimate goal is an approach that will enable the cross-functional team to identify problem areas so they can take corrective action. Where do you start? Attend this session to learn how you can develop a quantifiable approach to assessing testing effectiveness and addressing quality. Scott Acker shows you a solution he developed, deployed, and managed to effectively leverage various types of data to support analyzing, tracking, and reporting changes in testing and quality over time. Discover how to drive communication and collaboration improvements across the entire cross-functional team and boost quality efforts.
Oplægget blev holdt ved et seminar i InfinIT-interessegruppen Processer & IT Nord den 5. marts 2014. Læs mere om interessegruppen her: http://infinit.dk/dk/interessegrupper/processer_og_it/processer_og_it.htm
QA Fest 2017. Ирина Жилинская. Тестирование основанное на рисках в реальности...QAFest
ЧТО? и КАК? Вы можете протестировать имея ограниченное количество времени?
Это вероятно два самых популярных вопроса в реальной жизни современного QA. Тестирование основанное на рисках один из подходов в тестировании, который позволит вам концентрироваться на том что действительно важно.
Во время доклада, мы с вами обсудим следующее:
Что же такое «риск-бейзд» тестирование на практике,: какие риски есть в тестировании реального проекта? Как их определить? Как их оценить?
Как внедрить «риск-бейзд» подход на вашем проекте?
Как вовлечь столь необходимых вам стейкхолдеров?
А дальше мы посмотрим как строить тестирование на основании рисков, как поддерживать его в последствии и что же делать с новыми рисками которые будут происходить с ходом эволюции вашего продукта
И последний в теме доклада, однако не последний по важности вопрос: как измерять качество продукта в "риск-бейзд" тестировании, какие метрики взять за основу?
Lean software development has been described as “better, faster, cheaper” and focused on “eliminating waste,” but those are misnomers. Going after speed improvement and waste elimination can actually reduce the benefits you might otherwise get from lean. Ken Pugh describes what lean software development really is and why you should be incorporating it into your development efforts—whether you use Scrum, kanban, or SAFe. Ken explains the mindset, principles, and practices of lean. Its foundations are systems thinking, a relentless focus on time, and an understanding that complex systems require holistic solutions. Employing lean principles, you optimize the whole, eliminate delays, improve collaboration, deliver value quickly, create effective ecosystems for development, push decisions to the people doing the work, and build integrity in. Lean practices include small batches, cross-functional teams, implementing pull, and managing work in process. Join Ken to learn how to use lean—no matter where you are in your development process.
Talk for the Project Quality Day at Eclipse Conference Europe 2015. A presentation on how to perform risk based testing, using Jira, Jubula and Mylyn (and Spago4Q), appplied to a real-world use case, the SpagoWorld Shop
Software analytics (for software quality purpose) is a statistical or machine learning classifier that is trained to identify defect-prone software modules. The goal of software analytics is to help software engineers prioritize their software testing effort on the most-risky modules and understand past pitfalls that lead to defective code. While the adoption of software analytics enables software organizations to distil actionable insights, there are still many barriers to broad and successful adoption of such analytics systems. Indeed, even if software organizations can access such invaluable software artifacts and toolkits for data analytics, researchers and practitioners often have little knowledge to properly develop analytics systems. Thus, the accuracy of the predictions and the insights that are derived from analytics systems is one of the most important challenges of data science in software engineering.
In this work, we conduct a series of empirical investigation to better understand the impact of experimental components (i.e., class mislabelling, parameter optimization of classification techniques, and model validation techniques) on the performance and interpretation of software analytics. To accelerate a large amount of compute-intensive experiment, we leverage the High-Performance-Computing (HPC) resources of Centre for Advanced Computing (CAC) from Queen’s University, Canada. Through case studies of systems that span both proprietary and open- source domains, we demonstrate that (1) realistic noise does not impact the precision of software analytics; (2) automated parameter optimization for classification techniques substantially improve the performance and stability of software analytics; and (3) the out-of- sample bootstrap validation technique produces a good balance between bias and variance of performance estimates. Our results lead us to conclude that the experimental components of analytics modelling impact the predictions and associated insights that are derived from software analytics. Empirical investigations on the impact of overlooked experimental components are needed to derive practical guidelines for analytics modelling.
Continuous Security / DevSecOps- Why How and WhatMarc Hornbeek
This presentation explains what Continuous Security / DevSecOps is, Why it is important, How it works and What you can do to realized a well-engineered DevSecOps solution in your own organization or enterprise.
How much time it takes for my feature to arrive?Daniel Alencar
How much time it takes for a bug fix or a new feature be available to users? We did an empirical work to better understand what makes a new feature or bug fix to arrive faster to users
Using JIRA for Risk Based Testing - QASymphony WebinarQASymphony
Learn how to leverage risk based testing with JIRA to prioritize the testing of features and functions based on risk of failure, function of importance and likelihood or impact of failure.
QA Interview Questions With Answers from software testing experts. Frequently asked questions in Quality Assurance (QA) interview for freshers and experienced professionals.
DevOps Summit 2015 Presentation: Continuous Testing At the Speed of DevOpsSailaja Tennati
Continuous delivery is frightening to enterprise IT managers who see each new private, public or hybrid cloud infrastructure software change potentially causing service outages or security concerns.
This presentation by Marc Hornbeek, first shared at the DevOps Summit 2015 in London, explains Spirent’s comprehensive Clear DevOps Solution to support:
- Rapid paced continuous testing without compromising coverage or service quality
- Orchestration of service deployments over physical and virtual infrastructures
- Best practices for integrating continuous testing into CI infrastructures
- How to use continuous testing analytics for deployment decisions
Dietmar Strasser - Traditional QA meets Agile DevelopmentTEST Huddle
EuroSTAR Software Testing Conference 2008 presentation on Traditional QA meets Agile Development by Dietmar Strasser. See more at conferences.eurostarsoftwaretesting.com/past-presentations/
'An Evolution Into Specification By Example' by Adam KnightTEST Huddle
For the last four years myself and my colleagues at RainStor have been evolving a process for testing a structure data archiving system in an Agile development environment. In this talk I will discuss the evolution of a team from a rudimentary Agile implementation on an unreleased product, to our current process which uses the fundamental elements of Specification By Example to successfully deliver software functionality across 30 different platform/backend configurations to a series of high profile and demanding customers. Last year our company was used as a case study for successful implementation in Gojko Adzic's book on Specification By Example.
My report will discuss the lessons learned during the early implementation and the challenges faced in moving away from a compressed waterfall approach. Through a process of incremental change we have identified and tackled the fundamental issues that undermined the development effort as a team. I’ll describe some of the mistakes made in attempting to implement a more formal process of requirements documentation into an Agile implementation and the benefits we uncovered on moving to a more flexible user story based approach. I’ll also discuss some of the issues around trying to implement user stories in a server system with no GUI and very technical and performance based requirements.
Raising the importance of quality and the status of testing both within the development team and the organisation as a whole has allowed the challenges facing the team to be recognised and respected. The result has been a more collaborative approach taken between developers and testers both through “collaborative specification” of user stories and tackling the problems that impact the delivery of value to the customers. I also plan to discuss how we’ve expanded from documenting acceptance criteria for each user story such that we now document Criteria, Assumptions and Risks for each feature and, rather than a ‘Done/Not Done’ approach how we identify the confidence in each of these categories to measure the confidence we have in each new feature being implemented.
Having the test team as an involved and influential team through the entire development process has also allowed us to implement a number of testability features to help to make the product more testable. I will discuss the benefits of having development understand and prioritise testability issues with some illustrative examples.
I will discuss the challenges and benefits of developing our own metadata driven test harnesses as opposed to an off the shelf solution. I’ll detail how having control over these harnesses has allowed us to work towards a self documenting test system using realistic customer examples as “Automated Specifications” of the RainStor system allowing us to explain current behaviour to Product Management in terms of well understood customer scenarios.
DOES16 San Francisco - David Blank-Edelman - Lessons Learned from a Parallel ...Gene Kim
Lessons Learned from a Parallel Universe
David N. Blank-Edelman, Technical Evangelist, Apcera
Just within the last ten or so years, we have seen at least two separate communities evolve at the crossroads of development and operations. The first—DevOps—grew up very much in public, the second matured sequestered within the halls of “special” companies like Google and Facebook and is only now starting to gain visibility and traction in the wider world. The DevOps and Site Reliability Engineering (SRE) communities barely speak, yet both have common ancestors and much to offer each other. Let’s look at what they have in common, how they differ, and what are the key things we can learn from both.
DevOps Enterprise Summit San Francisco 2016
Thomas Axen - Lean Kaizen Applied To Software Testing - EuroSTAR 2010TEST Huddle
EuroSTAR Software Testing Conference 2010 presentation on Lean Kaizen Applied To Software Testing by Thomas Axen . See more at: http://conference.eurostarsoftwaretesting.com/past-presentations/
Talk for the Project Quality Day at Eclipse Conference Europe 2015. A presentation on how to perform risk based testing, using Jira, Jubula and Mylyn (and Spago4Q), appplied to a real-world use case, the SpagoWorld Shop
Software analytics (for software quality purpose) is a statistical or machine learning classifier that is trained to identify defect-prone software modules. The goal of software analytics is to help software engineers prioritize their software testing effort on the most-risky modules and understand past pitfalls that lead to defective code. While the adoption of software analytics enables software organizations to distil actionable insights, there are still many barriers to broad and successful adoption of such analytics systems. Indeed, even if software organizations can access such invaluable software artifacts and toolkits for data analytics, researchers and practitioners often have little knowledge to properly develop analytics systems. Thus, the accuracy of the predictions and the insights that are derived from analytics systems is one of the most important challenges of data science in software engineering.
In this work, we conduct a series of empirical investigation to better understand the impact of experimental components (i.e., class mislabelling, parameter optimization of classification techniques, and model validation techniques) on the performance and interpretation of software analytics. To accelerate a large amount of compute-intensive experiment, we leverage the High-Performance-Computing (HPC) resources of Centre for Advanced Computing (CAC) from Queen’s University, Canada. Through case studies of systems that span both proprietary and open- source domains, we demonstrate that (1) realistic noise does not impact the precision of software analytics; (2) automated parameter optimization for classification techniques substantially improve the performance and stability of software analytics; and (3) the out-of- sample bootstrap validation technique produces a good balance between bias and variance of performance estimates. Our results lead us to conclude that the experimental components of analytics modelling impact the predictions and associated insights that are derived from software analytics. Empirical investigations on the impact of overlooked experimental components are needed to derive practical guidelines for analytics modelling.
Continuous Security / DevSecOps- Why How and WhatMarc Hornbeek
This presentation explains what Continuous Security / DevSecOps is, Why it is important, How it works and What you can do to realized a well-engineered DevSecOps solution in your own organization or enterprise.
How much time it takes for my feature to arrive?Daniel Alencar
How much time it takes for a bug fix or a new feature be available to users? We did an empirical work to better understand what makes a new feature or bug fix to arrive faster to users
Using JIRA for Risk Based Testing - QASymphony WebinarQASymphony
Learn how to leverage risk based testing with JIRA to prioritize the testing of features and functions based on risk of failure, function of importance and likelihood or impact of failure.
QA Interview Questions With Answers from software testing experts. Frequently asked questions in Quality Assurance (QA) interview for freshers and experienced professionals.
DevOps Summit 2015 Presentation: Continuous Testing At the Speed of DevOpsSailaja Tennati
Continuous delivery is frightening to enterprise IT managers who see each new private, public or hybrid cloud infrastructure software change potentially causing service outages or security concerns.
This presentation by Marc Hornbeek, first shared at the DevOps Summit 2015 in London, explains Spirent’s comprehensive Clear DevOps Solution to support:
- Rapid paced continuous testing without compromising coverage or service quality
- Orchestration of service deployments over physical and virtual infrastructures
- Best practices for integrating continuous testing into CI infrastructures
- How to use continuous testing analytics for deployment decisions
Dietmar Strasser - Traditional QA meets Agile DevelopmentTEST Huddle
EuroSTAR Software Testing Conference 2008 presentation on Traditional QA meets Agile Development by Dietmar Strasser. See more at conferences.eurostarsoftwaretesting.com/past-presentations/
'An Evolution Into Specification By Example' by Adam KnightTEST Huddle
For the last four years myself and my colleagues at RainStor have been evolving a process for testing a structure data archiving system in an Agile development environment. In this talk I will discuss the evolution of a team from a rudimentary Agile implementation on an unreleased product, to our current process which uses the fundamental elements of Specification By Example to successfully deliver software functionality across 30 different platform/backend configurations to a series of high profile and demanding customers. Last year our company was used as a case study for successful implementation in Gojko Adzic's book on Specification By Example.
My report will discuss the lessons learned during the early implementation and the challenges faced in moving away from a compressed waterfall approach. Through a process of incremental change we have identified and tackled the fundamental issues that undermined the development effort as a team. I’ll describe some of the mistakes made in attempting to implement a more formal process of requirements documentation into an Agile implementation and the benefits we uncovered on moving to a more flexible user story based approach. I’ll also discuss some of the issues around trying to implement user stories in a server system with no GUI and very technical and performance based requirements.
Raising the importance of quality and the status of testing both within the development team and the organisation as a whole has allowed the challenges facing the team to be recognised and respected. The result has been a more collaborative approach taken between developers and testers both through “collaborative specification” of user stories and tackling the problems that impact the delivery of value to the customers. I also plan to discuss how we’ve expanded from documenting acceptance criteria for each user story such that we now document Criteria, Assumptions and Risks for each feature and, rather than a ‘Done/Not Done’ approach how we identify the confidence in each of these categories to measure the confidence we have in each new feature being implemented.
Having the test team as an involved and influential team through the entire development process has also allowed us to implement a number of testability features to help to make the product more testable. I will discuss the benefits of having development understand and prioritise testability issues with some illustrative examples.
I will discuss the challenges and benefits of developing our own metadata driven test harnesses as opposed to an off the shelf solution. I’ll detail how having control over these harnesses has allowed us to work towards a self documenting test system using realistic customer examples as “Automated Specifications” of the RainStor system allowing us to explain current behaviour to Product Management in terms of well understood customer scenarios.
DOES16 San Francisco - David Blank-Edelman - Lessons Learned from a Parallel ...Gene Kim
Lessons Learned from a Parallel Universe
David N. Blank-Edelman, Technical Evangelist, Apcera
Just within the last ten or so years, we have seen at least two separate communities evolve at the crossroads of development and operations. The first—DevOps—grew up very much in public, the second matured sequestered within the halls of “special” companies like Google and Facebook and is only now starting to gain visibility and traction in the wider world. The DevOps and Site Reliability Engineering (SRE) communities barely speak, yet both have common ancestors and much to offer each other. Let’s look at what they have in common, how they differ, and what are the key things we can learn from both.
DevOps Enterprise Summit San Francisco 2016
Thomas Axen - Lean Kaizen Applied To Software Testing - EuroSTAR 2010TEST Huddle
EuroSTAR Software Testing Conference 2010 presentation on Lean Kaizen Applied To Software Testing by Thomas Axen . See more at: http://conference.eurostarsoftwaretesting.com/past-presentations/
Vattenfall’s interim report for third quarter 2015 (January-September): Lower costs and increased production.
Vattenfall’s third quarter earnings improved as a result of lower costs and increased production. Market conditions remain challenging, however, especially in the Nordics. New savings and efficiency measures are needed to further reduce costs, at the same time that Vattenfall is in the midst of a transformation to new long-term market conditions.
Why bringing back prospective customers who have shown interest in our product in their earlier interactions is crucial in the internet media. Basics of Online Retargeting
Software Quality Assurance training by QuontraSolutionsQUONTRASOLUTIONS
Quontra Solutions provides QA training by Real time Industry experts. QA is having good demand in the market. Our QA online training Instructors are very much experienced and highly qualified and dedicated.
Our QA online training program is job oriented. After completion of QA training with us you should be able to work on any kind of project. After completion of QA online training our dedicated team will be supporting you.
HCLT Whitepaper: Landmines of Software Testing MetricsHCL Technologies
http://www.hcltech.com/enterprise-transformation-services/overview~ More on ETS
It is not only desirable but also necessary to assess the quality of testing being delivered by a vendor. Specific to software testing, there are some discerning metrics that one an look at, however it must be kept in mind that there are multiple factors that affect these metrics which are not necessarily under the control of testing team. The SLAs for testing initiatives can, and should, only be committed after a detailed understanding of the customer’s IT organization in terms of culture and process maturity and after analyzing the various trends among these metrics. This white paper lists some of the popular testing metrics and the factors one must keep in mind while reading in to their values.
Excerpts from the Paper
The estimates and planning for testing is based on certain assumptions and available historical data. However if there are higher number of disruptions (than anticipated) to testing in terms of environment unavailability or higher number of defects being found and fixed, the quality time available for testing the system would be less and hence higher number of defects slip through the testing stage. We must ensure that the data on defects on all subsequent stages are also available and are accurate. Production defects are usually handled by a separate Production support team and testing team is at times not given much insight in to this data. Also, since multiple projects and/or Programs would be going live, one after another, there are usually challenges in identifying which defects in Production can be attributed to which Project or Program. Inaccuracies in assignment would lead to inaccurate measure of test stage effectiveness.
Describes the detail of software quality, tradeoffs, quality with testing, quality with inspection, need of standards, standards organizations & different type of software standards.
What is testing?
“An empirical, technical investigation conducted to provide stakeholders with information about the quality of the product under test.”
- Cem Kaner
Why every dev team needs static analysisCoderGears
CppDepend is a static analysis tool for C/C++. CppDepend supports a large number of code metrics, allows for visualization of dependencies using directed graphs, and dependency matrices. It also performs code base snapshots comparison, and validation of architectural and quality rules.
Interview questions and answers for quality assuranceGaruda Trainings
Future of Software Testing is always good... as long as developers are developing projects we will be testing them and even when they stops developing then also we will test the enhancements and maintenance etc... Testing will always be needed
Customer will never accept the product Without complete testing .Scope of testing is always good as it gives everyone a confidence of the work we all are doing...Its always good to add more processes while doing testing so that one should not think that testing is a boring and easy job....Process is very imp. for testing.
Register for Free DEMO: www.p2cinfotech.com
email id: p2cinfotech@gmail.com
+1-732-546-3607 (USA)
2. Precision vs. AccuracyPrecision vs. Accuracy
AccuracyAccuracy
Saying PI = 3 is accurate, but not preciseSaying PI = 3 is accurate, but not precise
I’m 2 meters tall, which is accurate,I’m 2 meters tall, which is accurate,
but not precisebut not precise
PrecisionPrecision
Saying PI = 4.378383 is precise, but not accurateSaying PI = 4.378383 is precise, but not accurate
Airline flight times are precise to the minute,Airline flight times are precise to the minute,
but not accuratebut not accurate
Number of significant digits is the keyNumber of significant digits is the key
3. Precision vs. AccuracyPrecision vs. Accuracy
People make assumptions about accuracyPeople make assumptions about accuracy
based on precisionbased on precision
““365 days” is not the same as “1 year” or365 days” is not the same as “1 year” or
“4 quarters” or even “52 weeks”“4 quarters” or even “52 weeks”
““10,000 staff hours” is not the same as “510,000 staff hours” is not the same as “5
staff years”staff years”
Unwarranted precision is the enemy ofUnwarranted precision is the enemy of
accuracy (e.g., 395.7 days +/- 6 months)accuracy (e.g., 395.7 days +/- 6 months)
4. IntroductionIntroduction
Good GoalsGood Goals
A goal should be SMARTA goal should be SMART
SpecificSpecific
Measurable/TestableMeasurable/Testable
AttainableAttainable
RelevantRelevant
Time-boundTime-bound
Can use aCan use a Purpose, Issue, ObjectPurpose, Issue, Object formatformat
6. IntroductionIntroduction
GQM ExampleGQM Example
Current average cycle time * 100
Baseline average cycle time
Subjective rating of manager’s satisfaction
Measures
Is the performance of the process improving?Question
Average cycle time
Standard Deviation
% cases outside the upper limit
Measures
What is the current change request processing speed?Question
Improve by 10%
the timeliness of
change request processing
from the project manager’s viewpoint
Goal Purpose
Issue
Object (process)
Viewpoint
7. Project Evaluation: QualityProject Evaluation: Quality
Test Planning and ResourcesTest Planning and Resources
Do we have enough testing resources?Do we have enough testing resources?
How many tests do we need to run (estimated)?How many tests do we need to run (estimated)?
How long does each test case take to design and write?How long does each test case take to design and write?
How long does each test take, on average?How long does each test take, on average?
How many full testing cycles do we expect? (more thanHow many full testing cycles do we expect? (more than
one especially for early test cycles)one especially for early test cycles)
How many person-days do we need (# tests * time perHow many person-days do we need (# tests * time per
test * # of cycles)?test * # of cycles)?
How many testing staff do we have?How many testing staff do we have?
How long will the testing phase take, with our currentHow long will the testing phase take, with our current
staff?staff?
Is the testing phase too long (i.e. our current staff is notIs the testing phase too long (i.e. our current staff is not
sufficient)? Do we have to test less or can we add staff?sufficient)? Do we have to test less or can we add staff?
8. Project Evaluation: QualityProject Evaluation: Quality
Reported/Corrected SoftwareReported/Corrected Software
DefectsDefects
0%
100%
Time
Start of testing phase End of testing phase
Defects
found
Defects fixed
Defects open
From Manager’s Handbook for Software
Development, Revision 1, NASA, Software
Engineering Laboratory 1990
9. Project Evaluation: QualityProject Evaluation: Quality
Reported/Corrected SoftwareReported/Corrected Software
Defects – Actual ProjectDefects – Actual Project
Number of
defect
reports
(in
thousands)
0
1.0
Weeks of testing
5 10 15 20 25 30 35 40
0.2
0.4
0.8
0.6
Found
Open
Fixed
11. Project Evaluation: QualityProject Evaluation: Quality
Statistics on Effort per DefectStatistics on Effort per Defect
Data on time required to fix defects, categorizedData on time required to fix defects, categorized
by type of defect, provides a basis for estimatingby type of defect, provides a basis for estimating
remaining defect correction workremaining defect correction work
Need to collect data on fix time in defect trackingNeed to collect data on fix time in defect tracking
systemsystem
Data on phases in which defects are injected andData on phases in which defects are injected and
later detected gives you a measure of thelater detected gives you a measure of the
efficiency of the development process. If 95% ofefficiency of the development process. If 95% of
the defects are detected in the same phase theythe defects are detected in the same phase they
were created, the project has an efficient processwere created, the project has an efficient process
12. Project Evaluation: QualityProject Evaluation: Quality
A Defect Fix Time Model forA Defect Fix Time Model for
TestingTesting
From Software Metrics: Establishing a Company-wide Program, by
Robert B Grady and Deborah L. Caswell, 1987
25%
50%
20%
4%
1%
2 hours
5 hours
10 hours
20 hours
50 hours
13. Product Characterization: QualityProduct Characterization: Quality
DefectsDefects
Defects are one of the most often usedDefects are one of the most often used
measures of qualitymeasures of quality
Definitions of defects differDefinitions of defects differ
Only items found by customers? Testers?Only items found by customers? Testers?
Items found during upstream reviews?Items found during upstream reviews?
Only non-trivial items?Only non-trivial items?
Small enhancements?Small enhancements?
Timing of “defect” detection an important partTiming of “defect” detection an important part
of defect characterizationof defect characterization
A “product defect” may be different than a “processA “product defect” may be different than a “process
defect”defect”
14. Product Evaluation: TestingProduct Evaluation: Testing
System Test ProfileSystem Test Profile
0
20
40
60
80
100
120
140
System Test Phase
Tests
Tests
Executed
Tests
Passed
Tests
Planned
From NASA, Recommended Approach to
Software Development, 1992
15. Product Evaluation: TestingProduct Evaluation: Testing
System Test ProfileSystem Test Profile
0
20
40
60
80
100
120
140
System Test Phase
Tests
Tests
Executed
Tests
Passed
Tests
Planned
From NASA, Recommended Approach to
Software Development, 1992
16. Product Evaluation: TestingProduct Evaluation: Testing
Cumulative Defects Found inCumulative Defects Found in
TestingTesting
Error Rate Model
0
1
2
3
4
5
6
7
8
Design Code/Test System Test Acceptance
Test
CumulativeErrorsperKSLOC
Historical Norm
Upper bound
Lower Bound
From Manager’s Handbook for
Software Development, Revision 1,
NASA, Software Engineering
Laboratory 1990
17. Product Evaluation: TestingProduct Evaluation: Testing
Cumulative Defects – ActualCumulative Defects – Actual
ProjectProject Error Rate Model
0
1
2
3
4
5
6
7
8
Design Code/Test System Test Acceptance
Test
CumulativeErrorsperKSLOC
Historical Norm
Upper Bound
Lower Bound
Actual Project
From Manager’s Handbook for
Software Development, Revision 1,
NASA, Software Engineering
Laboratory 1990
18. Product PredictionProduct Prediction
Predicting Future Defect RatesPredicting Future Defect Rates
Increasing FactorsIncreasing Factors
System sizeSystem size
ApplicationApplication
complexitycomplexity
Compressing theCompressing the
scheduleschedule
4x increase4x increase
More staffMore staff
Lower productivityLower productivity
Decreasing FactorsDecreasing Factors
Simplifying theSimplifying the
application/problem atapplication/problem at
handhand
Extending the plannedExtending the planned
development timedevelopment time
Cut in halfCut in half
Fewer staffFewer staff
Higher productivityHigher productivity
19. Product PredictionProduct Prediction
Defect Density PredictionDefect Density Prediction
To judge whether we’ve found all the defects for anTo judge whether we’ve found all the defects for an
application, estimate its defect densityapplication, estimate its defect density
Need statistics on defect density of past similar projectsNeed statistics on defect density of past similar projects
Use this data to predict expected density on this projectUse this data to predict expected density on this project
For example, if our prior projects had a defect densityFor example, if our prior projects had a defect density
between 7 and 9.5 defects/KLOC, we expect a similarbetween 7 and 9.5 defects/KLOC, we expect a similar
density on our new projectdensity on our new project
If our new project has 100,000 lines of code, we expect to findIf our new project has 100,000 lines of code, we expect to find
between 700 and 950 defects totalbetween 700 and 950 defects total
If we’ve found 600 defects so farIf we’ve found 600 defects so far
We’re not done: we expect to find between 100 and 350 moreWe’re not done: we expect to find between 100 and 350 more
defectsdefects
20. Product PredictionProduct Prediction
Distribution of Software DefectDistribution of Software Defect
Origins and SeveritiesOrigins and Severities
Highest severity faults come fromHighest severity faults come from
requirements and designrequirements and design
SeverityLevel
Minor
Mod
Major
Critical
Requirements
Design
Coding
Documentation
Bad Fixes
21. Product PredictionProduct Prediction
Defect ModelingDefect Modeling
Model the number of defects expectedModel the number of defects expected
based on past experiencebased on past experience
Model the number of defects inModel the number of defects in
requirements, design, construction, etc.requirements, design, construction, etc.
Two approaches:Two approaches:
Model defects based on effort hours, i.e XModel defects based on effort hours, i.e X
defects will be introduced per hour workeddefects will be introduced per hour worked
Model defects per KSLOC (or other size unit)Model defects per KSLOC (or other size unit)
based on past experience and code growthbased on past experience and code growth
curvecurve
22. Product PredictionProduct Prediction
Defect ModelingDefect Modeling continuedcontinued
Approach 1: SEI data, based on PSP data:Approach 1: SEI data, based on PSP data:
DesignDesign Injected/hour = 1.76Injected/hour = 1.76
CodingCoding Injected/hour = 4.20Injected/hour = 4.20
Approach 2:Approach 2:
Defects / KSLOC total are about 40 (30-85)Defects / KSLOC total are about 40 (30-85)
10% requirements (4/KLOC)
25% design (10/KLOC)
40% coding (16/KLOC)
15% user documentation (6/KLOC)
10% bad fixes (4/KLOC)
23. Product PredictionProduct Prediction
Predicted and Actual DefectsPredicted and Actual Defects
FoundFound
0
100
200
300
400
500
600
700
800
Analysis
High
LevelD
esign
Low
LevelD
esignConstruction
UnitTest
ProjectIntegration
Test
Release
Integration
TestSystem
Test
Beta
G
eneralAvailability
Defects
Phase injection
estimate
Phase actual removal
Phase expected
removal
Cumulative actual
removal
Cumulative injection
estimate
Cumulative expected
Removal
Cumulative injection
reestimate
Development Phase
From Edward F. Weller, Practical Applications of
Statistical Process Control, IEEE Software
May/June 2000
Size reestimate
25. Release MeasuresRelease Measures
Defect CountsDefect Counts
Defect counts give a quantitative handleDefect counts give a quantitative handle
on how much work the project team stillon how much work the project team still
has to do before it can release thehas to do before it can release the
softwaresoftware
Graph the cumulative reported defects,Graph the cumulative reported defects,
open defects and fixed defectsopen defects and fixed defects
When the software is nearing release, theWhen the software is nearing release, the
number of open defects should trendnumber of open defects should trend
downward, and the fixed defects shoulddownward, and the fixed defects should
be approaching the reported defects linebe approaching the reported defects line
26. Release MeasuresRelease Measures
Defect Trends – Near ReleaseDefect Trends – Near Release
All DefectsAll Defects
Number of
defect
reports
(in
thousands)
0
1.0
Weeks of testing
5 10 15 20 25 30 35 40
0.2
0.4
0.8
0.6
Found
Open
Fixed
Target
27. Release MeasuresRelease Measures
Defect Trends – Near ReleaseDefect Trends – Near Release
Severity 1 and 2Severity 1 and 2
Number of
defect
reports
(in
thousands)
0
1.0
Weeks of testing
5 10 15 20 25 30 35 40
0.2
0.4
0.8
0.6
Found
Open
Fixed
Target
28. Release MeasuresRelease Measures
Construx Measurable ReleaseConstrux Measurable Release
CriteriaCriteria
Acceptance testing successfully completedAcceptance testing successfully completed
All open change requests dispositionedAll open change requests dispositioned
System testing successfully completedSystem testing successfully completed
All requirements implemented, based on the specAll requirements implemented, based on the spec
All review goals have been metAll review goals have been met
Declining defect rates are seenDeclining defect rates are seen
Declining change rates are seenDeclining change rates are seen
No open Priority A defects exist in the databaseNo open Priority A defects exist in the database
Code growth has stabilizedCode growth has stabilized
29. Release MeasuresRelease Measures
HP Measurable ReleaseHP Measurable Release
CriteriaCriteria
Breadth – testing coverage of userBreadth – testing coverage of user
accessible and internal functionsaccessible and internal functions
Depth – branch coverage testingDepth – branch coverage testing
Reliability – continuous hours of operationReliability – continuous hours of operation
under stress; stability; ability to recoverunder stress; stability; ability to recover
gracefully from defect conditionsgracefully from defect conditions
Remaining defect density at releaseRemaining defect density at release
From Robert B Grady, Practical Software
Metrics for Project Management and Process
Improvement, 1992
30. Release MeasuresRelease Measures
Post Release Defect Density byPost Release Defect Density by
Whether Met Release CriteriaWhether Met Release Criteria
Postrelease incoming defects submitted by customers (3
month moving average)
MR 1 2 3 4 5 6 7 8 9 10 11 12
Months
Defects
submitted
(normalized by
KLOC)
Did Not
Meet
Worst
Product
That Met
Average of
Products
That Met
From Practical Software Metrics for
Project Management and Process
Improvement, by Robert B. Grady 1992
31. Release Measures: Defect CountsRelease Measures: Defect Counts
Defect Plot Before ReleaseDefect Plot Before Release
0
2
4
6
8
10
12
Time
NumberofDefects
Sev 1 & 2
Sev 2
Sev 1
Target
From Robert B Grady, Practical Software Metrics for Project Management
and Process Improvement, 1992
34. Process EvaluationProcess Evaluation
Status ExampleStatus Example
0
100
200
300
400
500
600
700
800
Implementation Phase
Units
Target
Units Created
Units Reviewed
Units Tested
1
From NASA, Manager’s Handbook for
Software Development, Revision 1, 1990
35. Goal #1 – Improve Software QualityGoal #1 – Improve Software Quality
Postrelease Discovered DefectPostrelease Discovered Defect
DensityDensity
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Nov-84 Mar-86 Aug-87 Dec-88 May-90 Sep-91 Jan-93
NumberofOpenSeriousandCritical
DefectReports
Older
< 12
Months
10X Goal
From Practical Software Metrics for Project Management
and Process Improvement, by Robert B. Grady 1992
36. Goal #1 – Improve Software QualityGoal #1 – Improve Software Quality
Prerelease Defect DensityPrerelease Defect Density
Question: How can we predict software quality based onQuestion: How can we predict software quality based on
early development processes?early development processes?
0
10
20
30
40
50
60
70
80
Oct-80 Feb-82 Jul-83 Nov-84 Mar-86 Aug-87 Dec-88
Project Release Date
DefectsinTest/KLOC
Defects in
Test/KLOC
Linear
(Defects in
Test/KLOC)
From Practical Software Metrics for
Project Management and Process
Improvement, by Robert B. Grady 1992
37. Goal #3 – Improve ProductivityGoal #3 – Improve Productivity
Defect Repair EfficiencyDefect Repair Efficiency
Question: How efficient are defect-fixing activities? Are weQuestion: How efficient are defect-fixing activities? Are we
improving?improving?
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
1987 1988 1989 1990 1991
DefectsFixed/Engr.Month
Defects
Fixed /
Engr
Month
From Practical Software Metrics for Project Management
and Process Improvement, by Robert B. Grady 1992
38. Goal #4 – Maximize Customer SatisfactionGoal #4 – Maximize Customer Satisfaction
Mean Time to Fix Critical andMean Time to Fix Critical and
Serious DefectsSerious Defects
Question: How long does it take to fix a problem?Question: How long does it take to fix a problem?
0
50
100
150
200
250
7/18/1990
8/18/1990
9/18/1990
10/18/1990
11/18/1990
12/18/1990
1/18/1991
2/18/1991
3/18/1991
4/18/1991
5/18/1991
6/18/1991
7/18/1991
8/18/1991
Days
AR
QA
KP+AD
LC
MR
From Practical Software Metrics for Project Management and
Process Improvement, by Robert B. Grady 1992
AR = Awaiting release
QA = Final QA testing
KP = known problem
AD = awaiting data
LC = lab classification
MR = marketing
review
Editor's Notes
Main Point: Cover Slide
Notes:
Administrivia:
Hours (9-4:30) or (8:30-4)
Breaks (15 minutes 10:30, 2:30) (10:00, 2:00)
Lunch 12-1 (or 11:30-12:30)
Messages
Facilities
Notebook
Show enthusiasm for the topic!
Ask them for their enthusiasm!
(maybe needs a slide on the morass of measurement – every awful measurement you’ve seen or heard of?)
Don’t place laptop in center of table, I block screen. Put it at end of the table in our seminar room.
Misc.:
References: None
Main Point: Precision is different than accuracy
Notes:
Accuracy = correct; closest to the truth
Precision = Exactness; how finely you express the point.
Precision implies accuracy that may not exist.
So what would be the best way to say the project will take a year.
Exactly a year?
About a year?
12 months?
364 days?
Miscellaneous: Story of the 70 million and 6 year old dinosaur
Reference: None
Main Point: People are misled
Notes:
How you communicate leads people to believe how confident you are.
A precise, but inaccurate, estimate communicates a highly confident, accurate estimate.
Miscellaneous: Story of the 70 million and 6 year old dinosaur
Reference: None
Main Point: SMART Goals
Notes:
The goals can either be project, area, or organizational in nature. We can talk about what the group is chartered to do.
Purpose, Issue, Object format is from Basili’s group at Maryland. It also has viewpoint – see template. You may or may not find viewpoint useful.
Make sure to have a good flow between Smart and GQM
Misc: None
References: None
Main Point: Diagram of GQM – how goals, questions and measures relate to each other
Notes:
Goal: A point, end or place that an individual or group is striving to reach
Question: decision maker issues related to progress toward or attainment of one or more goals
Measure – A measurable characteristic of the organization or project that provides data which helps answer one or more of the questions.
The same measure can be used in order to answer different questions under the same goal, and may answer questions under different goals as well.
It’s an engineering problem. Smallest number of measurements to support the largest number of goals
Sometimes a combination of two answers a question, too
Misc: None
Reference: Basili, Victor R., Gianluigi Caldiera, and H. Dieter Rombach, “The Goal Question Metric Approach”, in Encyclopedia of Software Engineering, Wiley, 1994, available at http://ftp.cs.umd.edu/pub/sel/papers/gqm.pdf
Main Point: Example of GQM – how goals, questions and measures relate to each other
Notes:
Goal: A point, end or place that an individual or group is striving to reach
Goals have a purpose (like improve), an issue, an object or process and a viewpoint. See example above. It also has a quality issue, timeliness here.
Question: decision maker issues related to progress toward or attainment of one or more goals
Measure – A measurable characteristic of the organization or project that provides data which helps answer one or more of the questions.
Example of a goal, question and measures
Misc: None
Reference: Basili, Victor R., Gianluigi Caldiera, and H. Dieter Rombach, “The Goal Question Metric Approach”, in Encyclopedia of Software Engineering, Wiley, 1994, available at http://ftp.cs.umd.edu/pub/sel/papers/gqm.pdf
Main Point: Test Planning measures
Notes:
Now we’ll talk about measures related to Test Planning
These are very basic, and could be done by test phase
Basically, how many tests do we think we’ll need to develop
How long does each take to run (on average, from past experience)
Thus how long will the whole set take to run (total effort)?
How many staff do we have?
Thus, how long will the testing phase (this particular phase) last?
Is this ok for the project schedule, or do we need more staff? Do we have to run fewer tests?
Allow for multiple runs, especially of early testing phases!! Allow for 3 or four cycles
Time to find a defect varies by test type. It’s less for unit test, more for system test. Time to find a defect increases as the testing phase continues, as most defects are already found (or at least most easy ones!). So time to find the average defect isn’t linear – 2x time may not yield 2x defects. The defect detection curve is a Rayleigh curve, at least on a large project. Peaks then tails off slowly.
More important to talk about input (characteristics evaluation)
Misc.: None
References: None
Main Point: Information about Reported and Corrected Defects plots
Notes:
Plot defects found, defect fixed, defects still open on one graph
This is a healthy project. No huge backlog. Strive for a graph like this. This is a basic graph that everyone should be plotting. Most shops have this data in their defect tracking system. How many found, how many fixed, how many still open, over time. You’ll have to figure the numbers out over time, or save them every Friday or whatever. The dates the defect was entered into the defect tracking system, and the date it was fixed are known, so you can reconstruct this curve.
Key information is in the slope of the open defects curve
Open defects should decline as testing progresses, unless there is inadequate staff correcting problems, or the software is exceptionally buggy.
People introduce defects every hour they work. So the defect introduction curve looks like a Rayleigh curve (seen earlier).
If count of open defects is growing, possible causes may be:
Inadequate staffing to correct defects
Software very unreliable
Ambiguous or volatile requirements
If count of open defects is decreasing, possible causes may be:
Stable development
Testing/development progressing
Reliable, well-designed software (easy to fix defects)
Misc.: NoneReferences: [NASA90] page 6-9
Main Point: Information about Reported and Corrected Defects plots
Notes:
Data shown is from an actual project. This is more likely to be reality in your project! Have a celebration when the white line and blue line cross. You’re fixing defects faster than you’re finding them.
Draw on whiteboard an unhealthy open curve with a second hump.
Most customers are happy with a project that has 95% of defects removed. Most companies don’t even manage that. It costs significantly more to get to 99% and 99.9%. Putnam says 25% more for 99% 50% more for 99.9, but a recent paper says up to 10 times more expensive
From the Trajectory Computation and Orbital Products System, developed from 1983 to 1987. The total size was 1.2 million SLOC.
Early in testing, defects were not getting corrected. The cause was lower quality software. Defects were not found during system testing. Corrective actions: staffing was increased at week 20 to help address open defects. System attained stability (fixed and open lines crossed) at week 35, with defects being corrected faster than they were being reported.
Caution about metrics based on defect closure rate – the easy ones get corrected first, leaves the harder ones for the end.
Misc.: NoneReferences: [NASA90] page 6-9
Main Point: Defect Rate
Notes:
How long we need to test depends on the final reliability we need.
This is a theoretical curve, but actual data has been shown to fit this curve. And, if we test for 2x the time, we’ll find 2x the defects, if we’re in the peak of the curve, but not later on.
It makes sense – as we work, we inject defects into the project. Since the effort curve (at least for large projects) follows a Rayleigh curve, the defect rate curve also is a Rayleigh curve
The first line labeled 95% is the reliability we often aim for. This translates into a MTBF of about 8-9 hours, which is enough for a typical batch program that usually doesn’t need to run for a long time without failure
99% translates into a MTBF of more like 1.8 weeks – which is needed for software that must run in an online environment for days without failure. If 95% is a project time of 1.0, 99% is 1.25 (25% longer) according to Putnam
99.9% translates into a MTBF of 10+ weeks – and a project time of 1.5 (50% longer) relative to 95% defect removal.
Note that the time to find a defect varies as time goes on. Peaks then tails off slowly – i.e. gets longer and longer to find each new defect at the end.
I didn’t talk about collecting the data. How did it come to be?
# Defects doesn’t = reliability
What is a defect?
Misc.: NoneReferences: Putnam and Myers, pages 125-130
Main Point: Effort per defect info related to release readiness
Notes:
From SPSG page 225
See slide xx showing effort per defect data from HP
Need to collect effort to fix defects.
Then can generate average fix times by type of defect.
Then, if you know how many defects you have open, you know how many person-weeks of effort you still have to go to fix the remaining defects on the project
This works grossly, for planning purposes, when there are hundreds of defects expected
Towards the end of a project we don’t know how long it will take us to fix the last few nasty defects, so this breaks down a bit.
And, just because you haven’t found any for a week, doesn’t mean there aren’t any there!
Industry data: JPL: 5-17 hours to fix a defect
Misc.: None
References: SPSG pages 221-235, chapter 16
Main Point: HP’s Defect model – move to testing section?
Notes:
This model is for how long defects take to find and fix. This model was developed by Henry Kohoutek at HP
From actual data for one of HP’s product lines
You could figure it out for your defects
You can use a model like this to estimate the testing hours needed for your project.
To use the model, estimate your total defects from the total code size (this is known at the start of test)
Multiply it by the expected defect density, this yields total defects expected
Then calculate, using the model, the total time necessary to discover all the defects
Then the available staffing can be applied to the total time to predict the testing schedule
Also, what percentage of your fixes break something else? It can be as high as 25%
Does this time include confirmation time for the fix? Yes
Got a question about schedule slips – schedule slips are systemic – i.e. see EV slides
Misc.: None
References: From Software Metrics: Establishing a Company-Wide Program, by Robert B. Grady and Deborah L. Caswell, 1987, page 128
Main Point: A little bit about defects
Notes:
What is a defect in your company? A review issue? A small enhancement? A customer enhancement request? Defects found in unit testing?
What’s a defect isn’t completely clear either
Do you count things found in reviews?
What about unit test defects if they’re found by the programmers themselves? Are they counted?
Some shops put enhancements (either internally generated or customer requests) into the defect tracking system, as a convenient place to store them. Are they ‘defects’?
Need to weed out duplicates, etc.
Some companies use several defect tracking systems so finding out the total can be difficult (one for defects in production, another during testing, for example)
Misc.: None
References: None
Main Point: System Test Profile
Notes:
An example of plotting testing progress, number of tests planned, number of tests executed, number of tests passed. We expect more or less linear growth as we test. This is an easy graph to do. Want tests executed and tests passed lines close together.
This is for an actual project shown in NASA Recommended Approach to Software Development, page 8-18
What’s happening here? Testing starts off well, then levels off and finally continues at a lower rate
Cause: midway through the phase, testers found they did not have the input coefficients needed to test flight software. There was a long delay before the data became available, and testing momentum declined.
This S-shaped curve can be tracked for several items of interest during software development, for example:
Completion of design reviews over time
Completion of code inspections over time
Completion of code integration over time (the graph above)
Completion of component test in terms of number of test cases attempted and successful over time
Completion of system test in terms of number of test cases attempted and successful over time
See also units coded, read, tested graph shown earlier for build testing
Misc.: NoneReferences: NASA Recommended Approach to Software Development, page 8-18
Main Point: System Test Profile
Notes:
An example of plotting testing progress, number of tests planned, number of tests executed, number of tests passed. We expect more or less linear growth as we test. This is an easy graph to do. Want tests exec and tests passed lines close together.
This is a made up slide, showing a project with more problems than in the prior slide – tests passed are falling behind tests executed. This is a pattern to watch out for!
This S-shaped curve can be tracked for several items of interest during software development, for example:
Completion of design reviews over time
Completion of code inspections over time
Completion of code integration over time (the graph above)
Completion of component test in terms of number of test cases attempted and successful over time
Completion of system test in terms of number of test cases attempted and successful over time
See also units coded, read, tested graph shown earlier for build testing
Misc.: NoneReferences: NASA Recommended Approach to Software Development, page 8-18
Main Point: Defect Rates
Notes:
Track defects vs. total estimated size of the project
You need to define what a defect is. People often use what’s in the defect tracking system. This graph is of defects found in test only
NASA has developed software development processes which reduce defects – for example requirements and design reviews. If you work in a shop without those processes, your defects in test will be much higher than shown here.
In testing, NASA has found the defect rates are halved in each succeeding testing phase (not counting defects found in requirements and design reviews)
4 defects/KSLOC in construction/unit test
2 defects/KSLOC in system test
1 defect/KSLOC in acceptance test
A graph of typical defect rates – NASA data. This shows their model upper and lower bounds as well as the expected rates
If a project’s defect rate is above the model upper bounds, possible causes:
Unreliable softwareMisinterpreted requirements
Extremely complex software
If a project’s defect rate is below the model bounds, possible causes:
Reliable software“Easy” problem
Inadequate testing
Misc.: NoneReferences: [NASA90] page 6-8
Main Point: Defect Rates
Notes:
Defect rates from one actual project. What’s going on?
This actual project had a lower defect rate and lower defect detection rate
In this case, this was an early indication of high quality
There was smooth progress in detecting defects – it’s not like they weren’t testing all along
This was one of the highest quality systems produced
If the defect density is lower than expected, possible causes:
Size estimate is high (good)
Inspection defect detection is low (bad)
Work product quality is high (as above example) (good)
Insufficient level of detail in work product (bad)
If the defect density is higher than expected, possible causes:
Size estimate is low (bad)
Work product quality is poor (bad)
Inspection defect detection is high (good)
Too much detail in work product (good or bad)
Misc.: None
References: [NASA90] page 6-8
Main Point: Factors which affect the defect rate
Notes: Several things increase the amount of faults we put into the system and their opposite tends to decrease the amount.
Japanese study showed that perceived schedule pressure drove up defects by 4x.
Schedule beyond the minimum (25%?) decreases faults inserted – to less than half
If you know your history, here are the factors that can affect it up or down
Misc. X
References: Measures for Excellence, Reliable Software on Time, Within Budget, by Lawrence H. Putnam and Ware Myers, 1992 This is from Putnam, Chapter 8, pages 135-146
Main Point: Defect density prediction related to release readiness
Notes:
For example past projects have found defects per KSLOC to be
In the range of 7 to 10
You have a system of 100,000 lines of code and have found 600 defects so far = 6 defects per KSLOC.
Based on past experience from 2 projects, you expect 7 to 10, or 700 to 1000 defects in 100,000 lines of code
If you are trying to remove 95% of all defects before release, you need to plan to find 665 – 950 defects based on your past experience
If you have past experience on a number of projects, you may know your average lifetime defect rate is 7.4 ± 0.4 defects which is a much closer range
Misc.: None
References: SPSG pages 221-235, chapter 16
Main Point: Worst faults come early
Notes: Critical is system doesn’t work
Major is some of system doesn’t work
Moderate is system works but data corruption
Minor is workmanship
What is the percentage of total defects introduced in each phase?
Vertical bars are 5% each.
We see that the faults that will cause us the most pain come from activities early in the lifecycle. This highlights our problem with brute force quality by testing at the end. Too late to find them, too expensive.
We see coding errors start to pick up at major but is dominated by design.
Coding probably gives the most number of errors but perhaps not the highest cost to the organization.
Capers Jones says this might be a typical distribution for a medium to large system, of 50,000 LOC and larger
He says for a small system of 5000 LOC or less, coding defects would be more than 50% of the total
Misc. X
Reference: Capers Jones, Applied Software Measurement, page 368
Main Point: Defect Modeling
Notes:
Defect modeling is another approach
We’ve seen this already in prior slides
Can model based on number of lines of code and past experience on defects per KSLOC
Or on defects inserted per hour of effort in requirements, design, construction etc.
Or based on anything else you might have tracked historical data on – defects per class, for example
Model defects against size in whatever units makes sense – KLOC, FP, use cases, classes, whatever
Can also insert defects deliberately and see how many are caught by testing. Say insert 100. Testing finds 220, 200 not inserted and 20 inserted. Since found 20% of inserted ones, predict 20% is the percentage of actual defects found. Thus total actual defects isn 1000, and 800 remain (plus the 80 we inserted) = 880 total. Predict same percent of actual defects found as inserted defects, although this is only true if we can model actual defects and insert the same distribution of types in our inserted ones. If do this, insert defects in a branch of code in CM library, test the branch, then kill it.
Misc.: None
References: SPSG pages 221-235, chapter 16
Main Point: Defect Modeling
Notes:
Some industry numbers as an example for defect modeling
Based in defects injected per hour of design, construction
Based on defects per KSLOC in different test phases
This is a level 3 prediction: you won’t have this data to start
Capers Jones (1999 seminar at Cx) says 83% of defects exist before a single line of code is written
Misc.: None
References: SPSG pages 221-235, chapter 16; NASA Manager’s Handbook for Software Development, page 6-8 and Watts S Humphrey ‘Measuring Software Quality’ presented at SEPG 2000
Main Point: Predicted and actual defects found
Notes:
If you track and record all defects, you can develop a profile of defects by phase for your organization. Then you can develop a graph that shows expected defect numbers as the project progresses
If you then find more or fewer defects than expected, you can research why
If you know you usually catch 50% of the requirements defects in inspections, you can predict how many you’ll catch and how many will ‘escape’
Phase containment numbers of defects by phase is graphed above – defect injection estimate, defect expected removal, defect actual removal, by phase, plus cumulative numbers
Defect removal in unit test was higher than estimated, which meant that a lower number of defects were removed in integration test and system test phases. Without accurate defect removal data from the unit test phase, these low numbers would be of more concern with respect to product quality
This is a level 3 metric! Phase criteria for releasing by phase is about product and process quality.
Misc.: NoneReferences: From Edward F. Weller, Practical Applications of Statistical Process Control, IEEE Software May/June 2000
Main Point: Measures related to Inspections and other reviews
Notes:
Example of defect profile by type
Could be other type classifications
HP found defect percent by type varied a lot between divisions – see green HP book page 139 ff
These are IEEE classifications
Specification is requirements – specs don’t describe the needs of the users
Functionality – Incorrect or incompatible product features – is also requirements
Data Handling, Computation and Logic are coding errors
UI, Data definition and error checking are design errors
Misc.: None
References: Robert B. Grady, Practical Software Metrics for Project Management and Process Improvement, Prentice Hall PTR, 1992 , page 139 ff
Main Point: What defect counts tell us about release readiness
Notes:
From SPSG page 224
See slide xx showing defects graph
If the project’s quality level is out of control, and the project is thrashing, you might see a steadily increasing number of open defects. Steps need to be taken to improve the quality of the existing designs and code before adding more new functionality.
Misc.: None
References: SPSG pages 221-235, chapter 16
Main Point: Information about Reported and Corrected Defects plots
Notes:
Data shown is theoretical – I moved the lines from the actual project to show values at release.
Defect counts give a quantitative handle on how much work the project team still has to do before it can release the software
Graph the cumulative reported defects, open defects and fixed defects
When the software is nearing release, the number of open defects should trend downward and get near zero, and the fixed defects should be approaching the reported defects line
Misc.: None
References: [NASA90] page 6-9
Main Point: Information about Reported and Corrected Defects plots
Notes:
Data shown is theoretical – I moved the lines from the actual project to show values at release.
Defect counts give a quantitative handle on how much work the project team still has to do before it can release the software
Graph the cumulative reported defects, open defects and fixed defects
When the software is nearing release, the number of open defects should trend downward and get near zero, and the fixed defects should be approaching the reported defects line
And, just because you haven’t found any for a week, doesn’t mean there aren’t any there! The open line should be near zero for more than one week!
Typically there are no sev 1 or sev 2 defects; and you look at the sev 3 with product support and fix some of the remaining ones – the ones product support says will be a problem – before release.
So the total defects slide (the one before) doesn’t get to zero before release, although it gets low; but the sev 1 and sev 2 plot gets to zero.
Misc.: NoneReferences: [NASA90] page 6-9
Main Point: HP release criteria
Notes:
From green HP book
Misc.: None
References: Robert B. Grady, Practical Software Metrics for Project Management and Process Improvement, page 76
Main Point: HP release criteria
Notes:
From green HP book
Misc.: None
References: Robert B. Grady, Practical Software Metrics for Project Management and Process Improvement, page 76
Main Point: HP post-release criteria plot of critical/serious defects, by whether met release criteria or not
Notes:
From green HP book, page 78
Certification meant following the criteria on for testing and release criteria
The bottom line is an average of a dozen projects that met the new release criteria
This graph compares products that met the criteria for release (shown 2 slides ago) vs. ones that didn’t.
Graph shows that a combination of good development and testing processes will enable you to confidently predict a low incoming defect rate
This is in some ways an experiment. When standard release criteria were implemented, not all projects met them; so HP could compare, later, the postrelease defects in the products that met them vs. products that didn’t.
Misc.: None
References: Robert B. Grady, Practical Software Metrics for Project Management and Process Improvement, page 78
Main Point: HP release criteria plot of critical/serious defects
Notes:
From green HP book, page 77
Plot in book includes defects/KLOC on the right
The target line (3 defects) is at 0.02
6 defects is at 0.04
9 defects = 0.06
The goal of the target defects was for the test cycle before the final one
Goal for final test cycle would be 0 Sev 1 and 2 defects
Note it included some critical or serious defects – but not many!
Track this (Sev 1 & 2) during the whole testing phase, not just at the end (also track total defects)
Misc.: None
References: Robert B. Grady, Practical Software Metrics for Project Management and Process Improvement, page 76
Main Point: Need combination for reasonable level of removal
Notes: A study was done to determine effectiveness of various techniques.
The “check” ones are personal desk checking.
Function testing is related modules
Integration testing is the whole system
As you can see, a desk check can do fairly well when run right or down right lousy.
The lowest effective rate was unit testing.
The real message though is look what happens with the combined number.
If our goal is 95%, we may not need to do all of these. This is where software engineering comes into play. What is the right set for our project that will get us there for the least cost?
Misc. Most research doesn’t include requirements which many purists don’t consider part of a software project. Too much variability and may end up in hardware.
Reference: Capers Jones, Programming Productivity, p.179
Main Point: Development status model
Notes:
For a single build (single stage of a staged development project)
This is a key progress indicator
It is an indirect software quality indicator
The model must represent how development is done – the development methodology
Expect a lag between coding and review and between review and testing
This is a measure of process quality. We want to keep consistent space between the lines. If reviewing and testing fall behind, the project is falling behind, even though coding is going well.
If the project suddenly catches up, be suspicious, probably they didn’t do reviews and testing as thoroughly as they should have.
Monitor only major activities
Misc.: None
References: NASA Manager’s Handbook for Software Development, Revision 1, page 6-11
Main Point: Example for an actual project of development status model
Notes:
Shows target, units coded, units reviewed,units tested
The project shown finished code and unit testing nearly on schedule. When severe problems were encountered during system integration and testing, it was found that insufficient unit testing had resulted in poor quality software. Details shown above.
Note the miracle finish at 1 – where all of a sudden code review and unit testing catch up with coding near the deadline, when there had been a 3-4 week lag
Cause:
Some crucial testing information was not available
Short cuts were taken in reviews and unit testing to meet schedules
Result: project entered system testing phase with poor quality software. To bring the software up to standard, the system test phase took 100% longer than expected (!)
Misc.: None
References: NASA Manager’s Handbook for Software Development, Revision 1, page 6-11
Main Point: HP post-release defects over time, as part of measuring process improvement
Notes:
From green HP book, page 207
A corporate wide HP goal was to improve the product post release defect density by a factor of 10 in five years
This graph, from one division, show its progress in meeting its goals
Goal: improve software quality
Questions this graph helps to answer:
What is our current software quality?
This graph shows post-release defect density of products according to one of HP’s 10X improvement measures. It is only an after-the-fact indicator of the quality level produced by our processes, and thus can only influence future products through cause-effect analysis.
Misc.: None
References: Robert B. Grady, Practical Software Metrics for Project Management and Process Improvement, page 207
Main Point: HP Division Prerelease Defect Density
Notes:
From green HP book, page 199
A corporate wide HP goal was to improve the product post release defect density by a factor of 10 in five years
This graph, from one division, show its progress in meeting its goals
Goal: improve software quality
Questions this graph helps to answer:
How can we predict product quality based on early development processes?
This data can be used to predict the performance of the graph on the prior slide. For an unchanging process, there is a roughly predictable ration between pre- and post-release defects.
Keep in mind that an upward trend in this graph could show either better testing techniques or poorer pretest defect avoidance.
A downward trend could reflect better pre-test defect avoidance or poorer testing.
Misc.: None
References: Robert B. Grady, Practical Software Metrics for Project Management and Process Improvement, page 199
Main Point: Defect Repair Efficiency (defects fixed per engineering month)
Notes:
A corporate wide HP goal was to improve productivity
Our primary cost is in engineering months and calendar months. Our output today is most effectively measured in KLOC or function points for new development, and in defects fixed for maintenance. Productivity measurement is a particularly sensitive topic. It is best not to measure any finer level of detail than in these examples, and it is best to drive improvements from figure 15-4.
Goal: improve productivity
Questions this graph helps to answer: How efficient are defect-fixing activities?
This graph shows the trend of efficiency in fixing defects. It helps to insure that we reduce the average effort to fix defects besides whatever staffing actions we might take to reduce the backlog. (This graph does not show real data).
Misc.: None
References: Robert B. Grady, Practical Software Metrics for Project Management and Process Improvement, page 201
Main Point: Mean time to fix serious and critical defects
Notes:
A corporate wide HP goal was to maximize customer satisfaction
The next three graphs show more direct aspects of customer satisfaction. They deal with responsiveness to important customer problems and indirectly with how well we understand all our customer’s needs
Goal: Maximize customer satisfaction
Questions this graph helps to answer: How does it take to fix a problem?
The trend of the total area under the curve is related to how long customers have to wait before they see fixes. The largest area represents the best opportunity to shorten cycle time. MR = marketing review, LC = Lab classification, KP = Known Problem, AD = waiting for data, QA = final quality assurance testing, AR = awaiting release
Misc.: None
References: Robert B. Grady, Practical Software Metrics for Project Management and Process Improvement, page 202, figure 15-8