SlideShare a Scribd company logo
Estimating Total Defects WA Integrated Asset Management, 28 November 2016
TotalDefectsSoftwareProject.docx 1
Estimating the Total Defects in a Software
Project before Completion
Hein Aucamp
WA Integrated Asset Management
WA Integrated Asset Management is an
Infrastructure Asset Management company
serving Local Government and the mining
sector based in Perth, Western Australia.
www.waiam.com.au
In my recent Asset Management work on a buildings register with around 1 million components, I
have become interested in statistical methods for estimating the quality of the information. This
reminded me of a technique I discovered years ago in my work in software project management,
which I hope you will enjoy.
In 2003 I attended a session on software testing. A question arose about estimating the total defects in
a project before project completion. The answer given was that there was no way of doing this. But in
fact there are 2 ways. I want to mention one briefly and discuss the other in more detail.
A technique to estimate the total defects tell a project team at least 2 important things:
1. Their present progress in removing defects.
2. The remaining defects they still have to discover.
Here is the brief mention: the method of seeding. Seeding come from biological investigations. A
specific number of tagged fish are released into a population, and investigators monitor the proportion
of non-tagged to tagged fish in catches. So if a catch of 1,000 fish contains 5 tagged fish, and if 100
tagged fish were released, the estimated total population is 100 / 5 * 1,000: a total of 20,000.
Seeding is used in software projects by deliberately introducing defects (they keep a record of them!),
and then measuring the proportion of the introduced defects of the actual defects found.
But I am not talking about seeding in detail here. Rather, my focus is a formula mentioned by Steve
McConnell in his Software Project Survival Guide (Microsoft Press 1998).
McConnell mentions a technique where 2 independent testers collect defects on a project. If Tester 1
finds A defects, and Tester 2 finds B defects, and if there are C common defects in their findings, then
the estimate of the total defects is T = A * B / C.
McConnell mentions this formula without elaboration, so let’s investigate it in some detail.
If there are T defects in a project, the probability of any particular defect being found is 1/T. The
probability of a particular defect occurring in a set of size A is A/T. The probability of a particular
defect occurring in a set of size B is B/T.
Estimating Total Defects WA Integrated Asset Management, 28 November 2016
TotalDefectsSoftwareProject.docx 2
Now the assumption is that the probability of a particular defect occurring in both Set A and Set A is
(A/T) * (B/T). (We shall accept the assumption at this stage, and investigate it below.) We have thus
the probability of a point occurring in Set C: the set of common defects.
But we also have an independent probability of a point occurring in Set C, which is simply C/T. By
equating the 2 expressions and solving for T, we get McConnell’s formula.
But how valid is this assumption: that the probability of a particular defect occurring in both Set A
and Set B is (A/T) * (B/T)? This calculation of probability applies strictly only to independent events
(like rolling a die twice in succession).
In testing there is the real (but remote) possibility that the 2 testers will each discover exactly half of
the defects, with none in common. In that case, McConnell’s formula will suggest infinite defects,
when in fact all are known.
Let’s take an example of a total of 12 defects, where Tester 1 discovers 6 defects and Tester 2
discovers 6. The number of unique sets that each can discover is a well-known mathematical
combination: (12!) / (6! * 6!). The result is 924.
The possibility of Tester 1 discovering 6 defects which entirely exclude the 6 of Tester B’s set is
1/924, or 0.11%, in which case they will conclude from the formula that there are infinite defects
(when in reality all are known).
On the other hand, the possibility of their both discovering the same 6 is the also 0.11%, in which case
they will conclude from the formula that all defects are known (when in reality only half are known).
McConnell’s formula in fact relies on a common amount of 3 defects: T = 6 * 6 / 3 = 12.
The chart below shows the probabilities of the two 6-point samples having a number of common
points:
The probability of having 0 or 6 points in common is so remote that it does not appear on the chart.
These are the 0.11% cases which led to widely wrong conclusions.
The probability of having 3 points in common (on which McConnell’s formula depends) is 43.29%.
The chart shows that the probabilities favour the formula’s being a reasonable estimate. The 3 middle
columns account for 91.99% of the probability, making extreme estimation errors increasingly
unlikely.1
1
If you are interested in the maths, please email me at hein.aucamp@waiam.com.au, and I will send you the
detail. It involves combining groups of different length from partitions in the sample space.

More Related Content

Recently uploaded

22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
bjmsejournal
 
AI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptxAI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptx
architagupta876
 
integral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdfintegral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdf
gaafergoudaay7aga
 
Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...
Prakhyath Rai
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
shadow0702a
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
Nada Hikmah
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
Divyanshu
 
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
ElakkiaU
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
Madan Karki
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
IJECEIAES
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
21UME003TUSHARDEB
 
An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...
IJECEIAES
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
RamonNovais6
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
gowrishankartb2005
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
PKavitha10
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
ydzowc
 
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
Gino153088
 

Recently uploaded (20)

22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
 
AI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptxAI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptx
 
integral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdfintegral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdf
 
Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...
 
Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
 
Null Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAMNull Bangalore | Pentesters Approach to AWS IAM
Null Bangalore | Pentesters Approach to AWS IAM
 
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
 
Seminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptxSeminar on Distillation study-mafia.pptx
Seminar on Distillation study-mafia.pptx
 
Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...Advanced control scheme of doubly fed induction generator for wind turbine us...
Advanced control scheme of doubly fed induction generator for wind turbine us...
 
Mechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdfMechanical Engineering on AAI Summer Training Report-003.pdf
Mechanical Engineering on AAI Summer Training Report-003.pdf
 
An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
 
CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1CEC 352 - SATELLITE COMMUNICATION UNIT 1
CEC 352 - SATELLITE COMMUNICATION UNIT 1
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
 
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
4. Mosca vol I -Fisica-Tipler-5ta-Edicion-Vol-1.pdf
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
marketingartwork
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
Skeleton Technologies
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
SpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Christy Abraham Joy
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
Vit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
MindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson36
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Applitools
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
GetSmarter
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
Alireza Esmikhani
 

Featured (20)

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 

Estimating the Total Defects in a Software Project before Completion

  • 1. Estimating Total Defects WA Integrated Asset Management, 28 November 2016 TotalDefectsSoftwareProject.docx 1 Estimating the Total Defects in a Software Project before Completion Hein Aucamp WA Integrated Asset Management WA Integrated Asset Management is an Infrastructure Asset Management company serving Local Government and the mining sector based in Perth, Western Australia. www.waiam.com.au In my recent Asset Management work on a buildings register with around 1 million components, I have become interested in statistical methods for estimating the quality of the information. This reminded me of a technique I discovered years ago in my work in software project management, which I hope you will enjoy. In 2003 I attended a session on software testing. A question arose about estimating the total defects in a project before project completion. The answer given was that there was no way of doing this. But in fact there are 2 ways. I want to mention one briefly and discuss the other in more detail. A technique to estimate the total defects tell a project team at least 2 important things: 1. Their present progress in removing defects. 2. The remaining defects they still have to discover. Here is the brief mention: the method of seeding. Seeding come from biological investigations. A specific number of tagged fish are released into a population, and investigators monitor the proportion of non-tagged to tagged fish in catches. So if a catch of 1,000 fish contains 5 tagged fish, and if 100 tagged fish were released, the estimated total population is 100 / 5 * 1,000: a total of 20,000. Seeding is used in software projects by deliberately introducing defects (they keep a record of them!), and then measuring the proportion of the introduced defects of the actual defects found. But I am not talking about seeding in detail here. Rather, my focus is a formula mentioned by Steve McConnell in his Software Project Survival Guide (Microsoft Press 1998). McConnell mentions a technique where 2 independent testers collect defects on a project. If Tester 1 finds A defects, and Tester 2 finds B defects, and if there are C common defects in their findings, then the estimate of the total defects is T = A * B / C. McConnell mentions this formula without elaboration, so let’s investigate it in some detail. If there are T defects in a project, the probability of any particular defect being found is 1/T. The probability of a particular defect occurring in a set of size A is A/T. The probability of a particular defect occurring in a set of size B is B/T.
  • 2. Estimating Total Defects WA Integrated Asset Management, 28 November 2016 TotalDefectsSoftwareProject.docx 2 Now the assumption is that the probability of a particular defect occurring in both Set A and Set A is (A/T) * (B/T). (We shall accept the assumption at this stage, and investigate it below.) We have thus the probability of a point occurring in Set C: the set of common defects. But we also have an independent probability of a point occurring in Set C, which is simply C/T. By equating the 2 expressions and solving for T, we get McConnell’s formula. But how valid is this assumption: that the probability of a particular defect occurring in both Set A and Set B is (A/T) * (B/T)? This calculation of probability applies strictly only to independent events (like rolling a die twice in succession). In testing there is the real (but remote) possibility that the 2 testers will each discover exactly half of the defects, with none in common. In that case, McConnell’s formula will suggest infinite defects, when in fact all are known. Let’s take an example of a total of 12 defects, where Tester 1 discovers 6 defects and Tester 2 discovers 6. The number of unique sets that each can discover is a well-known mathematical combination: (12!) / (6! * 6!). The result is 924. The possibility of Tester 1 discovering 6 defects which entirely exclude the 6 of Tester B’s set is 1/924, or 0.11%, in which case they will conclude from the formula that there are infinite defects (when in reality all are known). On the other hand, the possibility of their both discovering the same 6 is the also 0.11%, in which case they will conclude from the formula that all defects are known (when in reality only half are known). McConnell’s formula in fact relies on a common amount of 3 defects: T = 6 * 6 / 3 = 12. The chart below shows the probabilities of the two 6-point samples having a number of common points: The probability of having 0 or 6 points in common is so remote that it does not appear on the chart. These are the 0.11% cases which led to widely wrong conclusions. The probability of having 3 points in common (on which McConnell’s formula depends) is 43.29%. The chart shows that the probabilities favour the formula’s being a reasonable estimate. The 3 middle columns account for 91.99% of the probability, making extreme estimation errors increasingly unlikely.1 1 If you are interested in the maths, please email me at hein.aucamp@waiam.com.au, and I will send you the detail. It involves combining groups of different length from partitions in the sample space.