SlideShare a Scribd company logo
Toolsfor
Software Engineers
The Art of Testing Less Without
Sacrificing Code Quality
Kim Herzig£$, Michaela Greiler$, Jacek Czerwonka$, Brendan Murphy£
£Microsoft Research, Cambridge
$Microsoft Corporation, Tools for Software Engineers (TSE)
IMPROVING TESTING PROCESSES
Release cycles impact verification process
• Testing becomes bottleneck for development.
• How much testing is enough?
• How reliable and effective are tests?
• When should we run a test?
ENGINEERING PROCESS
Engineers desktop Integration process
SYSTEM AND INTEGRATION TESTING
Quality gates
• Developer have to pass quality gates (no control over test selection)
• Checking system constraints: e.g. compatibility or performance
• Failures not isolated
 involve human inspections
 causes development freeze for corresponding branch
SYSTEM AND INTEGRATION TESTING
Software testing is expensive
• 10k+ gates executed, 1M+ test cases
• Different branches, architectures, languages, …
• Aims to find code issues as early as possible
• Slows down product development
RESEARCH OBJECTIVE
Only run effective and reliable tests
• Not every tests performs equally well, depends on code base
• Reduce execution frequency of tests that cause false test alarms
(failures due to test and infrastructure issues)
Do not sacrifice code quality
• Run every test at least once on every code change
• Eventually find all code defects, taking risk of finding defects later ok.
Running less tests increases code velocity
• We cannot run all tests on all code changes anymore.
• Identify tests that are more likely to find defects (not coverage).
Effectiveness
Reliability
High cost,
unknown value
$$$$
High cost,
low value
$$$$
Low cost,
good value
$$
Low cost,
low value
$
high
lowhigh
low
HISTORIC TEST FAILURE PROBABILITIES
Analyzing past test runs: failure probabilities
• How often did the test fail and detected a code defect? (𝑃𝑇𝑃)
• How often did the test report a false test alarm? (𝑃𝐹𝑃)
time
Quality
Gate
Build Build
?
Build
Execution history
These probabilities depend on the execution context!
COST MODEL
What is the cost of running a test?
• Treat test executions as investment: what is our return of investment
Check before running a test (considering the context)
𝐶𝑜𝑠𝑡 𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 > 𝐶𝑜𝑠𝑡 𝑆𝑘𝑖𝑝 ? suspend ∶ execute test
𝐶𝑜𝑠𝑡 𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 = 𝐶𝑜𝑠𝑡 𝑀𝑎𝑐ℎ𝑖𝑛𝑒/𝑇𝑖𝑚𝑒 ∗ 𝑇𝑖𝑚𝑒 𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 + "Cost of potential false alarm"
= 𝐶𝑜𝑠𝑡 𝑀𝑎𝑐ℎ𝑖𝑛𝑒/𝑇𝑖𝑚𝑒 ∗ 𝑇𝑖𝑚𝑒 𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 + (𝑃𝐹𝑃 ∗ 𝐶𝑜𝑠𝑡 𝐷𝑒𝑣𝑒𝑙𝑜𝑝𝑒𝑟/𝑇𝑖𝑚𝑒 ∗𝑇𝑖𝑚𝑒 𝑇𝑟𝑖𝑎𝑔𝑒 )
𝐶𝑜𝑠𝑡 𝑆𝑘𝑖𝑝 = "Potential cost of elapsing a bug to next higher branch level"
= 𝑃 𝑇𝑃 ∗ 𝐶𝑜𝑠𝑡 𝐷𝑒𝑣𝑒𝑙𝑜𝑝𝑒𝑟/𝑇𝑖𝑚𝑒 ∗ 𝑇𝑖𝑚𝑒 𝐹𝑟𝑒𝑒𝑧𝑒 𝑏𝑟𝑎𝑛𝑐ℎ ∗ #𝐷𝑒𝑣𝑒𝑙𝑜𝑝𝑒𝑟𝑠 𝐵𝑟𝑎𝑛𝑐ℎ
DOES IT PAY OFF?
Less test executions
reduce cost
Taking risk
increases cost
~11 month period
> 30 million test execs
multiple branches
~3 month period
> 1.2 million test execs
single branch
~12 month period
> 6.5 million test execs
multiple branches
Windows
Results
Simulated on
Windows 8.1
development period
(BVT only)
Tools for
Software Engineers
ACROSS ALL PRODUCTS
TABLE I. SIMULATION RESULTS FOR MICROSOFT WINDOWS, OFFICE, AND DYNAMICS.
Windows Office Dynamics
Measurement Rel. improvement Cost
improvement
Rel. improvement Cost
improvement
Rel. improvement Cost
improvement
Test executions 40.58% -- 34.9% -- 50.36% --
Test time 40.31% $1,567,607.76 40.1% $76,509.24 47.45% $19,979.03
Test result inspection 33.04% $61,532.80 21.1% $104,880.00 32.53% $2,337,926.40
Escaped defects 0.20% $11,970.56 8.7% $75,326.40 13.40% $310,159.42
Total cost balance $1,617,170.00 $106,063.24 $2,047,746.01
Results vary
• Branching structure
• Runtime of tests
• We save cost on all products
Fine-tuning possible, better results but not general
DYNAMIC & SELF-ADAPTIVE
Probabilities are dynamic (change over time)
• Skipping tests influences risk factors (of higher level branches)
• Tests re-enabled when code quality drops
• Feedback-loop between decision points
0%
10%
20%
30%
40%
50%
60%
70%
relativetestreductionrate
Time (Windows 8.1)
Training period
automatically enable tests again
IMPACT ON DEVELOPMENT PROCESS
Secondary Improvements
• Machine Setup
We may lower the number of machines allocated to testing process
• Developer satisfaction
Removing false test failures increases confidence in testing process
Development speed
• Impact on development speed hard to estimate through simulation
• Product teams invest as they believe that removing tests:
 Increases code velocity (at least lower bound)
 Avoids additional changes due to merge conflicts
 Reduces the number of required integration branches as their main purpose is to test product
“We used the data your team has provided to cut a bunch of bad content and are running a much leaner BVT system […]
we’re panning out to scale about 4x and run in well under 2 hours” (Jason Means, Windows BVT PM)
©2013 Microsoft Corporation. All rights reserved.
Toolsfor
Software Engineers

More Related Content

What's hot

TDD in the ABAP world - sitNL 2013 edition
TDD in the ABAP world - sitNL 2013 editionTDD in the ABAP world - sitNL 2013 edition
TDD in the ABAP world - sitNL 2013 edition
Hendrik Neumann
 
Mining Performance Regression Testing Repositories for Automated Performance ...
Mining Performance Regression Testing Repositories for Automated Performance ...Mining Performance Regression Testing Repositories for Automated Performance ...
Mining Performance Regression Testing Repositories for Automated Performance ...
SAIL_QU
 
Regression and performance testing
Regression and performance testingRegression and performance testing
Regression and performance testing
Himanshu
 
03 test specification and execution
03   test specification and execution03   test specification and execution
03 test specification and execution
Clemens Reijnen
 
Understanding the Rationale for Updating a Function's Comment
Understanding the Rationale for Updating a Function's CommentUnderstanding the Rationale for Updating a Function's Comment
Understanding the Rationale for Updating a Function's Comment
SAIL_QU
 
Case Study: Automated Code Reviews In A Grown SAP Application Landscape At EW...
Case Study: Automated Code Reviews In A Grown SAP Application Landscape At EW...Case Study: Automated Code Reviews In A Grown SAP Application Landscape At EW...
Case Study: Automated Code Reviews In A Grown SAP Application Landscape At EW...
Virtual Forge
 
Embedded Testing 2015
Embedded Testing 2015Embedded Testing 2015
Embedded Testing 2015
Vassilis Rizopoulos
 
Simple Railroad Command Protocol
Simple Railroad Command ProtocolSimple Railroad Command Protocol
Simple Railroad Command Protocol
Ankit Singh
 
ABAP Code Retreat Frankfurt 2016: TDD - Test Driven Development
ABAP Code Retreat Frankfurt 2016: TDD - Test Driven DevelopmentABAP Code Retreat Frankfurt 2016: TDD - Test Driven Development
ABAP Code Retreat Frankfurt 2016: TDD - Test Driven Development
Hendrik Neumann
 
H testing and debugging
H testing and debuggingH testing and debugging
H testing and debugging
missstevenson01
 
Software Testing & Debugging
Software Testing & DebuggingSoftware Testing & Debugging
Software Testing & Debugging
Computing Cage
 
Bert Zuurke - A Lean And Mean Approach To Model-Based Testing - EuroSTAR 2010
Bert Zuurke - A Lean And Mean Approach To Model-Based Testing - EuroSTAR 2010Bert Zuurke - A Lean And Mean Approach To Model-Based Testing - EuroSTAR 2010
Bert Zuurke - A Lean And Mean Approach To Model-Based Testing - EuroSTAR 2010
TEST Huddle
 
Software testing
Software testingSoftware testing
Software testing
nidhip216
 
Practical unit testing in c & c++
Practical unit testing in c & c++Practical unit testing in c & c++
Practical unit testing in c & c++
Matt Hargett
 
An Insight into the Black Box and White Box Software Testing
An Insight into the Black Box and White Box Software Testing An Insight into the Black Box and White Box Software Testing
An Insight into the Black Box and White Box Software Testing
BugRaptors
 
Improving the accuracy and reliability of data analysis code
Improving the accuracy and reliability of data analysis codeImproving the accuracy and reliability of data analysis code
Improving the accuracy and reliability of data analysis code
Johan Carlin
 
Software Testing Tecniques
Software Testing TecniquesSoftware Testing Tecniques
Software Testing Tecniques
ersanbilik
 
All you need to know about regression testing | David Tzemach
All you need to know about regression testing | David TzemachAll you need to know about regression testing | David Tzemach
All you need to know about regression testing | David Tzemach
David Tzemach
 
Software Testing Strategies
Software Testing StrategiesSoftware Testing Strategies
Software Testing Strategies
Alpana Bhaskar
 
Debugging
DebuggingDebugging

What's hot (20)

TDD in the ABAP world - sitNL 2013 edition
TDD in the ABAP world - sitNL 2013 editionTDD in the ABAP world - sitNL 2013 edition
TDD in the ABAP world - sitNL 2013 edition
 
Mining Performance Regression Testing Repositories for Automated Performance ...
Mining Performance Regression Testing Repositories for Automated Performance ...Mining Performance Regression Testing Repositories for Automated Performance ...
Mining Performance Regression Testing Repositories for Automated Performance ...
 
Regression and performance testing
Regression and performance testingRegression and performance testing
Regression and performance testing
 
03 test specification and execution
03   test specification and execution03   test specification and execution
03 test specification and execution
 
Understanding the Rationale for Updating a Function's Comment
Understanding the Rationale for Updating a Function's CommentUnderstanding the Rationale for Updating a Function's Comment
Understanding the Rationale for Updating a Function's Comment
 
Case Study: Automated Code Reviews In A Grown SAP Application Landscape At EW...
Case Study: Automated Code Reviews In A Grown SAP Application Landscape At EW...Case Study: Automated Code Reviews In A Grown SAP Application Landscape At EW...
Case Study: Automated Code Reviews In A Grown SAP Application Landscape At EW...
 
Embedded Testing 2015
Embedded Testing 2015Embedded Testing 2015
Embedded Testing 2015
 
Simple Railroad Command Protocol
Simple Railroad Command ProtocolSimple Railroad Command Protocol
Simple Railroad Command Protocol
 
ABAP Code Retreat Frankfurt 2016: TDD - Test Driven Development
ABAP Code Retreat Frankfurt 2016: TDD - Test Driven DevelopmentABAP Code Retreat Frankfurt 2016: TDD - Test Driven Development
ABAP Code Retreat Frankfurt 2016: TDD - Test Driven Development
 
H testing and debugging
H testing and debuggingH testing and debugging
H testing and debugging
 
Software Testing & Debugging
Software Testing & DebuggingSoftware Testing & Debugging
Software Testing & Debugging
 
Bert Zuurke - A Lean And Mean Approach To Model-Based Testing - EuroSTAR 2010
Bert Zuurke - A Lean And Mean Approach To Model-Based Testing - EuroSTAR 2010Bert Zuurke - A Lean And Mean Approach To Model-Based Testing - EuroSTAR 2010
Bert Zuurke - A Lean And Mean Approach To Model-Based Testing - EuroSTAR 2010
 
Software testing
Software testingSoftware testing
Software testing
 
Practical unit testing in c & c++
Practical unit testing in c & c++Practical unit testing in c & c++
Practical unit testing in c & c++
 
An Insight into the Black Box and White Box Software Testing
An Insight into the Black Box and White Box Software Testing An Insight into the Black Box and White Box Software Testing
An Insight into the Black Box and White Box Software Testing
 
Improving the accuracy and reliability of data analysis code
Improving the accuracy and reliability of data analysis codeImproving the accuracy and reliability of data analysis code
Improving the accuracy and reliability of data analysis code
 
Software Testing Tecniques
Software Testing TecniquesSoftware Testing Tecniques
Software Testing Tecniques
 
All you need to know about regression testing | David Tzemach
All you need to know about regression testing | David TzemachAll you need to know about regression testing | David Tzemach
All you need to know about regression testing | David Tzemach
 
Software Testing Strategies
Software Testing StrategiesSoftware Testing Strategies
Software Testing Strategies
 
Debugging
DebuggingDebugging
Debugging
 

Similar to The Art of Testing Less without Sacrificing Quality @ ICSE 2015

Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...
Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...
Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...
TEST Huddle
 
Keynote AST 2016
Keynote AST 2016Keynote AST 2016
Keynote AST 2016
Kim Herzig
 
Istqb foundation level day 1
Istqb foundation level   day 1Istqb foundation level   day 1
Istqb foundation level day 1
Shuchi Singla AKT,SPC4,PMI-ACP,ITIL(F),CP-AAT
 
Can we induce change with what we measure?
Can we induce change with what we measure?Can we induce change with what we measure?
Can we induce change with what we measure?
Michaela Greiler
 
Software Testing- Principles of testing- Mazenet Solution
Software Testing- Principles of testing- Mazenet SolutionSoftware Testing- Principles of testing- Mazenet Solution
Software Testing- Principles of testing- Mazenet Solution
Mazenetsolution
 
Soft quality & standards
Soft quality & standardsSoft quality & standards
Soft quality & standards
Prince Bhanwra
 
Soft quality & standards
Soft quality & standardsSoft quality & standards
Soft quality & standards
Prince Bhanwra
 
Making the Unstable Stable - An Intro To Testing
Making the Unstable Stable - An Intro To TestingMaking the Unstable Stable - An Intro To Testing
Making the Unstable Stable - An Intro To Testing
Cameron Presley
 
ISTQB - CTFL Summary v1.0
ISTQB - CTFL Summary v1.0ISTQB - CTFL Summary v1.0
ISTQB - CTFL Summary v1.0
Samer Desouky
 
A Software Testing Intro
A Software Testing IntroA Software Testing Intro
A Software Testing Intro
Evozon Test Lab
 
Database Unit Testing Made Easy with VSTS
Database Unit Testing Made Easy with VSTSDatabase Unit Testing Made Easy with VSTS
Database Unit Testing Made Easy with VSTS
Sanil Mhatre
 
Vlsi testing
Vlsi testingVlsi testing
Vlsi testing
Dilip Mathuria
 
Lightning Talks by Globant - Automation (This app runs by itself )
Lightning Talks by Globant -  Automation (This app runs by itself ) Lightning Talks by Globant -  Automation (This app runs by itself )
Lightning Talks by Globant - Automation (This app runs by itself )
Globant
 
Small is Beautiful- Fully Automate your Test Case Design
Small is Beautiful- Fully Automate your Test Case DesignSmall is Beautiful- Fully Automate your Test Case Design
Small is Beautiful- Fully Automate your Test Case Design
Georgina Tilby
 
Why Automated Testing Matters To DevOps
Why Automated Testing Matters To DevOpsWhy Automated Testing Matters To DevOps
Why Automated Testing Matters To DevOps
dpaulmerrill
 
Alexander Podelko - Context-Driven Performance Testing
Alexander Podelko - Context-Driven Performance TestingAlexander Podelko - Context-Driven Performance Testing
Alexander Podelko - Context-Driven Performance Testing
Neotys_Partner
 
Test Driven Development
Test Driven DevelopmentTest Driven Development
Test Driven Development
Sergey Aganezov
 
First steps in testing analytics: Does test code quality matter?
First steps in testing analytics: Does test code quality matter?First steps in testing analytics: Does test code quality matter?
First steps in testing analytics: Does test code quality matter?
Andy Zaidman
 
testing
testingtesting
testing
Rashmi Deoli
 
Testing- Fundamentals of Testing-Mazenet solution
Testing- Fundamentals of Testing-Mazenet solutionTesting- Fundamentals of Testing-Mazenet solution
Testing- Fundamentals of Testing-Mazenet solution
Mazenetsolution
 

Similar to The Art of Testing Less without Sacrificing Quality @ ICSE 2015 (20)

Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...
Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...
Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...
 
Keynote AST 2016
Keynote AST 2016Keynote AST 2016
Keynote AST 2016
 
Istqb foundation level day 1
Istqb foundation level   day 1Istqb foundation level   day 1
Istqb foundation level day 1
 
Can we induce change with what we measure?
Can we induce change with what we measure?Can we induce change with what we measure?
Can we induce change with what we measure?
 
Software Testing- Principles of testing- Mazenet Solution
Software Testing- Principles of testing- Mazenet SolutionSoftware Testing- Principles of testing- Mazenet Solution
Software Testing- Principles of testing- Mazenet Solution
 
Soft quality & standards
Soft quality & standardsSoft quality & standards
Soft quality & standards
 
Soft quality & standards
Soft quality & standardsSoft quality & standards
Soft quality & standards
 
Making the Unstable Stable - An Intro To Testing
Making the Unstable Stable - An Intro To TestingMaking the Unstable Stable - An Intro To Testing
Making the Unstable Stable - An Intro To Testing
 
ISTQB - CTFL Summary v1.0
ISTQB - CTFL Summary v1.0ISTQB - CTFL Summary v1.0
ISTQB - CTFL Summary v1.0
 
A Software Testing Intro
A Software Testing IntroA Software Testing Intro
A Software Testing Intro
 
Database Unit Testing Made Easy with VSTS
Database Unit Testing Made Easy with VSTSDatabase Unit Testing Made Easy with VSTS
Database Unit Testing Made Easy with VSTS
 
Vlsi testing
Vlsi testingVlsi testing
Vlsi testing
 
Lightning Talks by Globant - Automation (This app runs by itself )
Lightning Talks by Globant -  Automation (This app runs by itself ) Lightning Talks by Globant -  Automation (This app runs by itself )
Lightning Talks by Globant - Automation (This app runs by itself )
 
Small is Beautiful- Fully Automate your Test Case Design
Small is Beautiful- Fully Automate your Test Case DesignSmall is Beautiful- Fully Automate your Test Case Design
Small is Beautiful- Fully Automate your Test Case Design
 
Why Automated Testing Matters To DevOps
Why Automated Testing Matters To DevOpsWhy Automated Testing Matters To DevOps
Why Automated Testing Matters To DevOps
 
Alexander Podelko - Context-Driven Performance Testing
Alexander Podelko - Context-Driven Performance TestingAlexander Podelko - Context-Driven Performance Testing
Alexander Podelko - Context-Driven Performance Testing
 
Test Driven Development
Test Driven DevelopmentTest Driven Development
Test Driven Development
 
First steps in testing analytics: Does test code quality matter?
First steps in testing analytics: Does test code quality matter?First steps in testing analytics: Does test code quality matter?
First steps in testing analytics: Does test code quality matter?
 
testing
testingtesting
testing
 
Testing- Fundamentals of Testing-Mazenet solution
Testing- Fundamentals of Testing-Mazenet solutionTesting- Fundamentals of Testing-Mazenet solution
Testing- Fundamentals of Testing-Mazenet solution
 

More from Kim Herzig

Code Ownership and Software Quality: A Replication Study @ MSR 2015
Code Ownership and Software Quality: A Replication Study @ MSR 2015Code Ownership and Software Quality: A Replication Study @ MSR 2015
Code Ownership and Software Quality: A Replication Study @ MSR 2015
Kim Herzig
 
Predicting Defects Using Change Genealogies (ISSE 2013)
Predicting Defects Using Change Genealogies (ISSE 2013)Predicting Defects Using Change Genealogies (ISSE 2013)
Predicting Defects Using Change Genealogies (ISSE 2013)
Kim Herzig
 
Mining and Untangling Change Genealogies (PhD Defense Talk)
Mining and Untangling Change Genealogies (PhD Defense Talk)Mining and Untangling Change Genealogies (PhD Defense Talk)
Mining and Untangling Change Genealogies (PhD Defense Talk)
Kim Herzig
 
The Impact of Tangled Code Changes
The Impact of Tangled Code ChangesThe Impact of Tangled Code Changes
The Impact of Tangled Code Changes
Kim Herzig
 
Mining Cause Effect Chains from Version Archives - ISSRE 2011
Mining Cause Effect Chains from Version Archives - ISSRE 2011Mining Cause Effect Chains from Version Archives - ISSRE 2011
Mining Cause Effect Chains from Version Archives - ISSRE 2011
Kim Herzig
 
Network vs. Code Metrics to Predict Defects: A Replication Study
Network vs. Code Metrics  to Predict Defects: A Replication StudyNetwork vs. Code Metrics  to Predict Defects: A Replication Study
Network vs. Code Metrics to Predict Defects: A Replication Study
Kim Herzig
 
Capturing the Long Term Impact of Changes
Capturing the Long Term Impact of ChangesCapturing the Long Term Impact of Changes
Capturing the Long Term Impact of Changes
Kim Herzig
 
Software Engineering Course 2009 - Mining Software Archives
Software Engineering Course 2009 - Mining Software ArchivesSoftware Engineering Course 2009 - Mining Software Archives
Software Engineering Course 2009 - Mining Software Archives
Kim Herzig
 

More from Kim Herzig (8)

Code Ownership and Software Quality: A Replication Study @ MSR 2015
Code Ownership and Software Quality: A Replication Study @ MSR 2015Code Ownership and Software Quality: A Replication Study @ MSR 2015
Code Ownership and Software Quality: A Replication Study @ MSR 2015
 
Predicting Defects Using Change Genealogies (ISSE 2013)
Predicting Defects Using Change Genealogies (ISSE 2013)Predicting Defects Using Change Genealogies (ISSE 2013)
Predicting Defects Using Change Genealogies (ISSE 2013)
 
Mining and Untangling Change Genealogies (PhD Defense Talk)
Mining and Untangling Change Genealogies (PhD Defense Talk)Mining and Untangling Change Genealogies (PhD Defense Talk)
Mining and Untangling Change Genealogies (PhD Defense Talk)
 
The Impact of Tangled Code Changes
The Impact of Tangled Code ChangesThe Impact of Tangled Code Changes
The Impact of Tangled Code Changes
 
Mining Cause Effect Chains from Version Archives - ISSRE 2011
Mining Cause Effect Chains from Version Archives - ISSRE 2011Mining Cause Effect Chains from Version Archives - ISSRE 2011
Mining Cause Effect Chains from Version Archives - ISSRE 2011
 
Network vs. Code Metrics to Predict Defects: A Replication Study
Network vs. Code Metrics  to Predict Defects: A Replication StudyNetwork vs. Code Metrics  to Predict Defects: A Replication Study
Network vs. Code Metrics to Predict Defects: A Replication Study
 
Capturing the Long Term Impact of Changes
Capturing the Long Term Impact of ChangesCapturing the Long Term Impact of Changes
Capturing the Long Term Impact of Changes
 
Software Engineering Course 2009 - Mining Software Archives
Software Engineering Course 2009 - Mining Software ArchivesSoftware Engineering Course 2009 - Mining Software Archives
Software Engineering Course 2009 - Mining Software Archives
 

Recently uploaded

AJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdfAJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
RDhivya6
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
yourprojectpartner05
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
hozt8xgk
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
International Food Policy Research Institute- South Asia Office
 
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills MN
 
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Sérgio Sacani
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
Sciences of Europe
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
frank0071
 
11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
PirithiRaju
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
Scintica Instrumentation
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
University of Maribor
 
Farming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptxFarming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptx
Frédéric Baudron
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
Sérgio Sacani
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
Advanced-Concepts-Team
 
Summary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdfSummary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdf
vadgavevedant86
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
PirithiRaju
 
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at  𝐳 = 2.9  wi...Discovery of An Apparent Red, High-Velocity Type Ia Supernova at  𝐳 = 2.9  wi...
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...
Sérgio Sacani
 

Recently uploaded (20)

AJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdfAJAY KUMAR NIET GreNo Guava Project File.pdf
AJAY KUMAR NIET GreNo Guava Project File.pdf
 
23PH301 - Optics - Optical Lenses.pptx
23PH301 - Optics  -  Optical Lenses.pptx23PH301 - Optics  -  Optical Lenses.pptx
23PH301 - Optics - Optical Lenses.pptx
 
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptxLEARNING TO LIVE WITH LAWS OF MOTION .pptx
LEARNING TO LIVE WITH LAWS OF MOTION .pptx
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
 
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...
 
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
Evidence of Jet Activity from the Secondary Black Hole in the OJ 287 Binary S...
 
Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)Sciences of Europe journal No 142 (2024)
Sciences of Europe journal No 142 (2024)
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
Juaristi, Jon. - El canon espanol. El legado de la cultura española a la civi...
 
11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf11.1 Role of physical biological in deterioration of grains.pdf
11.1 Role of physical biological in deterioration of grains.pdf
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
 
Randomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNERandomised Optimisation Algorithms in DAPHNE
Randomised Optimisation Algorithms in DAPHNE
 
Farming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptxFarming systems analysis: what have we learnt?.pptx
Farming systems analysis: what have we learnt?.pptx
 
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...
 
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
ESA/ACT Science Coffee: Diego Blas - Gravitational wave detection with orbita...
 
Summary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdfSummary Of transcription and Translation.pdf
Summary Of transcription and Translation.pdf
 
Methods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdfMethods of grain storage Structures in India.pdf
Methods of grain storage Structures in India.pdf
 
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at  𝐳 = 2.9  wi...Discovery of An Apparent Red, High-Velocity Type Ia Supernova at  𝐳 = 2.9  wi...
Discovery of An Apparent Red, High-Velocity Type Ia Supernova at 𝐳 = 2.9 wi...
 

The Art of Testing Less without Sacrificing Quality @ ICSE 2015

  • 1. Toolsfor Software Engineers The Art of Testing Less Without Sacrificing Code Quality Kim Herzig£$, Michaela Greiler$, Jacek Czerwonka$, Brendan Murphy£ £Microsoft Research, Cambridge $Microsoft Corporation, Tools for Software Engineers (TSE)
  • 2. IMPROVING TESTING PROCESSES Release cycles impact verification process • Testing becomes bottleneck for development. • How much testing is enough? • How reliable and effective are tests? • When should we run a test?
  • 4. SYSTEM AND INTEGRATION TESTING Quality gates • Developer have to pass quality gates (no control over test selection) • Checking system constraints: e.g. compatibility or performance • Failures not isolated  involve human inspections  causes development freeze for corresponding branch
  • 5. SYSTEM AND INTEGRATION TESTING Software testing is expensive • 10k+ gates executed, 1M+ test cases • Different branches, architectures, languages, … • Aims to find code issues as early as possible • Slows down product development
  • 6. RESEARCH OBJECTIVE Only run effective and reliable tests • Not every tests performs equally well, depends on code base • Reduce execution frequency of tests that cause false test alarms (failures due to test and infrastructure issues) Do not sacrifice code quality • Run every test at least once on every code change • Eventually find all code defects, taking risk of finding defects later ok. Running less tests increases code velocity • We cannot run all tests on all code changes anymore. • Identify tests that are more likely to find defects (not coverage).
  • 7. Effectiveness Reliability High cost, unknown value $$$$ High cost, low value $$$$ Low cost, good value $$ Low cost, low value $ high lowhigh low HISTORIC TEST FAILURE PROBABILITIES Analyzing past test runs: failure probabilities • How often did the test fail and detected a code defect? (𝑃𝑇𝑃) • How often did the test report a false test alarm? (𝑃𝐹𝑃) time Quality Gate Build Build ? Build Execution history These probabilities depend on the execution context!
  • 8. COST MODEL What is the cost of running a test? • Treat test executions as investment: what is our return of investment Check before running a test (considering the context) 𝐶𝑜𝑠𝑡 𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 > 𝐶𝑜𝑠𝑡 𝑆𝑘𝑖𝑝 ? suspend ∶ execute test 𝐶𝑜𝑠𝑡 𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 = 𝐶𝑜𝑠𝑡 𝑀𝑎𝑐ℎ𝑖𝑛𝑒/𝑇𝑖𝑚𝑒 ∗ 𝑇𝑖𝑚𝑒 𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 + "Cost of potential false alarm" = 𝐶𝑜𝑠𝑡 𝑀𝑎𝑐ℎ𝑖𝑛𝑒/𝑇𝑖𝑚𝑒 ∗ 𝑇𝑖𝑚𝑒 𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 + (𝑃𝐹𝑃 ∗ 𝐶𝑜𝑠𝑡 𝐷𝑒𝑣𝑒𝑙𝑜𝑝𝑒𝑟/𝑇𝑖𝑚𝑒 ∗𝑇𝑖𝑚𝑒 𝑇𝑟𝑖𝑎𝑔𝑒 ) 𝐶𝑜𝑠𝑡 𝑆𝑘𝑖𝑝 = "Potential cost of elapsing a bug to next higher branch level" = 𝑃 𝑇𝑃 ∗ 𝐶𝑜𝑠𝑡 𝐷𝑒𝑣𝑒𝑙𝑜𝑝𝑒𝑟/𝑇𝑖𝑚𝑒 ∗ 𝑇𝑖𝑚𝑒 𝐹𝑟𝑒𝑒𝑧𝑒 𝑏𝑟𝑎𝑛𝑐ℎ ∗ #𝐷𝑒𝑣𝑒𝑙𝑜𝑝𝑒𝑟𝑠 𝐵𝑟𝑎𝑛𝑐ℎ
  • 9. DOES IT PAY OFF? Less test executions reduce cost Taking risk increases cost ~11 month period > 30 million test execs multiple branches ~3 month period > 1.2 million test execs single branch ~12 month period > 6.5 million test execs multiple branches
  • 10. Windows Results Simulated on Windows 8.1 development period (BVT only) Tools for Software Engineers
  • 11. ACROSS ALL PRODUCTS TABLE I. SIMULATION RESULTS FOR MICROSOFT WINDOWS, OFFICE, AND DYNAMICS. Windows Office Dynamics Measurement Rel. improvement Cost improvement Rel. improvement Cost improvement Rel. improvement Cost improvement Test executions 40.58% -- 34.9% -- 50.36% -- Test time 40.31% $1,567,607.76 40.1% $76,509.24 47.45% $19,979.03 Test result inspection 33.04% $61,532.80 21.1% $104,880.00 32.53% $2,337,926.40 Escaped defects 0.20% $11,970.56 8.7% $75,326.40 13.40% $310,159.42 Total cost balance $1,617,170.00 $106,063.24 $2,047,746.01 Results vary • Branching structure • Runtime of tests • We save cost on all products Fine-tuning possible, better results but not general
  • 12. DYNAMIC & SELF-ADAPTIVE Probabilities are dynamic (change over time) • Skipping tests influences risk factors (of higher level branches) • Tests re-enabled when code quality drops • Feedback-loop between decision points 0% 10% 20% 30% 40% 50% 60% 70% relativetestreductionrate Time (Windows 8.1) Training period automatically enable tests again
  • 13. IMPACT ON DEVELOPMENT PROCESS Secondary Improvements • Machine Setup We may lower the number of machines allocated to testing process • Developer satisfaction Removing false test failures increases confidence in testing process Development speed • Impact on development speed hard to estimate through simulation • Product teams invest as they believe that removing tests:  Increases code velocity (at least lower bound)  Avoids additional changes due to merge conflicts  Reduces the number of required integration branches as their main purpose is to test product “We used the data your team has provided to cut a bunch of bad content and are running a much leaner BVT system […] we’re panning out to scale about 4x and run in well under 2 hours” (Jason Means, Windows BVT PM)
  • 14. ©2013 Microsoft Corporation. All rights reserved. Toolsfor Software Engineers

Editor's Notes

  1. * Shorter release cycles simply restrict the number of tests that we can execute: less time for verification! * At the same time systems get more complex.
  2. These tests are predefines: no control: unit testing you can choose your set Many tests do not check functionality but constraints: these tests act as security policy—managing risk of defects elapsing
  3. The system is designed for maximal protection But that is not possible anymore: we simply cannot run all tests on all code changes anymore Which one to run? Coverage: no!
  4. GOLDEN RULE! The distinct number of tests remains the same: we only reduce the execution frequency of “bad” tests
  5. NO EXTRA DATA COLLECTED! No runtime overhead Data exists in every (!) test execution database (at least it should) 
  6. FOR EACH CODE CHANGE WE RUN EVERY TEST AT LEAST ONCE BEFORE TRUNK INTEGRATION: This check is only performed when we know that the test will be executed in a later stage, e.g. branch closer to root node of tree
  7. Show relative test reduction rate Show relative test execution runtime Elapsed defects Cost balance