The Art of Testing Less without Sacrificing Quality @ ICSE 2015

Toolsfor
Software Engineers
The Art of Testing Less Without
Sacrificing Code Quality
Kim Herzig£$, Michaela Greiler$, Jacek Czerwonka$, Brendan Murphy£
£Microsoft Research, Cambridge
$Microsoft Corporation, Tools for Software Engineers (TSE)

IMPROVING TESTING PROCESSES
Release cycles impact verification process
• Testing becomes bottleneck for development.
• How much testing is enough?
• How reliable and effective are tests?
• When should we run a test?

ENGINEERING PROCESS
Engineers desktop Integration process

SYSTEM AND INTEGRATION TESTING
Quality gates
• Developer have to pass quality gates (no control over test selection)
• Checking system constraints: e.g. compatibility or performance
• Failures not isolated
 involve human inspections
 causes development freeze for corresponding branch

SYSTEM AND INTEGRATION TESTING
Software testing is expensive
• 10k+ gates executed, 1M+ test cases
• Different branches, architectures, languages, …
• Aims to find code issues as early as possible
• Slows down product development

RESEARCH OBJECTIVE
Only run effective and reliable tests
• Not every tests performs equally well, depends on code base
• Reduce execution frequency of tests that cause false test alarms
(failures due to test and infrastructure issues)
Do not sacrifice code quality
• Run every test at least once on every code change
• Eventually find all code defects, taking risk of finding defects later ok.
Running less tests increases code velocity
• We cannot run all tests on all code changes anymore.
• Identify tests that are more likely to find defects (not coverage).

Effectiveness
Reliability
High cost,
unknown value
$$$$
High cost,
low value
$$$$
Low cost,
good value
$$
Low cost,
low value
$
high
lowhigh
low
HISTORIC TEST FAILURE PROBABILITIES
Analyzing past test runs: failure probabilities
• How often did the test fail and detected a code defect? (𝑃𝑇𝑃)
• How often did the test report a false test alarm? (𝑃𝐹𝑃)
time
Quality
Gate
Build Build
?
Build
Execution history
These probabilities depend on the execution context!

COST MODEL
What is the cost of running a test?
• Treat test executions as investment: what is our return of investment
Check before running a test (considering the context)
𝐶𝑜𝑠𝑡 𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 > 𝐶𝑜𝑠𝑡 𝑆𝑘𝑖𝑝 ? suspend ∶ execute test
𝐶𝑜𝑠𝑡 𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 = 𝐶𝑜𝑠𝑡 𝑀𝑎𝑐ℎ𝑖𝑛𝑒/𝑇𝑖𝑚𝑒 ∗ 𝑇𝑖𝑚𝑒 𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 + "Cost of potential false alarm"
= 𝐶𝑜𝑠𝑡 𝑀𝑎𝑐ℎ𝑖𝑛𝑒/𝑇𝑖𝑚𝑒 ∗ 𝑇𝑖𝑚𝑒 𝐸𝑥𝑒𝑐𝑢𝑡𝑖𝑜𝑛 + (𝑃𝐹𝑃 ∗ 𝐶𝑜𝑠𝑡 𝐷𝑒𝑣𝑒𝑙𝑜𝑝𝑒𝑟/𝑇𝑖𝑚𝑒 ∗𝑇𝑖𝑚𝑒 𝑇𝑟𝑖𝑎𝑔𝑒 )
𝐶𝑜𝑠𝑡 𝑆𝑘𝑖𝑝 = "Potential cost of elapsing a bug to next higher branch level"
= 𝑃 𝑇𝑃 ∗ 𝐶𝑜𝑠𝑡 𝐷𝑒𝑣𝑒𝑙𝑜𝑝𝑒𝑟/𝑇𝑖𝑚𝑒 ∗ 𝑇𝑖𝑚𝑒 𝐹𝑟𝑒𝑒𝑧𝑒 𝑏𝑟𝑎𝑛𝑐ℎ ∗ #𝐷𝑒𝑣𝑒𝑙𝑜𝑝𝑒𝑟𝑠 𝐵𝑟𝑎𝑛𝑐ℎ

DOES IT PAY OFF?
Less test executions
reduce cost
Taking risk
increases cost
~11 month period
> 30 million test execs
multiple branches
~3 month period
> 1.2 million test execs
single branch
~12 month period
> 6.5 million test execs
multiple branches

Windows
Results
Simulated on
Windows 8.1
development period
(BVT only)
Tools for
Software Engineers

ACROSS ALL PRODUCTS
TABLE I. SIMULATION RESULTS FOR MICROSOFT WINDOWS, OFFICE, AND DYNAMICS.
Windows Office Dynamics
Measurement Rel. improvement Cost
improvement
Rel. improvement Cost
improvement
Rel. improvement Cost
improvement
Test executions 40.58% -- 34.9% -- 50.36% --
Test time 40.31% $1,567,607.76 40.1% $76,509.24 47.45% $19,979.03
Test result inspection 33.04% $61,532.80 21.1% $104,880.00 32.53% $2,337,926.40
Escaped defects 0.20% $11,970.56 8.7% $75,326.40 13.40% $310,159.42
Total cost balance $1,617,170.00 $106,063.24 $2,047,746.01
Results vary
• Branching structure
• Runtime of tests
• We save cost on all products
Fine-tuning possible, better results but not general

DYNAMIC & SELF-ADAPTIVE
Probabilities are dynamic (change over time)
• Skipping tests influences risk factors (of higher level branches)
• Tests re-enabled when code quality drops
• Feedback-loop between decision points
0%
10%
20%
30%
40%
50%
60%
70%
relativetestreductionrate
Time (Windows 8.1)
Training period
automatically enable tests again

IMPACT ON DEVELOPMENT PROCESS
Secondary Improvements
• Machine Setup
We may lower the number of machines allocated to testing process
• Developer satisfaction
Removing false test failures increases confidence in testing process
Development speed
• Impact on development speed hard to estimate through simulation
• Product teams invest as they believe that removing tests:
 Increases code velocity (at least lower bound)
 Avoids additional changes due to merge conflicts
 Reduces the number of required integration branches as their main purpose is to test product
“We used the data your team has provided to cut a bunch of bad content and are running a much leaner BVT system […]
we’re panning out to scale about 4x and run in well under 2 hours” (Jason Means, Windows BVT PM)

The Art of Testing Less without Sacrificing Quality @ ICSE 2015

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to The Art of Testing Less without Sacrificing Quality @ ICSE 2015

Similar to The Art of Testing Less without Sacrificing Quality @ ICSE 2015 (20)

More from Kim Herzig

More from Kim Herzig (8)

Recently uploaded

Recently uploaded (20)

The Art of Testing Less without Sacrificing Quality @ ICSE 2015

Editor's Notes