SlideShare a Scribd company logo
1 of 29
coverage tool is
measuring the
wrong thing
(on purpose)
A deep(-ish) dive
into
code coverage
About Me
Sean Reilly
@seanjreilly
Who uses code
coverage now?
Why use code
coverage?
“We want to know how
well
our code is tested”
Code
coverage is
good• Testing code coverage is
good
• How it’s done is often not so
good
• Why is that?
The Starting
Point• “How well is our code
tested?”
• This is a qualitative measure
• Computers don’t do
qualitative
• Can we make it quantitive?
A Quantitive
Measure
“How many lines* of code can
I delete without causing any
tests** to fail?”
*statements, methods, branches, et
**or compilation
Why is this a good
measure?
• Direct translation of the
qualitative question
• Makes sense
• Minimises code written for a
set of tests
This is
expensive• Really, really, expensive
• n
statements/branches/method
s = n(n-1) compile and test
cycles
• We need something cheaper
Downgrade from
business class to
cattle class
Find a lower cost
approximation
A (low-budget)
Quantitive Measure
“How many lines* of code are
executed when all of the tests
are run?”
*statements, methods, branches, et
A (low-budget)
Quantitive Measure
• Much cheaper
• Approximately* the same
thing
* All the fun happens here!
The
differences
Problem areas
• Synthetic methods
• Things you do to “make it
compile”
• Java 7 features
• Useless code
Learnings
• All code coverage engines
are flawed
• Some are profoundly flawed
• It’s possible to lower
coverage by deleting
untested code
Proof
Proof
Learnings
• You cannot reliably enforce:
• “X% or higher coverage”
• “Coverage always goes up”
Automatic
enforcement of
coverage levels is
troublesome
(Unless the coverage level is 1
What to do
instead?
• Spot check
• Manually use the more stringent
measure
• Compare to last week, not last
commit
• If the number goes down, know
why
• Separate covered and uncovered
Two more
things
Should I test getters
and setters?
• No
• But…
• Delete getters and setters
that you can delete without
making a test fail
Should I enforce 100%
code coverage?
• It depends…
• Why you’re doing it
• Who decides
• If it feels like work
Mutation
Testing?
What is mutation
testing?
• Mutate statements instead of
deleting them
• Every mutation should make
a test fail
Thoughts on
mutation testing
• Seems decent for loop logic or
math logic
• Doesn’t know how to mutate a lot
of statements
• Doesn’t mutate source code, just
object code
• Based on a traditional coverage
run
UNITED KINGDOM
+44 203 603 7830
helloUK@equalexperts.com
Equal Experts UK Ltd
30 Brock Street
London NW1 3FG
INDIA
+91 20 6607 7763
helloIndia@equalexperts.com
Equal Experts India Private Ltd
Office No. 4-C
Cerebrum IT Park No. B3
Kumar City, Kalyani Nagar
Pune, 411006
CANADA
+1 403 775 4861
helloCanada@equalexperts.com
Equal Experts Devices Inc
205 - 279 Midpark way S.E.
T2X 1M2
Calgary, Alberta
PORTUGAL
+351 211 378 414
helloPortugal@equalexperts.com
Equal Experts Portugal
Avenida Dom João II, Nº35
Edificio Infante 11ºA
1990-083 Parque das Nações
Lisboa – Portugal
USA
helloUSA@equalexperts.com
Equal Experts Inc
315 Hudson Street
9th Floor
New York City, NY 10013
Thanks!

More Related Content

What's hot

Test case design_the_basicsv0.4
Test case design_the_basicsv0.4Test case design_the_basicsv0.4
Test case design_the_basicsv0.4
guest31fced
 
Random testing & prototyping
Random testing & prototypingRandom testing & prototyping
Random testing & prototyping
Vipul Rastogi
 

What's hot (20)

Key learnings from my refactor journey.
Key learnings from my refactor journey.Key learnings from my refactor journey.
Key learnings from my refactor journey.
 
Code review at large scale
Code review at large scaleCode review at large scale
Code review at large scale
 
How to successfully grow a code review culture
How to successfullygrow a code review cultureHow to successfullygrow a code review culture
How to successfully grow a code review culture
 
Unit testing - An introduction
Unit testing - An introductionUnit testing - An introduction
Unit testing - An introduction
 
Pertanyaan dan jawaban (graham et.al 2011) part 3
Pertanyaan dan jawaban (graham et.al 2011) part 3Pertanyaan dan jawaban (graham et.al 2011) part 3
Pertanyaan dan jawaban (graham et.al 2011) part 3
 
Effective Code Review
Effective Code ReviewEffective Code Review
Effective Code Review
 
Test case design_the_basicsv0.4
Test case design_the_basicsv0.4Test case design_the_basicsv0.4
Test case design_the_basicsv0.4
 
An insight to test driven development and unit testing
An insight to test driven development and unit testingAn insight to test driven development and unit testing
An insight to test driven development and unit testing
 
BugDay2012 Test Design with CTE XL(SharingDay)
BugDay2012 Test Design with CTE XL(SharingDay)BugDay2012 Test Design with CTE XL(SharingDay)
BugDay2012 Test Design with CTE XL(SharingDay)
 
What You are Doing Wrong with Automated Testing
What You are Doing Wrong with Automated TestingWhat You are Doing Wrong with Automated Testing
What You are Doing Wrong with Automated Testing
 
Effective Code Review
Effective Code ReviewEffective Code Review
Effective Code Review
 
Testing Philosphies
Testing PhilosphiesTesting Philosphies
Testing Philosphies
 
Imrad structure
Imrad structureImrad structure
Imrad structure
 
Code reviews
Code reviewsCode reviews
Code reviews
 
Domain analysis in Software Testing
Domain analysis in Software TestingDomain analysis in Software Testing
Domain analysis in Software Testing
 
Usability testing
Usability testingUsability testing
Usability testing
 
Random testing & prototyping
Random testing & prototypingRandom testing & prototyping
Random testing & prototyping
 
Random testing
Random testingRandom testing
Random testing
 
Fantastic Tests - The Crimes of Bad Test Design
Fantastic Tests - The Crimes of Bad Test DesignFantastic Tests - The Crimes of Bad Test Design
Fantastic Tests - The Crimes of Bad Test Design
 
Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Test...
Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Test...Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Test...
Mutation Analysis vs. Code Coverage in Automated Assessment of Students’ Test...
 

Similar to Every code coverage tool is measuring the wrong thing (on purpose)

Unit Testing Best Practices
Unit Testing Best PracticesUnit Testing Best Practices
Unit Testing Best Practices
Tomaš Maconko
 
Test-Driven Development Reference Card
Test-Driven Development Reference CardTest-Driven Development Reference Card
Test-Driven Development Reference Card
Seapine Software
 
Test-Driven Development
Test-Driven DevelopmentTest-Driven Development
Test-Driven Development
Meilan Ou
 
Quality metrics and angular js applications
Quality metrics and angular js applicationsQuality metrics and angular js applications
Quality metrics and angular js applications
nadeembtech
 

Similar to Every code coverage tool is measuring the wrong thing (on purpose) (20)

Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...
Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...
Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...
 
Полезные метрики покрытия. Практический опыт и немного теории
Полезные метрики покрытия. Практический опыт и немного теорииПолезные метрики покрытия. Практический опыт и немного теории
Полезные метрики покрытия. Практический опыт и немного теории
 
First steps in testing analytics: Does test code quality matter?
First steps in testing analytics: Does test code quality matter?First steps in testing analytics: Does test code quality matter?
First steps in testing analytics: Does test code quality matter?
 
Unit Testing Best Practices
Unit Testing Best PracticesUnit Testing Best Practices
Unit Testing Best Practices
 
Why Automated Testing Matters To DevOps
Why Automated Testing Matters To DevOpsWhy Automated Testing Matters To DevOps
Why Automated Testing Matters To DevOps
 
Bigger Unit Test Are Better
Bigger Unit Test Are BetterBigger Unit Test Are Better
Bigger Unit Test Are Better
 
Practical TDD Demonstrated
Practical TDD DemonstratedPractical TDD Demonstrated
Practical TDD Demonstrated
 
Dynamic Testing
Dynamic TestingDynamic Testing
Dynamic Testing
 
An Introduction to Unit Testing
An Introduction to Unit TestingAn Introduction to Unit Testing
An Introduction to Unit Testing
 
Code coverage
Code coverageCode coverage
Code coverage
 
Test case design techniques
Test case design techniquesTest case design techniques
Test case design techniques
 
Test case design techniques
Test case design techniquesTest case design techniques
Test case design techniques
 
Can we induce change with what we measure?
Can we induce change with what we measure?Can we induce change with what we measure?
Can we induce change with what we measure?
 
Introduzione allo Unit Testing
Introduzione allo Unit TestingIntroduzione allo Unit Testing
Introduzione allo Unit Testing
 
Test-Driven Development Reference Card
Test-Driven Development Reference CardTest-Driven Development Reference Card
Test-Driven Development Reference Card
 
Test-Driven Development
Test-Driven DevelopmentTest-Driven Development
Test-Driven Development
 
Getting Ahead of Delivery Issues with Deep SDLC Analysis by Donald Belcham
Getting Ahead of Delivery Issues with Deep SDLC Analysis by Donald BelchamGetting Ahead of Delivery Issues with Deep SDLC Analysis by Donald Belcham
Getting Ahead of Delivery Issues with Deep SDLC Analysis by Donald Belcham
 
An Introduction To Software Development - Test Driven Development, Part 1
An Introduction To Software Development - Test Driven Development, Part 1An Introduction To Software Development - Test Driven Development, Part 1
An Introduction To Software Development - Test Driven Development, Part 1
 
Effective code reviews
Effective code reviewsEffective code reviews
Effective code reviews
 
Quality metrics and angular js applications
Quality metrics and angular js applicationsQuality metrics and angular js applications
Quality metrics and angular js applications
 

Recently uploaded

Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
drm1699
 

Recently uploaded (20)

Test Automation Design Patterns_ A Comprehensive Guide.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdfTest Automation Design Patterns_ A Comprehensive Guide.pdf
Test Automation Design Patterns_ A Comprehensive Guide.pdf
 
Spring into AI presented by Dan Vega 5/14
Spring into AI presented by Dan Vega 5/14Spring into AI presented by Dan Vega 5/14
Spring into AI presented by Dan Vega 5/14
 
Navigation in flutter – how to add stack, tab, and drawer navigators to your ...
Navigation in flutter – how to add stack, tab, and drawer navigators to your ...Navigation in flutter – how to add stack, tab, and drawer navigators to your ...
Navigation in flutter – how to add stack, tab, and drawer navigators to your ...
 
Community is Just as Important as Code by Andrea Goulet
Community is Just as Important as Code by Andrea GouletCommunity is Just as Important as Code by Andrea Goulet
Community is Just as Important as Code by Andrea Goulet
 
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
Abortion Pills For Sale WhatsApp[[+27737758557]] In Birch Acres, Abortion Pil...
 
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
CERVED e Neo4j su una nuvola, migrazione ed evoluzione di un grafo mission cr...
 
Microsoft365_Dev_Security_2024_05_16.pdf
Microsoft365_Dev_Security_2024_05_16.pdfMicrosoft365_Dev_Security_2024_05_16.pdf
Microsoft365_Dev_Security_2024_05_16.pdf
 
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
Abortion Clinic In Johannesburg ](+27832195400*)[ 🏥 Safe Abortion Pills in Jo...
 
Novo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMsNovo Nordisk: When Knowledge Graphs meet LLMs
Novo Nordisk: When Knowledge Graphs meet LLMs
 
Abortion Pill Prices Jane Furse ](+27832195400*)[ 🏥 Women's Abortion Clinic i...
Abortion Pill Prices Jane Furse ](+27832195400*)[ 🏥 Women's Abortion Clinic i...Abortion Pill Prices Jane Furse ](+27832195400*)[ 🏥 Women's Abortion Clinic i...
Abortion Pill Prices Jane Furse ](+27832195400*)[ 🏥 Women's Abortion Clinic i...
 
BusinessGPT - Security and Governance for Generative AI
BusinessGPT  - Security and Governance for Generative AIBusinessGPT  - Security and Governance for Generative AI
BusinessGPT - Security and Governance for Generative AI
 
Abortion Clinic In Springs ](+27832195400*)[ 🏥 Safe Abortion Pills in Springs...
Abortion Clinic In Springs ](+27832195400*)[ 🏥 Safe Abortion Pills in Springs...Abortion Clinic In Springs ](+27832195400*)[ 🏥 Safe Abortion Pills in Springs...
Abortion Clinic In Springs ](+27832195400*)[ 🏥 Safe Abortion Pills in Springs...
 
Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
Abortion Pill Prices Germiston ](+27832195400*)[ 🏥 Women's Abortion Clinic in...
 
Encryption Recap: A Refresher on Key Concepts
Encryption Recap: A Refresher on Key ConceptsEncryption Recap: A Refresher on Key Concepts
Encryption Recap: A Refresher on Key Concepts
 
Incident handling is a clearly defined set of procedures to manage and respon...
Incident handling is a clearly defined set of procedures to manage and respon...Incident handling is a clearly defined set of procedures to manage and respon...
Incident handling is a clearly defined set of procedures to manage and respon...
 
Transformer Neural Network Use Cases with Links
Transformer Neural Network Use Cases with LinksTransformer Neural Network Use Cases with Links
Transformer Neural Network Use Cases with Links
 
Rapidoform for Modern Form Building and Insights
Rapidoform for Modern Form Building and InsightsRapidoform for Modern Form Building and Insights
Rapidoform for Modern Form Building and Insights
 
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
From Knowledge Graphs via Lego Bricks to scientific conversations.pptxFrom Knowledge Graphs via Lego Bricks to scientific conversations.pptx
From Knowledge Graphs via Lego Bricks to scientific conversations.pptx
 
Auto Affiliate AI Earns First Commission in 3 Hours..pdf
Auto Affiliate  AI Earns First Commission in 3 Hours..pdfAuto Affiliate  AI Earns First Commission in 3 Hours..pdf
Auto Affiliate AI Earns First Commission in 3 Hours..pdf
 
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
Anypoint Code Builder - Munich MuleSoft Meetup - 16th May 2024
 

Every code coverage tool is measuring the wrong thing (on purpose)

Editor's Notes

  1. And why? Sometimes, this is “because my boss makes me”, but setting that aside…
  2. This is a thing I hope most of us can get behind. The problems are not that this is done, but how it’s done. Sometimes this are people problems (we’ll get into that later), but sometimes they aren’t.
  3. This isn’t the only measure… “can I change a statement without making a test fail?” is also a good one?
  4. The sample java later in this presentation has 23 statements that could potentially be removed (excluding trivial things like throws clauses and import statements). That’s 506 build and test cycles.
  5. 1 instrumented test run, which is more expensive than normal, but cheaper than hundreds or thousands of test runs
  6. Synthetic methods: Default constructors, methods on enums Java 7: ARM blocks. The compiler purposefully puts in more blocks than will be executed (null checks in the finally, etc) knowing that the JIT will optimise the extra ones away. Coverage tools don’t even attempt to detect useless code.
  7. Profoundly flawed = Java 7 support, etc. Delete an untested method that does nothing but was executed during a test… coverage goes down slightly. In our example, with the spurious method, instruction coverage is 82%. Without it, coverage is 70%.
  8. Note that branch coverage went from 100% to undefined!
  9. Profoundly flawed = Java 7 support, etc.
  10. If it’s 100%, and you delete a chunk of untested code, it should still be 100%… because all of the code that’s less should still be covered. This also holds for 0% coverage. I assume we’re all happy to ignore that case.
  11. Separate code: consider a module with 100% (or high) coverage, and another module without enforced coverage. Move things into the one module over time.
  12. Trivial getters and setters don’t need to be tested directly. Tests are executable documentation, and documentation isn’t needed for that.
  13. should you enforce 100% code coverage? twice in my career I’ve been on teams where we were close to 100% code coverage. In 2013 we were three or four statements/branches away for a while. Spot checking every week or so. So finally, we put in explicit tests to cover those three or four spots. A week later, we were still at 100%. I talked to some of the guys on the team… should we fail the build if coverage isn’t 100%? Let’s try it.. see what happens. We turned it on, and forgot about it for a couple of weeks. Then the first few times we tripped it, it was definitely areas where we had forgotten to write a test… so we decided to keep it. Also, when somebody asked “what’s your code coverage?” and you can say 100% without checking anything you feel like an absolute boss. Good for political reasons sometimes. :-)
  14. should you enforce 100% code coverage? twice in my career I’ve been on teams where we were close to 100% code coverage. In 2013 we were three or four statements/branches away for a while. Spot checking every weak-ish. So finally, we put in explicit tests to cover those three or four spots. A week later, we were still at 100%. I talked to some of the guys on the team… should we fail the build if coverage isn’t 100%? Let’s try it.. see what happens. We turned it on, and forgot about it for a couple of weeks. Then the first few times we tripped it, it was definitely areas where we had forgotten to write a test… so we decided to keep it. Also, when somebody asked “what’s your code coverage?” and you can say 100% without checking anything you feel like an absolute boss. Good for political reasons sometimes. :-)
  15. The last two times I’ve done this talk, people have mentioned mutation testing — specifically PIT. (Which seems to be the viable option in the Java world)
  16. Example mutations: return null instead of a value, subtract instead of add, that sort of thing
  17. The class of problem PIT is really good at catching is tests that don’t assert anything. To improve performance, PIT does a single traditional coverage run… which it then uses to learn which tests to run which mutations against. Which means it’s got a gap for statements that aren’t executed by any tests…. Same old problem. Mutating object code and not source code means that we can’t see that a mutation doesn’t make something not compile. False positives mean that improving code can still make the coverage percentage go down. An example of this would be removing one of two duplicate methods.