Better code through making bugs

1,012
-1

Published on

Do you know how good your tests are? Mutation testing can tell you. Unlike test coverage metrics (which only tell us how much of your application was executed, not whether the tests were any use), mutation testing lets us say something concrete about the quality of your test suite. Mutation testing has been around for years, but it's only recently that performance tools (such as PI Test for Java) have become available. In this session, Seb will look at the motivation and technology behind mutation testing and see some examples in action.

Published in: Technology, Business

Better code through making bugs

  1. 1. Tuesday, 22 April 14
  2. 2. Be#er  code  through  making  bugs Seb  Rose Claysnow  Limited @sebrose Tuesday, 22 April 14
  3. 3. Massive  thanks  to: 3 Henry  Coles pitest Filip  van  Laenen mutant Tuesday, 22 April 14
  4. 4. How do you assure the quality of your test suite? Tuesday, 22 April 14
  5. 5. Only employ coding gods How do you assure the quality of your test suite? Tuesday, 22 April 14
  6. 6. Only employ coding gods TDD will protect us How do you assure the quality of your test suite? Tuesday, 22 April 14
  7. 7. Only employ coding gods TDD will protect us Peer review How do you assure the quality of your test suite? Tuesday, 22 April 14
  8. 8. Only employ coding gods TDD will protect us Peer review QA will catch anything that we miss How do you assure the quality of your test suite? Tuesday, 22 April 14
  9. 9. Only employ coding gods TDD will protect us Peer review QA will catch anything that we miss Good code coverage How do you assure the quality of your test suite? Tuesday, 22 April 14
  10. 10. How do you assure the quality of your test suite? Tuesday, 22 April 14
  11. 11. Code coverage? How do you assure the quality of your test suite? Tuesday, 22 April 14
  12. 12. Code coverage? Line coverage? How do you assure the quality of your test suite? Tuesday, 22 April 14
  13. 13. Code coverage? Line coverage? Branch coverage? How do you assure the quality of your test suite? Tuesday, 22 April 14
  14. 14. Code coverage? Line coverage? Branch coverage? Statement coverage? How do you assure the quality of your test suite? Tuesday, 22 April 14
  15. 15. Does good coverage guarantee that: Tuesday, 22 April 14
  16. 16. • I can safely refactor my tests? Does good coverage guarantee that: Tuesday, 22 April 14
  17. 17. • I can safely refactor my tests? • I can trust a test suite I inherited? Does good coverage guarantee that: Tuesday, 22 April 14
  18. 18. • I can safely refactor my tests? • I can trust a test suite I inherited? • My team are writing effective tests? Does good coverage guarantee that: Tuesday, 22 April 14
  19. 19. • I can safely refactor my tests? • I can trust a test suite I inherited? • My team are writing effective tests? • I've retrofitted enough tests to protect my legacy code? Does good coverage guarantee that: Tuesday, 22 April 14
  20. 20. • I can safely refactor my tests? • I can trust a test suite I inherited? • My team are writing effective tests? • I've retrofitted enough tests to protect my legacy code? Does good coverage guarantee that: NO, IT DOESN’T Tuesday, 22 April 14
  21. 21. Coverage tells us nothing about the quality of our tests Tuesday, 22 April 14
  22. 22. Coverage tells us nothing about the quality of our tests It tells us which statements have been run by our tests Tuesday, 22 April 14
  23. 23. Coverage tells us nothing about the quality of our tests It tells us which statements have NOT been tested Tuesday, 22 April 14
  24. 24. Coverage tells us nothing about the quality of our tests ... but it is very useful for something else Tuesday, 22 April 14
  25. 25. public class CountingClass { private int count; public void count(int i) { if ( i >= 10 ) { count++; } } public void reset() { count = 0; } } Tuesday, 22 April 14
  26. 26. public class CountingClass { private int count; public void count(int i) { if ( i >= 10 ) { count++; } } public void reset() { count = 0; } } Tuesday, 22 April 14
  27. 27. public class CountingClass { private int count; public void count(int i) { if ( i >= 10 ) { count++; } } public void reset() { count = 0; } } Tuesday, 22 April 14
  28. 28. public class CountingClass { private int count; public void count(int i) { if ( i >= 10 ) { count++; } } public void reset() { count = 0; } } Tuesday, 22 April 14
  29. 29. @Test public void shouldStartWithEmptyCount() { assertEquals(0,testee.currentCount()); } @Test public void shouldCountIntegersAboveTen() { testee.count(11); assertEquals(1,testee.currentCount()); } @Test public void shouldNotCountIntegersBelowTen() { testee.count(9); assertEquals(0,testee.currentCount()); } Tuesday, 22 April 14
  30. 30. Lipton,“Fault diagnosis in computer programs”,1971 “If we want to know if a test suite has properly checked some code deliberately introduce a bug” Tuesday, 22 April 14
  31. 31. The deliberate introduction of a bug by changing the code under test Mutation: Tuesday, 22 April 14
  32. 32. A version of the code under test that has had a single mutation Mutant: Tuesday, 22 April 14
  33. 33. Tuesday, 22 April 14
  34. 34. Mutation test: Tuesday, 22 April 14
  35. 35. Run your test suite Mutation test: Tuesday, 22 April 14
  36. 36. Run your test suite If any test fails, the mutant has been killed Mutation test: Tuesday, 22 April 14
  37. 37. Run your test suite If any test fails, the mutant has been killed If no test fails, the mutant has survived Mutation test: Tuesday, 22 April 14
  38. 38. Mutation testing: Tuesday, 22 April 14
  39. 39. Generate lots of mutants Mutation testing: Tuesday, 22 April 14
  40. 40. Generate lots of mutants Run each one through your test suite Mutation testing: Tuesday, 22 April 14
  41. 41. Generate lots of mutants Run each one through your test suite Record and interpret the results Mutation testing: Tuesday, 22 April 14
  42. 42. public class CountingClass { private int count; public void count(int i) { if ( i >= 10 ) { count++; } } public void reset() { count = 0; } } Tuesday, 22 April 14
  43. 43. public class CountingClass { private int count; public void count(int i) { if ( i >= 10 ) { count++; } } public void reset() { count = 0; } } if ( i > 10 ) { Tuesday, 22 April 14
  44. 44. http://pitest.org Demo time Tuesday, 22 April 14
  45. 45. Mutation operators: • Conditionals Boundary Mutator • Negate Conditionals Mutator • Remove Conditionals Mutator • Math Mutator • Increments Mutator • Invert Negatives Mutator • Inline Constant Mutator • ReturnValues Mutator • Void Method Calls Mutator • NonVoid Method Calls Mutator • Constructor Calls Mutator Tuesday, 22 April 14
  46. 46. “Generated mutants are similar to real faults” Andrews, Briand, Labiche, ICSE 2005 Tuesday, 22 April 14
  47. 47. “In practice, if the software contains a fault, there will usually be a set of mutants that can only be killed by a test case that also detects that fault.” Geist et. al.,“Estimation and Enhancement of Real-time Software Reliability through Mutation Analysis,” 1992 Tuesday, 22 April 14
  48. 48. “Complex faults are coupled to simple faults in such a way that a test data set that detects all simple faults in a program will detect most complex faults.” K.Wah,“Fault Coupling in Finite Bijective Functions,” 1995 Tuesday, 22 April 14
  49. 49. Poor performance Equivalent mutations Why isn’t mutation testing widely used? Tuesday, 22 April 14
  50. 50. How bad is performance? Tuesday, 22 April 14
  51. 51. How bad is performance? Joda Time, consider, let us Tuesday, 22 April 14
  52. 52. Joda Time is a ... Tuesday, 22 April 14
  53. 53. small library for dealing with dates and times Joda Time is a ... Tuesday, 22 April 14
  54. 54. small library for dealing with dates and times 68k lines of code Joda Time is a ... Tuesday, 22 April 14
  55. 55. small library for dealing with dates and times 68k lines of code 70k lines of test code Joda Time is a ... Tuesday, 22 April 14
  56. 56. small library for dealing with dates and times 68k lines of code 70k lines of test code Takes about 10 seconds to compile Joda Time is a ... Tuesday, 22 April 14
  57. 57. small library for dealing with dates and times 68k lines of code 70k lines of test code Takes about 10 seconds to compile Takes about 16 seconds to run the unit tests Joda Time is a ... Tuesday, 22 April 14
  58. 58. Tuesday, 22 April 14
  59. 59. Let’s use 10 mutation operators Tuesday, 22 April 14
  60. 60. Let’s use 10 mutation operators assume about 10k mutations Tuesday, 22 April 14
  61. 61. Let’s use 10 mutation operators assume about 10k mutations If it takes 1 second to compile each one Tuesday, 22 April 14
  62. 62. Let’s use 10 mutation operators assume about 10k mutations If it takes 1 second to compile each one 2.5 hours to generate the mutants Tuesday, 22 April 14
  63. 63. Let’s use 10 mutation operators assume about 10k mutations If it takes 1 second to compile each one 2.5 hours to generate the mutants Run the test suite for each mutant Tuesday, 22 April 14
  64. 64. Let’s use 10 mutation operators assume about 10k mutations If it takes 1 second to compile each one 2.5 hours to generate the mutants Run the test suite for each mutant 10k x 16 seconds = 44.5 hours Tuesday, 22 April 14
  65. 65. pitest manipulates the byte code directly 10k mutants generated < 1 second Don’t compile! Tuesday, 22 April 14
  66. 66. Run fewer tests! Tuesday, 22 April 14
  67. 67. Stop when a test fails Run fewer tests! Tuesday, 22 April 14
  68. 68. Stop when a test fails can easily halve the run time Run fewer tests! Tuesday, 22 April 14
  69. 69. Stop when a test fails can easily halve the run time Choose your tests carefully Run fewer tests! Tuesday, 22 April 14
  70. 70. Stop when a test fails can easily halve the run time Choose your tests carefully not every test can kill every mutant Run fewer tests! Tuesday, 22 April 14
  71. 71. Stop when a test fails can easily halve the run time Choose your tests carefully not every test can kill every mutant Parallelise the test runner Run fewer tests! Tuesday, 22 April 14
  72. 72. Stop when a test fails can easily halve the run time Choose your tests carefully not every test can kill every mutant Parallelise the test runner your tests are unit tests, right? Run fewer tests! Tuesday, 22 April 14
  73. 73. Choosing your tests well is critical Tuesday, 22 April 14
  74. 74. Each mutant can only ever be killed Choosing your tests well is critical Tuesday, 22 April 14
  75. 75. Each mutant can only ever be killed by a subset of the tests Choosing your tests well is critical Tuesday, 22 April 14
  76. 76. Each mutant can only ever be killed by a subset of the tests Every other test run is waste Choosing your tests well is critical Tuesday, 22 April 14
  77. 77. How can we know which tests might kill any given mutant? Tuesday, 22 April 14
  78. 78. How can we know which tests might kill any given mutant? Naming conventions? Tuesday, 22 April 14
  79. 79. How can we know which tests might kill any given mutant? Naming conventions? Static analysis? Tuesday, 22 April 14
  80. 80. How can we know which tests might kill any given mutant? Naming conventions? Static analysis? COVERAGE DATA! Tuesday, 22 April 14
  81. 81. public class CountingClass { private int count; public void count(int i) { if ( i >= 10 ) { count++; } } public void reset() { count = 0; } } Tuesday, 22 April 14
  82. 82. public class CountingClass { private int count; public void count(int i) { if ( i >= 10 ) { count++; } } public void reset() { count = 0; } } Tuesday, 22 April 14
  83. 83. public class CountingClass { private int count; public void count(int i) { if ( i >= 10 ) { count++; } } public void reset() { count = 0; } } Tuesday, 22 April 14
  84. 84. public class CountingClass { private int count; public void count(int i) { if ( i >= 10 ) { count++; } } public void reset() { count = 0; } } Tuesday, 22 April 14
  85. 85. public class CountingClass { private int count; public void count(int i) { if ( i >= 10 ) { count++; } } public void reset() { count = 0; } } shouldCountIntegersAboveTen shouldNotCountIntegersBelowTen shouldCountIntegersOfExactlyTen Tuesday, 22 April 14
  86. 86. public class CountingClass { private int count; public void count(int i) { if ( i >= 10 ) { count++; } } public void reset() { count = 0; } } shouldCountIntegersAboveTen shouldNotCountIntegersBelowTen shouldCountIntegersOfExactlyTen shouldCountIntegersAboveTen shouldCountIntegersOfExactlyTen Tuesday, 22 April 14
  87. 87. public class CountingClass { private int count; public void count(int i) { if ( i >= 10 ) { count++; } } public void reset() { count = 0; } } shouldCountIntegersAboveTen shouldNotCountIntegersBelowTen shouldCountIntegersOfExactlyTen shouldCountIntegersAboveTen shouldCountIntegersOfExactlyTen not covered by any test Tuesday, 22 April 14
  88. 88. public class CountingClass { private int count; public void count(int i) { if ( i >= 10 ) { count++; } } public void reset() { count = 0; } } shouldCountIntegersAboveTen shouldNotCountIntegersBelowTen shouldCountIntegersOfExactlyTen shouldCountIntegersAboveTen shouldCountIntegersOfExactlyTen not covered by any test shouldStartWithEmptyCount does not cover any of the lines shown Tuesday, 22 April 14
  89. 89. Joda Time on my machine, with 2 threads: - Timings > scan classpath : < 1 second > coverage and dependency analysis : 59 seconds > build mutation tests : 1 seconds > run mutation analysis : 8 minutes and 21 seconds > Total : 9 minutes and 22 seconds - Statistics >> Generated 9922 mutations Killed 7833 (79%) >> Ran 117579 tests (11.85 tests per mutation) Tuesday, 22 April 14
  90. 90. Why only 79% killed? Tuesday, 22 April 14
  91. 91. Missing test cases Why only 79% killed? Tuesday, 22 April 14
  92. 92. Missing test cases Time outs Why only 79% killed? Tuesday, 22 April 14
  93. 93. Missing test cases Time outs Equivalent mutations Why only 79% killed? Tuesday, 22 April 14
  94. 94. public void someLogic(int i) { if (i <= 100) { throw new IllegalArgumentException(); } if (i >= 100) { doSomething(); } } Equivalent mutant: Tuesday, 22 April 14
  95. 95. public void someLogic(int i) { if (i <= 100) { throw new IllegalArgumentException(); } if (i >= 100) { doSomething(); } } if (i > 100) { Equivalent mutant: Tuesday, 22 April 14
  96. 96. public void someLogic(int i) { if (i <= 100) { throw new IllegalArgumentException(); } if (i >= 100) { doSomething(); } } if (i > 100) { i can never be 100 here Equivalent mutant: Tuesday, 22 April 14
  97. 97. http://pitest.org Demo time Tuesday, 22 April 14
  98. 98. Maybe we should have written: public void someLogic(int i) { if (i <= 100) { throw new IllegalArgumentException(); } doSomething(); } Tuesday, 22 April 14
  99. 99. Some causes of equivalence are: Tuesday, 22 April 14
  100. 100. Dead/useless code Some causes of equivalence are: Tuesday, 22 April 14
  101. 101. Dead/useless code Non-functional modifications Some causes of equivalence are: Tuesday, 22 April 14
  102. 102. Dead/useless code Non-functional modifications Unsatisfiable guards Some causes of equivalence are: Tuesday, 22 April 14
  103. 103. Dead/useless code Non-functional modifications Unsatisfiable guards Internal state Some causes of equivalence are: Tuesday, 22 April 14
  104. 104. Dead/useless code Non-functional modifications Unsatisfiable guards Internal state Some causes of equivalence are: This can help us improve our code Tuesday, 22 April 14
  105. 105. Mutants that survive need to be checked manually to determine if they are equivalent Tuesday, 22 April 14
  106. 106. Is this a show-stopper? Tuesday, 22 April 14
  107. 107. Is this a show-stopper? Not all that common in practice Tuesday, 22 April 14
  108. 108. Is this a show-stopper? Not all that common in practice TDD helps prevent equivalents Tuesday, 22 April 14
  109. 109. Is this a show-stopper? Not all that common in practice TDD helps prevent equivalents Tolerate a few survivors Tuesday, 22 April 14
  110. 110. What about really large code bases? Tuesday, 22 April 14
  111. 111. What about really large code bases? Mutation testing can take a long time: Tuesday, 22 April 14
  112. 112. What about really large code bases? Mutation testing can take a long time: Use fewer mutants Tuesday, 22 April 14
  113. 113. What about really large code bases? Mutation testing can take a long time: Use fewer mutants Filter the candidates Tuesday, 22 April 14
  114. 114. What about really large code bases? Mutation testing can take a long time: Use fewer mutants Filter the candidates Run infrequently Tuesday, 22 April 14
  115. 115. Get the developers to mutation test Tuesday, 22 April 14
  116. 116. Get the developers to mutation test Only on code they are working on Tuesday, 22 April 14
  117. 117. Get the developers to mutation test Only on code they are working on Faster feedback Tuesday, 22 April 14
  118. 118. Get the developers to mutation test Only on code they are working on Faster feedback Improves design of the code Tuesday, 22 April 14
  119. 119. Get your CI server to run the mutation test Tuesday, 22 April 14
  120. 120. Get your CI server to run the mutation test Over whole codebase if quick enough Tuesday, 22 April 14
  121. 121. Get your CI server to run the mutation test Over whole codebase if quick enough Use SCM integration to identify subset Tuesday, 22 April 14
  122. 122. Experiences and recommendations Filip van Laenen Tuesday, 22 April 14
  123. 123. Use mutation testing from day 1 Experiences and recommendations Filip van Laenen Tuesday, 22 April 14
  124. 124. Use mutation testing from day 1 Start on a small code base Experiences and recommendations Filip van Laenen Tuesday, 22 April 14
  125. 125. Use mutation testing from day 1 Start on a small code base Keep number of unit tests per class low Experiences and recommendations Filip van Laenen Tuesday, 22 April 14
  126. 126. Use mutation testing from day 1 Start on a small code base Keep number of unit tests per class low Have small classes Experiences and recommendations Filip van Laenen Tuesday, 22 April 14
  127. 127. Use mutation testing from day 1 Start on a small code base Keep number of unit tests per class low Have small classes Select a good tool: Experiences and recommendations Filip van Laenen Tuesday, 22 April 14
  128. 128. Use mutation testing from day 1 Start on a small code base Keep number of unit tests per class low Have small classes Select a good tool: Configurable Experiences and recommendations Filip van Laenen Tuesday, 22 April 14
  129. 129. Use mutation testing from day 1 Start on a small code base Keep number of unit tests per class low Have small classes Select a good tool: Configurable Flexible Experiences and recommendations Filip van Laenen Tuesday, 22 April 14
  130. 130. Use mutation testing from day 1 Start on a small code base Keep number of unit tests per class low Have small classes Select a good tool: Configurable Flexible Identifies surviving the mutant Experiences and recommendations Filip van Laenen Tuesday, 22 April 14
  131. 131. Experiences and recommendations Filip van Laenen Tuesday, 22 April 14
  132. 132. Believe the tool Experiences and recommendations Filip van Laenen Tuesday, 22 April 14
  133. 133. Believe the tool Fix the problem Experiences and recommendations Filip van Laenen Tuesday, 22 April 14
  134. 134. Believe the tool Fix the problem Don't turn mutation testing off Experiences and recommendations Filip van Laenen Tuesday, 22 April 14
  135. 135. Believe the tool Fix the problem Don't turn mutation testing off Less code Experiences and recommendations Filip van Laenen Tuesday, 22 April 14
  136. 136. Believe the tool Fix the problem Don't turn mutation testing off Less code More unit tests Experiences and recommendations Filip van Laenen Tuesday, 22 April 14
  137. 137. Believe the tool Fix the problem Don't turn mutation testing off Less code More unit tests More intelligent unit tests Experiences and recommendations Filip van Laenen Tuesday, 22 April 14
  138. 138. Available tools http://en.wikipedia.org/wiki/ Mutation_testing#External_links Tuesday, 22 April 14
  139. 139. Ruby: Heckle, Mutant Available tools http://en.wikipedia.org/wiki/ Mutation_testing#External_links Tuesday, 22 April 14
  140. 140. Ruby: Heckle, Mutant Java: pitest, Jumble, Jester Available tools http://en.wikipedia.org/wiki/ Mutation_testing#External_links Tuesday, 22 April 14
  141. 141. Ruby: Heckle, Mutant Java: pitest, Jumble, Jester C#: Nester, NinjaTurtle, Cream Available tools http://en.wikipedia.org/wiki/ Mutation_testing#External_links Tuesday, 22 April 14
  142. 142. Ruby: Heckle, Mutant Java: pitest, Jumble, Jester C#: Nester, NinjaTurtle, Cream Python: Pester Available tools http://en.wikipedia.org/wiki/ Mutation_testing#External_links Tuesday, 22 April 14
  143. 143. Open source Works with Java 5, 6, 7 Works with all mocking frameworks including PowerMock Integrates with Maven,Ant, Gradle & SBT Plugins for Eclipse & IntelliJ Plugins for Jenkins & SonarQube Releases every 3 months or so since 2011 Tuesday, 22 April 14
  144. 144. • The Ladders - NewYork • Sky - Livingston • Insurance companies • Investment banks • Biotech companies • Norway's e-voting system Tuesday, 22 April 14
  145. 145. • The Ladders - NewYork • Sky - Livingston • Insurance companies • Investment banks • Biotech companies • Norway's e-voting system Tuesday, 22 April 14
  146. 146. • The Ladders - NewYork • Sky - Livingston • Insurance companies • Investment banks • Biotech companies • Norway's e-voting system Tuesday, 22 April 14
  147. 147. -JVM Seb Rose, Tuesday, 22 April 14
  148. 148. -JVM Seb Rose, Available 2014 (hopefully) Tuesday, 22 April 14
  149. 149. Seb  Rose Twi#er:     @sebrose Blog:       www.claysnow.co.uk E-­‐mail:     seb@claysnow.co.uk Tuesday, 22 April 14
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×