# Mutation testing with Descartes

May. 17, 2018

### Mutation testing with Descartes

1. Mutate and Test your Tests Benoit Baudry (KTH, baudry@kth.se) Oscar Vera Perez (INRIA), Martin Monperrus (KTH) Spotify, Stockholm. May 2018. 1
2. Test Your Tests •What do you expect from test cases? • Cover requirements • Stress the application • Prevent regressions • Reveal bugs 2
3. Test Your Tests •What do you expect from test cases? • Cover requirements • Stress the application • Prevent regressions • Reveal bugs 3
4. 4 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); }
5. long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} 5 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); }
6. Coverage long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} 6 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); }
7. Coverage 7 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); } @Test factorialWith5Test() { assertEqual(1, fact(0));} long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;}
8. 8 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); } @Test factorialWith5Test() { assertEqual(1, fact(0));} long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} Are these test cases good at detecting bugs?
9. 9 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); } @Test factorialWith5Test() { assertEqual(1, fact(0));} long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} Are these test cases good at detecting bugs? Let’s mutate our code to see.
10. Mutation analysis •Tests are good if they can detect bugs •Principle: generate bugs and test the tests 10
11. Mutation analysis •Tests are good if they can detect bugs •Principle: generate bugs and test the tests •Mutation operators = types of bugs •Mutant = Program with one seeded bug 11
12. Mutation analysis Input : P, TS, Ops Output : score, coverage M <- generateMutants (P, OPs) forAll (m in M) run (TS,m) if (one-test-fail) then killed <- m else alive <- m score = killed / size(M) 12
13. PITest mutation operators •Conditions •Constants •Return •Delete method calls •Constructor call 13
14. long fact(int n) { if(n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} 14
15. long fact(int n) { if(n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} 15 n != 0 return 1+1 < --!(i<=n) result/i result+1
16. long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} 16 n != 0 return 1+1 < --!(i<=n) result/i result+1
17. long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} 17 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); } n != 0 return 1+1 < --!(i<=n) result/i result+1
18. long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} 18 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); } @Test factorialWith5Test() { assertEqual(1, fact(0));} n != 0 return 1+1 < --!(i<=n) result/i result+1
19. long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} Mutation reveals bugs in the test suite 19 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); } @Test factorialWith5Test() { assertEqual(1, fact(0));} Bugs in the test suite: - Weak oracle - Missing input n != 0 return 1+1 < --!(i<=n) result/i result+1
20. long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} 20 @Test factorialWith5Test() { long obs = fact(5); assertEqual(120, obs); } @Test factorialWith5Test() { assertEqual(1, fact(0));} @Test factorialWith1Test() { assertEqual(1, fact(1));} n != 0 return 1+1 < --!(i<=n) result/i result+1
21. Project #Mutants Time (h) Score (%) Amazon Web Services SDK 2141690 04:25:35 76.28 XWiki Rendering Engine 112609 01:59:55 50.89 Apache Commons Math 104786 03:22:18 83.81 JFreeChart 89592 00:41:28 58.04 Apache PdfBox 79763 06:20:25 58.89 Java Git 78316 16:02:00 73.86 SCIFIO 62768 03:12:11 45.92 Joda-Time 31233 00:16:32 81.65 Apache Commons Lang 30361 00:21:02 86.17 Apache Commons Collections 20394 00:05:41 85.94 Urban Airship Client Library 17345 00:11:31 82.26 SAT4J 17067 11:11:04 68.58 ImageJ Common 15592 00:29:09 54.77 jsoup 14054 00:12:49 78.34 Jaxen XPath Engine 12210 00:24:40 67.13 Apache Commons Codec 9233 00:07:57 87.82 Apache Commons IO 8809 00:12:48 84.73 Google Gson 7353 00:05:34 81.76 AuthZForce PDP Core 7296 01:23:50 88.18 Apache Commons CLI 2560 00:01:26 88.71 JOpt Simple 2271 00:01:36 93.52 21
22. Descartes •Mutation operators: extreme mutation •Active, open source development •Pitest plugin •Compared to PIT • Less mutants • Different type of feedback • Same framework 22
23. long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} Descartes - example 23
24. long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} Descartes - example 24 long fact(int n){ return 0;} long fact(int n){ return 1;} PIT : 7 mutants Descartes : 2 mutants
25. Scalability of extreme mutation 25 public static boolean isValidXmlChar(int ch){ return (ch == 0x9) || (ch == 0xA) || (ch == 0xD) || (ch >= 0x20 && ch <= 0xD7FF) || (ch >= 0xE000 && ch <= 0xFFFD || (ch >= 0x10000 && ch <= 0x10FFFF); } PIT : 45 mutants Descartes : 2 mutants
26. Project Mutants PIT Mutants Descartes Time PIT Time Descartes Amazon Web Services SDK 2141690 161758 04:25:35 01:27:30 XWiki Rendering Engine 112609 5534 01:59:55 00:10:50 Apache Commons Math 104786 7150 03:22:18 00:08:30 JFreeChart 89592 7210 00:41:28 00:05:26 Apache PdfBox 79763 7559 06:20:25 00:42:11 Java Git 78316 7152 16:02:00 00:56:07 SCIFIO 62768 3627 03:12:11 00:15:26 Joda-Time 31233 4525 00:16:32 00:04:13 Apache Commons Lang 30361 3872 00:21:02 00:02:18 Apache Commons Collections 20394 3558 00:05:41 00:01:48 Urban Airship Client Library 17345 3082 00:11:31 00:09:38 SAT4J 17067 2296 11:11:04 00:56:42 ImageJ Common 15592 1947 00:29:09 00:04:08 jsoup 14054 1566 00:12:49 00:02:45 Jaxen XPath Engine 12210 1252 00:24:40 00:01:34 Apache Commons Codec 9233 979 00:07:57 00:02:02 Apache Commons IO 8809 1164 00:12:48 00:02:18 Google Gson 7353 848 00:05:34 00:01:11 AuthZForce PDP Core 7296 626 01:23:50 00:08:45 Apache Commons CLI 2560 271 00:01:26 00:00:09 JOpt Simple 2271 412 00:01:36 00:00:25 26
27. Coarser grain than PIT 27 long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} long fact(int n){ return 0;} long fact(int n){ return 1;} @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); } @Test factorialWith5Test() { assertEqual(1, fact(0));}
28. 28 @Test void typical() throws NoSuchAlgorithmException { SdkTLSSocketFactory f = new SdkTLSSocketF(SSLContext.getDefault(),null); ... f.prepareSocket(new TestSSLSocket() { ... @Override public void setEnabledProtocols(String[] protocols) { assertTrue(Arrays.equals(protocols, expected)); }); } protected final void prepareSocket(final SSLSocket socket) { }}
29. 29 @Test void typical() throws NoSuchAlgorithmException { SdkTLSSocketFactory f = new SdkTLSSocketF(SSLContext.getDefault(),null); ... f.prepareSocket(new TestSSLSocket() { ... @Override public void setEnabledProtocols(String[] protocols) { assertTrue(Arrays.equals(protocols, expected)); }); } protected final void prepareSocket(final SSLSocket socket) { }}
30. 30 @Test void typical() throws NoSuchAlgorithmException { SdkTLSSocketFactory f = new SdkTLSSocketF(SSLContext.getDefault(),null); ... f.prepareSocket(new TestSSLSocket() { ... @Override public void setEnabledProtocols(String[] protocols) { assertTrue(Arrays.equals(protocols, expected)); }); } protected final void prepareSocket(final SSLSocket socket) { }} Missing oracle
31. 31 bool equals(object other) { return other instanceof AClass &&((AClass) other).aField==aField; } bool equals(object other) { return true;} bool equals(object other) { return false;}
32. 32 bool equals(object other) { return other instanceof AClass &&((AClass) other).aField==aField; } bool equals(object other) { return true;} bool equals(object other) { return false;} test() { AClass a = new AClass(3); AClass b = new AClass(3); AClass c = new AClass(4); assertEquals(a, b); assertFalse(a == c); }
33. 33 bool equals(object other) { return other instanceof AClass &&((AClass) other).aField==aField; } bool equals(object other) { return true;} bool equals(object other) { return false;} test() { AClass a = new AClass(3); AClass b = new AClass(3); AClass c = new AClass(4); assertEquals(a, b); assertFalse(a == c); }
34. Bug in test 34 bool equals(object other) { return other instanceof AClass &&((AClass) other).aField==aField; } bool equals(object other) { return true;} bool equals(object other) { return false;} test() { AClass a = new AClass(3); AClass b = new AClass(3); AClass c = new AClass(4); assertEquals(a, b); assertFalse(a == c); }
35. 35 1 10 100 1000 10000 100000 1000000 10000000 Number of mutants Pit Descartes
36. 36 15 285 4 15 54 28 64 132 23 115 14 594 369 92 9 40 158 458 116 65 2592 69 2 14 28 19 27 55 10 17 12 97 85 31 2 14 11 61 20 69 197 274 1516 175 408 1178 610 1840 4245 459 344 585 2931 2237 2596 252 714 456 1726 425 2004 1618 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Tested Weak pseudo-tested Strong pseudo-tested
37. Quick journey in spotify/apollo 37
38. 38 3 2 1 10 2 1 1 2 1 1 1 24 49 11 19 14 50 11 93 29 25 20 18 3 33 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
39. package com.spotify.apollo.request; class TrackedOngoingRequestImpl extends ForwardingOngoingRequest { public void reply(Response<ByteString> message) { doReply(message); } private boolean doReply(Response<ByteString> msg) { final boolean removed = requestTracker.remove(this); if (removed) { super.reply(message);} return removed; } } 39
40. package com.spotify.apollo.request; class TrackedOngoingRequestImpl extends ForwardingOngoingRequest { public void reply(Response<ByteString> message) { doReply(message); } private boolean doReply(Response<ByteString> msg) { final boolean removed = requestTracker.remove(this); if (removed) { super.reply(message);} return removed; } } 40 public void reply(Response<ByteString> message) {} private boolean doReply(Response<ByteString> msg) {return true;} private boolean doReply(Response<ByteString> msg) {return false;} @Test shouldNotReplyIfNotTracked() { tOR = new TrackedOngoingRequestImpl(o,t); tracker.remove(tOR); tOR.reply(Response.ok()); verifyNoMoreInteractions(ongoingRequest); }
41. 41 public void reply(Response<ByteString> message) {} private boolean doReply(Response<ByteString> msg) {return true;} private boolean doReply(Response<ByteString> msg) {return false;} package com.spotify.apollo.request; class TrackedOngoingRequestImpl extends ForwardingOngoingRequest { public void reply(Response<ByteString> message) { doReply(message); } private boolean doReply(Response<ByteString> msg) { final boolean removed = requestTracker.remove(this); if (removed) { super.reply(message);} return removed; } } @Test shouldNotReplyIfNotTracked() { tOR = new TrackedOngoingRequestImpl(o,t); tracker.remove(tOR); tOR.reply(Response.ok()); verifyNoMoreInteractions(ongoingRequest); }
42. 42 package com.spotify.apollo.environment; public class ApolloConfig { public boolean enableMetaApi() { return optionalBoolean(apolloNode, "metaApi").orElse(true); }
43. 43 public boolean enableMetaApi() {return true;} public boolean enableMetaApi() {return false;} package com.spotify.apollo.environment; public class ApolloConfig { public boolean enableMetaApi() { return optionalBoolean(apolloNode, "metaApi").orElse(true); } Covered by 8 test cases which all expect enableMetaApi()to return true
44. 44 package com.spotify.apollo.environment; public class ApolloConfig { public boolean enableMetaApi() { return optionalBoolean(apolloNode, "metaApi").orElse(true); } public boolean enableMetaApi() {return true;} public boolean enableMetaApi() {return false;}
45. Descartes •Current • Compatible with JUnit5 and latest Pitest • Integrates with Maven and Gradle •Future: Descartes and CI • Incremental Descartes on a commit • Descartes for pull requests 45
46. Conclusion •Mutation analysis • Automatic generation of mutants • Evaluate the test suite •Bugs in test suites • Oracle • Input space coverage • Testability • Indirectly tested code 46
47. Feedback welcome! • https://github.com/STAMP-project/pitest-descartes • https://github.com/hcoles/pitest • http://stamp-project.eu/ baudry@kth.se 47