Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Mutation testing with Descartes

50 views

Published on

Talk given at Spotify, May 2018.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Mutation testing with Descartes

  1. 1. Mutate and Test your Tests Benoit Baudry (KTH, baudry@kth.se) Oscar Vera Perez (INRIA), Martin Monperrus (KTH) Spotify, Stockholm. May 2018. 1
  2. 2. Test Your Tests •What do you expect from test cases? • Cover requirements • Stress the application • Prevent regressions • Reveal bugs 2
  3. 3. Test Your Tests •What do you expect from test cases? • Cover requirements • Stress the application • Prevent regressions • Reveal bugs 3
  4. 4. 4 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); }
  5. 5. long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} 5 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); }
  6. 6. Coverage long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} 6 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); }
  7. 7. Coverage 7 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); } @Test factorialWith5Test() { assertEqual(1, fact(0));} long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;}
  8. 8. 8 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); } @Test factorialWith5Test() { assertEqual(1, fact(0));} long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} Are these test cases good at detecting bugs?
  9. 9. 9 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); } @Test factorialWith5Test() { assertEqual(1, fact(0));} long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} Are these test cases good at detecting bugs? Let’s mutate our code to see.
  10. 10. Mutation analysis •Tests are good if they can detect bugs •Principle: generate bugs and test the tests 10
  11. 11. Mutation analysis •Tests are good if they can detect bugs •Principle: generate bugs and test the tests •Mutation operators = types of bugs •Mutant = Program with one seeded bug 11
  12. 12. Mutation analysis Input : P, TS, Ops Output : score, coverage M <- generateMutants (P, OPs) forAll (m in M) run (TS,m) if (one-test-fail) then killed <- m else alive <- m score = killed / size(M) 12
  13. 13. PITest mutation operators •Conditions •Constants •Return •Delete method calls •Constructor call 13
  14. 14. long fact(int n) { if(n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} 14
  15. 15. long fact(int n) { if(n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} 15 n != 0 return 1+1 < --!(i<=n) result/i result+1
  16. 16. long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} 16 n != 0 return 1+1 < --!(i<=n) result/i result+1
  17. 17. long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} 17 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); } n != 0 return 1+1 < --!(i<=n) result/i result+1
  18. 18. long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} 18 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); } @Test factorialWith5Test() { assertEqual(1, fact(0));} n != 0 return 1+1 < --!(i<=n) result/i result+1
  19. 19. long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} Mutation reveals bugs in the test suite 19 @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); } @Test factorialWith5Test() { assertEqual(1, fact(0));} Bugs in the test suite: - Weak oracle - Missing input n != 0 return 1+1 < --!(i<=n) result/i result+1
  20. 20. long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} 20 @Test factorialWith5Test() { long obs = fact(5); assertEqual(120, obs); } @Test factorialWith5Test() { assertEqual(1, fact(0));} @Test factorialWith1Test() { assertEqual(1, fact(1));} n != 0 return 1+1 < --!(i<=n) result/i result+1
  21. 21. Project #Mutants Time (h) Score (%) Amazon Web Services SDK 2141690 04:25:35 76.28 XWiki Rendering Engine 112609 01:59:55 50.89 Apache Commons Math 104786 03:22:18 83.81 JFreeChart 89592 00:41:28 58.04 Apache PdfBox 79763 06:20:25 58.89 Java Git 78316 16:02:00 73.86 SCIFIO 62768 03:12:11 45.92 Joda-Time 31233 00:16:32 81.65 Apache Commons Lang 30361 00:21:02 86.17 Apache Commons Collections 20394 00:05:41 85.94 Urban Airship Client Library 17345 00:11:31 82.26 SAT4J 17067 11:11:04 68.58 ImageJ Common 15592 00:29:09 54.77 jsoup 14054 00:12:49 78.34 Jaxen XPath Engine 12210 00:24:40 67.13 Apache Commons Codec 9233 00:07:57 87.82 Apache Commons IO 8809 00:12:48 84.73 Google Gson 7353 00:05:34 81.76 AuthZForce PDP Core 7296 01:23:50 88.18 Apache Commons CLI 2560 00:01:26 88.71 JOpt Simple 2271 00:01:36 93.52 21
  22. 22. Descartes •Mutation operators: extreme mutation •Active, open source development •Pitest plugin •Compared to PIT • Less mutants • Different type of feedback • Same framework 22
  23. 23. long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} Descartes - example 23
  24. 24. long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} Descartes - example 24 long fact(int n){ return 0;} long fact(int n){ return 1;} PIT : 7 mutants Descartes : 2 mutants
  25. 25. Scalability of extreme mutation 25 public static boolean isValidXmlChar(int ch){ return (ch == 0x9) || (ch == 0xA) || (ch == 0xD) || (ch >= 0x20 && ch <= 0xD7FF) || (ch >= 0xE000 && ch <= 0xFFFD || (ch >= 0x10000 && ch <= 0x10FFFF); } PIT : 45 mutants Descartes : 2 mutants
  26. 26. Project Mutants PIT Mutants Descartes Time PIT Time Descartes Amazon Web Services SDK 2141690 161758 04:25:35 01:27:30 XWiki Rendering Engine 112609 5534 01:59:55 00:10:50 Apache Commons Math 104786 7150 03:22:18 00:08:30 JFreeChart 89592 7210 00:41:28 00:05:26 Apache PdfBox 79763 7559 06:20:25 00:42:11 Java Git 78316 7152 16:02:00 00:56:07 SCIFIO 62768 3627 03:12:11 00:15:26 Joda-Time 31233 4525 00:16:32 00:04:13 Apache Commons Lang 30361 3872 00:21:02 00:02:18 Apache Commons Collections 20394 3558 00:05:41 00:01:48 Urban Airship Client Library 17345 3082 00:11:31 00:09:38 SAT4J 17067 2296 11:11:04 00:56:42 ImageJ Common 15592 1947 00:29:09 00:04:08 jsoup 14054 1566 00:12:49 00:02:45 Jaxen XPath Engine 12210 1252 00:24:40 00:01:34 Apache Commons Codec 9233 979 00:07:57 00:02:02 Apache Commons IO 8809 1164 00:12:48 00:02:18 Google Gson 7353 848 00:05:34 00:01:11 AuthZForce PDP Core 7296 626 01:23:50 00:08:45 Apache Commons CLI 2560 271 00:01:26 00:00:09 JOpt Simple 2271 412 00:01:36 00:00:25 26
  27. 27. Coarser grain than PIT 27 long fact(int n) { if (n == 0)return 1; long result = 1; for(int i=2; i<=n; i++) result = result * i; return result;} long fact(int n){ return 0;} long fact(int n){ return 1;} @Test factorialWith5Test() { long obs = fact(5); assertTrue(5 < obs); } @Test factorialWith5Test() { assertEqual(1, fact(0));}
  28. 28. 28 @Test void typical() throws NoSuchAlgorithmException { SdkTLSSocketFactory f = new SdkTLSSocketF(SSLContext.getDefault(),null); ... f.prepareSocket(new TestSSLSocket() { ... @Override public void setEnabledProtocols(String[] protocols) { assertTrue(Arrays.equals(protocols, expected)); }); } protected final void prepareSocket(final SSLSocket socket) { }}
  29. 29. 29 @Test void typical() throws NoSuchAlgorithmException { SdkTLSSocketFactory f = new SdkTLSSocketF(SSLContext.getDefault(),null); ... f.prepareSocket(new TestSSLSocket() { ... @Override public void setEnabledProtocols(String[] protocols) { assertTrue(Arrays.equals(protocols, expected)); }); } protected final void prepareSocket(final SSLSocket socket) { }}
  30. 30. 30 @Test void typical() throws NoSuchAlgorithmException { SdkTLSSocketFactory f = new SdkTLSSocketF(SSLContext.getDefault(),null); ... f.prepareSocket(new TestSSLSocket() { ... @Override public void setEnabledProtocols(String[] protocols) { assertTrue(Arrays.equals(protocols, expected)); }); } protected final void prepareSocket(final SSLSocket socket) { }} Missing oracle
  31. 31. 31 bool equals(object other) { return other instanceof AClass &&((AClass) other).aField==aField; } bool equals(object other) { return true;} bool equals(object other) { return false;}
  32. 32. 32 bool equals(object other) { return other instanceof AClass &&((AClass) other).aField==aField; } bool equals(object other) { return true;} bool equals(object other) { return false;} test() { AClass a = new AClass(3); AClass b = new AClass(3); AClass c = new AClass(4); assertEquals(a, b); assertFalse(a == c); }
  33. 33. 33 bool equals(object other) { return other instanceof AClass &&((AClass) other).aField==aField; } bool equals(object other) { return true;} bool equals(object other) { return false;} test() { AClass a = new AClass(3); AClass b = new AClass(3); AClass c = new AClass(4); assertEquals(a, b); assertFalse(a == c); }
  34. 34. Bug in test 34 bool equals(object other) { return other instanceof AClass &&((AClass) other).aField==aField; } bool equals(object other) { return true;} bool equals(object other) { return false;} test() { AClass a = new AClass(3); AClass b = new AClass(3); AClass c = new AClass(4); assertEquals(a, b); assertFalse(a == c); }
  35. 35. 35 1 10 100 1000 10000 100000 1000000 10000000 Number of mutants Pit Descartes
  36. 36. 36 15 285 4 15 54 28 64 132 23 115 14 594 369 92 9 40 158 458 116 65 2592 69 2 14 28 19 27 55 10 17 12 97 85 31 2 14 11 61 20 69 197 274 1516 175 408 1178 610 1840 4245 459 344 585 2931 2237 2596 252 714 456 1726 425 2004 1618 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Tested Weak pseudo-tested Strong pseudo-tested
  37. 37. Quick journey in spotify/apollo 37
  38. 38. 38 3 2 1 10 2 1 1 2 1 1 1 24 49 11 19 14 50 11 93 29 25 20 18 3 33 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
  39. 39. package com.spotify.apollo.request; class TrackedOngoingRequestImpl extends ForwardingOngoingRequest { public void reply(Response<ByteString> message) { doReply(message); } private boolean doReply(Response<ByteString> msg) { final boolean removed = requestTracker.remove(this); if (removed) { super.reply(message);} return removed; } } 39
  40. 40. package com.spotify.apollo.request; class TrackedOngoingRequestImpl extends ForwardingOngoingRequest { public void reply(Response<ByteString> message) { doReply(message); } private boolean doReply(Response<ByteString> msg) { final boolean removed = requestTracker.remove(this); if (removed) { super.reply(message);} return removed; } } 40 public void reply(Response<ByteString> message) {} private boolean doReply(Response<ByteString> msg) {return true;} private boolean doReply(Response<ByteString> msg) {return false;} @Test shouldNotReplyIfNotTracked() { tOR = new TrackedOngoingRequestImpl(o,t); tracker.remove(tOR); tOR.reply(Response.ok()); verifyNoMoreInteractions(ongoingRequest); }
  41. 41. 41 public void reply(Response<ByteString> message) {} private boolean doReply(Response<ByteString> msg) {return true;} private boolean doReply(Response<ByteString> msg) {return false;} package com.spotify.apollo.request; class TrackedOngoingRequestImpl extends ForwardingOngoingRequest { public void reply(Response<ByteString> message) { doReply(message); } private boolean doReply(Response<ByteString> msg) { final boolean removed = requestTracker.remove(this); if (removed) { super.reply(message);} return removed; } } @Test shouldNotReplyIfNotTracked() { tOR = new TrackedOngoingRequestImpl(o,t); tracker.remove(tOR); tOR.reply(Response.ok()); verifyNoMoreInteractions(ongoingRequest); }
  42. 42. 42 package com.spotify.apollo.environment; public class ApolloConfig { public boolean enableMetaApi() { return optionalBoolean(apolloNode, "metaApi").orElse(true); }
  43. 43. 43 public boolean enableMetaApi() {return true;} public boolean enableMetaApi() {return false;} package com.spotify.apollo.environment; public class ApolloConfig { public boolean enableMetaApi() { return optionalBoolean(apolloNode, "metaApi").orElse(true); } Covered by 8 test cases which all expect enableMetaApi()to return true
  44. 44. 44 package com.spotify.apollo.environment; public class ApolloConfig { public boolean enableMetaApi() { return optionalBoolean(apolloNode, "metaApi").orElse(true); } public boolean enableMetaApi() {return true;} public boolean enableMetaApi() {return false;}
  45. 45. Descartes •Current • Compatible with JUnit5 and latest Pitest • Integrates with Maven and Gradle •Future: Descartes and CI • Incremental Descartes on a commit • Descartes for pull requests 45
  46. 46. Conclusion •Mutation analysis • Automatic generation of mutants • Evaluate the test suite •Bugs in test suites • Oracle • Input space coverage • Testability • Indirectly tested code 46
  47. 47. Feedback welcome! • https://github.com/STAMP-project/pitest-descartes • https://github.com/hcoles/pitest • http://stamp-project.eu/ baudry@kth.se 47

×