Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Research overview Oct. 2018

0 views

Published on

Search-based crash reproduction, research overview, Namur, October 2018

Published in: Science
  • Be the first to comment

  • Be the first to like this

Research overview Oct. 2018

  1. 1. Oops!... I Did It Again
 Search-based crash reproduction made easy Xavier Devroey <x.d.m.devroey@tudelft.nl>
  2. 2. Search-Based Optimisation • Optimisation problem • Minimise and/or maximise • Time • User satisfaction • Impact • … • Exhaustive search is impractical • Methaheuristic • Higher-level procedure • No guarantee of global optimal solution !3
  3. 3. !4 S S S S S S S S S S S S S S S S S S S S SS S S S S S
  4. 4. !5 S S S S S S S S S S S S S S S S S S S S SS S S S S S x
  5. 5. !6 S S S S S S S S S S S S S S S S S S S S SS S S S S S x f(x)
  6. 6. !7 S S S S S S S S S S S S S S S S S S S S SS S S S S S x f(x) Random sol. Gen. neighbours Pick best Hill Climbing
  7. 7. !8 S S S S S S S S S S S S S S S S S S S S SS S S S S S x f(x) Random sol. Gen. neighbours Pick best Hill Climbing
  8. 8. !9 S S S S S S S S S S S S S S S S S S S S SS S S S S S x f(x) Random sol. Gen. neighbours Pick best Hill Climbing
  9. 9. !10 S S S S S S S S S S S S S S S S S S S S SS S S S S S x f(x) Random sol. Gen. neighbours Pick best Hill Climbing
  10. 10. !11 S S S S S S S S S S S S S S S S S S S S SS S S S S S x f(x) Random sol. Gen. neighbours Pick best Hill Climbing
  11. 11. !12 S S S S S S S S S S S S S S S S S S S S SS S S S S S x f(x) Random sol. Gen. neighbours Pick best Hill Climbing
  12. 12. !13 S S S S S S S S S S S S S S S S S S S S SS S S S S S x f(x) Random sol. Gen. neighbours Pick best Hill Climbing
  13. 13. !14 S S S S S S S S S S S S S S S S S S S S SS S S S S S x f(x) Random sol. Gen. neighbours Pick best Hill Climbing
  14. 14. !15 S S S S S S S S S S S S S S S S S S S S SS S S S S S x f(x) Random sol. Gen. neighbours Pick best Hill Climbing
  15. 15. !16 S S S S S S S S S S S S S S S S S S S S SS S S S S S x f(x) Random sol. Gen. neighbours Pick best Hill Climbing
  16. 16. !17 S S S S S S S S S S S S S S S S S S S S SS S S S S S x f(x) Random sol. Gen. neighbours Pick best Hill Climbing
  17. 17. !18 S S S S S S S S S S S S S S S S S S S S SS S S S S S x f(x) Random sol. Gen. neighbours Pick best Hill Climbing
  18. 18. !19 S S S S S S S S S S S S S S S S S S S S SS S S S S S x f(x) Random sol. Gen. neighbours Pick best Hill Climbing
  19. 19. Search-Based Software Engineering !20 T1 T2 Tn … Test cases LOC L1 L2 … Ln Software Engineering problem Search-Based Optimisation problem ? Mark Harman, Phil McMinn, Jerffeson Teixeira Souza, and Shin Yoo. Search-Based Software Engineering: Techniques, Taxonomy, Tutorial. Empirical Software Engineering and Verification, Lecture Notes in Computer Science, vol. 7007, pp. 1–59, 2011
  20. 20. Search-Based Software Testing !21 http://www.evosuite.org
  21. 21. Software Testing Amplification
  22. 22. !23 DevOps
  23. 23. !24 java.lang.ClassCastException: […] at org…..SolrEntityReferenceResolver.getWikiReference(....java:93) at org…..SolrEntityReferenceResolver.getEntityReference(….java:70) at org…..SolrEntityReferenceResolver.resolve(….java:63) at org…..SolrDocumentReferenceResolver.resolve(….java:48) at … Crash!?!
  24. 24. !25 Issue XWIKI-13031
  25. 25. Java Stack Trace (Issue XWIKI-13031) !26 java.lang.ClassCastException: […] at org…..SolrEntityReferenceResolver.getWikiReference(....java:93) at org…..SolrEntityReferenceResolver.getEntityReference(….java:70) at org…..SolrEntityReferenceResolver.resolve(….java:63) at org…..SolrDocumentReferenceResolver.resolve(….java:48) at … Exception Frames {Target→
  26. 26. !27 a() b() c() e()d() e() c() e() a() b() c() e()d() e() c() e() a() b() c() e()d() e() c() e() a() b() c() e()d() e() c() e() Random initial test suite a() b() c() e()d() e() c() e() a() b() c() e()d() e() c() e() a() c() e()d() e() c() b() e() a() c() e()d() e() c() b() e() Evolutionary search a() e()d() Exception: at x(…) at y(…) at e(…) Exception: at x(…) at y(…) at e(…) Crash reproducing test case Stack trace Soltani, M., Panichella, A. and van Deursen, A. 2018. Search-Based Crash Reproduction and Its Impact on Debugging. Software Engineering, IEEE Transactions on. (2018).
  27. 27. Crash-reproducing Test Case !28 public void test0() throws Throwable { … SolrEntityReferenceResolver solrEntityReferenceResolver0 = new …(); EntityReferenceResolver entityReferenceResolver0 = … mock(…); solrDocument0.put("wiki", (Object) entityType0); Injector.inject(solrEntityReferenceResolver0, …); Injector.validateBean(solrEntityReferenceResolver0, …); … // Undeclared exception! solrEntityReferenceResolver0.resolve(solrDocument0, entityType0, objectArray0); } java.lang.ClassCastException: […] at org…..SolrEntityReferenceResolver.getWikiReference(....java:93) at org…..SolrEntityReferenceResolver.getEntityReference(….java:70) at org…..SolrEntityReferenceResolver.resolve(….java:63)
  28. 28. Fitness Function • Line coverage • How far are we from the line where the exception is thrown? • Exception coverage • Is the exception thrown? • Stack trace similarity • How similar is the stack trace compared to the original (given) stack trace? !29 Weighted Sum Scalarization f(t) = 3 × ds(t) + 2 × max(dexcept) + max(dtrace) if the line is not reached 3 × min(ds) + 2 × dexcept(t) + max(dtrace) if the line is reached 3 × min(ds) + 2 × min(dexcept) + dtrace(t) if the exception is thrown Minimize
  29. 29. Weighted Sum Scalarization !30 f(t) = 3 × ds(t) + 2 × max(dexcept) + max(dtrace) if the line is not reached 3 × min(ds) + 2 × dexcept(t) + max(dtrace) if the line is reached 3 × min(ds) + 2 × min(dexcept) + dtrace(t) if the exception is thrown S1 S2 S3 S4 S5 S3 Control flow graph
 of resolve java.lang.ClassCastException: […] at org…...getWikiReference(....java:93) at org…...getEntityReference(….java:70) at org…...resolve(….java:S5)
  30. 30. Weighted Sum Scalarization !31 f(t) = 3 × ds(t) + 2 × max(dexcept) + max(dtrace) if the line is not reached 3 × min(ds) + 2 × dexcept(t) + max(dtrace) if the line is reached 3 × min(ds) + 2 × min(dexcept) + dtrace(t) if the exception is thrown S1 S2 S3 S4 S5 S3 Control flow graph
 of resolve java.lang.ClassCastException: […] at org…...getWikiReference(....java:93) at org…...getEntityReference(….java:70) at org…...resolve(….java:S5)
  31. 31. Weighted Sum Scalarization !32 f(t) = 3 × ds(t) + 2 × max(dexcept) + max(dtrace) if the line is not reached 3 × min(ds) + 2 × dexcept(t) + max(dtrace) if the line is reached 3 × min(ds) + 2 × min(dexcept) + dtrace(t) if the exception is thrown S1 S2 S3 S4 S5 S3 Control flow graph
 of resolve NullPointerException: […] at … java.lang.ClassCastException: […] at org…...getWikiReference(....java:93) at org…...getEntityReference(….java:70) at org…...resolve(….java:S5)
  32. 32. Weighted Sum Scalarization !33 f(t) = 3 × ds(t) + 2 × max(dexcept) + max(dtrace) if the line is not reached 3 × min(ds) + 2 × dexcept(t) + max(dtrace) if the line is reached 3 × min(ds) + 2 × min(dexcept) + dtrace(t) if the exception is thrown S1 S2 S3 S4 S5 S3 Control flow graph
 of resolve java.lang.ClassCastException: […] at org…...getWikiReference(....java:90) at org…...getWikiReference(….java:93) at org…...resolve(….java:S5) java.lang.ClassCastException: […] at org…...getWikiReference(....java:93) at org…...getEntityReference(….java:70) at org…...resolve(….java:S5)
  33. 33. Weighted Sum Scalarization !34 f(t) = 3 × ds(t) + 2 × max(dexcept) + max(dtrace) if the line is not reached 3 × min(ds) + 2 × dexcept(t) + max(dtrace) if the line is reached 3 × min(ds) + 2 × min(dexcept) + dtrace(t) if the exception is thrown S1 S2 S3 S4 S5 S3 Control flow graph
 of resolve java.lang.ClassCastException: […] at org…...getWikiReference(....java:93) at org…...getEntityReference(….java:70) at org…...resolve(….java:S5) java.lang.ClassCastException: […] at org…...getWikiReference(....java:93) at org…...getEntityReference(….java:70) at org…...resolve(….java:S5)
  34. 34. Genetic Algorithm !35 Initialize population Evaluate fitness Next generation Selection Crossover Mutation Reinsertion [fitness == 0 or 
 budget exhausted]
  35. 35. Genetic Algorithm !36 Initialize population Evaluate fitness Next generation Selection Crossover Mutation Reinsertion [fitness == 0 or 
 budget exhausted] Crossover Parents: Children: Mutation
  36. 36. Guided
 Genetic Algorithm !37 Initialize population Evaluate fitness Next generation Selection Crossover Mutation Reinsertion [fitness == 0 or 
 budget exhausted] • Guided initialization • Initial population: N test cases • Random method calls • Guarantees that a target method call is inserted in each test at least once • Direct for public and protected methods • Indirect for private methods
  37. 37. Guided
 Genetic Algorithm !38 Initialize population Evaluate fitness Next generation Selection Crossover Mutation Reinsertion [fitness == 0 or 
 budget exhausted] • Selection • Fittest tests according to the fitness function • Guided crossover • Single-point crossover • Checks that the call to the target method is preserved • Guided mutation • Adding, changing and removing statements • Checks that the call to the target method is preserved
  38. 38. JCrashPack • 200 crashes from various open source projects • XWiki (STAMP partner) • From XWiki issue tracking system: 51 crashes • Defects4J applications • State of the art fault localization benchmark • 73 crashes (with fixes) • Elasticsearch • Based on popularity • From Elasticsearch issue tracking system: 76 crashes • Filtered, verified, cleaned up, right jar versions, … !39
  39. 39. !40 Defects4J XWiki Elasticsearch NPEIAEAIOOBECCESIOOBEISEOth. crashed failed linereached ex.thrown reproduced crashed failed linereached ex.thrown reproduced crashed failed linereached ex.thrown reproduced 1 10 100 1 10 100 1 10 100 1 10 100 1 10 100 1 10 100 1 10 100 AverageNumberofframes(logarithmicscale)
  40. 40. 12 Key Challenges • Input data generation • For complex inputs, generic types, etc. • Environmental dependencies • Environment state hard to manage at unit level • Complex code • Long methods, with lot of nested predicates • Abstract classes and methods • Cannot be instantiated and one concrete implementation is picked randomly • […] !41
  41. 41. Rethinking the Fitness Function !42 Weighted Sum Scalarization f(t) = 3 × ds(t) + 2 × max(dexcept) + max(dtrace) if the line is not reached 3 × min(ds) + 2 × dexcept(t) + max(dtrace) if the line is reached 3 × min(ds) + 2 × min(dexcept) + dtrace(t) if the exception is thrown Simple Sum Scalarization Multi-objectivization f(t) = ds(t) + dexcept(t) + dtrace(t) f1(t) = ds(t) f2(t) = dexcept(t) f3(t) = dtrace(t)
  42. 42. Weighted Sum 
 Scalarization Simple Sum 
 Scalarization
  43. 43. !44 Multi-objectivization
  44. 44. !45 Multi-objectivization
  45. 45. !46 Multi-objectivization
  46. 46. !47 Multi-objectivization Multi-objective search solutions Multi-Objectivization solution
  47. 47. Reproduced crashes !48 0,0 222120 Simple Sum Weighted Sum Multi-objectivized optimization
  48. 48. !49 a() b() c() e()d() e() c() e() a() b() c() e()d() e() c() e() a() b() c() e()d() e() c() e() a() b() c() e()d() e() c() e() Random initial test suite a() b() c() e()d() e() c() e() a() b() c() e()d() e() c() e() a() c() e()d() e() c() b() e() a() c() e()d() e() c() b() e() Evolutionary search a() e()d() Exception: at x(…) at y(…) at e(…) Exception: at x(…) at y(…) at e(…) Crash reproducing test case Stack trace Isn’t an initial test suite close to actual usage of the classes more likely to lead to a crash reproduction?
  49. 49. Test Seeding • Use existing tests to generate the initial test suite • J. M. Rojas, et al. , “Seeding strategies in search- based unit test generation,” STVR, 2016. • Applied to crash replication !50 a() b() c() e()d() e() c() e() a() b() c() e()d() e() c() e() a() b() c() e()d() e() c() e() a() c() c() e()c() e() c() a() Exception: at x(…) at y(…) at e(…) Stack trace a() c() c() c() b() a() e() Existing tests e() Random initial test suite Existing tests subset
  50. 50. Behavioral Model Seeding • Use a model of method usage to generate objects used in the initial test suite • Relies on Model-based Testing • Select abstract test cases • Concretized into objects !51 Random initial test suite Exception: at x(…) at y(…) at e(…) Stack trace a()b() c() e() d() Model Model-driven initial test suite a() b() c() e()d() e() c() e() a() b() c() e()d() e() c() e() a() b() c() e()d() e() c() e() b() a() c() e()d() e() c() e() a()
  51. 51. size() add(Object)iterator() S0 get(int) remove(int) S1 S2S3 remove(int) S4 S5 SX size() add(Object) size() add(Object) !52 Abstract test case selection <add(Object), add(Object)>Abstract test case int[] t = new int[7]; t[3] = (-2147483647); EuclideanIntegerPoint ep = new […](t); LinkedList<[...]> lst = new LinkedList<>(); lst.add(ep); lst.add(ep); Concrete object
  52. 52. Test MATH-79b, Frame 2 !53 java.lang.NullPointerException at ...KMeansPlusPlusClusterer.assignPointsToClusters() at ...KMeansPlusPlusClusterer.cluster() … Input stack trace public void testCluster() throws Exception{
 int[] t = new int[7];
 t[3] = (-2147483647);
 EuclideanIntegerPoint ep = new EuclideanIntegerPoint(t);
 LinkedList<[...]> lst = new LinkedList<>(); 
 lst.add(ep);
 lst.add(ep);
 KMeansPlusPlusClusterer<[...]> kmean = new KMeansPlusPlusClusterer<>(12);
 lst.offerFirst(ep); 
 kmean.cluster(lst, 1, (-1357));
 } Crash reproducing test
  53. 53. Behavioral Model Inference • Inference from sequences of method calls • Coming from • Source code • Static analysis • Test cases • Dynamic analysis • Operations logs (future work) • Online analysis !54 a()b() c() e() d() N-gram inference [b(), a(), e()] [c(), d(), a(), e()] [b(), a(), d(), a(), d(), a(), e()] …
  54. 54. Evaluation on 45 crashes !55 0 25 50 75 not started failed line reached ex. thrown reproduced Numberofframes Configurations no s. test s. 0.2 test s. 0.5 test s. 0.8 test s. 1.0 model s. 0.2 model s. 0.5 model s. 0.8 model s. 1.0
  55. 55. Execution Time !56 ● ● 653 653 641 638 629 699 690 697 704 1e+01 1e+03 1e+05 nos. tests.0.2 tests.0.5 tests.0.8 tests.1.0 models.0.2 models.0.5 models.0.8 models.1.0 Numberoffitnessevaluations
  56. 56. Influencing Factors • Seeding abstract test cases • <add(Object), add(Object)> • Having dissimilar abstract test cases • Multiple information sources • Static analysis • Dynamic analysis • Prioritising abstract test cases selection • Select abstract test cases for classes in the stack trace !57
  57. 57. !58 https://code.fb.com/developer-tools/sapienz-intelligent-automated-software-testing-at-scale/ https://link.springer.com/chapter/10.1007/978-3-319-99241-9_1
  58. 58. !59 https://www.stamp-project.eu https://github.com/STAMP-project Xavier Devroey <x.d.m.devroey@tudelft.nl>

×