Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Assessing Test Case Prioritization on Real Faults and Mutants

90 views

Published on

Test Case Prioritization (TCP) is an important component of regression testing, allowing for earlier detection of faults or helping to reduce testing time and cost. While several TCP approaches exist in the research literature, a growing number of studies have evaluated them against synthetic software defects, called mutants. Hence, it is currently unclear to what extent TCP performance on mutants would be representative of the performance achieved on real faults. To answer this fundamental question, we conduct the first empirical study comparing the performance of TCP techniques applied to both real-world and mutation faults. The context of our study includes eight well-studied TCP approaches, 35k+ mutation faults, and 357 real-world faults from five Java systems in the Defects4J dataset. Our results indicate that the relative performance of the studied TCP techniques on mutants may not strongly correlate with performance on real faults, depending upon attributes of the subject programs. This suggests that, in certain contexts, the best performing technique on a set of mutants may not be the best technique in practice when applied to real faults. We also illustrate that these correlations vary for mutants generated by different operators depending on whether chosen operators reflect typical faults of a subject program. This highlights the importance, particularly for TCP, of developing mutation operators tailored for specific program domains.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Assessing Test Case Prioritization on Real Faults and Mutants

  1. 1. Qi Luo, Kevin Moran, Massimiliano Di Penta, Denys Poshyvanyk AssessingTest Case Prioritization on Real Faults and Mutants 34th International Conference on Software Maintenance and Evolution (ICSME’18) Thursday, September 27th, 2018
  2. 2. Continuous Testing
  3. 3. Continuous Testing
  4. 4. REGRESSION TESTING v1.0 v1.2 v2.0 vN …
  5. 5. REGRESSION TESTING v1.0 v1.2 v2.0 vN … t1 t2 t3 t4
  6. 6. REGRESSION TESTING v1.0 v1.2 v2.0 vN … t1 t2 t3 t4
  7. 7. REGRESSION TESTING v1.0 v1.2 v2.0 vN … t1 t2 t3 t4
  8. 8. REGRESSION TESTING v1.0 v1.2 v2.0 vN … t1 t2 t3 t4
  9. 9. REGRESSION TESTING v1.0 v1.2 v2.0 vN … t1 t2 t3 t4
  10. 10. TEST CASE PRIORITIZATION (TCP) v1.2 t1 t2 t3 t4
  11. 11. TEST CASE PRIORITIZATION (TCP) v1.2 t1 t2 t3 t4 0 1 2 3 4 0 1 2 3 4 # Faults Found #Tests Executed Test Ordering
  12. 12. TEST CASE PRIORITIZATION (TCP) v1.2 t1 t2 t3 t4 0 1 2 3 4 0 1 2 3 4 # Faults Found #Tests Executed Test Ordering 1) t1
  13. 13. TEST CASE PRIORITIZATION (TCP) v1.2 t1 t2 t3 t4 0 1 2 3 4 0 1 2 3 4 # Faults Found #Tests Executed Test Ordering 1) t1 2) t2
  14. 14. TEST CASE PRIORITIZATION (TCP) v1.2 t1 t2 t3 t4 0 1 2 3 4 0 1 2 3 4 # Faults Found #Tests Executed Test Ordering 1) t1 2) t2 3) t3
  15. 15. TEST CASE PRIORITIZATION (TCP) v1.2 t1 t2 t3 t4 0 1 2 3 4 0 1 2 3 4 # Faults Found #Tests Executed Test Ordering 1) t1 2) t2 3) t3 4) t4
  16. 16. TEST CASE PRIORITIZATION (TCP) v1.2 0 1 2 3 4 0 1 2 3 4 # Faults Found #Tests Executed Test Ordering 1) t1 2) t2 3) t3 4) t4
  17. 17. TEST CASE PRIORITIZATION (TCP) v1.2 0 1 2 3 4 0 1 2 3 4 # Faults Found #Tests Executed Test Ordering 1) t1 2) t2 3) t3 4) t4 t3 t1 t2 t4
  18. 18. TEST CASE PRIORITIZATION (TCP) v1.2 0 1 2 3 4 0 1 2 3 4 # Faults Found #Tests Executed Test Ordering 1) t1 2) t2 3) t3 4) t4 1) t3 t3 t1 t2 t4
  19. 19. TEST CASE PRIORITIZATION (TCP) v1.2 0 1 2 3 4 0 1 2 3 4 # Faults Found #Tests Executed Test Ordering 1) t1 2) t2 3) t3 4) t4 1) t3 2) t1 t3 t1 t2 t4
  20. 20. TEST CASE PRIORITIZATION (TCP) v1.2 0 1 2 3 4 0 1 2 3 4 # Faults Found #Tests Executed Test Ordering 1) t1 2) t2 3) t3 4) t4 1) t3 2) t1 3) t2 t3 t1 t2 t4
  21. 21. TEST CASE PRIORITIZATION (TCP) v1.2 0 1 2 3 4 0 1 2 3 4 # Faults Found #Tests Executed Test Ordering 1) t1 2) t2 3) t3 4) t4 1) t3 2) t1 3) t2 4) t4 t3 t1 t2 t4
  22. 22. TEST CASE PRIORITIZATION (TCP) v1.2 0 1 2 3 4 0 1 2 3 4 # Faults Found #Tests Executed Test Ordering 1) t1 2) t2 3) t3 4) t4 1) t3 2) t1 3) t2 4) t4 t3 t1 t2 t4 APFD: Average Percentage of Faults Detected APFD = 54%
  23. 23. TEST CASE PRIORITIZATION (TCP) v1.2 0 1 2 3 4 0 1 2 3 4 # Faults Found #Tests Executed Test Ordering 1) t1 2) t2 3) t3 4) t4 1) t3 2) t1 3) t2 4) t4 t3 t1 t2 t4 APFD: Average Percentage of Faults Detected APFD = 54%
  24. 24. TEST CASE PRIORITIZATION (TCP) v1.2 0 1 2 3 4 0 1 2 3 4 # Faults Found #Tests Executed Test Ordering 1) t1 2) t2 3) t3 4) t4 1) t3 2) t1 3) t2 4) t4 t3 t1 t2 t4 APFD: Average Percentage of Faults Detected APFD = 54% APFD = 96%
  25. 25. TEST CASE PRIORITIZATION (TCP) v1.2 0 1 2 3 4 0 1 2 3 4 # Faults Found #Tests Executed Test Ordering 1) t1 2) t2 3) t3 4) t4 1) t3 2) t1 3) t2 4) t4 t3 t1 t2 t4 APFD: Average Percentage of Faults Detected APFD = 54% APFD = 96% The Red ordering of test cases outperforms the Blue ordering in terms of APFD The main goal of TCP is to prioritize test cases so as to maximize APFD
  26. 26. ASSESSING TCP EFFECTIVENESS (IDEAL) v1.0 v1.2 v2.0 vN …
  27. 27. ASSESSING TCP EFFECTIVENESS (IDEAL) v1.0 v1.2 v2.0 vN …
  28. 28. ASSESSING TCP EFFECTIVENESS (IDEAL)
  29. 29. ASSESSING TCP EFFECTIVENESS (IDEAL)
  30. 30. ASSESSING TCP EFFECTIVENESS (IDEAL) t1 t2 t3 t4
  31. 31. ASSESSING TCP EFFECTIVENESS (IDEAL) t1 t2 t3 t4 TCP Technique
  32. 32. ASSESSING TCP EFFECTIVENESS (IDEAL) t1 t2 t3 t4 TCP Technique t1 t2 t3 t4 Prioritized Tests
  33. 33. ASSESSING TCP EFFECTIVENESS (IDEAL) t1 t2 t3 t4 TCP Technique t1 t2 t3 t4 Prioritized Tests
  34. 34. ASSESSING TCP EFFECTIVENESS (IDEAL) t1 t2 t3 t4 TCP Technique t1 t2 t3 t4 Prioritized Tests Measured APFD Values
  35. 35. ASSESSING TCP EFFECTIVENESS (ACTUAL) t1 t2 t3 t4 Mutation Framework v N
  36. 36. ASSESSING TCP EFFECTIVENESS (ACTUAL) t1 t2 t3 t4 Mutation Framework v N
  37. 37. ASSESSING TCP EFFECTIVENESS (ACTUAL) t1 t2 t3 t4 v N
  38. 38. ASSESSING TCP EFFECTIVENESS (ACTUAL) t1 t2 t3 t4 TCP Technique v N
  39. 39. ASSESSING TCP EFFECTIVENESS (ACTUAL) t1 t2 t3 t4 TCP Technique t1 t2 t3 t4 Prioritized Tests v N
  40. 40. ASSESSING TCP EFFECTIVENESS (ACTUAL) t1 t2 t3 t4 TCP Technique t1 t2 t3 t4 Prioritized Tests v N
  41. 41. ASSESSING TCP EFFECTIVENESS (ACTUAL) t1 t2 t3 t4 TCP Technique t1 t2 t3 t4 Prioritized Tests Measured APFD Values v N
  42. 42. How well do TCP Techniques perform on real faults?
  43. 43. How well do TCP Techniques perform on real faults? Is the performance of TCP techniques on mutants representative of their performance on real faults?
  44. 44. How well do TCP Techniques perform on real faults? What properties of mutants impact the representativeness of this performance? Is the performance of TCP techniques on mutants representative of their performance on real faults?
  45. 45. RESEARCH QUESTIONS
  46. 46. RESEARCH QUESTIONS • RQ1: TCP performance real faults?
  47. 47. RESEARCH QUESTIONS • RQ1: TCP performance real faults? • RQ2: Representativeness of mutants for TCP?
  48. 48. RESEARCH QUESTIONS • RQ1: TCP performance real faults? • RQ2: Representativeness of mutants for TCP? • RQ3: How do fault properties impact TCP performance?
  49. 49. EMPIRICAL STUDY CONTEXT Project Name Number of Real Faults JFreeChart 26 Closure compiler 133 Apache commons-lang 65 Apache commons-math 106 Joda-Time 27 Total 357 Defects4J
  50. 50. EMPIRICAL STUDY CONTEXT v1.0 v2.0 vN …
  51. 51. EMPIRICAL STUDY CONTEXT
  52. 52. EMPIRICAL STUDY CONTEXT
  53. 53. EMPIRICAL STUDY CONTEXT
  54. 54. EMPIRICAL STUDY CONTEXT Pit Mutation Framework vN
  55. 55. EMPIRICAL STUDY CONTEXT Pit Mutation Framework vN Repeat this process 100 times Every mutant can be detected by at least one test case
  56. 56. EMPIRICAL STUDY CONTEXT Project Name Number of Real Faults Mutants JFreeChart 26 2,600 Closure 133 13,300 Commons-lang 65 6,500 Commons-math 106 10,600 Joda-Time 27 2,700 Total 357 35,700 Defects4J
  57. 57. EMPIRICAL STUDY CONTEXT
  58. 58. EMPIRICAL STUDY CONTEXT
  59. 59. EMPIRICAL STUDY CONTEXT
  60. 60. EMPIRICAL STUDY CONTEXT Created a second set of mutants with subsumed mutants removed
  61. 61. EMPIRICAL STUDY CONTEXT Project Name Number of Bugs Mutants Subsuming Mutants JFreeChart 26 2,600 1,796 Closure 133 13,300 9,731 Commons-lang 65 6,500 2,129 Commons-math 106 10,600 5,016 Joda-Time 27 2,700 2,700 Total 357 35,700 21,372 Defects4J
  62. 62. EMPIRICAL STUDY CONTEXT Type Tag Description Static CG-Total Call graph-based (total strategy) CG-Add Call graph-based (additional strategy) Str String distance-based Topic Topic model-based Dynamic Total Greedy Total (statement level) Add Greedy Additional (statement level) Art Adaptive Random (statement level) Search Search-based (statement level) Studied TCP Techniques
  63. 63. METHODOLOGY RQ1
  64. 64. METHODOLOGY RQ1 Run Tests at Test Method Level
  65. 65. METHODOLOGY RQ1 Run Tests at Test Method Level APFD = Rate of Fault Detection
  66. 66. METHODOLOGY RQ1 Run Tests at Test Method Level APFD = Rate of Fault Detection APFDc = Fault Detection Rate & Efficiency
  67. 67. METHODOLOGY RQ1 Run Tests at Test Method Level ANOVA & Tukey HSD Tests APFD = Rate of Fault Detection APFDc = Fault Detection Rate & Efficiency
  68. 68. METHODOLOGY RQ2
  69. 69. METHODOLOGY RQ2 Examine Absolute Performance
  70. 70. METHODOLOGY RQ2 Examine Absolute Performance Examine Relative Performance
  71. 71. METHODOLOGY RQ2 Examine Absolute Performance Kendall Rank Correlation Analysis Examine Relative Performance
  72. 72. METHODOLOGY RQ3
  73. 73. METHODOLOGY RQ3 Examine Mutants based on Real Fault Coupling
  74. 74. METHODOLOGY RQ3 Examine Mutants based on Real Fault Coupling Examine Mutants based on Operator
  75. 75. METHODOLOGY RQ3 Kendall Rank Correlation Analysis Examine Mutants based on Real Fault Coupling Examine Mutants based on Operator
  76. 76. RQ1 RESULTS: REAL FAULT PERFORMANCE TCP Technique APFD Static Topic 0.700 Str 0.696 CG-Add 0.597 CG-Total 0.594 Dynamic Art 0.657 Total 0.610 Search 0.600 Add 0.583 TCP Technique APFDc Static Topic 0.635 Str 0.594 CG-Add 0.591 CG-Total 0.480 Dynamic Art 0.677 Search 0.556 Add 0.454 Total 0.419
  77. 77. RQ1 RESULTS: REAL FAULT PERFORMANCE TCP Technique APFD Static Topic 0.700 Str 0.696 CG-Add 0.597 CG-Total 0.594 Dynamic Art 0.657 Total 0.610 Search 0.600 Add 0.583 TCP Technique APFDc Static Topic 0.635 Str 0.594 CG-Add 0.591 CG-Total 0.480 Dynamic Art 0.677 Search 0.556 Add 0.454 Total 0.419
  78. 78. RQ1 RESULTS: REAL FAULT PERFORMANCE TCP Technique APFD Static Topic 0.700 Str 0.696 CG-Add 0.597 CG-Total 0.594 Dynamic Art 0.657 Total 0.610 Search 0.600 Add 0.583 TCP Technique APFDc Static Topic 0.635 Str 0.594 CG-Add 0.591 CG-Total 0.480 Dynamic Art 0.677 Search 0.556 Add 0.454 Total 0.419
  79. 79. RQ1 RESULTS: REAL FAULT PERFORMANCE TCP Technique APFD Static Topic 0.700 Str 0.696 CG-Add 0.597 CG-Total 0.594 Dynamic Art 0.657 Total 0.610 Search 0.600 Add 0.583 TCP Technique APFDc Static Topic 0.635 Str 0.594 CG-Add 0.591 CG-Total 0.480 Dynamic Art 0.677 Search 0.556 Add 0.454 Total 0.419 Summary • All Techniques perform better according to APFD
  80. 80. RQ1 RESULTS: REAL FAULT PERFORMANCE TCP Technique APFD Static Topic 0.700 Str 0.696 CG-Add 0.597 CG-Total 0.594 Dynamic Art 0.657 Total 0.610 Search 0.600 Add 0.583 TCP Technique APFDc Static Topic 0.635 Str 0.594 CG-Add 0.591 CG-Total 0.480 Dynamic Art 0.677 Search 0.556 Add 0.454 Total 0.419 Summary • All Techniques perform better according to APFD • Static TCP Techniques tend to perform better overall
  81. 81. RQ1 RESULTS: REAL FAULT PERFORMANCE TCP Technique APFD Static Topic 0.700 Str 0.696 CG-Add 0.597 CG-Total 0.594 Dynamic Art 0.657 Total 0.610 Search 0.600 Add 0.583 TCP Technique APFDc Static Topic 0.635 Str 0.594 CG-Add 0.591 CG-Total 0.480 Dynamic Art 0.677 Search 0.556 Add 0.454 Total 0.419 Summary • All Techniques perform better according to APFD • Static TCP Techniques tend to perform better overall • Total outperforms Add for APFD
  82. 82. RQ2 RESULTS: COMPARING MUTANTS & REAL FAULTS Tech APFD Static Topic 0.700 Str 0.696 CG-A 0.597 CG-T 0.594 Dynamic Art 0.657 Total 0.610 Search 0.600 Add 0.583 Tech APFDc Static Topic 0.635 Str 0.594 CG-A 0.591 CG-T 0.480 Dynamic Art 0.677 Search 0.556 Add 0.454 Total 0.419 Tech APFD Static Str 0.834 Topic 0.832 CG-A 0.818 CG-T 0.743 Dynamic Add 0.897 Art 0.800 Search 0.784 Total 0.757 Tech APFDc Static CG-A 0.835 Topic 0.802 Str 0.788 CG-T 0.598 Dynamic Art 0.841 Add 0.829 Search 0.725 Total 0.549 Real Faults All Mutants
  83. 83. RQ2 RESULTS: COMPARING MUTANTS & REAL FAULTS Tech APFD Static Topic 0.700 Str 0.696 CG-A 0.597 CG-T 0.594 Dynamic Art 0.657 Total 0.610 Search 0.600 Add 0.583 Tech APFDc Static Topic 0.635 Str 0.594 CG-A 0.591 CG-T 0.480 Dynamic Art 0.677 Search 0.556 Add 0.454 Total 0.419 Tech APFD Static Str 0.834 Topic 0.832 CG-A 0.818 CG-T 0.743 Dynamic Add 0.897 Art 0.800 Search 0.784 Total 0.757 Tech APFDc Static CG-A 0.835 Topic 0.802 Str 0.788 CG-T 0.598 Dynamic Art 0.841 Add 0.829 Search 0.725 Total 0.549 Real Faults All Mutants
  84. 84. RQ2 RESULTS: COMPARING MUTANTS & REAL FAULTS Tech APFD Static Topic 0.700 Str 0.696 CG-A 0.597 CG-T 0.594 Dynamic Art 0.657 Total 0.610 Search 0.600 Add 0.583 Tech APFDc Static Topic 0.635 Str 0.594 CG-A 0.591 CG-T 0.480 Dynamic Art 0.677 Search 0.556 Add 0.454 Total 0.419 Tech APFD Static Str 0.834 Topic 0.832 CG-A 0.818 CG-T 0.743 Dynamic Add 0.897 Art 0.800 Search 0.784 Total 0.757 Tech APFDc Static CG-A 0.835 Topic 0.802 Str 0.788 CG-T 0.598 Dynamic Art 0.841 Add 0.829 Search 0.725 Total 0.549 Real Faults All Mutants
  85. 85. Tech APFD Static Topic 0.700 Str 0.696 CG-A 0.597 CG-T 0.594 Dynamic Art 0.657 Total 0.610 Search 0.600 Add 0.583 Tech APFDc Static Topic 0.635 Str 0.594 CG-A 0.591 CG-T 0.480 Dynamic Art 0.677 Search 0.556 Add 0.454 Total 0.419 Tech APFD Static Str 0.620 Topic 0.612 CG-A 0.612 CG-T 0.561 Dynamic Add 0.664 Art 0.622 Search 0.578 Total 0.534 Tech APFDc Static CG-A 0.639 Str 0.572 Topic 0.570 CG-T 0.407 Dynamic Art 0.671 Add 0.565 Search 0.508 Total 0.305 Real Faults Subsuming Mutants RQ2 RESULTS: COMPARING MUTANTS & REAL FAULTS
  86. 86. Tech APFD Static Topic 0.700 Str 0.696 CG-A 0.597 CG-T 0.594 Dynamic Art 0.657 Total 0.610 Search 0.600 Add 0.583 Tech APFDc Static Topic 0.635 Str 0.594 CG-A 0.591 CG-T 0.480 Dynamic Art 0.677 Search 0.556 Add 0.454 Total 0.419 Tech APFD Static Str 0.620 Topic 0.612 CG-A 0.612 CG-T 0.561 Dynamic Add 0.664 Art 0.622 Search 0.578 Total 0.534 Tech APFDc Static CG-A 0.639 Str 0.572 Topic 0.570 CG-T 0.407 Dynamic Art 0.671 Add 0.565 Search 0.508 Total 0.305 Real Faults Subsuming Mutants RQ2 RESULTS: COMPARING MUTANTS & REAL FAULTS
  87. 87. Tech APFD Static Topic 0.700 Str 0.696 CG-A 0.597 CG-T 0.594 Dynamic Art 0.657 Total 0.610 Search 0.600 Add 0.583 Tech APFDc Static Topic 0.635 Str 0.594 CG-A 0.591 CG-T 0.480 Dynamic Art 0.677 Search 0.556 Add 0.454 Total 0.419 Tech APFD Static Str 0.620 Topic 0.612 CG-A 0.612 CG-T 0.561 Dynamic Add 0.664 Art 0.622 Search 0.578 Total 0.534 Tech APFDc Static CG-A 0.639 Str 0.572 Topic 0.570 CG-T 0.407 Dynamic Art 0.671 Add 0.565 Search 0.508 Total 0.305 Real Faults Subsuming Mutants RQ2 RESULTS: COMPARING MUTANTS & REAL FAULTS
  88. 88. Tech APFD Static Topic 0.700 Str 0.696 CG-A 0.597 CG-T 0.594 Dynamic Art 0.657 Total 0.610 Search 0.600 Add 0.583 Tech APFDc Static Topic 0.635 Str 0.594 CG-A 0.591 CG-T 0.480 Dynamic Art 0.677 Search 0.556 Add 0.454 Total 0.419 Tech APFD Static Str 0.620 Topic 0.612 CG-A 0.612 CG-T 0.561 Dynamic Add 0.664 Art 0.622 Search 0.578 Total 0.534 Tech APFDc Static CG-A 0.639 Str 0.572 Topic 0.570 CG-T 0.407 Dynamic Art 0.671 Add 0.565 Search 0.508 Total 0.305 Real Faults Subsuming MutantsSummary •Metrics according to the full mutant set tend to overestimate performance RQ2 RESULTS: COMPARING MUTANTS & REAL FAULTS
  89. 89. Tech APFD Static Topic 0.700 Str 0.696 CG-A 0.597 CG-T 0.594 Dynamic Art 0.657 Total 0.610 Search 0.600 Add 0.583 Tech APFDc Static Topic 0.635 Str 0.594 CG-A 0.591 CG-T 0.480 Dynamic Art 0.677 Search 0.556 Add 0.454 Total 0.419 Tech APFD Static Str 0.620 Topic 0.612 CG-A 0.612 CG-T 0.561 Dynamic Add 0.664 Art 0.622 Search 0.578 Total 0.534 Tech APFDc Static CG-A 0.639 Str 0.572 Topic 0.570 CG-T 0.407 Dynamic Art 0.671 Add 0.565 Search 0.508 Total 0.305 Real Faults Subsuming MutantsSummary •Metrics according to the full mutant set tend to overestimate performance •APFDc values on mutants correlated more strongly to APFDc values on Real Faults RQ2 RESULTS: COMPARING MUTANTS & REAL FAULTS
  90. 90. Tech APFD Static Topic 0.700 Str 0.696 CG-A 0.597 CG-T 0.594 Dynamic Art 0.657 Total 0.610 Search 0.600 Add 0.583 Tech APFDc Static Topic 0.635 Str 0.594 CG-A 0.591 CG-T 0.480 Dynamic Art 0.677 Search 0.556 Add 0.454 Total 0.419 Tech APFD Static Str 0.620 Topic 0.612 CG-A 0.612 CG-T 0.561 Dynamic Add 0.664 Art 0.622 Search 0.578 Total 0.534 Tech APFDc Static CG-A 0.639 Str 0.572 Topic 0.570 CG-T 0.407 Dynamic Art 0.671 Add 0.565 Search 0.508 Total 0.305 Real Faults Subsuming MutantsSummary •Metrics according to the full mutant set tend to overestimate performance •APFDc values on mutants correlated more strongly to APFDc values on Real Faults •The relative ordering of techniques between mutants and real faults differs RQ2 RESULTS: COMPARING MUTANTS & REAL FAULTS
  91. 91. RQ3 RESULTS: EXAMINING FAULT PROPERTIES
  92. 92. RQ3 RESULTS: EXAMINING FAULT PROPERTIES • High performance correlation for real faults that were highly coupled to mutation operators
  93. 93. RQ3 RESULTS: EXAMINING FAULT PROPERTIES • High performance correlation for real faults that were highly coupled to mutation operators • TCP techniques perform differently across mutants seeded by different operators
  94. 94. RQ3 RESULTS: EXAMINING FAULT PROPERTIES • High performance correlation for real faults that were highly coupled to mutation operators • TCP techniques perform differently across mutants seeded by different operators • TCP performance for mutants seeded by different operators varies widely across subject programs
  95. 95. LEARNED LESSONS
  96. 96. LEARNED LESSONS Relative TCP performance can differ between mutants and real faults Lesson 1
  97. 97. LEARNED LESSONS Relative TCP performance can differ between mutants and real faults Lesson 1 The metrics utilized in TCP evaluations impact mutant representativeness Lesson 2
  98. 98. LEARNED LESSONS Relative TCP performance can differ between mutants and real faults Lesson 1 The metrics utilized in TCP evaluations impact mutant representativeness Lesson 2 Mutation Operators must be carefully selected in order for mutation-based TCP performance to represent performance on real faults Lesson 3
  99. 99. Any Questions? Thank you! Kevin Moran Post-Doctoral Fellow College of William & Mary @kevpmo kpmoran@cs.wm.edu https://www.kpmoran.com
  100. 100. ADDITIONAL SLIDES
  101. 101. SEEDING MUTANTS INTO THE LATEST VERSION

×