Mutation Testing
  Hernán Wilkinson             Nicolás Chillo     Gabriel Brunstein
     UBA - 10Pines                  U...
What is Mutation Testing?



Technique to verify the quality of the tests
What is Mutation Testing?

        Verify Quality of…           Verify Quality of…




Source Code                  Tests ...
How does it work?
     1st Step: Create the Mutant

                    Mutation
                    Process




The Sourc...
Examples
DebitCard>>= anotherDebitCard
 ^(type = anotherDebitCard type)
  and: [ number = anotherDebitCard number ]



   ...
Examples
Purchase>>netPaid
 ^self totalPaid – self totalRefunded



                       Change #- with #+



Purchase>>...
Why?
How does it help?
How does it work?
  2nd Step: Try to Kill the Mutant




                                  A Killer
  The “Mutant”        ...
Meaning…


The Mutant Survives  The case generated by the mutant
                     is not tested



The Mutant Dies  ...
Example: The mutant survives
DebitCard>>= anotherDebitCard
 ^(type = anotherDebitCard type) and: [ number = anotherDebitCa...
Example: The mutant dies
DebitCard>>= anotherDebitCard
 ^(type = anotherDebitCard type) and: [ number = anotherDebitCard n...
Example: The mutant survives
Purchase>>netPaid
 ^self totalPaid – self totalRefunded

                                    ...
Example: The mutant dies
Purchase>>netPaid
 ^self totalPaid – self totalRefunded

                                        ...
How does it work? - Summary
• Changes the original source code with
  special “operators” to generate “Mutants”
• Run the ...
MuTalk



Mutation Testing Tool for Smalltalk (Pharo
              and Squeak)
Demo
MuTalk – How does it work?
•       Runs the test to be sure that all run
•       For each method m
    •       For each op...
MuTalk – Operators
•       Boolean messages
    •    Remove #not
    •    Replace #and: with #eqv:
    •    Replace #and: ...
MuTalk – Operators
•       Magnitude messages
    •    Replace #'<=' with #<
    •    Replace #'<=' with #=
    •    Repla...
MuTalk – Operators
•       Collection messages
    •     Remove at:ifAbsent:
    •     Replace #reject: with #select:
    ...
MuTalk – Operators
•       Number messages
    •    Replace #* with #/
    •    Replace #+ with #-
    •    Replace #- wit...
MuTalk – Operators
•       Flow control messages
    •     Remove Exception Handler Operator
    •     Replace #ifFalse: r...
Why is not widely used?
Is not new … - History

Begins in 1971, R. Lipton, “Fault Diagnosis of
            Computer Programs”

 Generally accepted...
Why is not widely used?


Maturity Problem: Because Testing is not
            widely used YET!
        (Although it is in...
Why is not widely used?


Integration Problem: Inability to successfully
  integrate it into the software development
    ...
Why is not widely used?



Technical Problem: It is a Brute Force
             technique!
Technical Problems
• Brute force technique



                   NxM
      N = number of tests
      M = number of mutants
Aconcagua
•   Number of Tests: 666
•   Number of Mutants: 1005
•   Time to create a mutant/compile/link/run:
    10 secs. ...
Another way of doing it…
CreditCard>>= anotherCreditCard
 ^(anotherCreditCard isKindOf: self class) and: [ number =
  anot...
Aconcagua
•   Number of Tests: 666
•   Number of Mutants: 1005
•   Time to create the
    metamutant/compile/link: 2 minut...
MuTalk Optimizations
              Running Strategies
Mutate all methods, run all tests per       Mutate covered methods, ...
MuTalk - Aconcagua Statistics
•       Mutate All, Run All: 1 minute, 6 seconds
•       Mutate Covered, Run Covering: 36
  ...
More Statistics
MuTalk Optimizations
Terminated Mutants




       Try to kill the Mutant!

       The killer has to be
       “Terminated...
MuTalk - Terminated Mutants


• Take the time it runs each test the first
  time
• If the test takes more thant 3 times,
 ...
Let’s redefine MuTalk as…

  Mutation Testing Tool for Smalltalk (Pharo
  and Squeak) that uses meta-facilities to
run fas...
Work in progress


• Operators Categorization based on how
  useful they are to detect errors
• Filter Operators on View
•...
Future work

• Make Operators more “inteligent”
  • a = b ifTrue: [ … ]
    • a = b ifFalse: [] is equivalent to a ~= b if...
Why does it work?


 “Complex faults are coupled to simple faults
in such a way that a test data set that detects
all simp...
Why does it work?


 “In practice, if the software contains a fault,
there will usually be a set of mutants that can
only ...
More Statistics…
How does it compare to
               coverage?
•       Does not replaces coverage because
        some methods do not gen...
Questions?
MuTalk - Mutation
    Testing for Smalltalk

  Hernán Wilkinson             Nicolás Chillo     Gabriel Brunstein
     UBA ...
Upcoming SlideShare
Loading in...5
×

Mutation Testing

2,746

Published on

Mutation Testing by Hernán Wilkinson

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,746
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
101
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Mutation Testing

  1. 1. Mutation Testing Hernán Wilkinson Nicolás Chillo Gabriel Brunstein UBA - 10Pines UBA UBA hernan.wilkinson@gmail.com nchillo@gmail.com gaboto@gmail.com
  2. 2. What is Mutation Testing? Technique to verify the quality of the tests
  3. 3. What is Mutation Testing? Verify Quality of… Verify Quality of… Source Code Tests Mutation Testing
  4. 4. How does it work? 1st Step: Create the Mutant Mutation Process The Source Code The “Mutant” The Mutation “Operator”
  5. 5. Examples DebitCard>>= anotherDebitCard ^(type = anotherDebitCard type) and: [ number = anotherDebitCard number ] Operator: Change #and: by #or: CreditCard>>= anotherDebitCard ^(type = anotherDebitCard type) or: [ number = anotherDebitCard number ]
  6. 6. Examples Purchase>>netPaid ^self totalPaid – self totalRefunded Change #- with #+ Purchase>>netPaid ^self totalPaid + self totalRefunded
  7. 7. Why? How does it help?
  8. 8. How does it work? 2nd Step: Try to Kill the Mutant A Killer The “Mutant” tries to kill the Mutant! All tests run  The Mutant Survives!!! The Test Suite A test fails or errors  The Mutant Dies
  9. 9. Meaning… The Mutant Survives  The case generated by the mutant is not tested The Mutant Dies  The case generated by the mutant is tested
  10. 10. Example: The mutant survives DebitCard>>= anotherDebitCard ^(type = anotherDebitCard type) and: [ number = anotherDebitCard number ] Operator: Change #and: by #or: DebitCard>>= anotherDebitCard ^(type = anotherDebitCard type) or: [ number = anotherDebitCard number ] DebitCardTest>>testDebitCardWithSameNumberShouldBeEqual self assert: (DebitCard visaNumbered: 123) = (DebitCard visaNumbered: 123).
  11. 11. Example: The mutant dies DebitCard>>= anotherDebitCard ^(type = anotherDebitCard type) and: [ number = anotherDebitCard number ] Operator: Change #and: by #or: DebitCard>>= anotherDebitCard ^(type = anotherDebitCard type) or: [ number = anotherDebitCard number ] DebitCardTest>>testDebitCardWithSameNumberShouldBeEqual self assert: (DebitCard visaNumbered: 123) = (DebitCard visaNumbered: 123). DebitCardTest >>testDebitCardWithDifferentNumberShouldBeDifferent self deny: (DebitCard visaNumbered: 123) = (DebitCard visaNumbered: 789).
  12. 12. Example: The mutant survives Purchase>>netPaid ^self totalPaid – self totalRefunded Change #- with #+ Purchase>>netPaid ^self totalPaid + self totalRefunded Purchase>>testNetPaid | purchase | purchase := Purchase for: 20 * euros. self assert: purchase netPaid = (purchase totalPaid – purchase totalRefunded)
  13. 13. Example: The mutant dies Purchase>>netPaid ^self totalPaid – self totalRefunded Change #- with #+ Purchase>>netPaid ^self totalPaid + self totalRefunded Purchase>>testNetPaidWithOutRefunds  Renamed! | purchase | purchase := Purchase for: 20 * euros. self assert: purchase netPaid = (purchase totalPaid – purchase totalRefunded) Purchase>>testNetPaidWithRefunds | purchase | purchase := Purchase for: 20 * euros. purchase addRefundFor: 10 * euros. self assert: purchase netPaid = (purchase totalPaid – purchase totalRefunded)
  14. 14. How does it work? - Summary • Changes the original source code with special “operators” to generate “Mutants” • Run the test suite related to the changed code • If a test errors or fails  Kills the mutant • If all tests run  The Mutant survives • Surviving Mutants show not tested cases The Important Thing!
  15. 15. MuTalk Mutation Testing Tool for Smalltalk (Pharo and Squeak)
  16. 16. Demo
  17. 17. MuTalk – How does it work? • Runs the test to be sure that all run • For each method m • For each operator o • Changes m AST using o • Compiles mutated code • Changes method dictionary • Run the tests
  18. 18. MuTalk – Operators • Boolean messages • Remove #not • Replace #and: with #eqv: • Replace #and: with #nand: • Replace #and: with #or: • Replace #and: with #secondArgResult: • Replace #and: with false • Replace #or: First Condition with false • Replace #or: Second Condition with false • Replace #or: with #and: • Replace #or: with #xor:
  19. 19. MuTalk – Operators • Magnitude messages • Replace #'<=' with #< • Replace #'<=' with #= • Replace #'<=' with #> • Replace #'>=' with #= • Replace #'>=' with #> • Replace #'~=' with #= • Replace #< with #> • Replace #= with #'~=' • Replace #> with #< • Replace #max: with #min: • Replace #min: with #max:
  20. 20. MuTalk – Operators • Collection messages • Remove at:ifAbsent: • Replace #reject: with #select: • Replace #select: with #reject: • Replace Reject block with [:each | false] • Replace Reject block with [:each | true] • Replace Select block with [:each | false] • Replace Select block with [:each | true] • Replace detect: block with [:each | false] when #detect:ifNone: • Replace detect: block with [:each | true] when #detect:ifNone: • Replace do block with [:each |] • Replace ifNone: block with [] when #detect:ifNone: • Replace inject:aValue into:aBlock with aValue • Replace sortBlock:aBlock with sortBlock:[:a :b| true]
  21. 21. MuTalk – Operators • Number messages • Replace #* with #/ • Replace #+ with #- • Replace #- with #+ • Replace #/ with #*
  22. 22. MuTalk – Operators • Flow control messages • Remove Exception Handler Operator • Replace #ifFalse: receiver with false • Replace #ifFalse: receiver with true • Replace #ifFalse: with #ifTrue: • Replace #ifFalse:IfTrue: receiver with false • Replace #ifFalse:IfTrue: receiver with true • Replace #ifTrue: receiver with false • Replace #ifTrue: receiver with true • Replace #ifTrue: with #ifFalse: • Replace #ifTrue:ifFalse: receiver with false • Replace #ifTrue:ifFalse: receiver with true
  23. 23. Why is not widely used?
  24. 24. Is not new … - History Begins in 1971, R. Lipton, “Fault Diagnosis of Computer Programs” Generally accepted in 1978, R. Lipton et al, “Hints on test data selection: Help for the practicing programmer”
  25. 25. Why is not widely used? Maturity Problem: Because Testing is not widely used YET! (Although it is increasing)
  26. 26. Why is not widely used? Integration Problem: Inability to successfully integrate it into the software development process (TDD plays a key role now)
  27. 27. Why is not widely used? Technical Problem: It is a Brute Force technique!
  28. 28. Technical Problems • Brute force technique NxM N = number of tests M = number of mutants
  29. 29. Aconcagua • Number of Tests: 666 • Number of Mutants: 1005 • Time to create a mutant/compile/link/run: 10 secs. each aprox.? • Total time: – 6693300 seconds – 1859 hours, 15 minutes
  30. 30. Another way of doing it… CreditCard>>= anotherCreditCard ^(anotherCreditCard isKindOf: self class) and: [ number = anotherCreditCard number ] CreditCard>>= anotherCreditCard MutantId = 12 ifTrue: [ ^(anotherCreditCard isKindOf: self class) or: [ number = anotherCreditCard number ]. MutantId = 13 ifTrue: [ ^(anotherCreditCard isKindOf: self class) nand: [ number = anotherCreditCard number ]. MutantId = 14 ifTrue: [ ^(anotherCreditCard isKindOf: self class) eqv: [ number = anotherCreditCard number ].
  31. 31. Aconcagua • Number of Tests: 666 • Number of Mutants: 1005 • Time to create the metamutant/compile/link: 2 minutes? • Time to run the tests per mutant: 1 sec • Total time: – 1125 seconds – 18 minutes 45 seconds
  32. 32. MuTalk Optimizations Running Strategies Mutate all methods, run all tests per Mutate covered methods, run all mutant tests per mutant – Create a mutant for each method – Takes coverage running all tests – Run all the test for each mutant – Mutate only covered methods – Disadvantage: Slower strategy – Run all methods per mutant – Relies on coverage Mutate all methods, run only test Mutate covered methods, run only test that cover mutated method that covered mutated methods – Run coverage keeping for each – Run coverage keeping for each method the tests that covered it method the tests that covered it – Create a mutant for each method – Create a mutant for only covered – For each mutant, run only the methods tests that covered the original – For each mutant, run only the tests method that covered the original method
  33. 33. MuTalk - Aconcagua Statistics • Mutate All, Run All: 1 minute, 6 seconds • Mutate Covered, Run Covering: 36 seconds • Result: • 545 Killed • 6 Terminated • 83 Survived
  34. 34. More Statistics
  35. 35. MuTalk Optimizations Terminated Mutants Try to kill the Mutant! The killer has to be “Terminated” The Test Suite
  36. 36. MuTalk - Terminated Mutants • Take the time it runs each test the first time • If the test takes more thant 3 times, terminate it
  37. 37. Let’s redefine MuTalk as… Mutation Testing Tool for Smalltalk (Pharo and Squeak) that uses meta-facilities to run faster and provide inmediate feedback
  38. 38. Work in progress • Operators Categorization based on how useful they are to detect errors • Filter Operators on View • Cancel process
  39. 39. Future work • Make Operators more “inteligent” • a = b ifTrue: [ … ] • a = b ifFalse: [] is equivalent to a ~= b ifTrue: [] • Suggest tests using not killed mutants • Use MuTalk to test MuTalk?
  40. 40. Why does it work? “Complex faults are coupled to simple faults in such a way that a test data set that detects all simple faults in a program will detect most complex faults” (Coupling effect) Demonstrated in 1995, K. Wah, “Fault coupling in finite bijective functions”
  41. 41. Why does it work? “In practice, if the software contains a fault, there will usually be a set of mutants that can only be killed by a test case that also detects that fault” Geist et al, “Estimation and enhancement of real-time software reliability through mutation analysis”, 1992
  42. 42. More Statistics…
  43. 43. How does it compare to coverage? • Does not replaces coverage because some methods do not generate mutants • But: • Mutants on not covered methods will survive • It provides better insight than coverage • Method Coverage fails with long methods/conditions/loops/etc.
  44. 44. Questions?
  45. 45. MuTalk - Mutation Testing for Smalltalk Hernán Wilkinson Nicolás Chillo Gabriel Brunstein UBA - 10Pines UBA UBA hernan.wilkinson@gmail.com nchillo@gmail.com gaboto@gmail.com
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×