Can You Trust Your Tests? (Agile Tour 2015 Kaunas)

Can
You
Trust
Your
Tests?
2015 Vaidas Pilkauskas & Tadas Ščerbinskas

Agenda
1. Test quality & code coverage
2. Mutation testing in theory
3. Mutation testing in practice

Prod vs. Test code quality
Code has bugs.
Tests are code.
Tests have bugs.

Test quality
Readable
Focused
Concise
Well named

“Program testing can be used to show
the presence of bugs, but never to
show their absence!”
- Edsger W. Dijkstra

Types of Code Coverage
Lines
Branches
Instructions
Cyclomatic Complexity
Methods
& more

Lines
string foo() {
return "a" + "b"
}
assertThat(a.foo(), is("ab"))

Lines
string foo(boolean arg) {
return arg ? "a" : "b"
}
assertThat(a.foo(true), is("a"))

Branches
string foo(boolean arg) {
return arg ? "a" : "b"
}
assertThat(a.foo(true), is("a"))
assertThat(a.foo(false), is("b"))

SUCCESS: 26/26 (100%) Tests passed

Can you trust 100% coverage?
Code coverage can only show what is not tested.
For interpreted languages 100% code coverage is
kind of like full compilation.

Code Coverage can be gamed
On purpose or by accident

Mutation testing
Changes your program code and
expects your tests to fail.

What exactly is a mutation?
def isFoo(a) {
return a == foo
}
def isFoo(a) {
return a != foo
}
def isFoo(a) {
return true
}
def isFoo(a) {
return null
}
>
>
>

Terminology
Applying a mutation to some code creates a mutant.
If test passes - mutant has survived.
If test fails - mutant is killed.

array = [a, b, c]
max(array) == ???

// test
max([0]) == 0 ✔
// implementation
max(a) {
return 0
}

// test
max([0]) == 0 ✔
max([1]) == 1 ✘
// implementation
max(a) {
return 0
}

// test
max([0]) == 0 ✔
max([1]) == 1 ✔
// implementation
max(a) {
return a.first
}

// test
max([0]) == 0 ✔
max([1]) == 1 ✔
max([0, 2]) == 2 ✘
// implementation
max(a) {
return a.first
}

// test
max([0]) == 0 ✔
max([1]) == 1 ✔
max([0, 2]) == 2 ✔
// implementation
max(a) {
m = a.first
for (e in a)
if (e > m)
m = e
return m
}

Mutation
// test
max([0]) == 0 ✔
max([1]) == 1 ✔
max([0, 2]) == 2 ✔
// implementation
max(a) {
m = a.first
for (e in a)
if (e > m)
m = e
return m
}

Mutation
// test
max([0]) == 0 ✔
max([1]) == 1 ✔
max([0, 2]) == 2 ✔
// implementation
max(a) {
m = a.first
for (e in a)
if (true)
m = e
return m
}

// test
max([0]) == 0 ✔
max([1]) == 1 ✔
max([0, 2]) == 2 ✔
// implementation
max(a) {
return a.last
}

// test
max([0]) == 0 ✔
max([1]) == 1 ✔
max([0, 2]) == 2 ✔
max([2, 1]) == 2 ✘
// implementation
max(a) {
return a.last
}

// implementation
max(a) {
m = a.first
for (e in a)
if (e > m)
m = e
return m
}
// test
max([0]) == 0 ✔
max([1]) == 1 ✔
max([0, 2]) == 2 ✔
max([2, 1]) == 2 ✔

Tests’ effectiveness is measured
by number of killed mutants by
your test suite.

It’s like hiring a white-hat hacker to try to break into
your server and making sure you detect it.

What if mutant survives
● Simplify your code
● Add additional tests
● TDD - minimal amount of code to pass
the test

Challenges
1. High computation cost - slow
2. Equivalent mutants - false negatives
3. Infinite loops

Equivalent mutations
// Original
int i = 0;
while (i != 10) {
doSomething();
i += 1;
}
// Mutant
int i = 0;
while (i < 10) {
doSomething();
i += 1;
}

Infinite Runtime
// Original
while (expression)
doSomething();
// Mutant
while (true)
doSomething();

Disadvantages
● Can slow down your TDD rhythm
● May be very noisy

Let’s say we have codebase with:
● 300 classes
● around 10 tests per class
● 1 test runs around 1ms
● total test suite runtime is about 3s
Is it really slow?
Let’s do 10 mutations per class
● We get 3000 (300 * 10) mutations
● runtime with all mutations is
150 minutes (3s * 3000)

Speeding it up
Run only tests that cover the mutation
● 300 classes
● 10 tests per class
● 10 mutations per class
● 1ms test runtime
● total mutation runtime
10 * 10 * 1 * 300 = 30s

Speeding it up
During development run tests that
cover only your current changes

● Continuous integration
● TDD with mutation testing only
on new changes
● Add mutation testing to your
legacy project, but do not fail a
build - produce warning report
Usage scenarios

Tools
● Ruby - Mutant
● Java - PIT
● And many tools for other
languages

Summary
● Code coverage highlights code that is
definitely not tested
● Mutation testing highlights code that
definitely is tested
● Given non equivalent mutations, good
test suite should work the same as a
hash function

Vaidas Pilkauskas
@liucijus
● Vilnius JUG co-founder
● Vilnius Scala leader
● Coderetreat facilitator
● Mountain bicycle rider
● Snowboarder
About us
Tadas Ščerbinskas
@tadassce
● VilniusRB co-organizer
● RubyConfLT co-
organizer
● RailsGirls Vilnius & Berlin
coach
● Various board sports’
enthusiast

Credits
A lot of presentation content is based on
work by these guys
● Markus Schirp - author of Mutant
● Henry Coles - author of PIT
● Filip Van Laenen - working on a book

Can You Trust Your Tests? (Agile Tour 2015 Kaunas)

More Related Content

Viewers also liked

Similar to Can You Trust Your Tests? (Agile Tour 2015 Kaunas)

Recently uploaded

Can You Trust Your Tests? (Agile Tour 2015 Kaunas)