GTAC 2014: What lurks in test suites?

Beyond Coverage:
What Lurks in Test Suites?
Patrick Lam, @uWaterlooSE
(and Felix Fang)
University of Waterloo

Test Suites: Myths vs Realities.

Subjects: Open-Source Test Suites

Basic Test Suite Properties
Benchmark sizes:
30 kLOC (google-visualization) to
495 kLOC (weka)
% of system represented by tests:
5.3% (weka) to 50.4% (joda-time)

Test suite versus benchmark size
m = 0.3002
m = 0.03514

# test cases versus # test methods

apache-commons-collection tests
Consider map.TestFlat3Map:
contains 14 test methods
yet, 156 test cases
superclass tests: 42 tests
+ 4 Apache Commons Collections “bulk tests”

Run-time Test Suite Properties

Test suites run quickly
joda-time 4.9s
jdom 5.0s
google-vis 5.1s
jgrapht 16.9s
weka 28.9s
apache-cc 34.0s
poi 36.5s
jmeter 53.0s
jfreechart 241.0s

Failing tests
76/384
0
n/a 0
1
0
3/1109
0
0
0

Continuous Integration: Daily Builds

Continuous Integration: Daily Tests
(via SonarQube,
Travis CI, Surefire)

Myth #1:
Coverage is a key property
of test suites.

Coverage is central in textbooks
Ammann and Offutt, Introduction to Software Testing

Reality #1
Coverage sometimes important,
but tools only give limited data.

Guideline #1
Consider metrics beyond
reported coverage results:
- weka uses peer review for QA
- not measured by tools:
input space coverage

Myth #2
Tests are simple.
- test complexity
- test dependencies

Test methods with at least 5 asserts
e.g. from Joda-Time:
public void testEquality() {
assertSame(getInstance(TOKYO), getInstance(TOKYO));
assertSame(getInstance(LONDON), getInstance(LONDON));
assertSame(getInstance(PARIS), getInstance(PARIS));
assertSame(getInstanceUTC(), getInstanceUTC());
assertSame(getInstance(), getInstance(LONDON));
}

% Test methods with ≥ 5 asserts

Test Methods with Branches
if (isAllowNullKey() == false) {
try {
assertEquals(null, o.nextKey(null));
} catch (NullPointerException ex) {}
} else {
assertEquals(null, o.nextKey(null));
}
// from apache-cc

Test Methods with Loops
counter = 0;
while (this.complexPerm.hasNext()) {
this.complexPerm.getNext();
counter++;
}
assertEquals(maxPermNum, counter);
// from jgrapht

% Test Methods with Control-Flow

Tests Which Use the Filesystem

Filesystem Usage Details
new File(tempDir, "tzdata");
verifies vs canonical forms
of serialized collections on disk

More Filesystem Usage Details
resources, serialization
creates charts, tests their existence
some comparisons vs test data

Tests Which Use the Network
*

Network Usage Details
connects to
http://sc.openoffice.org
tests HTTP mirror server
at localhost

flip side: Mocks and Stubs
True mocks only in Google Visualization.

flip side: Mocks and Stubs
True mocks only in Google Visualization.
Found stubs/fakes in
4 other suites.

Reality #2
Test cases are mostly simple.
few asserts, little branching
some filesystem/net usage

Consequence #2
Many tests don’t need
high expertise to write,
but some do!

Myth #3
Test cases are written by hand.

Types of reuse (standard Java)
1. test class setUp()/tearDown()
2. inheritance: e.g. in apache-cc,
TestFastHashMap extends AbstractTestMap
3. composition: e.g. in jfreechart,
helper class RendererChangeDetector

Inheritance is heavily used
(> 50% test classes inherit functionality)

Test Classes with Custom Superclasses

Helper Classes Example
from poi:
/** Test utility class to get Records
* out of HSSF objects. */
public final class RecordInspector {
public static Record[] getRecords(...) {}
}

Helper Class Count
weka 1
google-vis 3
jdom 6
joda-time 7
jfreechart 7
jmeter 12
jgrapht 15
apache-cc 22
hsqldb 31
poi 54

Test Clone Example
public void testNominalFiltering() {
m_Filter = getFilter(Attribute.NOMINAL);
Instances r = useFilter();
for (int i = 0; i < r.numAttributes(); i++)
assertTrue(r.attribute(i).type() != Attribute.NOMINAL);}
public void testStringFiltering() {
m_Filter = getFilter(Attribute.STRING);
Instances r = useFilter();
for (int i = 0; i < r.numAttributes(); i++)
assertTrue(r.attribute(i).type() != Attribute.STRING);}

Assertion Fingerprints
detect clones
by identifying
similar tests

How to Refactor?
● setUp/tearDown/subclassing
● JUnit 4:
Parametrized Unit Tests
● Test Theories

apache-cc: Bulk tests
public BulkTest bulkTestKeySet() {
return new TestSet(makeFullMap().keySet());
}
● runs all tests in the TestSet class
with the object returned from makeFullMap().
keySet()

jdom: Generated Test Case Stubs
class ClassGenerator makes e.g.:
class TestDocument {
void test_TCC__List();
void test_TCM__int_hashCode();
}
Developer still needs to populate tests.

Automated Testing Technology
In our test suites,
the principal automation technology
was cut-and-paste.

Reality #3
Automated test generation
is uncommon in our test suites.

Guideline
Maximize reuse:
setUp/tearDown,
inheritance,
parametrized tests,
whatever works for you!

Suggestion
Use automated test generation tools!
Some examples:
● Korat (structurally complex tests)
● Randoop (random testing)
● CERT Basic Fuzzing Framework
http://mit.bme.hu/~micskeiz/pages/code_based_test_generation.html

Summary
Myths:
1. Coverage is a key property
of test suites. ≈
2. Tests are simple. ✓
3. Tests are written by hand. ✓

Data
https://docs.google.
com/spreadsheets/d/1xAsdk35tJAOM4WGbGloliS4ovDJ8_
MDn6_Gzk0DXEZQ

GTAC 2014: What lurks in test suites?

More Related Content

What's hot

Similar to GTAC 2014: What lurks in test suites?

Recently uploaded

GTAC 2014: What lurks in test suites?