This talk demonstrates advanced testing practices coming from the STAMP research project and applied to the XWiki open source project:
- Testing for coverage with Jacoco and defining a viable strategy for slowly improving the situation
- Testing the quality of your tests with Descartes Mutation testing
- Automatically enriching your test suite with DSpot
- Testing various configurations with Docker containers and Jenkins
- Generating tests automatically from production stack traces
1. New types of tests for
Java projects
Vincent Massol, October 2018
2. Agenda
•Context & Current status quo
•Coverage testing
•Mutation testing
•Environment testing
•Crash reproduction
3. Context: XWiki
• Open source wiki
• 14 years
• 10-15 active committers
• Very extensible, scripting in wiki pages
• Platform for developing ad-hoc web applications
• Strong build practices using Maven and lots of “Quality” plugins
• Using Jenkins & custom pipeline library for the CI
https://xwiki.org
4. Context: STAMP
• Automatic Test Amplification
• XWiki SAS participating
• Experiments on XWiki project
Mutation testing Environment Testing Production Testing
5. Current Testing Status
• 10815 automated tests (in 2.5 hours):
• Unit tests (using Mockito)
• Integration tests (using Mockito)
• Functional (UI) tests (using Selenium/Webdriver)
6. New questions
• Are my tests testing enough? Coverage
• How good are my tests? Mutation testing
• Do my software work in various setups?
Environment testing
• How can I reproduce bugs found in production?
Production testing
= in place w/ strategy = in progress
7. Test Coverage - Local
• Using Jacoco and Clover
• Strategy - “Ratchet effect”:
• Each Maven module has a threshold
• Jacoco Maven plugin fails if new code
has less coverage than before in %
• Dev is allowed to increase threshold
Of course TPC is not panacea. You
could have 100% and app not
working. Also need functional tests.
Aim for 80%.
8. Test Coverage - Global
• Issue: Local coverage can increase and
global decrease
• Removed code with high TPC
• Code tested indirectly by functional
tests and code refactoring led to
different paths used
• New module with lower TPC than
average
Global TPC evolution
9. Test Coverage - Global
• Strategy:
• Global Clover TPC computed automatically every night on
Jenkins for all repos combined, using a pipeline
• Email sent to developers with report in email (see next slide)
• Developers fix module they have been working on
• Release Manager (RM) ensures that report passes before
release & we add one step in our Release Plan check list.
Source: http://massol.myxwiki.org/xwiki/bin/view/Blog/ComparingCloverReports
11. Mutation Testing
• Using PIT/Gregor, PIT/Descartes
• Concepts of PIT
• Modify code under test (mutants) and run tests
• Good tests kill mutants
• Generates a mutation score similar to the coverage %
• Descartes = extreme mutations that execute fast and have high
values
https://massol.myxwiki.org/xwiki/bin/view/Blog/MutationTestingDescartes
15. Mutation - Example
@Test
public void testEquality()
{
MacroId id1 = new MacroId("id", Syntax.XWIKI_2_0);
MacroId id2 = new MacroId("id", Syntax.XWIKI_2_0);
MacroId id3 = new MacroId("otherid", Syntax.XWIKI_2_0);
MacroId id4 = new MacroId("id", Syntax.XHTML_1_0);
MacroId id5 = new MacroId("otherid", Syntax.XHTML_1_0);
MacroId id6 = new MacroId("id");
MacroId id7 = new MacroId("id");
Assert.assertEquals(id2, id1);
// Equal objects must have equal hashcode
Assert.assertTrue(id1.hashCode() == id2.hashCode());
Assert.assertFalse(id3 == id1);
Assert.assertFalse(id4 == id1);
Assert.assertFalse(id5 == id3);
Assert.assertFalse(id6 == id1);
Assert.assertEquals(id7, id6);
// Equal objects must have equal hashcode
Assert.assertTrue(id6.hashCode() == id7.hashCode());
}
Not testing
for inequality!
Improved thanks to Descartes!
16. Mutation - Limitations
• Takes time to find interesting things to look at and decide if that’s an issue
to handle or not. Need better categorisation in report (now reported by
Descartes):
• Strong pseudo-tested methods:The worst! No matter what the return
values are the tests always fail
• Pseudo-tested methods: Grey area.The tests pass with at least one
modified value.
• Multi module support - PITmp
• But slow on large projects (e.g. 7+ hours just for xwiki-rendering)
17. Mutation - Strategy
• Work in progress, no enough feedback yet!
• Fail the build when the mutation score of a given module is below
a defined threshold in the pom.xml
• The idea is that new tests should, in average, be of quality equal or
better than past tests.
• Other idea: hook on CI to run it only on modified code/tests.
General goal with coverage + mutation: maintain quality
18. Mutation: Going further
• Using DSpot
• Uses PIT/Descartes but injects
results to generate new tests
• Adds assertions to existing tests
• Generate new test methods
• Selector can be PIT/Gregor, PIT/
Descartes, Jacoco (instruction
coverage), Clover (Branch
coverage)
https://massol.myxwiki.org/xwiki/bin/view/Blog/TestGenerationDspot
19. Mutation: Dspot Example 1
public void escapeAttributeValue2() {
String escapedText = XMLUtils.escapeAttributeValue("a < a' && a' < a" => a < a" {");
// AssertGenerator add assertion
Assert.assertEquals("a < a' && a' < a" => a < a" {", escapedText);
// AssertGenerator create local variable with return value of invocation
boolean o_escapeAttributeValue__3 = escapedText.contains("<");
// AssertGenerator add assertion
Assert.assertFalse(o_escapeAttributeValue__3);
// AssertGenerator create local variable with return value of invocation
boolean o_escapeAttributeValue__4 = escapedText.contains(">");
// AssertGenerator add assertion
Assert.assertFalse(o_escapeAttributeValue__4);
// AssertGenerator create local variable with return value of invocation
boolean o_escapeAttributeValue__5 = escapedText.contains("'");
// AssertGenerator add assertion
Assert.assertFalse(o_escapeAttributeValue__5);
// AssertGenerator create local variable with return value of invocation
boolean o_escapeAttributeValue__6 = escapedText.contains(""");
// AssertGenerator create local variable with return value of invocation
boolean o_escapeAttributeValue__7 = escapedText.contains("&&");
// AssertGenerator add assertion
Assert.assertFalse(o_escapeAttributeValue__7);
// AssertGenerator create local variable with return value of invocation
boolean o_escapeAttributeValue__8 = escapedText.contains("{");
// AssertGenerator add assertion
Assert.assertFalse(o_escapeAttributeValue__8);
}
Generated test
New test
@Test
public void escapeAttributeValue()
{
String escapedText = XMLUtils.escapeAttributeValue("a < a' && a' < a" => a < a" {");
assertFalse("Failed to escape <", escapedText.contains("<"));
assertFalse("Failed to escape >", escapedText.contains(">"));
assertFalse("Failed to escape '", escapedText.contains("'"));
assertFalse("Failed to escape "", escapedText.contains("""));
assertFalse("Failed to escape &", escapedText.contains("&&"));
assertFalse("Failed to escape {", escapedText.contains("{"));
}
Original test
20. Mutation: Dspot Example 2
Generated test
Original test
Also increase coverage
Before: 70.5%
After: 71.2%
21. Mutation: Dspot Strategy
• DSpot is very slow to execute (between 3 to 20mn on
small modules)
• One strategy is to run it on CI and in the pipeline commit
generated tests in a different source root.
• And run it only on Tests affected by commit changeset
• Configure Maven to add a new test directory source using
the Maven Build Helper plugin.
• Work in progress: small coverage and mutation score
improvements on XWiki so far.
22. Environment Testing
• Environment = combination of Servlet
container & version, DB & version, OS,
Browser & version
• Future: cluster mode, LibreOffice
integration, external SOLR, etc
• Need: Be able to run/debug functional
tests on local dev machines as well as on
CI
• Using Docker / TestContainers
25. Environment Testing
• Feedback: takes about 3 minutes to deploy all (and 1 minute for the
test)
• Strategy
• Run on CI (Jenkins)
• Round robin of various environments since not enough agents to
run all variations
• Idea: Run CAMP to generate various configurations. CAMP can
mutate Docker compose files and TestContainers provides a
DockerComposeContainer.
• Future: IE/Edge + Docker in Docker
26. Crash Reproduction
• Tool: EvoCrash / Botsing
• Concept:Take a stack trace
and generates a test that,
when executed, leads to this
stack trace
• i.e. find the conditions that
leads to the problem
28. Evocrash - Feedback
• Can take a long time to reproduce, doesn’t always succeed
• Generates a test that reproduces the problem, not the fix!
• Often you’d write a test at a different level (usually up in the call
chain, to be more meaningful to the use case)
• Is useful for newcomers who don’t know the codebase well as it
helps pinpoint the problem. Acts as a timesaver.
• Future work being done in Botsing.Work is planned to generate
regression tests.
29. Parting words
• Experiment, push the limit!
• Some other types of tests not covered and that also need
automation
• Backward compatibility testing
• Performance/Stress testing
• Usability testing
• others?