1.
New types of tests for
Java projects
Vincent Massol, CTO XWiki SAS, December 2019
2.
Agenda
•Context & Current status quo
•Coverage testing
•Mutation testing
•Environment testing
•Crash reproduction
3.
Context: XWiki
• Open source wiki
• 15 years
• 10-15 active committers
• Very extensible, scripting in wiki pages
• Platform for developing ad-hoc web applications
• Strong build practices using Maven and lots of “Quality” plugins
• Using Jenkins & custom pipeline library for the CI
https://xwiki.org
4.
Context: STAMP
• Automatic Test Amplification
• XWiki SAS participating, 3 years
• Experiments on XWiki project
Mutation testing Environment Testing Production Testing
5.
Current Testing Status
• 10986 automated tests (in 2.5 hours):
• Unit tests (using Mockito)
• Integration tests (using Mockito)
• Functional (UI) tests (using Selenium/Webdriver)
6.
New questions
• Are my tests testing enough? Coverage
• How good are my tests? Mutation testing
• Do my software work in various setups?
Environment testing
• How can I reproduce bugs found in production?
Production testing
= in place w/ strategy = in progress
7.
Test Coverage - Local
• Using Jacoco and Clover
• Strategy - “Ratchet effect”:
• Each Maven module has a threshold
• Jacoco Maven plugin fails if new code
has less coverage than before in %
• Dev is allowed to increase threshold
Of course TPC is not panacea. You
could have 100% and app not
working. Also need functional tests.
Aim for 80%.
8.
Test Coverage - Global
• Issue: Local coverage can increase and
global decrease
• Removed code with high TPC
• Code tested indirectly by functional
tests and code refactoring led to
different paths used
• New module with lower TPC than
average
Global TPC evolution
9.
Test Coverage - Global
• Strategy:
• Global Clover TPC computed automatically every night on
Jenkins for all repos combined, using a pipeline
• Email sent to developers with report in email (see next slide)
• Developers fix module they have been working on
• Release Manager (RM) ensures that report passes before
release & we add one step in our Release Plan check list.
Source: http://massol.myxwiki.org/xwiki/bin/view/Blog/ComparingCloverReports
11.
Mutation Testing
• Using PIT/Gregor, PIT/Descartes
• Concepts of PIT
• Modify code under test (mutants) and run tests
• Good tests kill mutants
• Generates a mutation score similar to the coverage %
• Descartes = extreme mutations that execute fast and have high
values
https://massol.myxwiki.org/xwiki/bin/view/Blog/MutationTestingDescartes
12.
Mutation - Descartes
Image courtesy of Oscar LuisVera Perez / INRIA / STAMP project
15.
Mutation - Example
@Test
public void testEquality()
{
MacroId id1 = new MacroId("id", Syntax.XWIKI_2_0);
MacroId id2 = new MacroId("id", Syntax.XWIKI_2_0);
MacroId id3 = new MacroId("otherid", Syntax.XWIKI_2_0);
MacroId id4 = new MacroId("id", Syntax.XHTML_1_0);
MacroId id5 = new MacroId("otherid", Syntax.XHTML_1_0);
MacroId id6 = new MacroId("id");
MacroId id7 = new MacroId("id");
Assert.assertEquals(id2, id1);
// Equal objects must have equal hashcode
Assert.assertTrue(id1.hashCode() == id2.hashCode());
Assert.assertFalse(id3 == id1);
Assert.assertFalse(id4 == id1);
Assert.assertFalse(id5 == id3);
Assert.assertFalse(id6 == id1);
Assert.assertEquals(id7, id6);
// Equal objects must have equal hashcode
Assert.assertTrue(id6.hashCode() == id7.hashCode());
}
Not testing
for inequality!
Improved thanks to Descartes!
16.
Mutation - Limitations
• Takes time to find interesting things to look at and decide if that’s an issue
to handle or not. Need better categorisation in report (now reported by
Descartes):
• Strong pseudo-tested methods:The worst! No matter what the return
values are the tests always fail
• Pseudo-tested methods: Grey area.The tests pass with at least one
modified value.
• Multi module support - PITmp
• But slow on large projects (e.g. 7+ hours just for xwiki-rendering)
17.
Mutation - Strategy
• Seems to be working ok so far (6+ months of feedback now)
• But still young and not enough data about evolution
• Fail the build when the mutation score of a given module is below
a defined threshold in the pom.xml
• The idea is that new tests should, in average, be of quality equal or
better than past tests.
• Other idea: hook on CI to run it only on modified code/tests.
General goal with coverage + mutation: maintain quality
18.
Mutation: Going further
• Using DSpot
• Uses PIT/Descartes but injects
results to generate new tests
• Adds assertions to existing tests
• Generate new test methods
• Selector can be PIT/Gregor, PIT/
Descartes, Jacoco (instruction
coverage), Clover (Branch
coverage)
https://massol.myxwiki.org/xwiki/bin/view/Blog/TestGenerationDspot
19.
Mutation: Dspot Example 1
public void escapeAttributeValue2() {
String escapedText = XMLUtils.escapeAttributeValue("a < a' && a' < a" => a < a" {");
// AssertGenerator add assertion
Assert.assertEquals("a < a' && a' < a" => a < a" {", escapedText);
// AssertGenerator create local variable with return value of invocation
boolean o_escapeAttributeValue__3 = escapedText.contains("<");
// AssertGenerator add assertion
Assert.assertFalse(o_escapeAttributeValue__3);
// AssertGenerator create local variable with return value of invocation
boolean o_escapeAttributeValue__4 = escapedText.contains(">");
// AssertGenerator add assertion
Assert.assertFalse(o_escapeAttributeValue__4);
// AssertGenerator create local variable with return value of invocation
boolean o_escapeAttributeValue__5 = escapedText.contains("'");
// AssertGenerator add assertion
Assert.assertFalse(o_escapeAttributeValue__5);
// AssertGenerator create local variable with return value of invocation
boolean o_escapeAttributeValue__6 = escapedText.contains(""");
// AssertGenerator create local variable with return value of invocation
boolean o_escapeAttributeValue__7 = escapedText.contains("&&");
// AssertGenerator add assertion
Assert.assertFalse(o_escapeAttributeValue__7);
// AssertGenerator create local variable with return value of invocation
boolean o_escapeAttributeValue__8 = escapedText.contains("{");
// AssertGenerator add assertion
Assert.assertFalse(o_escapeAttributeValue__8);
}
Generated test
New test
@Test
public void escapeAttributeValue()
{
String escapedText = XMLUtils.escapeAttributeValue("a < a' && a' < a" => a < a" {");
assertFalse("Failed to escape <", escapedText.contains("<"));
assertFalse("Failed to escape >", escapedText.contains(">"));
assertFalse("Failed to escape '", escapedText.contains("'"));
assertFalse("Failed to escape "", escapedText.contains("""));
assertFalse("Failed to escape &", escapedText.contains("&&"));
assertFalse("Failed to escape {", escapedText.contains("{"));
}
Original test
20.
Mutation: Dspot Example 2
Generated test
Original test
Also increase coverage
Before: 70.5%
After: 71.2%
21.
Mutation: Dspot Strategy
• DSpot is very slow to execute (between 3 to 20mn on
small modules)
• One strategy is to run it on CI and in the pipeline commit
generated tests in a different source root.
• And run it only on Tests affected by commit changeset
• Configure Maven to add a new test directory source using
the Maven Build Helper plugin.
• Work in progress: small coverage and mutation score
improvements on XWiki so far.
22.
Environment Testing
• Environment = combination of Servlet
container & version, DB & version, OS,
Browser & version
• Future: cluster mode, LibreOffice
integration, external SOLR, etc
• Need: Be able to run/debug functional
tests on local dev machines as well as on
CI
• Using Docker / TestContainers
25.
Environment Testing
• Feedback: takes about 3 minutes to deploy all (and 1 minute for the
test)
• Strategy
• Run on CI (Jenkins)
• 3 jobs
• “latest”: latest versions of all elements (DB, Servlet Container,
Browser, etc). Once per day
• “all”: all supported versions. Once per week
• “unsupported”: what we want to support in the future. Once
per month.
• Future: IE/Edge + Docker in Docker
• Some instability with Docker and DinD/DooD.
26.
Crash Reproduction
• Tool: Botsing
• Concept:Take a stack trace
and generates a test that,
when executed, leads to this
stack trace
• i.e. find the conditions that
leads to the problem
28.
Botsing - Feedback
• Can take a long time to reproduce, doesn’t always succeed
• Generates a test that reproduces the problem, not the fix!
• Often you’d write a test at a different level (usually up in the call
chain, to be more meaningful to the use case)
• Is useful for newcomers who don’t know the codebase well as it
helps pinpoint the problem. Acts as a timesaver.
29.
Parting words
• Experiment, push the limit!
• Some other types of tests not covered and that also need
automation
• Backward compatibility testing
• Performance/Stress testing
• Usability testing
• others?
It appears that you have an ad-blocker running. By whitelisting SlideShare on your ad-blocker, you are supporting our community of content creators.
Hate ads?
We've updated our privacy policy.
We’ve updated our privacy policy so that we are compliant with changing global privacy regulations and to provide you with insight into the limited ways in which we use your data.
You can read the details below. By accepting, you agree to the updated privacy policy.