A Deep Dive into Secure Product Development Frameworks.pdf
PHANTA: Diversified Test Code Quality Measurement for Modern Software Development
1. Copyright 2019 FUJITSU LABORATORIES LTD.
PHANTA: Diversified Test Code
Quality Measurement for
Modern Software Development
Susumu Tokumoto, Kuniharu Takayama
FUJITSU LABORATORIES LTD., Japan
ASE 2019 Industry Showcase
0
2. Background
How should we assure quality of test code?
Quality of production code is assured by various ways such as
testing, review
But few tools/standard practices for quality assurance of test code
Copyright 2019 FUJITSU LABORATORIES LTD.
int foo(int x){
int ret = 1;
while(x > 0){
ret *= x:
x--;
}
return ret;
}
TEST(foo, one){
int x = foo(1);
assert(x == 1);
}
TEST(foo, two){
int x = foo(2);
assert(x == 2);
}
Test Code Production Code
No.
1
2
3
4
5
6
7
8
検証者 備考
CUDの編集内容通りになること 18 2003/4/21 ○○ ○○
予想結果
テスト
項目数
検証日
機能テスト(業務仕様) a600100101.ot1a600100101.in1
テストケース 入力データ 出力データ
テストケース表
L57-005g20
プロセス名
受注
プログラム名プログラムID
受注履歴検索
サブシステム
コード
L57
サブシステム名
1 2003/4/22 ○○ ○○
機能テスト(制御仕様1)
回収先コードに有効な値が入ってい
る場合
a600100101.in1 a600100101.ot1
相手先名に回収先コードに対応した
値が設定されていること
1 2003/4/22 ○○ ○○
機能テスト(制御仕様2)
回収先コードにZEROが入っている
場合
a600100101.in1 a600100101.ot1
相手先名に空白が設定されているこ
と
1
1
2003/4/22 ○○ ○○
機能テスト(制御仕様4)
入金借方勘定コードに有効な値が
入っている場合
a600100101.in1 a600100101.ot1
入金借方勘定コード内容に入金借方
勘定コードに対応した値が設定され
ていること
○○ ○○
a600100101.in1
a600100102.txt
a600100101.ot1
入金番号をもとに入金実績テーブルを
検索し、入金実績振込テーブルの値
を電文に設定する
機能テスト(制御仕様6)
入金番号に有効な値が入っており、
取消状態の場合
a600100101.in1
a600100102.txt
a600100101.ot1
エラーのメッセージIDを電文に設定す
る
2003/4/22 ○○ ○○
- -
1 2003/4/22
限界値テスト
入金実績の取得件数が名寄せの最
大件数を超過した場合
a600100101.in1
a600100102.txt
a600100101.ot1
次ページ有無Fに「有」が設定されて
いること
2003/4/22 ○○ ○○1
L57-0004
L57-0005
L57-0006
L57-0007
受発注
L57-0008
テストケース名
L57-0001
L57-0002
L57-0003
しきい値テスト
該当処理なし
-
機能テスト(制御仕様5)
入金番号に有効な値が入っており、
取消状態で無い場合 Test Specification
☑Automated
Testing
☑Manual Testing
Function Line Branch
foo 100% 100%
bar 54% 20%
baz 75% 51%
Total 82% 73%
Coverages
Measure Coverage
Engineer
☑Review
☑Review
?Review
☑?
Need technology to
check quality of test code
Automated quality measurement for test code!
• Improve maintainability of test code. This in turn
improves quality of production code
• Reduce cost by automating manual review of test code
• Adds value to Fujitsu’s system integration business
Various ways to
assure the quality
How to assure
the quality?
1
3. What is Good Test Code?
Copyright 2019 FUJITSU LABORATORIES LTD.
Does 100% Code Coverage
mean Perfect Test Code?
Is Automation of Testing
Goal of Creating Test Code?
2
4. Position of Test Code in Testing Quadrants
Copyright 2019 FUJITSU LABORATORIES LTD.
Feel confident about code
Enable to refactor frequently
Release with confidence
Support programming
Managing technical debt
Test code as safety net
rather than producing
business value
Lisa Crispin and Janet Gregory. 2009. Agile Testing: A Practical Guide for Testers and Agile Teams (1 ed.). Addison-Wesley Professional.
3
5. Multiple Forces of Influence Related to Tests
Directly or Indirectly Affect Productivity
Copyright 2019 FUJITSU LABORATORIES LTD.
Lasse Koskela, “Effective Unit Testing: A Guide for Java Developers”, p8, Figure 1.3
4
6. Don’t Do Everything from the Beginning
You should NOT stick to
• doing everything from
the beginning
• test-driven development
• test-first development
• “unit” test
• code coverage
• test speed
You should stick to
• reproducible and
repeatable tests
• isolated tests
Copyright 2019 FUJITSU LABORATORIES LTD.
Takuto Wada, “Working Effectively with Legacy Code” https://speakerdeck.com/twada/working-effectively-with-legacy-code
5
7. Good Test Code
Plays the Role of a
Safety Net
Affects Positively
Productivity
with Diverse and
Prioritized
Quality Metrics
Achieved by
Monitoring the
Quality Metrics
Copyright 2019 FUJITSU LABORATORIES LTD.6
8. Maturity Levels of Test Code
No Test Code
Isolated and Repeatable Tests
Requirements-covered Tests
Code-covered Tests
High Speed
Tests
Maintain-
able Tests
Copyright 2019 FUJITSU LABORATORIES LTD.
Unsteady defense
Steady defense
and turn to attack
Spend enough
time to attack
We need a tool
to measure and
improve these
qualities
7
10. Quality of Test Code PHANTA Measures
Bug Detectability
• How many bugs the test code can detect
Maintainability
• Flexibility for fixing
• Readability
Speed
• Test duration
Copyright 2019 FUJITSU LABORATORIES LTD.
Technology 1:
Mutation Analysis
Technology 2:
Active Assertion
Analysis
Technology 3:
Test Code
Clone Detection
Technology 4:
Scoring Test
Duration
9
11. Copyright 2019 FUJITSU LABORATORIES LTD.
Technology 1: Mutation Analysis
1: int abs(int x){
2: if(x <= 0){
3: return –x;
4: }
5: return x;
6: }
int abs(int x){
if(x > 0){
return –x;
}
return x;
}
Test1
abs(2) ==2
Test2
abs(-2)==2
mutant1 fail fail killed
mutant2 pass pass unkilled
mutant3 pass fail killed
int abs(int x){
if(x <= 1){
return –x;
}
return x;
}
int abs(int x){
if(x <= 0){
return –-x;
}
return x;
}
mutant1
(line 2, <= → >)
mutant2
(line 2, 0 → 1)
mutant3
(line 3, - → --)
mutation score: 2/3=0.667
Inject various bugs and
check test can detect the
injected bugs automatically
Program Under Test
Mutant notation:
(Location of mutant, Mutant operator)
Ratio of killed
mutants, i.e.
bug
detectability
10
12. Technology 2: Active Assertion Analysis
Assertions which don’t contribute bug detection may
make test code’s readability worse
→Assertions which don’t kill any mutants are regarded as
inactive, and use that to improve test code’s readability
Copyright 2019 FUJITSU LABORATORIES LTD.
The assertion which doesn’t kill mutant
doesn’t contribute bug detection
1: public Bar method(){
2: b = 1;
3: c = 2;
4: }
mutant1
(Line 2, 1 → 0)
mutant2
(Line 3, 2 → 0)
1: public Bar method(){
2: b = 0;
3: c = 2;
4: }
1: public Bar method(){
2: b = 1;
3: c = 0;
4: }
11
13. Technology 3: Test Code Clone Detection
Code clone: Matching or similar chunks in source code
Example for test code clone
Code clones is considered harmful for the maintenance,
and they can be used for maintainability improvement
@Test
public void testFoo1() {
Foo foo = new Foo();
assertEquals(12, foo.methodA(0));
}
@Test
public void testFoo2() {
Foo foo = new Foo();
assertEquals(34, foo.methodA(1));
}
Match except method names
and literals
→Regards them as code
clone
Copyright 2019 FUJITSU LABORATORIES LTD.12
14. Technology 4: Scoring Test Duration
We aggregate test duration per test case from 225 open
source projects, and enable to calculate relative speed of
target tests as normalized score
Copyright 2019 FUJITSU LABORATORIES LTD.
Test duration per test case
Test duration per test case
Distribution of test duration
per test case in OSS Score model
Score
Numberofprojects
Transform
to normal
distribution
0.489 sec ->
50 (average)
13
15. Case Study :
Factory Automation Application
Copyright 2019 FUJITSU LABORATORIES LTD.14
16. Subject Application
• RESTful API server in Factory
Automation
Feature
• Source Code : 37 KLOC
• Test Code : 21 KLOC
• Over 1,000 Test cases
Size of Code
• Java 8
• DBMS
• Apache Maven
Application requires
Copyright 2019 FUJITSU LABORATORIES LTD.
Application Server
DBMS
Servlet
Authen-
tication
ValidatorServiceService
Services
Client
Data Access
Architecture
15
17. Install and Run PHANTA
Add internal maven repository to ~/.m2/settings.xml
Add the plugin to build/plugins in your pom.xml
Run the plugin from command line
Copyright 2019 FUJITSU LABORATORIES LTD.
<settings>
<mirrors>
<mirror>
<id>freiburg-nexus</id>
<mirrorOf>*</mirrorOf>
<url>http://freiburg.dyn.soft.flab.fujitsu.co.jp:8081/repository/maven-public/</url>
</mirror>
</mirrors>
</settings>
<plugin>
<groupId>com.fujitsu.labs.phanta</groupId>
<artifactId>phanta-maven</artifactId>
<version>0.1.0-SNAPSHOT</version>
</plugin>
$ mvn com.fujitsu.labs:phanta-maven:analyze
16
18. Copyright 2019 FUJITSU LABORATORIES LTD.
Analysis duration: 3h 8min (sequential but could be parallelized)
Analysis Summary
Yellow (Caution) Red (Danger)
17
19. Mutation Analysis Report
Copyright 2019 FUJITSU LABORATORIES LTD.
Authentication
Data Access
Services
Moderate line coverage and
mutation coverage with acceptable
gap between two.
→caused by well written assertions
Servlet
18
20. Active Assertion Analysis Report
Copyright 2019 FUJITSU LABORATORIES LTD.
Moderate number of
active assertions with
low variance
19
21. Proposal for Improving the Test Code
Repeatable
• Some test cases look flaky
because expected values are
written as order-sensitive in
spite of no need for order-
sensitive
Coverage
• Frequently updated code with
low coverage, especially zero
coverage, should be added
test cases.
Speed, Maintainability
• Use in-memory DB or
stub/mock to relieve DB
access overhead
• Change test code clones into
parameterized tests
Copyright 2019 FUJITSU LABORATORIES LTD.20
22. Feedback from Test Code Developers
Overall, the result of analysis is reasonable and acceptable
Average values are not quite meaningful
• The code except Services is out of the tests’ scope because they were derived from other past project and had
already be tested well
• This is the reason why the code except Services has much less coverage than Services
Equivalent mutants made the developers confused
• They were wondering why mutation coverages in Services are not 100%
• There were equivalent mutants (not killable) in Services’ code, but this was not evident from the report
Understanding the report is difficult
• They found difficulty in understanding what the report means and what action items various metrics imply
Maintainability is more important than we assumed
• They thought maintainability of test code is more important than some qualities like coverage
Copyright 2019 FUJITSU LABORATORIES LTD.21
23. Lessons Learned
Make the report
understandable for
developers
• Remove equivalent mutants
• Lead developers to the next action
Make the tool flexible and
customizable
• Let developers select the target to be
measured
Automated test code
refactoring could be helpful
• The developers want more
maintainable test code
Copyright 2019 FUJITSU LABORATORIES LTD.22