WSO2CON 2024 - Freedom First—Unleashing Developer Potential with Open Source
Automating System Test Case Classification and Prioritization for Use Case-Driven Testing in Product Lines
1. .lu
software verification & validation
V
V
S
Automating System Test Case
Classification and Prioritization
for Use Case-Driven Testing
in Product Lines
Ines Hajri1, Arda Goknil2, Fabrizio Pastore1, Lionel Briand1,3
Journal First, Empirical Software Engineering Journal (2020)
1 SnT Centre/University of Luxembourg, Luxembourg
2 SINTEF Digital, Norway
3 University of Ottawa, Canada
ESEC/FSE 2021
2. • In several product lines,
Ø infeasible to define all the system-level test cases beforehand, at the PL level
Ø infeasible to simply reuse and execute all test cases of previous products
• Current test practice is "opportunistic reuse of test assets” [Da Mota 2011]
Ø system test cases are manually selected for reuse
- not systematic: a lot of test cases rewritten from scratch
Ø no guidance to prioritize test cases
- related work mostly relies on source code analysis but source code is partially
available in industrial contexts
- delays failure detection, risky and critical test cases potentially executed last
2
Motivation
3. Objective
Support the definition and the prioritization
of the test suite for a new product
by maximizing the reuse
of test suites of existing products in a product line
and relying on requirements not source code analysis
3
4. 4
PL: Product Line
PS: Product Specific
Sensors
Recognize
Gesture
Identify System
Operating Status Storing
Error
Status
Provide System
Operating Status
Tester
<<include>>
<<Variant>>
Store Error
Status
<<include>>
Clearing
Error
Status
<<Variant>>
Clear Error
Status
0..1
0..1
<<require>>
STO Controller
<<include>>
Model Variability in
Use Case and Domain Models
PL Use Case
Diagram
PL Use Case
Specifications
PL Domain
Model
¨
<<s>>
<<p>>
<<p>>
<<m
>>
Interactive Configuration
of PS Use Case and Domain
Models
≠
Automated
Configuration of PS Use
Change Impact
Analysis for
Incremental
{
<latexit
sha1_base64="(null)">(null)</latexit>
<latexit
sha1_base64="(null)">(null)</latexit>
<latexit
sha1_base64="(null)">(null)</latexit>
<latexit
sha1_base64="(null)">(null)</latexit>
PS Use Case
Diagram
Impact Analysis
Report
PS Use Case
Specifications
PS Domain
Model
VP1
VP2
VP3
Decision
Model
Product Line Use Case Modeling
[Hajri 2015]
5. 5
PL: Product Line
PS: Product Specific
Sensors
Recognize
Gesture
Identify System
Operating Status Storing
Error
Status
Provide System
Operating Status
Tester
<<include>>
<<Variant>>
Store Error
Status
<<include>>
Clearing
Error
Status
<<Variant>>
Clear Error
Status
0..1
0..1
<<require>>
STO Controller
<<include>>
Model Variability in
Use Case and Domain Models
PL Use Case
Diagram
PL Use Case
Specifications
PL Domain
Model
¨
<<s>>
<<p>>
<<p>>
<<m
>>
Interactive Configuration
of PS Use Case and Domain
Models
≠
Automated
Configuration of PS Use
Change Impact
Analysis for
Incremental
{
<latexit
sha1_base64="(null)">(null)</latexit>
<latexit
sha1_base64="(null)">(null)</latexit>
<latexit
sha1_base64="(null)">(null)</latexit>
<latexit
sha1_base64="(null)">(null)</latexit>
PS Use Case
Diagram
Impact Analysis
Report
PS Use Case
Specifications
PS Domain
Model
VP1
VP2
VP3
Decision
Model
Product Line Use Case Modeling
Use Case: Provide System User Data via Standard Mode
Basic Flow
1. OPTIONAL STEP: The system requests the move capacitance from the
sensors
2. INCLUDE VARIATION POINT: INCLUDE USE CASE Identify System Operati
ng Status
3. The system VALIDATES THAT the operating status is valid
4. The system VALIDATES THAT the movement is a valid kick
5. The system SENDS the valid kick status TO the STO Controller
Alternative Flow
RFS 3
1. ABORT
Alternative Flow
RFS 4
1. The system increments the OveruseCounter by the increment step
2. ABORT
Product 1
Basic Flow
SELECTED
SELECTED
SELECTED
SELECTED
SELECTED
Alternative Flow
-
SELECTED
Alternative Flow
-
SELECTED
SELECTED
Product 2
Basic Flow
NOT SELECTED
NOT SELECTED
SELECTED
SELECTED
SELECTED
Alternative Flow
-
SELECTED
Alternative Flow
-
SELECTED
SELECTED
6. 6
PL: Product Line
PS: Product Specific
Sensors
Recognize
Gesture
Identify System
Operating Status Storing
Error
Status
Provide System
Operating Status
Tester
<<include>>
<<Variant>>
Store Error
Status
<<include>>
Clearing
Error
Status
<<Variant>>
Clear Error
Status
0..1
0..1
<<require>>
STO Controller
<<include>>
Model Variability in
Use Case and Domain Models
PL Use Case
Diagram
PL Use Case
Specifications
PL Domain
Model
¨
<<s>>
<<p>>
<<p>>
<<m
>>
Interactive Configuration
of PS Use Case and Domain
Models
≠
Automated
Configuration of PS Use
Change Impact
Analysis for
Incremental
{
<latexit
sha1_base64="(null)">(null)</latexit>
<latexit
sha1_base64="(null)">(null)</latexit>
<latexit
sha1_base64="(null)">(null)</latexit>
<latexit
sha1_base64="(null)">(null)</latexit>
PS Use Case
Diagram
Impact Analysis
Report
PS Use Case
Specifications
PS Domain
Model
VP1
VP2
VP3
Decision
Model
Product Line Use Case Modeling
7. 7
Test Case Classification and
Prioritization
1. Classify System
Test Cases for
the new Product
2. Create New
Test Cases
Using Guidance
3. Prioritize System
Test Cases for the
New Product
• PS Models and Decision Model
for the New Product
• Test Cases, PS Models and their
Traces, and Decision Models for
Previous Product(s)
• Partial Test Suite for
the New Product
• Guidance to Update
Test Cases
Test suite for
the new Product
• Test Execution History
• Variability Information
• Size of Use Case Scenarios
• Classification of Test Cases
Prioritized Test Suite
for the New Product
Start
8. Test Case Classification and
Prioritization
1. Classify previous
Test Cases
Partial Test Suite
for the New Product
Start
Use Case
specifications
for the new
product
Decision Model
for the new
product
Test Cases
for previous
Products
Trace
Links
Use Case
specifications
for previous
products
Decision
models for
previous
Products
TC1 TC1
TC2
Guidance to
Update Test
Cases
VP1
VP2
VP3
VP1
VP2
VP3
VP1
VP2
VP3
VP1
VP2
VP3
9. Test Case Classification
• Three classes [Briand 2009]:
• reusable (may re-run)
• retestable (shall re-run)
• obsolete (shall delete)
• Determined by comparing the sequence of use case
steps traversed by a test case in the old and new
product
• rely on a use case scenario model
9
10. Use Case Scenario Model
10
1. The system requests the move
capacitance from the sensors
2. INCLUDE USE CASE Identify
System Operating Status
1. The system increments the
OveruseCounter by the increment step
3. The system VALIDATES THAT
the operating status is valid
4. The system VALIDATES THAT
the movement is a valid kick
1. ABORT
5. The system SENDS the valid
kick status TO the STO Controller
2. ABORT
true
true
false
false
A
B
C
D
E
F
H
G
11. 11
Use Case Scenario Model
1. The system requests the move!
capacitance from the sensors
2. INCLUDE USE CASE Identify!
System Operating Status
1. The system increments the !
OveruseCounter by the increment step
3. The system VALIDATES THAT
the operating status is valid
4. The system VALIDATES THAT
the movement is a valid kick
1. ABORT
5. The system SENDS the valid !
kick status TO the STO Controller
2. ABORT
true
true
false
false
A
B
C
D
E
F
H
G
TC1 TC2 TC3
13. 13
B
C
D
F
E
G
Old product New product
retestable
A
C
D
F
E
G
B
Internal step:
represent internal
system operations
H
A
H
Test Cases Classification:
Retestable Tests
The sequence of inputs and outputs remains valid
14. Test Cases Classification:
Obsolete Tests
14
A
B
C
D
F
E
G
Old product New product
obsolete
C
D
F
E
G
A Input step:
represent system-
actor interactions
H
B
H
The sequence of test inputs and outputs might
be invalidated by the change
Refer to our paper
for a complete list of rules
15. 16
B
C
D
F
E
H
X
Z
A • Perform depth-first traversal
guided by use case scenarios
of previous products
- We analyze scenarios
covered by retestable and
obsolete test cases
• Any traversed path that
includes steps that were not
exercised by any test case for
previous product is considered
a new scenario
Identification of New Scenarios
Use Case Scenario Model
for New Product
16. Guidance Generation
17
• We compare old and new scenarios
• determine their differences
• We provide a set of suggestions for adding, removing and
updating test case steps that correspond to added, removed
and updated use case steps in the old and new scenario
17. Guidance Example Outputs
18
1. The system requests the move
capacitance from the sensors
2. INCLUDE USE CASE Identify
System Operating Status
1. The system increments the
OveruseCounter by the increment step
3. The system VALIDATES THAT
the operating status is valid
4. The system VALIDATES THAT
the movement is a valid kick
2. ABORT
A
B
C
D
H
G
Please update the existing test
case “TC2” to account for the
fact that the step in red was
deleted from the scenario of the
use case specifications of the
previous product.
18. Test Case Classification and
Prioritization
1. Classify previous
Test Cases
2. Create New
Test Cases
Using Guidance
Partial Test Suite
for the New Product
Test suite for
the new Product
Start
Use Case
specifications
for the new
product
Decision Model
for the new
product
Test Cases
for previous
Products
Trace
Links
Use Case
specifications
for previous
products
Decision
models for
previous
Products
TC1 TC1
TC2
Guidance to
Update Test
Cases
•
• •
• •
• •
• •
• •
• •
• •
•
VP1
VP2
VP3
VP1
VP2
VP3
VP1
VP2
VP3
VP1
VP2
VP3
19. 20
Test Case Classification and
Prioritization
1. Classify previous
Test Cases
2. Create New
Test Cases
Using Guidance
3. Prioritize System
Test Cases for the
New Product
Partial Test Suite
for the New Product
Test suite for
the new Product
• Test Execution History
• Variability Information
• Size of Use Case Scenarios
• Classification of Test Cases
Prioritized Test Suite
for the New Product
Start
Use Case
specifications
for the new
product
Decision Model
for the new
product
•
• •
• •
• •
• •
• •
• •
• •
•
Test Cases
for previous
Products
Trace
Links
Use Case
specifications
for previous
products
Decision
models for
previous
Products
TC1 TC1
TC2
Guidance to
Update Test
Cases
•
• •
• •
• •
• •
• •
• •
• •
•
VP1
VP2
VP3
VP1
VP2
VP3
VP1
VP2
VP3
VP1
VP2
VP3
20. Test Case Prioritization Approach
22
Identifying Significant
Factors
1
Degree of
variabilty
Test case is
retestable?
Prioritizing Test Cases Based on
Significant Factors
2
List of Significant
Factors
Prioritized Test Suite
# of failing
products
Rely on Logistic
Regression
# steps in Use
Case Scenario
# of failing
versions
The regression
model predicts
the probability
of failure
Sorted in
descending
order of
probability
Based on
p-value
for each
parameter
21. Empirical Evaluation
• RQ1: Does the proposed approach provide correct test case
classification results?
• RQ2: Does the proposed approach accurately identify new
scenarios that are relevant for testing a new product?
• RQ3: Does the proposed approach successfully prioritize test
cases?
• RQ4: Can the proposed approach significantly reduce test
development cost compared to current industrial practice?
31
22. Empirical Evaluation
• RQ1: Does the proposed approach provide correct test case
classification results?
• RQ2: Does the proposed approach accurately identify new
scenarios that are relevant for testing a new product?
• RQ3: Does the proposed approach successfully prioritize test
cases?
• RQ4: Can the proposed approach significantly reduce test
development cost compared to current industrial practice?
32
23. Case Study Subject
• Smart Trunk Opener (STO) system developed
by our industry partner IEE
• 14 variant use cases, 8 variation points, 13
optional alternative flows and 27 optional steps
33
Product ID # Versions # Use Cases # Use Case Flows # Use Case Steps # Test Cases
P1 22 28 236 689 110
P2 8 25 169 568 86
P3 10 28 234 685 96
P4 5 26 212 618 83
P5 9 28 238 695 113
24. Our approach identifies more than 80% of the failures by
executing less than 50% of the test cases
RQ3: Effectiveness of Test Case
Prioritization
34
Classified
Test Suites
Product to be
Tested
% Test Case Executed to Identify
All Failures 80% of Failures
P1 P2 72.09 38.37
P1, P2 P3 41.66 22.91
P1, P2, P3 P4 51.80 22.89
P1, P2, P3, P4 P5 26.54 18.58
In the paper, we also compare results with the ideal situation where
all failing test cases are executed first.
Result: When data is available for at least two products, we achieve
the same results with half of the test cases.
25. • We compared results from our approach with the ideal
situation where all failing test cases are executed first
• For both results (ideal and that of our approach):
- We computed the Area Under Curve (AUC) for the cumulative
percentage of failures triggered by executed test cases
- We computed the AUC ratio
35
RQ3: Effectiveness of Test Case
Prioritization
26. 36
0 20 40 60 80
0.0
0.2
0.4
0.6
0.8
1.0
TCs
Percentage
of
failures
(cumulative)
Ideal
Observed
0 20 40 60 80
0.0
0.2
0.4
0.6
0.8
1.0
TCs
Percentage
of
failures
(cumulative)
Ideal
Observed
0 20 40 60 80
0.2
0.4
0.6
0.8
1.0
TCs
Percentage
of
failures
(cumulative)
Ideal
Observed
0 20 40 60 80 100
0.2
0.4
0.6
0.8
1.0
TCs
Percentage
of
failures
(cumulative)
Ideal
Observed
Cumulative
Percentage
of
failure
Test Cases
AUC ratio = 0.98 AUC ratio = 0.99
AUC ratio = 0.97 AUC ratio = 0.95
27. RQ4: Reducing Test Development
Cost
37
• Number of test cases inherited from previous products
• Number of test cases that need to be implemented
Classified Test
Suites
New
Product
Test Cases Inherited
from Previous
Products
(Reusable/Retestabl
e)
New Test
Cases
Test Suite
Size for
The New
Product
P1 P2 83 3 86
P1, P2 P3 95 1 96
P1, P2, P3 P4 83 0 83
P1, P2, P3, P4 P5 110 14 113
28. RQ4: Reducing Test Development
Cost
38
• Number of test cases inherited from previous products
• Number of test cases that need to be implemented
Classified Test
Suites
New
Product
Test Cases Inherited
from Previous
Products
(Reusable/Retestabl
e)
New Test
Cases
Test Suite
Size for
The New
Product
P1 P2 83 3 86
P1, P2 P3 95 1 96
P1, P2, P3 P4 83 0 83
P1, P2, P3, P4 P5 110 14 113
95% of test cases can be inherited from test suites of previous
products
29. 41
Objective
Support the definition and the prioritization
of the test suite for a new product
by maximizing the reuse
of test suites of existing products in a product line
and relying on requirements not source code analysis
3 20
Test Case Classification and
Prioritization
1. Classify previous
Test Cases
2. Create New
Test Cases
Using Guidance
3. Prioritize System
Test Cases for the
New Product
Partial Test Suite
for the New Product
Test suite for
the new Product
• Test Execution History
• Variability Information
• Size of Use Case Scenarios
• Classification of Test Cases
Prioritized Test Suite
for the New Product
Start
Use Case
specifications
for the new
product
Decision Model
for the new
product
•
• •
• •
• •
• •
• •
• •
• •
•
Test Cases
for previous
Products
Trace
Links
Use Case
specifications
for previous
products
Decision
models for
previous
Products
TC1 TC1
TC2
Guidance to
Update Test
Cases
•
• •
• •
• •
• •
• •
• •
• •
•
VP1
VP2
VP3
VP1
VP2
VP3
VP1
VP2
VP3
VP1
VP2
VP3
Empirical Evaluation
• RQ1: Does the proposed approach provide correct test case
classification results?
• RQ2: Does the proposed approach accurately identify new
scenarios that are relevant for testing a new product?
• RQ3: Does the proposed approach successfully prioritize test
cases?
• RQ4: Can the proposed approach significantly reduce testing
costs compared to current industrial practice?
31
Our approach identifies more than 80%
of the failures by
executing less than 50% of the test cases
95% of test cases can be inherited from test
suites of previous products
30. .lu
software verification & validation
V
V
S
Automating System Test Case
Classification and Prioritization
for Use Case-Driven Testing
in Product Lines
Ines Hajri1, Arda Goknil2, Fabrizio Pastore1, Lionel
Briand1,3
1 SnT Centre/University of Luxembourg, Luxembourg
2 SINTEF Digital, Norway
3 University of Ottawa, Canada