Search-Based Software Testing in Industry
---
Research collaborations and Lessons Learned
Lionel Briand
Interdisciplinary Centre for ICT Security, Reliability, and Trust (SnT)
University of Luxembourg, Luxembourg
SBST, Hyderabad, 2014
SnT Software Verification and Validation Lab
•  SnT centre, Est. 2009: Interdisciplinary,
ICT security-reliability-trust
•  200 scientists and Ph.D. candidates, 20
industry partners
•  SVV Lab: Established January 2012,
www.svv.lu
•  25 scientists (Research scientists,
associates, and PhD candidates)
•  Industry-relevant research on system
dependability: security, safety, reliability
•  Six partners: Cetrel, CTIE, Delphi, SES,
IEE, Hitec …
•  And we are always hiring!
2
An Effective, Collaborative Model of Research
and Innovation
Basic	
  Research	
   Applied	
  Research	
  
Innova3on	
  &	
  Development	
  
•  Basic and applied research take place in a rich context
•  Basic Research is also driven by problems raised by applied
research, which is itself fed by innovation and development
•  Publishable research results and focused practical solutions that
serve an existing market. 3
Schneiderman, 2013
Collaboration in Practice
•  Well-defined problems in context
•  Realistic evaluation
•  Long term industrial collaborations
4
Problem
Formulation
Problem
Identification
State of the
Art Review
Candidate
Solution(s)
Initial
Validation
Training
Realistic
Validation
Industry
Partners
Research
Groups
1
2
3
4
5
7
Solution
Release
8
6
Outline
•  Four projects:
–  Testing PID controllers in the automotive industry (Delphi)
–  Robustness testing of a video conference system (Cisco)
–  Environment-based testing of a seismic acquisition system
(WesternGeco)
–  Schedulability analysis and stress testing of safety-critical
drivers in the oil&gas industry (Kongsberg)
•  Lessons learned, patterns, discussions
•  Meant to be an interactive talk – I am also here to learn
5
Acknowledgements
PhD. Students:
•  Marwa Shousha
•  Shaukat Ali
•  Zohaib Iqbal
•  Hadi Hemmati
•  Reza Matinnejad
•  Stefano Di Alesio
Research Associates/Scientists, former colleagues:
•  Shiva Nejati
•  Andrea Arcuri
•  Arnaud Gotlieb
•  Yvan Labiche
6
Testing PID Controllers (Delphi)
References:
7
•  R. Matinnejad, S. Nejati, L. Briand, T. Bruckmann, “MiL Testing of Highly Configurable
Continuous Controllers: Scalable Search Using Surrogate Models”, Submitted (2104)
•  R. Matinnejad, S. Nejati, L. Briand, T. Bruckmann, C. Poull, , “Search-Based Automated
Testing of Continuous Controllers: Framework, Tool Support, and Case Studies”,
forthcoming in Information and Software Technology (2014)
•  R. Matinnejad, S. Nejati, L. Briand, T. Bruckmann, C. Poull , “Automated Model-in-the-Loop
Testing of Continuous Controllers using Search”, in 5th Symposium on Search-Based
Software Engineering (SSBSE 2013), Springer Lecture Notes in Computer Science (2013,
August)
Dynamic continuous controllers are present in
many embedded systems
8
Development Process
9
Hardware-in-the-Loop
Stage
Model-in-the-Loop
Stage
Simulink Modeling
Generic
Functional
Model
MiL Testing
Software-in-the-Loop
Stage
Code Generation
and Integration
Software Running
on ECU
SiL Testing
Software
Release
HiL Testing
Controllers at MIL
10
Plant Model
+
+
+
⌃
+
-
e(t)
actual(t)
desired(t)
⌃
KP e(t)
KD
de(t)
dt
KI
R
e(t) dt
P
I
D
output(t)
Inputs: Time-dependent variables
Configuration Parameters
Inputs, Outputs, Test Objectives
11
InitialDesired
(ID)
Desired ValueI (input)
Actual Value (output)
FinalDesired
(FD)
time
T/2 T
Smoothness
Responsiveness
Stability
Process and Technology
12
HeatMap
Diagram
1. Exploration
List of
Critical
RegionsDomain
Expert
Worst-Case
Scenarios
+
Controller-
plant
model
Objective
Functions
based on
Requirements
2. Single-State
Search
Continuous Controller Tester
(a) Liveness (b) Smoothness
Testing in the Configuration Space
•  MIL testing for all feasible configurations
•  The search space is much larger
•  The search is much slower (Simulations of Simulink models are
expensive)
•  Not all configuration parameters matter for all objective functions
•  Results are harder to visualize
13
Modified Process and Technology
14
+
Controller
Model
(Simulink)
Worst-Case
Scenarios
List of
Critical
PartitionsRegression
Tree
1.Exploration with
Dimensionality
Reduction
2.Search with
Surrogate
Modeling
Objective
Functions
Domain
Expert
Visualization of the
8-dimension space
using regression treesDimensionality
reduction to identify
the significant variables
Surrogate modeling
to predict the objective
function and
speed up the search
Dimensionality Reduction
•  Sensitivity Analysis:
Elementary Effect Analysis
(EEA)
•  Identify non-influential
inputs in computationally
costly mathematical
models
•  Requires less data points
than other techniques
•  Observations are
simulations generated
during the Exploration step
•  Compute sample mean
and standard deviation for
each dimension of the
distribution of elementary
effects
15
Cal5
ID
Cal3
FD
Cal4
Cal6
Cal1,Cal2
0.6
0.4
0.2
0.0
SampleStandardDeviation()
-0.6 -0.4 -0.2 0.0 0.2
Sample Mean ( )
⇤10 2
⇤10 2
Si
i
Visualization in Inputs & Configuration Space
16
All Points
FD>=0.43306
Count
Mean
Std Dev
Count
Mean
Std Dev
FD<0.43306
Count
Mean
Std Dev
ID>=0.64679
Count
Mean
Std Dev
Count
Mean
Std Dev
Cal5>=0.020847 Cal5>0.020847
Count
Mean
Std Dev
Count
Mean
Std Dev
Cal5>=0.014827 Cal5<0.014827
Count
Mean
Std Dev
Count
Mean
Std Dev
1000
0.007822
0.0049497
ID<0.64679
574
0.0059513
0.0040003
426
0.0103425
0.0049919
373
0.0047594
0.0034346
201
0.0081631
0.0040422
182
0.0134555
0.0052883
244
0.0080206
0.0031751
70
0.0106795
0.0052045
131
0.0068185
0.0023515 Regression Tree
Surrogate Modeling
17
•  Any supervised learning or
statistical technique
providing fitness predictions
with confidence intervals
1.  Predict higher fitness with
high confidence: Move to
new position, no simulation
2.  Predict lower fitness with
high confidence: Do not
move to new position, no
simulation
3.  Low confidence in
prediction: Simulation
Surrogate Model
Real Function
x
Fitness
Results
•  Search yielded worst-case scenarios that were much worse than
known and expected scenarios
•  Surrogate modeling: Polynomial regression yielded best fit and
predictive power so far
•  Dimensionality reduction helps generate better surrogate models
•  Surrogate modeling can yield up to an eight-fold increase in search
speed
•  Surrogate modeling can help find more critical requirements violations
•  By accounting for variations in configurations, we found more critical
requirements violations than just with the HIL configuration
18
Robustness Testing of a Video Conference System
(Cisco)
References:
19
•  S. Ali, Briand, H. Hemmati, “Modeling Robustness Behavior Using Aspect-Oriented
Modeling to Support Robustness Testing of Industrial Systems”, Journal of Software
and Systems Modeling (Springer), 2011
•  S. Ali, M. Z. Iqbal, A. Arcuri, L. Briand, “Generating Test Data from OCL Constraints
with Search Techniques”, IEEE Transactions on Software Engineering, 2013
Video Conference System
20
Core Functionality
21
EP1	
  
EP3	
  
EP2	
  
Call	
  
Outgoing	
  	
  
channel	
  
Incoming	
  	
  
channel	
  
Audio	
  
Channel	
  
Presentation	
  
Channel	
  Video	
  
Channel	
  
Robustness
•  Robustness is the degree to which a software
component functions correctly in the presence of
exceptional inputs or stressful environmental
conditions (IEEE Std 610.12-1990)
•  Significant additional complexity lies with handling the
robustness properties
–  Network communication faults
–  Media quality faults in media streams
–  Faults in the endpoints
22
Cross-Cutting Concern
23
NotFull
[0<#call
s<max]
Full
[#calls=
max]
dial()
dial()[#calls=max-1]
dial()
[#calls<max-1]
disconnect()
disconnect() [#calls=1]
disconnect()
[#calls>1]
Idle
[#calls=0
]
Recovery
[…]
After(time)
DisconnectAll()
PL>0 or PacketDelay>0 or ReorderDelay>0 or
corrupt>0 or Duplicate>0
PL=0 && PacketDelay=0 &&
ReorderDelay=0 && corrupt=0 && Duplicate=0
Cross-cutting
concern
Base model
Model-Based Testing (MBT)
24
•  Goals: Scalability, complete automation
•  Model-based Testing (MBT) uses models of the system for test case
and oracle generation
–  The models typically describe some aspects of system under test
–  Increasingly used for complete test automation, e.g., aerospace,
automotive, banking
•  Often using well-established standards for modeling and their
extensions: UML (profiles), OCL, etc.
•  Requirements:
–  Test-ready models
–  Appropriate test strategies, e.g., path selection
–  Test data generation
–  Oracles
Model-Based Testing: Process and Technology
25
Test Data Generation for MBT
•  Test data is needed to execute program paths as required by a
coverage criterion during testing
•  For MBT, test data is typically an instance of a class diagram
•  Instances must fulfill invariants
•  Paths in state machines carry constraints (guards) on conditions
•  To generate test data for UML/OCL models, we need to solve
OCL constraints written on the models
26
context Student inv ageConstraint:
self.age > 15 and self.age < 80
Example OCL expression in VC Model
27
context Saturn inv synchronizationConstraint: !
!self.systemUnit.NumberOfActiveCalls > 1 and !
!self.systemUnit.NumberOfActiveCalls <= !
! ! ! ! ! !
!self.systemUnit.MaximumNumberOfActiveCalls !
!and !
!self.media.synchronizationMismatch.unit = TimeUnitKind::s and !!
!(!
! !self.media.synchronizationMismatch.value >= 0 and !
! !self.media.synchronizationMismatch.value <= ! ! !
!!
! ! !self.media.synchronizationMismatchThreshold.value!
!) and !
!self.conference.PresentationMode = Mode::Off and !
!self.conference.call→select(call | !
! !call.incomingPresentationChannel.Protocol <> VideoProtocol::Off)→size()=2 !
!and !!
!self.conference.call→select(call | !
! ! call.outgoingPresentationChannel.Protocol <> VideoProtocol::Off)→size()=2!
OCL Constraint Solvers
•  A number approaches for OCL constraint solving
•  Not complete
–  Support subset of OCL
•  Lack of proper tool support
–  A number of approaches are not automated
•  Not scalable
–  Often based on translation (e.g., to CSP)
–  Combinatorial explosion
28
A Search Problem
•  We used an alternate approach by applying the search-based testing
(SBT) concepts to solve OCL constraint
•  The process of generating test data can be seen as a search process
–  There is a huge number of possible instances that can be
generated for a particular model
–  We need to select instances that solve the constraint
•  Fitness defined as a distance function d()
–  d() returns 0 if the constraint is solved
–  otherwise a value that heuristically estimates how far the constraint
was from being evaluated as true
29
Challenges
•  Primitive Types, Boolean Operators
•  Operations on Collections, Iterators
•  Fine grained fitness functions for iterators using size, oclInState
•  Consider a collection C = {1, 2, 3} and a constraint C→forAll(x|x= 0)
d(C->forAll(x|x=0)) ! d(C.at(i) = 0)/C->size()
! (d(1 = 0) + d(2=0) + d(3=0))/3
! (2 + 3 + 4)/3
! 3
•  Many complex rules for the computations of fitness functions based on
OCL expressions
•  Fine grained heuristics -> maximum guidance
30
VC Model and Results
•  UML Class diagram, state machines, OCL
•  20 subsystems, on average 5 states and 11
transitions (largest: 22 states – 63 transitions)
•  OCL: 144 constraints as guards, 100 invariants, and
57 change events
•  Results:
–  All constraints were resolved
–  Maximum time: ~ 2 minutes on laptop
31
Environment-Based Testing of a Seismic Acquisition
System (WesternGeco)
References:
32
•  Z. Iqbal, A. Arcuri, L. Briand, “Empirical Investigation of Search Algorithms for Environment
Model-Based Testing of Real-Time Embedded Software”, ACM ISSTA, 2012
•  Z. Iqbal, A. Arcuri, L. Briand, “Environment Modeling and Simulation for Automated Testing
of Soft Real-Time Embedded Software”, Software and System Modeling (Springer), 2014
Objectives
•  Model-based System testing
–  Black-box
–  Environment models
33
Environment
Simulator
Test cases
Environment Models
Test oracle
Environment: “Domain” Model
34
Environment: “Behavioral” Model
35
Test Case Generation
•  Test objectives: Reach “error” states (critical environment states)
•  Test Case: (1) Environment and (2) Simulation Configuration
–  (1) Number of instances for each component in domain model,
e.g., number of items on conveying belt
–  (2) Setting non-deterministic properties of the environment, e.g.,
speed of sorter’s left and right arms
•  Oracle: Reaching an “error” state
•  SBST: Heuristics
–  Distance from error state
–  Distance from satisfying OCL guards
–  Time distance
–  Time in “risky” states
–  …
36
Schedulability Analysis and Stress Testing of Safety-
Critical Drivers (Kongsberg Maritime)
References:
37
•  L. Briand, Y. Labiche, and M. Shousha, “Using genetic algorithms for early schedulability
analysis and stress testing in real-time systems”, Genetic Programming and Evolvable
Machines, vol. 7 no. 2, pp. 145-170, 2006
•  S. Nejati, S. Di Alesio, M. Sabetzadeh, and L. Briand, “Modeling and analysis of cpu usage in
safety-critical embedded systems to support stress testing,” in Model Driven Engineering
Languages and Systems. Springer, 2012, pp. 759–775.
•  S. Di Alesio, S. Nejati, L. Briand. A. Gotlieb, “Stress Testing of Task Deadlines: A Constraint
Programming Approach”, ISSRE 2013, San Jose, USA!
•  S. Di Alesio, S. Nejati, L. Briand. A. Gotlieb, “Worst-Case Scheduling of Software Tasks – A
Constraint Optimization Model to Support Performance Testing, Constraint Programming (CP),
2014
Fire/Gas Detection and Emergency Shutdown
38
Drivers
(Software-Hardware Interface)
Control Modules Alarm Devices
(Hardware)
Multicore Archt.
Real Time Operating System
Monitor gas leaks and fire in oil
extraction platforms
Performance Requirements are Hard to Verify
39
They constraint the entire system’s behavior
and thus can’t be checked locally
They depend on the environment the
software interacts with (hw devices)
They depend on the computing platform
on which the software runs
Schedulability Analysis and Testing
•  RTES have concurrent interdependent tasks which have to
finish before their deadlines
•  Each task has a deadline (i.e., latest finishing time) w.r.t. its
arrival time
•  Some task properties depend on the environment, some are
design choices
•  Tasks can trigger other tasks, and can share computational
resources with other tasks
•  Schedulability analysis encompasses techniques that try to
predict whether all (critical) tasks are schedulable, i.e., meet
their deadlines
•  Stress testing runs carefully selected test cases that have a high
probability of leading to deadline misses
40
Arrival Times Determine Deadline Misses
41
0
1
2
3
4
5
6
7
8
9
𝒋 𝟎, 𝒋 𝟏, 𝒋 𝟐 arrive at 𝒂𝒕 𝟎, 𝒂𝒕 𝟏, 𝒂𝒕 𝟐 and must, 𝒋 𝟏, 𝒋 𝟐 arrive at 𝒂𝒕 𝟎, 𝒂𝒕 𝟏, 𝒂𝒕 𝟐 and must, 𝒋 𝟐 arrive at 𝒂𝒕 𝟎, 𝒂𝒕 𝟏, 𝒂𝒕 𝟐 and mustarrive at 𝒂𝒕 𝟎, 𝒂𝒕 𝟏, 𝒂𝒕 𝟐 and must, 𝒂𝒕 𝟏, 𝒂𝒕 𝟐 and must, 𝒂𝒕 𝟐 and mustand must
finish before 𝒅𝒍 𝟎, 𝒅𝒍 𝟏, 𝒅𝒍 𝟐, 𝒅𝒍 𝟏, 𝒅𝒍 𝟐, 𝒅𝒍 𝟐
𝒋 𝟏 can miss its deadline 𝒅𝒍 𝟏 depending oncan miss its deadline 𝒅𝒍 𝟏 depending ondepending on
when 𝒂​ 𝒕↓ 𝟐  occurs!
0
1
2
3
4
5
6
7
8
9
    𝒋 𝟐
𝒂𝒕 𝟐
𝒅𝒍 𝟐
    𝒋 𝟏
𝒂𝒕 𝟏
𝒅𝒍 𝟏
𝑻
    𝒋 𝟎
𝒂𝒕 𝟎
𝒅𝒍 𝟎
    𝒋 𝟐
𝒂𝒕 𝟐
𝒅𝒍 𝟐
    𝒋 𝟏
𝒂𝒕 𝟏
𝒅𝒍 𝟏
𝑻
    𝒋 𝟎
𝒂𝒕 𝟎
𝒅𝒍 𝟎
Search-Based Approaches
•  This problem can be tackled as a search problem in the space
of arrival times for aperiodic tasks
•  Identify worst-case scenarios for testing
•  No assumptions
•  Genetic algorithms: Briand et al., 2003-2006
•  Constraint Programming (e.g., OPL, ILOG CP Optimizer)
–  Nejati et al., 2012
–  Di Alesio et al., 2013-2014
42
Constraint Optimization
43
Constraint Optimization Problem
Static Properties of Tasks
(Constants)
Dynamic Properties of
Tasks
(Variables)
Performance Requirement
(Objective Function)
OS Scheduler Behaviour
(Constraints)
Process and Technologies
44
UML Modeling
Automated Search
Optimization Problem
(Find arrival times that maximize
the chance of deadline misses)
System Platform
Solutions
(Task arrival times likely to
lead to deadline misses)
Deadline Misses
Analysis
System Design Design Model (Time
and Concurrency
Information)
INPUT
OUTPUT
Genetic
Algorithms
(GA)
Stress Test Cases
Constraint
Programming
(CP)
​ 𝒂 𝒕↓ 𝟎 =𝟏
​ 𝒂 𝒕↓ 𝟏 =𝟑
​ 𝒂 𝒕↓ 𝟐 =𝟒
Results and Current Work
•  GA tends to be more efficient but less effective than CP
–  More efficient: Find deadline misses quicker
–  More effective: Find worse deadline misses
•  CP is deterministic, evolutionary search is randomized
•  For testing we want a diverse test of stress test cases
•  Combining GA and CP (Di Alesio’s dissertation):
–  Achieve an efficiency close to GA and an effectiveness close to
CP
–  Use GA first and improve worst solutions found by GA by
performing a CP complete search in the neighborhood of
solutions
–  Results on five case studies are very encouraging
45
SBST in Industry: Discussion
•  Scalability
•  Applicability
•  Variety of heuristics as a function of test objectives, available
information, assumptions, etc.
•  Search as a piece of the solution: multidisciplinarity
•  Combining search with other techniques: Likely candidates
46
Scalability
•  Search spaces are huge in practice
•  Fitness computation is often computationally-intensive
•  Test execution can be expensive
–  Web applications or phone apps versus embedded systems
with HIL
–  Models, simulation to guide the search
•  Simulation is always expensive
–  Simulink models, e.g., 31s for a 2s simulation
–  Surrogate modeling?
•  In many situations, models of the system can help guide the search
47
Applicability
•  Many academic solutions are not applicable in practice
•  Context matters
•  Scalability -> applicability
•  But also inputs required for guiding the search
•  Integrated to the rest of the development process
–  E.g., design models, WCET analysis, Simulink development
48
A Large Variety of Heuristics
•  Test objectives differ a great deal depending on context
–  Performance, robustness, critical environment states …
•  Available information also differs, both for guiding test generation
and oracles
–  Purely black-box testing
–  Design information, e.g., through models
•  Working assumptions
–  About process, technology, …
–  E.g., availability of plant/environment models in Simulink
•  In a given context, some degree of tailoring is usually required for
applying SBST
49
Multidisciplinarity
•  Typically, meta-heuristic search is only part of a solution to a
testing problem
•  Dedicated system or environment modeling, e.g., in Cisco and
WesternGeco studies
•  Machine learning, e.g., regression trees in Delphi study
•  Statistical analysis, e.g., EEA and non-linear regression in
Delphi study
•  Constraint programming, e.g., in Kongsberg study
50
Search-Based Software Testing in Industry
---
Research collaborations and Lessons Learned
Lionel Briand
Interdisciplinary Centre for ICT Security, Reliability, and Trust (SnT)
University of Luxembourg, Luxembourg
SBST, Hyderabad, 2014
SVV lab: svv.lu
SnT: www.securityandtrust.lu

Keynote SBST 2014 - Search-Based Testing

  • 1.
    Search-Based Software Testingin Industry --- Research collaborations and Lessons Learned Lionel Briand Interdisciplinary Centre for ICT Security, Reliability, and Trust (SnT) University of Luxembourg, Luxembourg SBST, Hyderabad, 2014
  • 2.
    SnT Software Verificationand Validation Lab •  SnT centre, Est. 2009: Interdisciplinary, ICT security-reliability-trust •  200 scientists and Ph.D. candidates, 20 industry partners •  SVV Lab: Established January 2012, www.svv.lu •  25 scientists (Research scientists, associates, and PhD candidates) •  Industry-relevant research on system dependability: security, safety, reliability •  Six partners: Cetrel, CTIE, Delphi, SES, IEE, Hitec … •  And we are always hiring! 2
  • 3.
    An Effective, CollaborativeModel of Research and Innovation Basic  Research   Applied  Research   Innova3on  &  Development   •  Basic and applied research take place in a rich context •  Basic Research is also driven by problems raised by applied research, which is itself fed by innovation and development •  Publishable research results and focused practical solutions that serve an existing market. 3 Schneiderman, 2013
  • 4.
    Collaboration in Practice • Well-defined problems in context •  Realistic evaluation •  Long term industrial collaborations 4 Problem Formulation Problem Identification State of the Art Review Candidate Solution(s) Initial Validation Training Realistic Validation Industry Partners Research Groups 1 2 3 4 5 7 Solution Release 8 6
  • 5.
    Outline •  Four projects: – Testing PID controllers in the automotive industry (Delphi) –  Robustness testing of a video conference system (Cisco) –  Environment-based testing of a seismic acquisition system (WesternGeco) –  Schedulability analysis and stress testing of safety-critical drivers in the oil&gas industry (Kongsberg) •  Lessons learned, patterns, discussions •  Meant to be an interactive talk – I am also here to learn 5
  • 6.
    Acknowledgements PhD. Students: •  MarwaShousha •  Shaukat Ali •  Zohaib Iqbal •  Hadi Hemmati •  Reza Matinnejad •  Stefano Di Alesio Research Associates/Scientists, former colleagues: •  Shiva Nejati •  Andrea Arcuri •  Arnaud Gotlieb •  Yvan Labiche 6
  • 7.
    Testing PID Controllers(Delphi) References: 7 •  R. Matinnejad, S. Nejati, L. Briand, T. Bruckmann, “MiL Testing of Highly Configurable Continuous Controllers: Scalable Search Using Surrogate Models”, Submitted (2104) •  R. Matinnejad, S. Nejati, L. Briand, T. Bruckmann, C. Poull, , “Search-Based Automated Testing of Continuous Controllers: Framework, Tool Support, and Case Studies”, forthcoming in Information and Software Technology (2014) •  R. Matinnejad, S. Nejati, L. Briand, T. Bruckmann, C. Poull , “Automated Model-in-the-Loop Testing of Continuous Controllers using Search”, in 5th Symposium on Search-Based Software Engineering (SSBSE 2013), Springer Lecture Notes in Computer Science (2013, August)
  • 8.
    Dynamic continuous controllersare present in many embedded systems 8
  • 9.
    Development Process 9 Hardware-in-the-Loop Stage Model-in-the-Loop Stage Simulink Modeling Generic Functional Model MiLTesting Software-in-the-Loop Stage Code Generation and Integration Software Running on ECU SiL Testing Software Release HiL Testing
  • 10.
    Controllers at MIL 10 PlantModel + + + ⌃ + - e(t) actual(t) desired(t) ⌃ KP e(t) KD de(t) dt KI R e(t) dt P I D output(t) Inputs: Time-dependent variables Configuration Parameters
  • 11.
    Inputs, Outputs, TestObjectives 11 InitialDesired (ID) Desired ValueI (input) Actual Value (output) FinalDesired (FD) time T/2 T Smoothness Responsiveness Stability
  • 12.
    Process and Technology 12 HeatMap Diagram 1.Exploration List of Critical RegionsDomain Expert Worst-Case Scenarios + Controller- plant model Objective Functions based on Requirements 2. Single-State Search Continuous Controller Tester (a) Liveness (b) Smoothness
  • 13.
    Testing in theConfiguration Space •  MIL testing for all feasible configurations •  The search space is much larger •  The search is much slower (Simulations of Simulink models are expensive) •  Not all configuration parameters matter for all objective functions •  Results are harder to visualize 13
  • 14.
    Modified Process andTechnology 14 + Controller Model (Simulink) Worst-Case Scenarios List of Critical PartitionsRegression Tree 1.Exploration with Dimensionality Reduction 2.Search with Surrogate Modeling Objective Functions Domain Expert Visualization of the 8-dimension space using regression treesDimensionality reduction to identify the significant variables Surrogate modeling to predict the objective function and speed up the search
  • 15.
    Dimensionality Reduction •  SensitivityAnalysis: Elementary Effect Analysis (EEA) •  Identify non-influential inputs in computationally costly mathematical models •  Requires less data points than other techniques •  Observations are simulations generated during the Exploration step •  Compute sample mean and standard deviation for each dimension of the distribution of elementary effects 15 Cal5 ID Cal3 FD Cal4 Cal6 Cal1,Cal2 0.6 0.4 0.2 0.0 SampleStandardDeviation() -0.6 -0.4 -0.2 0.0 0.2 Sample Mean ( ) ⇤10 2 ⇤10 2 Si i
  • 16.
    Visualization in Inputs& Configuration Space 16 All Points FD>=0.43306 Count Mean Std Dev Count Mean Std Dev FD<0.43306 Count Mean Std Dev ID>=0.64679 Count Mean Std Dev Count Mean Std Dev Cal5>=0.020847 Cal5>0.020847 Count Mean Std Dev Count Mean Std Dev Cal5>=0.014827 Cal5<0.014827 Count Mean Std Dev Count Mean Std Dev 1000 0.007822 0.0049497 ID<0.64679 574 0.0059513 0.0040003 426 0.0103425 0.0049919 373 0.0047594 0.0034346 201 0.0081631 0.0040422 182 0.0134555 0.0052883 244 0.0080206 0.0031751 70 0.0106795 0.0052045 131 0.0068185 0.0023515 Regression Tree
  • 17.
    Surrogate Modeling 17 •  Anysupervised learning or statistical technique providing fitness predictions with confidence intervals 1.  Predict higher fitness with high confidence: Move to new position, no simulation 2.  Predict lower fitness with high confidence: Do not move to new position, no simulation 3.  Low confidence in prediction: Simulation Surrogate Model Real Function x Fitness
  • 18.
    Results •  Search yieldedworst-case scenarios that were much worse than known and expected scenarios •  Surrogate modeling: Polynomial regression yielded best fit and predictive power so far •  Dimensionality reduction helps generate better surrogate models •  Surrogate modeling can yield up to an eight-fold increase in search speed •  Surrogate modeling can help find more critical requirements violations •  By accounting for variations in configurations, we found more critical requirements violations than just with the HIL configuration 18
  • 19.
    Robustness Testing ofa Video Conference System (Cisco) References: 19 •  S. Ali, Briand, H. Hemmati, “Modeling Robustness Behavior Using Aspect-Oriented Modeling to Support Robustness Testing of Industrial Systems”, Journal of Software and Systems Modeling (Springer), 2011 •  S. Ali, M. Z. Iqbal, A. Arcuri, L. Briand, “Generating Test Data from OCL Constraints with Search Techniques”, IEEE Transactions on Software Engineering, 2013
  • 20.
  • 21.
    Core Functionality 21 EP1   EP3   EP2   Call   Outgoing     channel   Incoming     channel   Audio   Channel   Presentation   Channel  Video   Channel  
  • 22.
    Robustness •  Robustness isthe degree to which a software component functions correctly in the presence of exceptional inputs or stressful environmental conditions (IEEE Std 610.12-1990) •  Significant additional complexity lies with handling the robustness properties –  Network communication faults –  Media quality faults in media streams –  Faults in the endpoints 22
  • 23.
    Cross-Cutting Concern 23 NotFull [0<#call s<max] Full [#calls= max] dial() dial()[#calls=max-1] dial() [#calls<max-1] disconnect() disconnect() [#calls=1] disconnect() [#calls>1] Idle [#calls=0 ] Recovery […] After(time) DisconnectAll() PL>0or PacketDelay>0 or ReorderDelay>0 or corrupt>0 or Duplicate>0 PL=0 && PacketDelay=0 && ReorderDelay=0 && corrupt=0 && Duplicate=0 Cross-cutting concern Base model
  • 24.
    Model-Based Testing (MBT) 24 • Goals: Scalability, complete automation •  Model-based Testing (MBT) uses models of the system for test case and oracle generation –  The models typically describe some aspects of system under test –  Increasingly used for complete test automation, e.g., aerospace, automotive, banking •  Often using well-established standards for modeling and their extensions: UML (profiles), OCL, etc. •  Requirements: –  Test-ready models –  Appropriate test strategies, e.g., path selection –  Test data generation –  Oracles
  • 25.
  • 26.
    Test Data Generationfor MBT •  Test data is needed to execute program paths as required by a coverage criterion during testing •  For MBT, test data is typically an instance of a class diagram •  Instances must fulfill invariants •  Paths in state machines carry constraints (guards) on conditions •  To generate test data for UML/OCL models, we need to solve OCL constraints written on the models 26 context Student inv ageConstraint: self.age > 15 and self.age < 80
  • 27.
    Example OCL expressionin VC Model 27 context Saturn inv synchronizationConstraint: ! !self.systemUnit.NumberOfActiveCalls > 1 and ! !self.systemUnit.NumberOfActiveCalls <= ! ! ! ! ! ! ! !self.systemUnit.MaximumNumberOfActiveCalls ! !and ! !self.media.synchronizationMismatch.unit = TimeUnitKind::s and !! !(! ! !self.media.synchronizationMismatch.value >= 0 and ! ! !self.media.synchronizationMismatch.value <= ! ! ! !! ! ! !self.media.synchronizationMismatchThreshold.value! !) and ! !self.conference.PresentationMode = Mode::Off and ! !self.conference.call→select(call | ! ! !call.incomingPresentationChannel.Protocol <> VideoProtocol::Off)→size()=2 ! !and !! !self.conference.call→select(call | ! ! ! call.outgoingPresentationChannel.Protocol <> VideoProtocol::Off)→size()=2!
  • 28.
    OCL Constraint Solvers • A number approaches for OCL constraint solving •  Not complete –  Support subset of OCL •  Lack of proper tool support –  A number of approaches are not automated •  Not scalable –  Often based on translation (e.g., to CSP) –  Combinatorial explosion 28
  • 29.
    A Search Problem • We used an alternate approach by applying the search-based testing (SBT) concepts to solve OCL constraint •  The process of generating test data can be seen as a search process –  There is a huge number of possible instances that can be generated for a particular model –  We need to select instances that solve the constraint •  Fitness defined as a distance function d() –  d() returns 0 if the constraint is solved –  otherwise a value that heuristically estimates how far the constraint was from being evaluated as true 29
  • 30.
    Challenges •  Primitive Types,Boolean Operators •  Operations on Collections, Iterators •  Fine grained fitness functions for iterators using size, oclInState •  Consider a collection C = {1, 2, 3} and a constraint C→forAll(x|x= 0) d(C->forAll(x|x=0)) ! d(C.at(i) = 0)/C->size() ! (d(1 = 0) + d(2=0) + d(3=0))/3 ! (2 + 3 + 4)/3 ! 3 •  Many complex rules for the computations of fitness functions based on OCL expressions •  Fine grained heuristics -> maximum guidance 30
  • 31.
    VC Model andResults •  UML Class diagram, state machines, OCL •  20 subsystems, on average 5 states and 11 transitions (largest: 22 states – 63 transitions) •  OCL: 144 constraints as guards, 100 invariants, and 57 change events •  Results: –  All constraints were resolved –  Maximum time: ~ 2 minutes on laptop 31
  • 32.
    Environment-Based Testing ofa Seismic Acquisition System (WesternGeco) References: 32 •  Z. Iqbal, A. Arcuri, L. Briand, “Empirical Investigation of Search Algorithms for Environment Model-Based Testing of Real-Time Embedded Software”, ACM ISSTA, 2012 •  Z. Iqbal, A. Arcuri, L. Briand, “Environment Modeling and Simulation for Automated Testing of Soft Real-Time Embedded Software”, Software and System Modeling (Springer), 2014
  • 33.
    Objectives •  Model-based Systemtesting –  Black-box –  Environment models 33 Environment Simulator Test cases Environment Models Test oracle
  • 34.
  • 35.
  • 36.
    Test Case Generation • Test objectives: Reach “error” states (critical environment states) •  Test Case: (1) Environment and (2) Simulation Configuration –  (1) Number of instances for each component in domain model, e.g., number of items on conveying belt –  (2) Setting non-deterministic properties of the environment, e.g., speed of sorter’s left and right arms •  Oracle: Reaching an “error” state •  SBST: Heuristics –  Distance from error state –  Distance from satisfying OCL guards –  Time distance –  Time in “risky” states –  … 36
  • 37.
    Schedulability Analysis andStress Testing of Safety- Critical Drivers (Kongsberg Maritime) References: 37 •  L. Briand, Y. Labiche, and M. Shousha, “Using genetic algorithms for early schedulability analysis and stress testing in real-time systems”, Genetic Programming and Evolvable Machines, vol. 7 no. 2, pp. 145-170, 2006 •  S. Nejati, S. Di Alesio, M. Sabetzadeh, and L. Briand, “Modeling and analysis of cpu usage in safety-critical embedded systems to support stress testing,” in Model Driven Engineering Languages and Systems. Springer, 2012, pp. 759–775. •  S. Di Alesio, S. Nejati, L. Briand. A. Gotlieb, “Stress Testing of Task Deadlines: A Constraint Programming Approach”, ISSRE 2013, San Jose, USA! •  S. Di Alesio, S. Nejati, L. Briand. A. Gotlieb, “Worst-Case Scheduling of Software Tasks – A Constraint Optimization Model to Support Performance Testing, Constraint Programming (CP), 2014
  • 38.
    Fire/Gas Detection andEmergency Shutdown 38 Drivers (Software-Hardware Interface) Control Modules Alarm Devices (Hardware) Multicore Archt. Real Time Operating System Monitor gas leaks and fire in oil extraction platforms
  • 39.
    Performance Requirements areHard to Verify 39 They constraint the entire system’s behavior and thus can’t be checked locally They depend on the environment the software interacts with (hw devices) They depend on the computing platform on which the software runs
  • 40.
    Schedulability Analysis andTesting •  RTES have concurrent interdependent tasks which have to finish before their deadlines •  Each task has a deadline (i.e., latest finishing time) w.r.t. its arrival time •  Some task properties depend on the environment, some are design choices •  Tasks can trigger other tasks, and can share computational resources with other tasks •  Schedulability analysis encompasses techniques that try to predict whether all (critical) tasks are schedulable, i.e., meet their deadlines •  Stress testing runs carefully selected test cases that have a high probability of leading to deadline misses 40
  • 41.
    Arrival Times DetermineDeadline Misses 41 0 1 2 3 4 5 6 7 8 9 𝒋 𝟎, 𝒋 𝟏, 𝒋 𝟐 arrive at 𝒂𝒕 𝟎, 𝒂𝒕 𝟏, 𝒂𝒕 𝟐 and must, 𝒋 𝟏, 𝒋 𝟐 arrive at 𝒂𝒕 𝟎, 𝒂𝒕 𝟏, 𝒂𝒕 𝟐 and must, 𝒋 𝟐 arrive at 𝒂𝒕 𝟎, 𝒂𝒕 𝟏, 𝒂𝒕 𝟐 and mustarrive at 𝒂𝒕 𝟎, 𝒂𝒕 𝟏, 𝒂𝒕 𝟐 and must, 𝒂𝒕 𝟏, 𝒂𝒕 𝟐 and must, 𝒂𝒕 𝟐 and mustand must finish before 𝒅𝒍 𝟎, 𝒅𝒍 𝟏, 𝒅𝒍 𝟐, 𝒅𝒍 𝟏, 𝒅𝒍 𝟐, 𝒅𝒍 𝟐 𝒋 𝟏 can miss its deadline 𝒅𝒍 𝟏 depending oncan miss its deadline 𝒅𝒍 𝟏 depending ondepending on when 𝒂​ 𝒕↓ 𝟐  occurs! 0 1 2 3 4 5 6 7 8 9    𝒋 𝟐 𝒂𝒕 𝟐 𝒅𝒍 𝟐    𝒋 𝟏 𝒂𝒕 𝟏 𝒅𝒍 𝟏 𝑻    𝒋 𝟎 𝒂𝒕 𝟎 𝒅𝒍 𝟎    𝒋 𝟐 𝒂𝒕 𝟐 𝒅𝒍 𝟐    𝒋 𝟏 𝒂𝒕 𝟏 𝒅𝒍 𝟏 𝑻    𝒋 𝟎 𝒂𝒕 𝟎 𝒅𝒍 𝟎
  • 42.
    Search-Based Approaches •  Thisproblem can be tackled as a search problem in the space of arrival times for aperiodic tasks •  Identify worst-case scenarios for testing •  No assumptions •  Genetic algorithms: Briand et al., 2003-2006 •  Constraint Programming (e.g., OPL, ILOG CP Optimizer) –  Nejati et al., 2012 –  Di Alesio et al., 2013-2014 42
  • 43.
    Constraint Optimization 43 Constraint OptimizationProblem Static Properties of Tasks (Constants) Dynamic Properties of Tasks (Variables) Performance Requirement (Objective Function) OS Scheduler Behaviour (Constraints)
  • 44.
    Process and Technologies 44 UMLModeling Automated Search Optimization Problem (Find arrival times that maximize the chance of deadline misses) System Platform Solutions (Task arrival times likely to lead to deadline misses) Deadline Misses Analysis System Design Design Model (Time and Concurrency Information) INPUT OUTPUT Genetic Algorithms (GA) Stress Test Cases Constraint Programming (CP) ​ 𝒂 𝒕↓ 𝟎 =𝟏 ​ 𝒂 𝒕↓ 𝟏 =𝟑 ​ 𝒂 𝒕↓ 𝟐 =𝟒
  • 45.
    Results and CurrentWork •  GA tends to be more efficient but less effective than CP –  More efficient: Find deadline misses quicker –  More effective: Find worse deadline misses •  CP is deterministic, evolutionary search is randomized •  For testing we want a diverse test of stress test cases •  Combining GA and CP (Di Alesio’s dissertation): –  Achieve an efficiency close to GA and an effectiveness close to CP –  Use GA first and improve worst solutions found by GA by performing a CP complete search in the neighborhood of solutions –  Results on five case studies are very encouraging 45
  • 46.
    SBST in Industry:Discussion •  Scalability •  Applicability •  Variety of heuristics as a function of test objectives, available information, assumptions, etc. •  Search as a piece of the solution: multidisciplinarity •  Combining search with other techniques: Likely candidates 46
  • 47.
    Scalability •  Search spacesare huge in practice •  Fitness computation is often computationally-intensive •  Test execution can be expensive –  Web applications or phone apps versus embedded systems with HIL –  Models, simulation to guide the search •  Simulation is always expensive –  Simulink models, e.g., 31s for a 2s simulation –  Surrogate modeling? •  In many situations, models of the system can help guide the search 47
  • 48.
    Applicability •  Many academicsolutions are not applicable in practice •  Context matters •  Scalability -> applicability •  But also inputs required for guiding the search •  Integrated to the rest of the development process –  E.g., design models, WCET analysis, Simulink development 48
  • 49.
    A Large Varietyof Heuristics •  Test objectives differ a great deal depending on context –  Performance, robustness, critical environment states … •  Available information also differs, both for guiding test generation and oracles –  Purely black-box testing –  Design information, e.g., through models •  Working assumptions –  About process, technology, … –  E.g., availability of plant/environment models in Simulink •  In a given context, some degree of tailoring is usually required for applying SBST 49
  • 50.
    Multidisciplinarity •  Typically, meta-heuristicsearch is only part of a solution to a testing problem •  Dedicated system or environment modeling, e.g., in Cisco and WesternGeco studies •  Machine learning, e.g., regression trees in Delphi study •  Statistical analysis, e.g., EEA and non-linear regression in Delphi study •  Constraint programming, e.g., in Kongsberg study 50
  • 51.
    Search-Based Software Testingin Industry --- Research collaborations and Lessons Learned Lionel Briand Interdisciplinary Centre for ICT Security, Reliability, and Trust (SnT) University of Luxembourg, Luxembourg SBST, Hyderabad, 2014 SVV lab: svv.lu SnT: www.securityandtrust.lu