SlideShare a Scribd company logo
1 of 113
Download to read offline
Scalable Software Testing and Verification
of Non-Functional Properties through
Heuristic Search and Optimization
Lionel Briand
Interdisciplinary Centre for ICT Security, Reliability, and Trust (SnT)
University of Luxembourg, Luxembourg
ITEQS, March 13, 2017
Collaborative Research @ SnT Centre
• Research in context
• Addresses actual needs
• Well-defined problem
• Long-term collaborations
• Our lab is the industry
2
Scalable Software Testing and
Verification Through Heuristic
Search and Optimization
3
With a focus on non-functional
properties
Verification, Testing
• The term “verification” is used in its wider sense:
Defect detection.
• Testing is, in practice, the most common verification
technique.
• Other forms of verifications are important too (e.g.,
design time, run-time), but much less present in
practice.
4
Decades of V&V research have
not yet significantly and widely
impacted engineering practice
5
Cyber-PhysicalSystems
• Increasingly complex and critical
systems
• Complex environment
• Combinatorial and state
explosion
• Dynamic behavior
• Complex requirements, e.g.,
temporal, timing, resource
usage
• Uncertainty, e.g., about the
environment
6
Scalable?Practical?
• Scalable: Can a technique be applied on large
artifacts (e.g., models, data sets, input spaces) and
still provide useful support within reasonable effort,
CPU and memory resources?
• Practical: Can a technique be efficiently and
effectively applied by engineers in realistic
conditions?
– realistic ≠ universal
– feasibility and cost of inputs to be provided?
7
Metaheuristics
• Heuristic search (Metaheuristics): Hill climbing, Tabu
search, Simulated Annealing, Genetic algorithms, Ant
colony optimisation ….
• Stochastic optimization: General class of algorithms and
techniques which employ some degree of randomness to
find optimal (or as optimal as possible) solutions to hard
problems
• Many verification and testing problems can be re-
expressed as optimization problems
• Goal: Address scalability and practicality issues
8
Talk Outline
• Selected project examples, with industry
collaborations
• Similarities and patterns
• Lessons learned
9
Testing Software Controllers
References:
10
• R. Matinnejad et al., “Automated Test Suite Generation for Time-continuous Simulink
Models“, IEEE/ACM ICSE 2016
• R. Matinnejad et al., “EffectiveTest Suites for Mixed Discrete-Continuous Stateflow
Controllers”, ACM ESEC/FSE 2015 (Distinguished paper award)
• R. Matinnejad et al., “MiL Testing of Highly Configurable Continuous Controllers:
Scalable Search Using Surrogate Models”, IEEE/ACM ASE 2014 (Distinguished
paper award)
• R. Matinnejad et al., “Search-Based Automated Testing of Continuous Controllers:
Framework, Tool Support, and Case Studies”, Information and Software Technology,
Elsevier (2014)
Electronic Control Units (ECUs)
More functions
Comfort and variety
Safety and reliability
Faster time-to-market
Less fuel consumption
Greenhouse gas emission laws
11
A Taxonomy of Automotive Functions
ControllingComputation
State-Based ContinuousTransforming Calculating
unit convertors calculating positions,
duty cycles, etc
State machine
controllers
Closed-loop
controllers (PID)
12
Dynamic Continuous Controllers
13
Development Process
14
Hardware-in-the-Loop
Stage
Model-in-the-Loop
Stage
Simulink Modeling
Generic
Functional
Model
MiL Testing
Software-in-the-Loop
Stage
Code Generation
and Integration
Software Running
on ECU
SiL Testing
Software
Release
HiL Testing
MATLAB/Simulinkmodel
+
+
0.051
FuelLevelSensor
-0.05
100
0.8
+
-
Gain
Gain1
Add1
Add
1
FuelLevel
Continuous
Integrator
15
• Data flow oriented
• Blocks and lines
• Time continuous and discrete behavior
• Input and outputs signals
Automotive Example
• Supercharger bypass flap controller
• Flap position is boundedwithin [0..1]
• 34 sub-components decomposedinto 6
abstraction levels
• Compressor blowingto the engine
Supercharger
Bypass Flap
Supercharger
Bypass Flap
Flap position = 0 (open) Flap position = 1 (closed)
16
Testing Controllers at MIL
Initial
Desired Value
Final
Desired Value
time time
Desired Value
Actual Value
T/2 T T/2 T
Test Input Test Output
Plant
Model
Controller
(SUT)
Desired value Error
Actual value
System output+
-
17
Configurable Controllers at MIL
18
Plant Model
+
+
+
⌃
+
-
e(t)
actual(t)
desired(t)
⌃
KP e(t)
KD
de(t)
dt
KI
R
e(t) dt
P
I
D
output(t)
Time-dependent variables
Configuration Parameters
Requirements and Test Oracles
19
InitialDesired
(ID)
Desired ValueI (input)
Actual Value (output)
FinalDesired
(FD)
time
T/2 T
Smoothness
Responsiveness
Stability
Test Strategy: A Search-Based Approach
20
Initial Desired (ID)
FinalDesired(FD)
Worst Case(s)?
• Continuous behavior
• Controller’s behavior can
be complex
• Meta-heuristic search in
(large) input space:
Finding worst case inputs
• Possible because of
automated oracle
(feedback loop)
• Different worst cases for
different requirements
• Worst cases may or may
not violate requirements
Search-Based Software Testing
• Express test generation
problem as a search
problem
• Search for test input data
with certain properties, i.e.,
constraints
• Non-linearity of software
(if, loops, …): complex,
discontinuous, non-linear
search spaces (Baresel)
• Many search algorithms
(metaheuristics), from
local search to global
search, e.g., Hill Climbing,
Simulated Annealing and
Genetic Algorithms
imbing starts at a random point in the
in the search space neighbouring the
uated for fitness. If a better candidate
l Climbing moves to that new point,
hbourhood of that candidate solution.
until the neighbourhood of the current
ce offers no better candidate solutions;
ima’. If the local optimum is not the
n Figure 3a), the search may benefit
and performing a climb from a new
landscape (Figure 3b).
simple Hill Climbing is Simulated
h by Simulated Annealing is similar to
movement around the search space is
may be made to points of lower fitness
with the aim of escaping local optima.
probability value that is dependent
d the ‘temperature’, which decreases
ch progresses (Figure 4). The lower
ess likely the chances of moving to a
search space, until ‘freezing point’ is
point the algorithm behaves identically
ulated Annealing is named so because
he physical process of annealing in
Hill Climbing. From a random starting point, the algorithm follows the
curve of the fitness landscape until a local optimum is found. The final
position may not represent the global optimum (part (a)), and restarts may
be required (part (b))
Fitness
Input domain
Figure 4. Simulated Annealing may temporarily move to points of poorer
fitness in the search space
Fitness
Input domain
Figure 5. Genetic Algorithms are global searches, sampling many points
in the fitness landscape at once
Search-Based SoftwareTesting: Past, Presentand Future
Phil McMinn
Genetic Algorithm
21
discusses future directions for Search-Based
Testing, comprising issues involving execution
ts, testability, automated oracles, reduction of
le cost and multi-objective optimisation. Finally,
concludes with closing remarks.
ARCH-BASED OPTIMIZATION ALGORITHMS
plest form of an optimization algorithm, and
to implement, is random search. In test data
inputs are generated at random until the goal of
r example, the coverage of a particular program
r branch) is fulfilled. Random search is very poor
olutions when those solutions occupy a very small
overall search space. Such a situation is depicted
where the number of inputs covering a particular
arget are very few in number compared to the
input domain. Test data may be found faster
reliably if the search is given some guidance.
eurstic searches, this guidance can be provided
m of a problem-specific fitness function, which
rent points in the search space with respect to
ness’ or their suitability for solving the problem
Input domain
portion of
input domain
denoting required
test data
randomly-generated
inputs
Figure 2. Random search may fail to fulfil low-probability test goals
Fitness
Input domain
(a) Climbing to a local optimum
s
22
Search Elements
• Search Space:
• Initial and desired values, configuration parameters
• Search Technique:
• (1+1) EA, variants of hill climbing, GAs …
• Search Objective:
• Objective/fitness function for each requirement
• Evaluation of Solutions
• Simulation of Simulink model => fitness computation
• Result:
• Worst case scenarios or input signals that (are more likely to)
break the requirement at MiL level
22
Smoothness Objective Functions: OSmoothness
Test Case A Test Case B
OSmoothness(Test Case A) > OSmoothness(Test Case B)
We want to find test scenarios which maximize OSmoothness
23
Solution Overview (SimplifiedVersion)
24
HeatMap
Diagram
1. Exploration
List of
Critical
RegionsDomain
Expert
Worst-Case
Scenarios
+
Controller-
plant
model
Objective
Functions
based on
Requirements
2. Single-State
Search
time
Desired Value
Actual Value
0 1 2
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Initial Desired
Final Desired
Finding Seeded Faults
Inject Fault
25
Analysis – Fitness increase over iterations
26
Number of Iterations
Fitness
Analysis II – Search over different regions
27
0.315
0.316
0.317
0.319
0.321
0.323
0.324
0.326
0.327
0.329
0.330
0 10 20 30 40 50 60 70 80 90 100
0.328
0.325
0.320
0.318
0 10 20 30 40 50 60 70 80 90 1000 10 20 30 40 50 60 70 80 90 100
Random Search
(1+1) EA
0 10 20 30 40 50 60 70 80 90 100
0.0166
0.0168
0.0170
0.0176
0.0180
0.0178
0.0172
0.0160
0.0162
0.0164 Random Search
(1+1) EA
0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100
0.0174
Average (1+1) EA Distribution Random Search Distribution
Number of Iterations
• We found much worse scenarios during MiL testing than our
partner had found so far, and much worse than random
search (baseline)
• These scenarios are also run at the HiL level, where testing is
much more expensive: MiL results -> test selection for HiL
• But further research was needed:
– Simulations are expensive
– Configuration parameters
– Dynamically adjust search algorithms in different
subregions (exploratory <-> exploitative)
Conclusions
i.e., 31s. Hence, the horizontal axis of the diagrams in Figure 8 shows the number of
iterations instead of the computation time. In addition, we start both random search and
(1+1) EA from the same initial point, i.e., the worst case from the exploration step.
Overall in all the regions, (1+1) EA eventually reaches its plateau at a value higher
than the random search plateau value. Further, (1+1) EA is more deterministic than ran-
dom, i.e., the distribution of (1+1) EA has a smaller variance than that of random search,
especially when reaching the plateau (see Figure 8). In some regions (e.g., Figure 8(d)),
however, random reaches its plateau slightly faster than (1+1) EA, while in some other
regions (e.g. Figure 8(a)), (1+1) EA is faster. We will discuss the relationship between
the region landscape and the performance of (1+1) EA in RQ3.
RQ3. We drew the landscape for the 11 regions in our experiment. For example, Fig-
ure 9 shows the landscape for two selected regions in Figures 7(a) and 7(b). Specifically,
Figure 9(a) shows the landscape for the region in Figure 7(b) where (1+1) EA is faster
than random, and Figure 9(b) shows the landscape for the region in Figure 7(a) where
(1+1) EA is slower than random search.
0.30
0.31
0.32
0.33
0.34
0.35
0.36
0.37
0.38
0.39
0.40
0.70 0.71 0.72 0.73 0.74 0.75 0.76 0.77 0.78 0.79 0.80
0.10
0.11
0.12
0.13
0.14
0.15
0.16
0.17
0.18
0.19
0.20
0.90 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1.00
(a) (b)
Fig. 9. Diagrams representing the landscape for two representative HeatMap regions: (a) Land-
scape for the region in Figure 7(b). (b) Landscape for the region in Figure 7(a).
28
Testing in the ConfigurationSpace
• MIL testing for all feasible configurations
• The search space is much larger
• The search is much slower (Simulations of Simulink
models are expensive)
• Results are harder to visualize
• But not all configuration parameters matter for all
objective functions
29
Modified Process and Technology
30
+
Controller
Model
(Simulink)
Worst-Case
Scenarios
List of
Critical
PartitionsRegression
Tree
1.Exploration with
Dimensionality
Reduction
2.Search with
Surrogate
Modeling
Objective
Functions
Domain
Expert
Visualization of the
8-dimension space
using regression treesDimensionality
reduction to identify
the significant variables
(Elementary Effect Analysis)
Surrogate modeling
to predict the objective
function and
speed up the search
(Machine learning)
Dimensionality Reduction
• Sensitivity Analysis:
Elementary Effect Analysis
(EEA)
• Identify non-influential
inputs in computationally
costly mathematical
models
• Requires less data points
than other techniques
• Observations are
simulations generated
during the Exploration step
• Compute sample mean
and standard deviation for
each dimension of the
distribution of elementary
effects
31
Cal5
ID
Cal3
FD
Cal4
Cal6
Cal1,Cal2
0.6
0.4
0.2
0.0
SampleStandardDeviation()
-0.6 -0.4 -0.2 0.0 0.2
Sample Mean ( )
⇤10 2
⇤10 2
Si
i
Elementary Effects Analysis Method
ü Imagine function F with 2 inputs, x and y:
A x
y
A1
A2
C x
y
C1
C2
B x
y
B1
B2
X
Y
Elementary Effects
for X for Y
F(A1)-F(A)
F(B1)-F(B)
F(C1)-F(C)
…
F(A2)-F(A)
F(B2)-F(B)
F(C2)-F(C)
…
32
Visualizationin Inputs & Configuration Space
33
All Points
FD>=0.43306
Count
Mean
Std Dev
Count
Mean
Std Dev
FD<0.43306
Count
Mean
Std Dev
ID>=0.64679
Count
Mean
Std Dev
Count
Mean
Std Dev
Cal5>=0.020847 Cal5>0.020847
Count
Mean
Std Dev
Count
Mean
Std Dev
Cal5>=0.014827 Cal5<0.014827
Count
Mean
Std Dev
Count
Mean
Std Dev
1000
0.007822
0.0049497
ID<0.64679
574
0.0059513
0.0040003
426
0.0103425
0.0049919
373
0.0047594
0.0034346
201
0.0081631
0.0040422
182
0.0134555
0.0052883
244
0.0080206
0.0031751
70
0.0106795
0.0052045
131
0.0068185
0.0023515 Regression Tree
Surrogate Modeling During Search
• Goal: To predict the value of the objective functions within a
critical partition, given a number of observations, and use that to
avoid as many simulations as possible and speed up the search
34
A
B
Surrogate Modeling During Search
35
• Any supervised learning or
statistical technique
providing fitness predictions
with confidence intervals
1. Predict higher fitness with
high confidence: Move to
new position, no simulation
2. Predict lower fitness with
high confidence: Do not
move to new position, no
simulation
3. Low confidence in
prediction: Simulation
Surrogate Model
Real Function
x
Fitness
Experiments Results (RQ1)
ü The best regression technique to build Surrogate models
for all of our three objective functions is Polynomial
Regression with n = 3
ü Other supervised learning techniques, such as SVM
Mean of R2/MRPE values for different surrogate modeling techniques
Fst
Fsm
Fr
PR(n=3)
R2/MRPE
0.66/0.0526 0.95/0.0203
0.78/0.0295
0.26/0.2043
0.98/0.0129
0.85/0.0247 0.85/0.0245
0.46/0.1755 0.54/0.1671
0.44/0.0791
0.49/1.2281
0.22/1.2519
LR
R2/MRPE
ER
R2/MRPE
PR(n=2)
R2/MRPE
36
Experiments Results (RQ2)
ü Dimensionality reduction helps generate better surrogate
models for Smoothness and Responsiveness
requirements
0.0
0.02
0.04
0.06
0.08 0.05
0.01
0.02
0.03
0.04
0.1
0.2
0.3
DR No DR DR No DR DR No DR
Smoothness( )Fsm Responsiveness( )Fr Stability( )Fst
MeanRelativePredictionErrors
(MRPEValues)
37
ü For responsiveness, the search with SM was 8 times faster
ü For smoothness, the search with SM was much more effective
Experiments Results (RQ3)
SearchOutput
Values
SearchOutput
Values
0.215
SM
After 800 seconds After 2500 seconds After 3000 seconds
NoSM
0.220
0.225
0.230
0.235
SM NoSMSM NoSM
After 200 seconds
0.160
0.164
0.168
After 300 seconds After 3000 seconds
NoSM NoSM SM NoSMSM SM
38
ü Our approach is able to identify critical violations of the
controller requirements that had neither been found by
our earlier work nor by manual testing.
MiL-Testing
different configurations
Stability
Smoothness
Responsiveness
MiL-Testing
fixed configurations Manual MiL-Testing
- -2.2% deviation
24% over/undershoot 20% over/undershoot 5% over/undershoot
170 ms response time 80 ms response time 50 ms response time
Experiments Results (RQ4)
39
A Taxonomy of Automotive Functions
ControllingComputation
State-Based ContinuousTransforming Calculating
unit convertors calculating positions,
duty cycles, etc
State machine
controllers
Closed-loop
controllers (PID)
Different testing strategies are required for
different types of functions
40
Open-LoopControllers
41
ave complex internal structures
, making them less amenable to
ve rich time-continuous outputs.
wing contributions:
f testing Stateflows with mixed
urs. We propose two new test
t stability and output continuity
est inputs that are likely to pro-
hibiting instability and disconti-
verage and the blackbox output
o Stateflows, and evaluate their
ntinuous behaviours. The former
nal state and transition coverage
atter is defined based on the re-
rion [?].
of our newly proposed and the
applying them to three Stateflow
dustrial and one public domain.
LT.
(c) Engaging state of SCC -- mixed discrete-continuous behaviou
Disengaging
Engaged
[disengageReq]/time := 0
[tim
>5]
time + +;
OnMoving OnSlipping
OnCompleted
time + +;
ctrlSig := f(time)
Engaging
time + +;
ctrlSig := g(time)
time + +;
ctrlSig := 1.0
[¬(vehspd = 0)
time > 2]
[(vehspd = 0)
time > 3]
[time > 4]
Figure 1: Supercharge Clutch Controller (SCC) Stateflo
transient states [?], engaging and disengaging, specifying that m
ing from the engaged to the disengaged state and vice versa t
six milisec. Since this model is simplified, it does not show
• No feedback loop -> no automated
oracle
• No plant model: Much quicker
simulation time
• Mixed discrete-continuous behavior:
Simulink stateflows
• The main testing cost is the manual
analysis of output signals
• Goal: Minimize test suites
• Challenge:Test selection
• Entirely different approach to testing On
Off
CtrlSig
Selection Strategies Based on Search
• Input diversity
• White-box Structural
Coverage
• State Coverage
• Transition Coverage
• Output Diversity
• Failure-Based Selection
Criteria
• Domain specific failure
patterns
• Output Stability
• Output Continuity
42
S3
t
S3
t
Failure-basedTest Generation
4
3
Instability Discontinuity
0.0 1.0 2.0
-1.0
-0.5
0.0
0.5
1.0
Time
CtrlSigOutput
• Search: Maximizing the likelihood of presence of specific
failure patterns in output signals
• Domain-specific failure patterns elicited from engineers
0.0 1.0 2.0
Time
0.0
0.25
0.50
0.75
1.0
CtrlSigOutput
Summary of Results
• The test cases resulting from state/transition
coverage algorithms cover the faulty parts of
the models
• However, they fail to generate output signals
that are sufficiently distinct from the oracle
signal, hence yieldinga low fault revealing
rate
• Output-based algorithmsare more effective
44
Automated Testing of Driver Assistance
Systems Through Simulation
Reference:
45
R. Ben Abdessalem et al., "Testing Advanced DriverAssistance Systems
Using Multi-Objective Search and Neural Networks”, ACM ESEC/FSE
2016
Pedestrian Detection Vision System (PeVi)
46
• The PeVi system is a camera-based
collision-warning system providing
improved vision
Testing DA Systems
• Testing DA systems requires complex and
comprehensive simulation environments
– Static objects: roads, weather, etc.
– Dynamic objects: cars, humans, animals, etc.
• A simulationenvironment captures the behavior of
dynamic objects as well as constraints and
relationships between dynamic and static objects
47
Approach
48
Generation of Test
specifications
Static
[ranges/values/
resolution]
Dynamic
[ranges/
resolution]
(2)
test case specification
Specification Documents
(Simulation Environment and PeVi System)
Domain
model
Requirements
model
(1)Development of Requirements
and domain models
49
- simulationTime:
Real
- timeStep: Real
Test Scenario
- v0: Real
Vehicle
- x0: Real
- y0: Real
- θ: Real
- v0: Real
Pedestrian
- simulationTime:
Real
- timeStep: Real
Test Scenario
PeVi
1
1
1
1
«positioned»
Dynamic
Object
- v0: Real
Vehicle
- x0: Real
- y0: Real
- θ: Real
- v0: Real
Pedestrian
- x: Real
- y: Real
Position
*
1
1
- state: Boolean
Collision
PeVi
- state: Boolean
Detection
11
1
1
- AWA
Output
Trajectory
Static inputs
Dynamic inputs
Outputs
- intensity: Real
SceneLight
1
- weatherType:
Condition
Weather
- fog
- rain
- snow
- normal
«enumeration»
Condition
- field of view:
Real
Camera
Sensor
RoadSide
Object
- roadType: RT
Road
1 - curved
- straight
- ramped
«enumeration»
RT
1
*
1
Parked
Cars
Trees
- simulationTime:
Real
- timeStep: Real
Test Scenario
«uses»
1 1
PeVi
PeVi and Environment Domain Model
Requirements Model
50
<<trace>> <<trace>>
Speed
Profile
Path
1 1
Slot Path
Segment
1..**
1
Trajectory
Human
1*
trajectory
Warning
Sensors
posx1, posx2
posy1, posy2
AWACar/Motor/
Truck/Bus
sensor
has
has
awa
1
1
1
*
human
appears
posx1 posx2
posy1
posy2
The NiVi system shall detect any person
located in the Acute Warning Area of a vehicle
MiL Testing via Search
51
Simulator + NiVi
Environment
Settings
(Roads, weather,
vehicle type, etc.)
Fixed during Search Manipulated by Search
Human Simulator
(initial position,
speed, orientation)
Car Simulator
(speed)
PeVi
Meta-heuristic Search
(multi-objective)
Generate
scenarios
Detection
or not?
Collision
or not?
5
2
Type of Road Type of vehicle Type of actor
Situation 1 Straight Car Male
Situation 2 Straight Car Child
Situation 3 Straight Car Cow
Situation 4 Straight Truck Male
Situation 5 Straight Truck Child
Situation 6 Straight Truck Cow
Situation 7 Curved Car Male
Situation 8 Curved Car Child
Situation 9 Curved Car Cow
Situation 10 Curved Truck Male
Situation 11 Curved Truck Child
Situation 12 Curved Track Cow
Situation 13 Ramp Car Male
Situation 14 Ramp Car Child
Situation 15 Ramp Car Cow
Situation 16 Ramp Truck Male
Situation 17 Ramp Truck Child
Situation 18 Ramp Truck Cow
Situation 19
Situation 20
Straight Car+ Cars in parking
Car + buildings
Male
Test Case Specification: Static
(combinatorial)
Test Case Specification: Dynamic
53
Start locationX = 74
Start locationY = 37.72
Start locationZ = 0
Orientation = 0
trajectoryPerson : Trajectory
PositionX= 74
Position Y= 37.72
Position Z = 0
OrientationHeading = 93.33
Acceleration = 0
MaxWalkingSpeed =14
height=1.75
person :Actor
UniqueId
profilePerson :
Speed Profile
StartPointX = 74
StartPointY = 37.72
StartPointY = 0
StartAngle = 93.33
End Angle = 0
Length = 60
pathPerson : Path
Length = 60
Type = Straight
MaxSpeedLimit = 14
segmentPerson : Path Segment
ID
slotPerson : Slot
Time = 0
Speed = 12.59
startPerson :
StartState
Start locationX = 10
Start locationY = 50.125
Start locationZ = 0.56
Orientation = 0
trajectoryCar : TrajectoryPositionX=10
Position Y= 50.125
Position Z = 0.56
OrientationHeading = 0
Acceleration = 0
MaxWalkingSpeed =100
car : Actor
UniqueId
profileCar : Speed
Profile
StartPointX = 10
StartPointY = 50.125
StartPointZ =0.56
StartAngle = 0
End Angle = 0
Length = 100
pathCar : Path
Length = 100
Type = Straight
MaxSpeedLimit = 100
segmentCar : Path Segment
ID
slotCar : Slot
Time = 0
Speed = 60.66
startCar :
StartState
MinTTC=0.3191
Collision
Choice of Surrogate Model
• Neural networks (NN) have been trained to learn
complex functions predicting fitness values
• NN can be trained using different algorithms such as:
– LM: Levenberg-Marquardt
– BR: Bayesian regularization backpropagation
– SCG: Scaled conjugate gradient backpropagation
• R2 (coefficient of determination) indicates how well
data fit a statistical model
• Computed R2 for LM, BR and SCG è BR has the
highest R2
54
Multi-Objective Search
• Input space: car-speed, person-
speed, person-position (x,y),
person-orientation
• Search algorithm need objective
or fitness functions for guidance
• In our case several independent
functions could be interesting:
– Minimum distance between car and
pedestrian
– Minimum distance between
pedestrian and AWA
– Minimum time to collision
• NSGA II algorithm
55
posx1 posx2
posy1
posy2
Pareto Front
56
Individual A Pareto
dominates individual B if
A is at least as good as B
in every objective
and better than B in at
least one objective.
Dominated by x
O1
O2
Pareto front
x
• A multi-objective optimization algorithm must achieve:
• Guide the search towards the global Pareto-Optimal front.
• Maintain solution diversity in the Pareto-Optimal front.
MO Search with NSGA-II
57
Non-Dominated
Sorting
Selection based on rank
and crowding distance
Size: 2*N Size: 2*N Size: N
• Based on Genetic Algorithm
• N: Archive and population size
• Non-Dominated sorting: Solutions are ranked according to
how far they are from the Pareto front, fitness is based on
rank.
• Crowding Distance: Individuals in the archive are being spread
more evenly across the front (forcing diversity)
• Runs simulations for close to N new solutions
Pareto Front Results
58
Pareto Front Projection
59
SimulationScenario Execution
• Straight road with parking
• The person appears in the AWA, but is not detected
60
Improving Time Performance
• Individual simulations take on average more than
1min
• It takes 10 hours to run our search-based test
generation (≈ 500 simulations)
• We use surrogate modeling to improve the search
• Neural networks are used to predict fitness values
within a confidence interval
• During the search, we use prediction values &
confidence intervals to run simulations only
for the solutions likely to be selected
61
Search with Surrogate Models
62
Non-Dominated
Sorting
Selection based on rank
and crowding distance
Size: 2*N Size: 2*N Size: N
Original Algorithm
- Runs simulations for all
new solutions
Our Algorithm
- Uses prediction values
& intervals to run
simulations only
for the solutions likely to
be selected
NSGA II
Results – Surrogate Modeling
63
0.00
0.25
0.50
0.75
1.00
Time (min)
HV
50 100 15010
(a) Comparing HV values obtained
by NSGAII and NSGAII-SM
NSGAII (mean)
NSGAII-SM (mean)
1.00
(b) Comparing HV values obtained
by RS and NSGAII-SM
execution
the worst
among ou
NSGAII
worst run
be unluc
run simil
the wors
worst run
safer algo
constrain
values do
This is be
distance
from the
tions in t
We fur
and NSG
min. Sim
Results – Random Search
64
Time (min)
0.00
0.25
0.50
0.75
1.00
Time (min)
50 100 15010
(b) Comparing HV values obtained
by RS and NSGAII-SM
HV
RS (mean)
NSGAII-SM (mean)
(c) HV values for worst runs of NSGAII,
NSGAII-SM and RS
dis
fro
tio
an
mi
alg
is
Fu
an
tha
res
tio
SM
gen
ten
NS
tha
in
Results – Worst Runs
65
Time (min)
50 100 15010
0.00
0.25
0.50
0.75
1.00
Time (min)
HV
50 100 15010
(c) HV values for worst runs of NSGAII,
NSGAII-SM and RS
RS
NSGAII-SM
NSGAII
Figure 6: Comparing HV values obtained by (a) 20
runs of NSGAII and NSGAII-SM (cl=.95); (b) 20
g
t
N
t
i
w
G
o
p
N
G
g
a
A
t
s
c
G
b
Minimizing CPU Shortage Risks
During Integration
References:
66
• S. Nejati et al., ‘‘Minimizing CPU Time Shortage Risks in Integrated Embedded
Software’’, in 28th IEEE/ACM International Conference on Automated Software
Engineering (ASE 2013), 2013
• S. Nejati, L. Briand, “Identifying Optimal Trade-Offs between CPU Time Usage and
Temporal Constraints Using Search”, ACM International Symposium on Software
Testing and Analysis (ISSTA 2014), 2014
Automotive: Distributed Development
67
Software Integration
68
• Develop software optimized for
their specific hardware
• Provide integrator with runnables
• Integrate car makers software
with their own platform
• Deploy final software on ECUs
and send them to car makers
Car Makers Integrator
Stakeholders
69
• Objective: Effective execution and
synchronization of runnables
• Some runnables should execute
simultaneously or in a certain order
• Objective: Effective usage of
CPU time
• Max CPU time used by all the
runnables should remain as low
as possible over time
Car Makers Integrator
Different Objectives
70
An overview of an integration process in the
automotive domain
AUTOSAR Models
sw runnables
sw runnables
AUTOSAR Models
Glue
71
72
CPU time shortage
• Static cyclic scheduling: predictable, analyzable
• Challenge
– Many OS tasks and their many runnables run within a limited
available CPU time
• The execution time of the runnables may exceed their time slot
• Goal
– Reducing the maximum CPU time used per time slot to be
able to
• Minimize the hardware cost
• Reduce the probability of overloading the CPU in practice
• Enable addition of new functions incrementally
72
5ms 10ms 15ms 20ms 25ms 30ms 35ms 40ms
✗
5ms 10ms 15ms 20ms 25ms 30ms 35ms 40ms
✔
(a)
(b)
Fig. 4. Two possible CPU time usage simulations for an OS task with a 5ms
cycle: (a) Usage with bursts, and (b) Desirable usage.
to dead
run pas
by OS
divisibl
offset v
synchro
73
Using runnable offsets (delay times)
5ms 10ms 15ms 20ms 25ms 30ms 35ms 40ms
5ms 10ms 15ms 20ms 25ms 30ms 35ms 40ms
✗
✔
Insertingrunnables’offsets
Offsets have to be chosen such that
the maximum CPU usage per time slot is minimized, and further,
the runnables respect their period
the runnables respect their time slot
the runnables satisfy their synchronization constraints
73
5.34ms 5.34ms
5 ms
Time
CPUtimeusage(ms)
CPU time usage exceeds the size of the slot (5ms)
Without optimization
74
CPU time usage always remains less than 2.13ms, so
more than half of each slot is guaranteed to be free
2.13ms
5 ms
Time
CPUtimeusage(ms)
With Optimization
75
Single-objectiveSearch algorithms
Hill Climbing and Tabu Search and their variations
Solution Representation
a vector of offset values: o0=0, o1=5, o2=5, o3=0
Tweak operator
o0=0, o1=5, o2=5, o3=0 à o0=0, o1=5, o2=10, o3=0
Synchronization Constraints
offset values are modified to satisfy constraints
Fitness Function
max CPU time usage per time slot
76
Summary of Problem and Solution
Optimization
while satisfying
synchronization/temporal
constraints
Explicit Time
Model
for real-time embedded systems
Search
meta-heuristic single objective
search algorithms
10^27
an industrial case study with a
large search space
77
78
Search Solution and Results
Case Study: an automotive software system with 430 runnables,
search space = 10^27
Running the system without offsets
Simulation for the runnables in our case study and
corresponding to the lowest max CPU usage found by HC
5.34 ms
Optimized offset assignment
2.13 ms
- The objective function is the max CPU usage of a 2s-simulation of
runnables
- The search modifies one offset at a time, and updates other offsets
only if timing constraints are violated
- Single-state search algorithms for discrete spaces (HC, Tabu)
78
79
Comparingdifferent search algorithms
(ms)(s)
Best CPU usage
Time to find
Best CPU usage
79
80
Comparingour best search algorithm with
random search
(a) (b) (c)
(a)
Lowest max CPU usage values computed by HC within 70 ms
over 100 different runs
Lowest max CPU usage values computed by Random
within 70 ms over 100 different runs
Comparing average behavior of Random and HC in computing
lowest max CPU usage values within 70 s and over 100 different runs
80
HC Random Average
0ms 5ms 10ms 15ms 20ms 25ms 30ms
0ms 5ms 10ms 15ms 20ms 25ms 30ms
0ms 5ms 10ms 15ms 20ms 25ms 30ms
4ms
3ms
2ms
Car Makers Integratorr0 r1 r2 r3
Minimize CPU time usage
1 slot
2 slots
3 slots
Execute r0 to r3 close to one another.
Trade-off between Objectives
81
Trade-off curve
#ofslots
CPU time usage (ms) 2.041.45
12
21
14
1.56
1
2
3
Boundary Trade Offs
Interesting
Solutions
82
Multi-objective search
• Multi-objective genetic algorithms (NSGA II)
• Pareto optimality
• Supporting decision making and negotiation between
stakeholders
83
Report: GraphRandom-NSGAII-25 Page 1 of 2
CPU Time Usage-NSGAII & CPU Time Usage-Random vs. Number
of Slots-NSGAII & Number of Slots-Random
Number of Slots-NSGAII & Number of Slots-Random
10 15 20 25 30 35 40 45 50
CPUTimeUsage-NSGAII&CPUTimeUsage-Random
1.5
2.0
2.5
3.0
3.5
Graph Builder
Total Number of Time Slots
MaxCPUTimeUsage(ms)
Random(25,000)
NSGA-II(25,000)
A
B
12
1.45
C
Objectives:
• (1) Max CPU time
• (2) Maximum time
slots between
“dependent” tasks
Input.csv:
- runnables
- Periods
- CETs
- Groups
- # of slots per
groups
Search
A list of solutions:
- objective 1 (CPU usage)
- objective 2 (# of slots)
- vector of group slots
- vector of offsets
Visualization/
Query Analysis
- Visualize solutions
- Retrieve/visualize
simulations
- Visualize Pareto Fronts
- Apply queries to the
solutions
Trade-Off Analysis Tool
84
85
Conclusions
- Search algorithms to compute
offset values that reduce the
max CPU time needed
- Generate reasonably good
results for a large automotive
system and in a small amount
of time
- Used multi-objective search à
tool for establishing trade-off
between relaxing
synchronization constraints and
maximum CPU time usage
85
Schedulability Analysis and Stress
Testing
References:
86
• S. Di Alesio et al., “Worst-Case Scheduling of Software Tasks – A Constraint
Optimization Model to Support Performance Testing, Constraint Programming (CP),
2014
• S. Di Alesio et al. “Combining Genetic Algorithms and Constraint Programming to
Support Stress Testing”, ACM TOSEM, 25(1), 2015
Real-time, concurrent systems (RTCS)
• Real-time, concurrent systems (RTCS) have
concurrent interdependent tasks which have
to finish before their deadlines
• Some task properties depend on the
environment, some are design choices
• Tasks can trigger other tasks, and can share
computational resources with other tasks
• How can we determine whether tasks meet
their deadlines?
87
Problem
• Schedulability analysis encompasses techniques
that try to predict whether all (critical) tasks are
schedulable, i.e., meet their deadlines
• Stress testing runs carefully selected test cases
that have a high probability of leading to deadline
misses
• Stress testing is complementary to schedulability
analysis
• Testing is typically expensive, e.g., hardware in
the loop
• Finding stress test cases is difficult
88
Finding Stress Test Cases is Difficult
89
0
1
2
3
4
5
6
7
8
9
j0, j1 , j2 arrive at at0 , at1 , at2 and must
finish before dl0 , dl1 , dl2
J1 can miss its deadline dl1 depending on
when at2 occurs!
0
1
2
3
4
5
6
7
8
9
j0 j1 j2 j0 j1 j2
at0
dl0
dl1
at1 dl2
at2
T
T
at0
dl0 dl1
at1
at2
dl2
Challengesand Solutions
• Ranges for arrival times form a very large input space
• Task interdependencies and properties constrain
what parts of the space are feasible
• We re-expressed the problem as a constraint
optimisation problem
• Constraint programming (e.g., IBM CPLEX)
90
Constraint Optimization
91
Constraint Optimization Problem
Static Properties of Tasks
(Constants)
Dynamic Properties of Tasks
(Variables)
Performance Requirement
(Objective Function)
OS Scheduler Behaviour
(Constraints)
Process and Technologies
92
UML Modeling (e.g.,
MARTE)
Constraint Optimization
Optimization Problem
(Find arrival times that maximize the
chance of deadline misses)
System Platform
Solutions
(Task arrival times likely to
lead to deadline misses)
Deadline Misses
Analysis
System Design Design Model (Time
and Concurrency
Information)
INPUT
OUTPUT
Stress Test Cases
Constraint
Programming
(CP)
Context
93
Drivers
(Software-Hardware Interface)
Control Modules
Alarm Devices
(Hardware)
Multicore Architecture
Real-Time Operating System
System monitors gas leaks and fire in
oil extraction platforms
Challengesand Solutions
• CP effective on small problems
• Scalability problem: Constraint programming (e.g.,
IBM CPLEX) cannot handle large input spaces (CPU,
memory)
• Solution: Combine metaheuristic search and
constraint programming
– metaheuristic search (GA) identifies high risk regions in the
input space
– constraint programming finds provably worst-case schedules
within these (limited) regions
– Achieve (nearly) GA efficiency and CP effectiveness
• Our approach can be used both for stress testing and
schedulability analysis (assumption free)
94
Combining GA and CP
95
A:12 S. Di Alesio et al.
Fig. 3: Overview of GA+CP: the solutions x1, y1 and z1 in the initial population of GA evolve into
x6, y6, and z6, then CP searches in their neighborhood for the optimal solutions x⇤
, y⇤
and z⇤
.
in the schedule generated by the arrival times in x:
Process and Technologies
96
UML Modeling (e.g.,
MARTE)
Constraint Optimization
Optimization Problem
(Find arrival times that maximize the
chance of deadline misses)
System Platform
Solutions
(Task arrival times likely to
lead to deadline misses)
Deadline Misses
Analysis
System Design Design Model (Time
and Concurrency
Information)
INPUT
OUTPUT
Genetic
Algorithms
(GA)
Stress Test Cases
Constraint
Programming
(CP)
V&V Topics Addressed by Search
• Many projects over the last 15 years
• Design-time verification
– Schedulability
– Concurrency
– Resource usage
• Testing
– Stress/load testing, e.g., task deadlines
– Robustness testing, e.g., data errors
– Reachability of safety or business critical states, e.g.,
collision and no warning
– Security testing, e.g., XML and SQLi injections
97
Publicity!
• Chunhui Wang et al., “System Testing of Timing
Requirements based on Use Cases and Timed
Automata”. Session R09 @ ICST 2017, Tuesday, 2
pm
• Sadeeq Jan et al., “A Search-based Testing
Approach for XML Injection Vulnerabilities in Web
Applications”. Session R11 @ ICST 2017, Thursday
11 am
98
99
Objective
Function
Search
Space
Search
Technique
n Problem = fault model
n Model = system or
environment
n Search to optimize
objective function(s)
n Metaheuristics
n Scalability: A small part
of the search space is
traversed
n Model: Guidance to
worst case, high-risk
scenarios across space
n Reasonable modeling
effort based on
standards or extension
n Heuristics: Extensive
empirical studies are
required
General Pattern: Using Metaheuristic Search
100
Objective
Function
Search
Space
Search
Technique
n Model simulation can be
time consuming
n Makes the search
impractical or ineffective
n Surrogate modeling
based on machine
learning
n Simulator dedicated to
search
General Pattern: Using Metaheuristic Search
Simulator
101
Objective
Function
Search
Space
Search
Technique
n Use techniques such as
sensitivity analysis to
minimize dimensionality
before running search
n Predict parts of the
space worth searching
in
General Pattern: Using Metaheuristic Search
Large
102
Objective
Function
Search
Space
Search
Technique
n Combine with solvers
and optimization
engines
n Need heuristic
strategies to determine
when to use what
General Pattern: Using Metaheuristic Search
Multiple
techniques?
Scalability
103
Project examples
• Scalability is the most common verification challenge in
practice
• Testing closed-loop controllers, DA system
– Large input and configuration space
– Expensive simulations
– Smart heuristics to avoid simulations (machine learning
to predict fitness)
• Schedulability analysis and stress testing
– Large space of possible arrival times
– Constraint programming cannot scale by itself
– CP was carefully combined with genetic algorithms
104
Scalability: Lessons Learned
• Scalability must be part of the problem definition and
solution from the start, not a refinement or an after-
thought
• Meta-heuristic search, by necessity, has been an essential
part of the solutions, along with, in some cases, machine
learning, statistics, etc.
• Scalability often leads to solutions that offer “best
answers” within time constraints, but no guarantees
• Scalability analysis should be a component of every
research project – otherwise it is unlikely to be adopted in
practice
• How many papers research papers do include even a
minimal form of scalability analysis?
105
Practicality
106
Project examples
• Practicality requires to account for the domain and context
• Testing controllers
– Relies on Simulink only
– No additional modeling or complex translation
– Differences between open versus closed loop
controllers
• Minimizing risks of CPU shortage
– Trade-off between between effective synchronization
and CPU usage
– Trade-off achieved through multiple-objective GA
search and appropriate decision tool
107
Practicality: Lessons Learned
• In software engineering, and verification in particular,
just understanding the real problems in context is
difficult
• What are the inputs required by the proposed
technique?
• How does it fit in development practices?
• Is the output what engineers require to make
decisions?
• There is no unique solution to a problem as they tend
to be context dependent, but a context is rarely
unique and often representative of a domain or type
of system
108
Discussion
• Metaheuristic search for verification and testing
– Tends to be versatile, tailorable to new problems and
contexts
– Particularly suited to the verification of non-functional
properties
– Entails acceptable modeling requirements
– Can provide “best” answers at any time
– Scalable, practical
But
– Not a proof, no certainty
– Effectiveness of search guidance is key and must be
experimentally evaluated
– Models are key to provide adequate guidance
– Search must often be combined with other techniques, e.g.,
machine learning, constraint programming 109
DiscussionII
• Constraint solvers (e.g., Comet, ILOG CPLEX, SICStus)
– Is there an efficient constraintmodel for the problem at hand?
– Can effective heuristics be found to order the search?
– Better if there is a match to a known standard problem, e.g., job
shop scheduling
– Tend to be strongly affected by small changes in the problem, e.g.,
allowing task pre-emption
– Often not scalable, e.g., memory
• Model checking
– Detailed operationalmodels (e.g., state models), involving
(complex) temporal properties (e.g., CTL)
– Enough details to analyze statically or execute symbolically
– These modeling requirements are usually not realistic in actual
system development.State explosion problem.
– Originally designed for checking temporal properties through
reachability analysis,as opposed to explicit timing properties
– Often not scalable 110
Talk Summary
• Focus: Meta-heuristic Search to enable scalable
verification and testing.
• Scalability is the main challenge in practice.
• We drew lessons learned from example projects in
collaboration with industry, on real systems and in real
verification contexts.
• Results show that meta-heuristic search contributes to
mitigate the scalability problem.
• It has also shown to lead to practical solutions.
• Solutions are very context dependent.
• Solutions tend to be multidisciplinary: system modeling,
constraint solving, machine learning, statistics.
111
Acknowledgements
PhD. Students:
• Vahid Garousi
• Marwa Shousha
• Zohaib Iqbal
• Reza Matinnejad
• Stefano Di Alesio
• Raja Ben Abdessalem
Scientists:
• Shiva Nejati
• Andrea Arcuri
112
Scalable Software Testing and Verification
of Non-Functional Properties through
Heuristic Search and Optimization
Lionel Briand
Interdisciplinary Centre for ICT Security, Reliability, and Trust (SnT)
University of Luxembourg, Luxembourg
ITEQS, March 13, 2017
SVV lab: svv.lu
SnT: www.securityandtrust.lu
We are hiring!

More Related Content

What's hot

Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Lionel Briand
 
OCLR: A More Expressive, Pattern-Based Temporal Extension of OCL
OCLR: A More Expressive, Pattern-Based Temporal Extension of OCLOCLR: A More Expressive, Pattern-Based Temporal Extension of OCL
OCLR: A More Expressive, Pattern-Based Temporal Extension of OCL
Lionel Briand
 
Evaluating Model Testing and Model Checking for Finding Requirements Violatio...
Evaluating Model Testing and Model Checking for Finding Requirements Violatio...Evaluating Model Testing and Model Checking for Finding Requirements Violatio...
Evaluating Model Testing and Model Checking for Finding Requirements Violatio...
Lionel Briand
 
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
Lionel Briand
 
Keynote SBST 2014 - Search-Based Testing
Keynote SBST 2014 - Search-Based TestingKeynote SBST 2014 - Search-Based Testing
Keynote SBST 2014 - Search-Based Testing
Lionel Briand
 
Test Case Prioritization for Acceptance Testing of Cyber Physical Systems
Test Case Prioritization for Acceptance Testing of Cyber Physical SystemsTest Case Prioritization for Acceptance Testing of Cyber Physical Systems
Test Case Prioritization for Acceptance Testing of Cyber Physical Systems
Lionel Briand
 
Can we predict the quality of spectrum-based fault localization?
Can we predict the quality of spectrum-based fault localization?Can we predict the quality of spectrum-based fault localization?
Can we predict the quality of spectrum-based fault localization?
Lionel Briand
 
A practical guide for using Statistical Tests to assess Randomized Algorithms...
A practical guide for using Statistical Tests to assess Randomized Algorithms...A practical guide for using Statistical Tests to assess Randomized Algorithms...
A practical guide for using Statistical Tests to assess Randomized Algorithms...
Lionel Briand
 

What's hot (20)

Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
Analyzing Natural-Language Requirements: The Not-too-sexy and Yet Curiously D...
 
OCLR: A More Expressive, Pattern-Based Temporal Extension of OCL
OCLR: A More Expressive, Pattern-Based Temporal Extension of OCLOCLR: A More Expressive, Pattern-Based Temporal Extension of OCL
OCLR: A More Expressive, Pattern-Based Temporal Extension of OCL
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
 
Evaluating Model Testing and Model Checking for Finding Requirements Violatio...
Evaluating Model Testing and Model Checking for Finding Requirements Violatio...Evaluating Model Testing and Model Checking for Finding Requirements Violatio...
Evaluating Model Testing and Model Checking for Finding Requirements Violatio...
 
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
HITECS: A UML Profile and Analysis Framework for Hardware-in-the-Loop Testing...
 
Scalable and Cost-Effective Model-Based Software Verification and Testing
Scalable and Cost-Effective Model-Based Software Verification and TestingScalable and Cost-Effective Model-Based Software Verification and Testing
Scalable and Cost-Effective Model-Based Software Verification and Testing
 
Functional Safety in ML-based Cyber-Physical Systems
Functional Safety in ML-based Cyber-Physical SystemsFunctional Safety in ML-based Cyber-Physical Systems
Functional Safety in ML-based Cyber-Physical Systems
 
Research-Based Innovation with Industry: Project Experience and Lessons Learned
Research-Based Innovation with Industry: Project Experience and Lessons LearnedResearch-Based Innovation with Industry: Project Experience and Lessons Learned
Research-Based Innovation with Industry: Project Experience and Lessons Learned
 
Keynote SBST 2014 - Search-Based Testing
Keynote SBST 2014 - Search-Based TestingKeynote SBST 2014 - Search-Based Testing
Keynote SBST 2014 - Search-Based Testing
 
Supporting Change in Product Lines within the Context of Use Case-driven Deve...
Supporting Change in Product Lines within the Context of Use Case-driven Deve...Supporting Change in Product Lines within the Context of Use Case-driven Deve...
Supporting Change in Product Lines within the Context of Use Case-driven Deve...
 
Metamorphic Security Testing for Web Systems
Metamorphic Security Testing for Web SystemsMetamorphic Security Testing for Web Systems
Metamorphic Security Testing for Web Systems
 
Automated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance SystemsAutomated Testing of Autonomous Driving Assistance Systems
Automated Testing of Autonomous Driving Assistance Systems
 
Artificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software TestingArtificial Intelligence for Automated Software Testing
Artificial Intelligence for Automated Software Testing
 
Test Case Prioritization for Acceptance Testing of Cyber Physical Systems
Test Case Prioritization for Acceptance Testing of Cyber Physical SystemsTest Case Prioritization for Acceptance Testing of Cyber Physical Systems
Test Case Prioritization for Acceptance Testing of Cyber Physical Systems
 
Can we predict the quality of spectrum-based fault localization?
Can we predict the quality of spectrum-based fault localization?Can we predict the quality of spectrum-based fault localization?
Can we predict the quality of spectrum-based fault localization?
 
Search-Based Robustness Testing of Data Processing Systems
Search-Based Robustness Testing of Data Processing SystemsSearch-Based Robustness Testing of Data Processing Systems
Search-Based Robustness Testing of Data Processing Systems
 
SSBSE 2020 keynote
SSBSE 2020 keynoteSSBSE 2020 keynote
SSBSE 2020 keynote
 
A practical guide for using Statistical Tests to assess Randomized Algorithms...
A practical guide for using Statistical Tests to assess Randomized Algorithms...A practical guide for using Statistical Tests to assess Randomized Algorithms...
A practical guide for using Statistical Tests to assess Randomized Algorithms...
 
Search-driven String Constraint Solving for Vulnerability Detection
Search-driven String Constraint Solving for Vulnerability DetectionSearch-driven String Constraint Solving for Vulnerability Detection
Search-driven String Constraint Solving for Vulnerability Detection
 
STAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash ReproductionSTAR: Stack Trace based Automatic Crash Reproduction
STAR: Stack Trace based Automatic Crash Reproduction
 

Viewers also liked

Viewers also liked (17)

Bangladesh arts and sculpture
Bangladesh arts and sculptureBangladesh arts and sculpture
Bangladesh arts and sculpture
 
Evaluación del estado nutricio
Evaluación del estado nutricioEvaluación del estado nutricio
Evaluación del estado nutricio
 
Hábitos de higiene personal
Hábitos de higiene personalHábitos de higiene personal
Hábitos de higiene personal
 
Digital economy
Digital economyDigital economy
Digital economy
 
Asuhan primer pada bayi 6 minggu pertama
Asuhan primer pada bayi 6 minggu pertama  Asuhan primer pada bayi 6 minggu pertama
Asuhan primer pada bayi 6 minggu pertama
 
Bayi baru lahir normal
Bayi baru lahir normalBayi baru lahir normal
Bayi baru lahir normal
 
Internship Report
Internship ReportInternship Report
Internship Report
 
Copywriting w kontekście segmentacji SEO. Jak opisać produkt, aby wyświetlał ...
Copywriting w kontekście segmentacji SEO. Jak opisać produkt, aby wyświetlał ...Copywriting w kontekście segmentacji SEO. Jak opisać produkt, aby wyświetlał ...
Copywriting w kontekście segmentacji SEO. Jak opisać produkt, aby wyświetlał ...
 
Laporan Kegiatan Rapat Umum Pengurus dan Ulang Tahun KWP VII
Laporan Kegiatan Rapat Umum Pengurus dan Ulang Tahun KWP VIILaporan Kegiatan Rapat Umum Pengurus dan Ulang Tahun KWP VII
Laporan Kegiatan Rapat Umum Pengurus dan Ulang Tahun KWP VII
 
Планирование производства в системе Omega Production
Планирование производства в системе Omega ProductionПланирование производства в системе Omega Production
Планирование производства в системе Omega Production
 
Arquitectura islamica. ADRIANA POLLY
Arquitectura islamica. ADRIANA POLLYArquitectura islamica. ADRIANA POLLY
Arquitectura islamica. ADRIANA POLLY
 
Conceptos básicos de la probabilidad
Conceptos básicos de la probabilidadConceptos básicos de la probabilidad
Conceptos básicos de la probabilidad
 
Transport of critically ill patients
Transport of critically ill patientsTransport of critically ill patients
Transport of critically ill patients
 
Αφρική (Δήμος Ιωάννου)
Αφρική   (Δήμος Ιωάννου)Αφρική   (Δήμος Ιωάννου)
Αφρική (Δήμος Ιωάννου)
 
Dallascad BPP Exempt FP Web
Dallascad BPP Exempt FP WebDallascad BPP Exempt FP Web
Dallascad BPP Exempt FP Web
 
Тесты по ПДД
Тесты по ПДДТесты по ПДД
Тесты по ПДД
 
"Νοιάζομαι για μένα και τους άλλους"( 2)
"Νοιάζομαι για μένα και τους άλλους"( 2)"Νοιάζομαι για μένα και τους άλλους"( 2)
"Νοιάζομαι για μένα και τους άλλους"( 2)
 

Similar to Scalable Software Testing and Verification of Non-Functional Properties through Heuristic Search and Optimization

Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Lionel Briand
 
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
Lionel Briand
 
Testing survey by_directions
Testing survey by_directionsTesting survey by_directions
Testing survey by_directions
Tao He
 
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
Testing the Untestable: Model Testing of Complex Software-Intensive SystemsTesting the Untestable: Model Testing of Complex Software-Intensive Systems
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
Lionel Briand
 

Similar to Scalable Software Testing and Verification of Non-Functional Properties through Heuristic Search and Optimization (20)

Enabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceEnabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial Intelligence
 
Presentation by Lionel Briand
Presentation by Lionel BriandPresentation by Lionel Briand
Presentation by Lionel Briand
 
Achieving Scalability in Software Testing with Machine Learning and Metaheuri...
Achieving Scalability in Software Testing with Machine Learning and Metaheuri...Achieving Scalability in Software Testing with Machine Learning and Metaheuri...
Achieving Scalability in Software Testing with Machine Learning and Metaheuri...
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
 
AI in SE: A 25-year Journey
AI in SE: A 25-year JourneyAI in SE: A 25-year Journey
AI in SE: A 25-year Journey
 
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
 
Testing survey by_directions
Testing survey by_directionsTesting survey by_directions
Testing survey by_directions
 
Simulation
SimulationSimulation
Simulation
 
Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.Software Engineering Research: Leading a Double-Agent Life.
Software Engineering Research: Leading a Double-Agent Life.
 
Combinatorial testing ppt
Combinatorial testing pptCombinatorial testing ppt
Combinatorial testing ppt
 
The Automation Firehose: Be Strategic and Tactical by Thomas Haver
The Automation Firehose: Be Strategic and Tactical by Thomas HaverThe Automation Firehose: Be Strategic and Tactical by Thomas Haver
The Automation Firehose: Be Strategic and Tactical by Thomas Haver
 
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
Testing the Untestable: Model Testing of Complex Software-Intensive SystemsTesting the Untestable: Model Testing of Complex Software-Intensive Systems
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
 
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
 
'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015 'A critique of testing' UK TMF forum January 2015
'A critique of testing' UK TMF forum January 2015
 
Combinatorial testing
Combinatorial testingCombinatorial testing
Combinatorial testing
 
Generation of Search Based Test Data on Acceptability Testing Principle
Generation of Search Based Test Data on Acceptability Testing PrincipleGeneration of Search Based Test Data on Acceptability Testing Principle
Generation of Search Based Test Data on Acceptability Testing Principle
 
D017642026
D017642026D017642026
D017642026
 
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairIt Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
It Does What You Say, Not What You Mean: Lessons From A Decade of Program Repair
 
On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...
On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...
On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...
 
Automated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise ApplicationsAutomated Discovery of Performance Regressions in Enterprise Applications
Automated Discovery of Performance Regressions in Enterprise Applications
 

More from Lionel Briand

Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Lionel Briand
 
Data-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical SystemsData-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical Systems
Lionel Briand
 
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled SystemsMany-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Lionel Briand
 
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
Lionel Briand
 
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Lionel Briand
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
Lionel Briand
 
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Lionel Briand
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and SafetyAutonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Lionel Briand
 
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Lionel Briand
 
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Lionel Briand
 
On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...
Lionel Briand
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Lionel Briand
 
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Lionel Briand
 
A Theoretical Framework for Understanding the Relationship between Log Parsin...
A Theoretical Framework for Understanding the Relationship between Log Parsin...A Theoretical Framework for Understanding the Relationship between Log Parsin...
A Theoretical Framework for Understanding the Relationship between Log Parsin...
Lionel Briand
 

More from Lionel Briand (20)

Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Metamorphic Testing for Web System Security
Metamorphic Testing for Web System SecurityMetamorphic Testing for Web System Security
Metamorphic Testing for Web System Security
 
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
 
Fuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation TestingFuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation Testing
 
Data-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical SystemsData-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical Systems
 
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled SystemsMany-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
 
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
 
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
 
PRINS: Scalable Model Inference for Component-based System Logs
PRINS: Scalable Model Inference for Component-based System LogsPRINS: Scalable Model Inference for Component-based System Logs
PRINS: Scalable Model Inference for Component-based System Logs
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
 
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and SafetyAutonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
 
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
 
Reinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case PrioritizationReinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case Prioritization
 
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
 
On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
 
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
 
A Theoretical Framework for Understanding the Relationship between Log Parsin...
A Theoretical Framework for Understanding the Relationship between Log Parsin...A Theoretical Framework for Understanding the Relationship between Log Parsin...
A Theoretical Framework for Understanding the Relationship between Log Parsin...
 

Recently uploaded

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
VishalKumarJha10
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 

Recently uploaded (20)

Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban%in Durban+277-882-255-28 abortion pills for sale in Durban
%in Durban+277-882-255-28 abortion pills for sale in Durban
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 

Scalable Software Testing and Verification of Non-Functional Properties through Heuristic Search and Optimization

  • 1. Scalable Software Testing and Verification of Non-Functional Properties through Heuristic Search and Optimization Lionel Briand Interdisciplinary Centre for ICT Security, Reliability, and Trust (SnT) University of Luxembourg, Luxembourg ITEQS, March 13, 2017
  • 2. Collaborative Research @ SnT Centre • Research in context • Addresses actual needs • Well-defined problem • Long-term collaborations • Our lab is the industry 2
  • 3. Scalable Software Testing and Verification Through Heuristic Search and Optimization 3 With a focus on non-functional properties
  • 4. Verification, Testing • The term “verification” is used in its wider sense: Defect detection. • Testing is, in practice, the most common verification technique. • Other forms of verifications are important too (e.g., design time, run-time), but much less present in practice. 4
  • 5. Decades of V&V research have not yet significantly and widely impacted engineering practice 5
  • 6. Cyber-PhysicalSystems • Increasingly complex and critical systems • Complex environment • Combinatorial and state explosion • Dynamic behavior • Complex requirements, e.g., temporal, timing, resource usage • Uncertainty, e.g., about the environment 6
  • 7. Scalable?Practical? • Scalable: Can a technique be applied on large artifacts (e.g., models, data sets, input spaces) and still provide useful support within reasonable effort, CPU and memory resources? • Practical: Can a technique be efficiently and effectively applied by engineers in realistic conditions? – realistic ≠ universal – feasibility and cost of inputs to be provided? 7
  • 8. Metaheuristics • Heuristic search (Metaheuristics): Hill climbing, Tabu search, Simulated Annealing, Genetic algorithms, Ant colony optimisation …. • Stochastic optimization: General class of algorithms and techniques which employ some degree of randomness to find optimal (or as optimal as possible) solutions to hard problems • Many verification and testing problems can be re- expressed as optimization problems • Goal: Address scalability and practicality issues 8
  • 9. Talk Outline • Selected project examples, with industry collaborations • Similarities and patterns • Lessons learned 9
  • 10. Testing Software Controllers References: 10 • R. Matinnejad et al., “Automated Test Suite Generation for Time-continuous Simulink Models“, IEEE/ACM ICSE 2016 • R. Matinnejad et al., “EffectiveTest Suites for Mixed Discrete-Continuous Stateflow Controllers”, ACM ESEC/FSE 2015 (Distinguished paper award) • R. Matinnejad et al., “MiL Testing of Highly Configurable Continuous Controllers: Scalable Search Using Surrogate Models”, IEEE/ACM ASE 2014 (Distinguished paper award) • R. Matinnejad et al., “Search-Based Automated Testing of Continuous Controllers: Framework, Tool Support, and Case Studies”, Information and Software Technology, Elsevier (2014)
  • 11. Electronic Control Units (ECUs) More functions Comfort and variety Safety and reliability Faster time-to-market Less fuel consumption Greenhouse gas emission laws 11
  • 12. A Taxonomy of Automotive Functions ControllingComputation State-Based ContinuousTransforming Calculating unit convertors calculating positions, duty cycles, etc State machine controllers Closed-loop controllers (PID) 12
  • 14. Development Process 14 Hardware-in-the-Loop Stage Model-in-the-Loop Stage Simulink Modeling Generic Functional Model MiL Testing Software-in-the-Loop Stage Code Generation and Integration Software Running on ECU SiL Testing Software Release HiL Testing
  • 15. MATLAB/Simulinkmodel + + 0.051 FuelLevelSensor -0.05 100 0.8 + - Gain Gain1 Add1 Add 1 FuelLevel Continuous Integrator 15 • Data flow oriented • Blocks and lines • Time continuous and discrete behavior • Input and outputs signals
  • 16. Automotive Example • Supercharger bypass flap controller • Flap position is boundedwithin [0..1] • 34 sub-components decomposedinto 6 abstraction levels • Compressor blowingto the engine Supercharger Bypass Flap Supercharger Bypass Flap Flap position = 0 (open) Flap position = 1 (closed) 16
  • 17. Testing Controllers at MIL Initial Desired Value Final Desired Value time time Desired Value Actual Value T/2 T T/2 T Test Input Test Output Plant Model Controller (SUT) Desired value Error Actual value System output+ - 17
  • 18. Configurable Controllers at MIL 18 Plant Model + + + ⌃ + - e(t) actual(t) desired(t) ⌃ KP e(t) KD de(t) dt KI R e(t) dt P I D output(t) Time-dependent variables Configuration Parameters
  • 19. Requirements and Test Oracles 19 InitialDesired (ID) Desired ValueI (input) Actual Value (output) FinalDesired (FD) time T/2 T Smoothness Responsiveness Stability
  • 20. Test Strategy: A Search-Based Approach 20 Initial Desired (ID) FinalDesired(FD) Worst Case(s)? • Continuous behavior • Controller’s behavior can be complex • Meta-heuristic search in (large) input space: Finding worst case inputs • Possible because of automated oracle (feedback loop) • Different worst cases for different requirements • Worst cases may or may not violate requirements
  • 21. Search-Based Software Testing • Express test generation problem as a search problem • Search for test input data with certain properties, i.e., constraints • Non-linearity of software (if, loops, …): complex, discontinuous, non-linear search spaces (Baresel) • Many search algorithms (metaheuristics), from local search to global search, e.g., Hill Climbing, Simulated Annealing and Genetic Algorithms imbing starts at a random point in the in the search space neighbouring the uated for fitness. If a better candidate l Climbing moves to that new point, hbourhood of that candidate solution. until the neighbourhood of the current ce offers no better candidate solutions; ima’. If the local optimum is not the n Figure 3a), the search may benefit and performing a climb from a new landscape (Figure 3b). simple Hill Climbing is Simulated h by Simulated Annealing is similar to movement around the search space is may be made to points of lower fitness with the aim of escaping local optima. probability value that is dependent d the ‘temperature’, which decreases ch progresses (Figure 4). The lower ess likely the chances of moving to a search space, until ‘freezing point’ is point the algorithm behaves identically ulated Annealing is named so because he physical process of annealing in Hill Climbing. From a random starting point, the algorithm follows the curve of the fitness landscape until a local optimum is found. The final position may not represent the global optimum (part (a)), and restarts may be required (part (b)) Fitness Input domain Figure 4. Simulated Annealing may temporarily move to points of poorer fitness in the search space Fitness Input domain Figure 5. Genetic Algorithms are global searches, sampling many points in the fitness landscape at once Search-Based SoftwareTesting: Past, Presentand Future Phil McMinn Genetic Algorithm 21 discusses future directions for Search-Based Testing, comprising issues involving execution ts, testability, automated oracles, reduction of le cost and multi-objective optimisation. Finally, concludes with closing remarks. ARCH-BASED OPTIMIZATION ALGORITHMS plest form of an optimization algorithm, and to implement, is random search. In test data inputs are generated at random until the goal of r example, the coverage of a particular program r branch) is fulfilled. Random search is very poor olutions when those solutions occupy a very small overall search space. Such a situation is depicted where the number of inputs covering a particular arget are very few in number compared to the input domain. Test data may be found faster reliably if the search is given some guidance. eurstic searches, this guidance can be provided m of a problem-specific fitness function, which rent points in the search space with respect to ness’ or their suitability for solving the problem Input domain portion of input domain denoting required test data randomly-generated inputs Figure 2. Random search may fail to fulfil low-probability test goals Fitness Input domain (a) Climbing to a local optimum s
  • 22. 22 Search Elements • Search Space: • Initial and desired values, configuration parameters • Search Technique: • (1+1) EA, variants of hill climbing, GAs … • Search Objective: • Objective/fitness function for each requirement • Evaluation of Solutions • Simulation of Simulink model => fitness computation • Result: • Worst case scenarios or input signals that (are more likely to) break the requirement at MiL level 22
  • 23. Smoothness Objective Functions: OSmoothness Test Case A Test Case B OSmoothness(Test Case A) > OSmoothness(Test Case B) We want to find test scenarios which maximize OSmoothness 23
  • 24. Solution Overview (SimplifiedVersion) 24 HeatMap Diagram 1. Exploration List of Critical RegionsDomain Expert Worst-Case Scenarios + Controller- plant model Objective Functions based on Requirements 2. Single-State Search time Desired Value Actual Value 0 1 2 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 Initial Desired Final Desired
  • 26. Analysis – Fitness increase over iterations 26 Number of Iterations Fitness
  • 27. Analysis II – Search over different regions 27 0.315 0.316 0.317 0.319 0.321 0.323 0.324 0.326 0.327 0.329 0.330 0 10 20 30 40 50 60 70 80 90 100 0.328 0.325 0.320 0.318 0 10 20 30 40 50 60 70 80 90 1000 10 20 30 40 50 60 70 80 90 100 Random Search (1+1) EA 0 10 20 30 40 50 60 70 80 90 100 0.0166 0.0168 0.0170 0.0176 0.0180 0.0178 0.0172 0.0160 0.0162 0.0164 Random Search (1+1) EA 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 0.0174 Average (1+1) EA Distribution Random Search Distribution Number of Iterations
  • 28. • We found much worse scenarios during MiL testing than our partner had found so far, and much worse than random search (baseline) • These scenarios are also run at the HiL level, where testing is much more expensive: MiL results -> test selection for HiL • But further research was needed: – Simulations are expensive – Configuration parameters – Dynamically adjust search algorithms in different subregions (exploratory <-> exploitative) Conclusions i.e., 31s. Hence, the horizontal axis of the diagrams in Figure 8 shows the number of iterations instead of the computation time. In addition, we start both random search and (1+1) EA from the same initial point, i.e., the worst case from the exploration step. Overall in all the regions, (1+1) EA eventually reaches its plateau at a value higher than the random search plateau value. Further, (1+1) EA is more deterministic than ran- dom, i.e., the distribution of (1+1) EA has a smaller variance than that of random search, especially when reaching the plateau (see Figure 8). In some regions (e.g., Figure 8(d)), however, random reaches its plateau slightly faster than (1+1) EA, while in some other regions (e.g. Figure 8(a)), (1+1) EA is faster. We will discuss the relationship between the region landscape and the performance of (1+1) EA in RQ3. RQ3. We drew the landscape for the 11 regions in our experiment. For example, Fig- ure 9 shows the landscape for two selected regions in Figures 7(a) and 7(b). Specifically, Figure 9(a) shows the landscape for the region in Figure 7(b) where (1+1) EA is faster than random, and Figure 9(b) shows the landscape for the region in Figure 7(a) where (1+1) EA is slower than random search. 0.30 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.38 0.39 0.40 0.70 0.71 0.72 0.73 0.74 0.75 0.76 0.77 0.78 0.79 0.80 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.20 0.90 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1.00 (a) (b) Fig. 9. Diagrams representing the landscape for two representative HeatMap regions: (a) Land- scape for the region in Figure 7(b). (b) Landscape for the region in Figure 7(a). 28
  • 29. Testing in the ConfigurationSpace • MIL testing for all feasible configurations • The search space is much larger • The search is much slower (Simulations of Simulink models are expensive) • Results are harder to visualize • But not all configuration parameters matter for all objective functions 29
  • 30. Modified Process and Technology 30 + Controller Model (Simulink) Worst-Case Scenarios List of Critical PartitionsRegression Tree 1.Exploration with Dimensionality Reduction 2.Search with Surrogate Modeling Objective Functions Domain Expert Visualization of the 8-dimension space using regression treesDimensionality reduction to identify the significant variables (Elementary Effect Analysis) Surrogate modeling to predict the objective function and speed up the search (Machine learning)
  • 31. Dimensionality Reduction • Sensitivity Analysis: Elementary Effect Analysis (EEA) • Identify non-influential inputs in computationally costly mathematical models • Requires less data points than other techniques • Observations are simulations generated during the Exploration step • Compute sample mean and standard deviation for each dimension of the distribution of elementary effects 31 Cal5 ID Cal3 FD Cal4 Cal6 Cal1,Cal2 0.6 0.4 0.2 0.0 SampleStandardDeviation() -0.6 -0.4 -0.2 0.0 0.2 Sample Mean ( ) ⇤10 2 ⇤10 2 Si i
  • 32. Elementary Effects Analysis Method ü Imagine function F with 2 inputs, x and y: A x y A1 A2 C x y C1 C2 B x y B1 B2 X Y Elementary Effects for X for Y F(A1)-F(A) F(B1)-F(B) F(C1)-F(C) … F(A2)-F(A) F(B2)-F(B) F(C2)-F(C) … 32
  • 33. Visualizationin Inputs & Configuration Space 33 All Points FD>=0.43306 Count Mean Std Dev Count Mean Std Dev FD<0.43306 Count Mean Std Dev ID>=0.64679 Count Mean Std Dev Count Mean Std Dev Cal5>=0.020847 Cal5>0.020847 Count Mean Std Dev Count Mean Std Dev Cal5>=0.014827 Cal5<0.014827 Count Mean Std Dev Count Mean Std Dev 1000 0.007822 0.0049497 ID<0.64679 574 0.0059513 0.0040003 426 0.0103425 0.0049919 373 0.0047594 0.0034346 201 0.0081631 0.0040422 182 0.0134555 0.0052883 244 0.0080206 0.0031751 70 0.0106795 0.0052045 131 0.0068185 0.0023515 Regression Tree
  • 34. Surrogate Modeling During Search • Goal: To predict the value of the objective functions within a critical partition, given a number of observations, and use that to avoid as many simulations as possible and speed up the search 34 A B
  • 35. Surrogate Modeling During Search 35 • Any supervised learning or statistical technique providing fitness predictions with confidence intervals 1. Predict higher fitness with high confidence: Move to new position, no simulation 2. Predict lower fitness with high confidence: Do not move to new position, no simulation 3. Low confidence in prediction: Simulation Surrogate Model Real Function x Fitness
  • 36. Experiments Results (RQ1) ü The best regression technique to build Surrogate models for all of our three objective functions is Polynomial Regression with n = 3 ü Other supervised learning techniques, such as SVM Mean of R2/MRPE values for different surrogate modeling techniques Fst Fsm Fr PR(n=3) R2/MRPE 0.66/0.0526 0.95/0.0203 0.78/0.0295 0.26/0.2043 0.98/0.0129 0.85/0.0247 0.85/0.0245 0.46/0.1755 0.54/0.1671 0.44/0.0791 0.49/1.2281 0.22/1.2519 LR R2/MRPE ER R2/MRPE PR(n=2) R2/MRPE 36
  • 37. Experiments Results (RQ2) ü Dimensionality reduction helps generate better surrogate models for Smoothness and Responsiveness requirements 0.0 0.02 0.04 0.06 0.08 0.05 0.01 0.02 0.03 0.04 0.1 0.2 0.3 DR No DR DR No DR DR No DR Smoothness( )Fsm Responsiveness( )Fr Stability( )Fst MeanRelativePredictionErrors (MRPEValues) 37
  • 38. ü For responsiveness, the search with SM was 8 times faster ü For smoothness, the search with SM was much more effective Experiments Results (RQ3) SearchOutput Values SearchOutput Values 0.215 SM After 800 seconds After 2500 seconds After 3000 seconds NoSM 0.220 0.225 0.230 0.235 SM NoSMSM NoSM After 200 seconds 0.160 0.164 0.168 After 300 seconds After 3000 seconds NoSM NoSM SM NoSMSM SM 38
  • 39. ü Our approach is able to identify critical violations of the controller requirements that had neither been found by our earlier work nor by manual testing. MiL-Testing different configurations Stability Smoothness Responsiveness MiL-Testing fixed configurations Manual MiL-Testing - -2.2% deviation 24% over/undershoot 20% over/undershoot 5% over/undershoot 170 ms response time 80 ms response time 50 ms response time Experiments Results (RQ4) 39
  • 40. A Taxonomy of Automotive Functions ControllingComputation State-Based ContinuousTransforming Calculating unit convertors calculating positions, duty cycles, etc State machine controllers Closed-loop controllers (PID) Different testing strategies are required for different types of functions 40
  • 41. Open-LoopControllers 41 ave complex internal structures , making them less amenable to ve rich time-continuous outputs. wing contributions: f testing Stateflows with mixed urs. We propose two new test t stability and output continuity est inputs that are likely to pro- hibiting instability and disconti- verage and the blackbox output o Stateflows, and evaluate their ntinuous behaviours. The former nal state and transition coverage atter is defined based on the re- rion [?]. of our newly proposed and the applying them to three Stateflow dustrial and one public domain. LT. (c) Engaging state of SCC -- mixed discrete-continuous behaviou Disengaging Engaged [disengageReq]/time := 0 [tim >5] time + +; OnMoving OnSlipping OnCompleted time + +; ctrlSig := f(time) Engaging time + +; ctrlSig := g(time) time + +; ctrlSig := 1.0 [¬(vehspd = 0) time > 2] [(vehspd = 0) time > 3] [time > 4] Figure 1: Supercharge Clutch Controller (SCC) Stateflo transient states [?], engaging and disengaging, specifying that m ing from the engaged to the disengaged state and vice versa t six milisec. Since this model is simplified, it does not show • No feedback loop -> no automated oracle • No plant model: Much quicker simulation time • Mixed discrete-continuous behavior: Simulink stateflows • The main testing cost is the manual analysis of output signals • Goal: Minimize test suites • Challenge:Test selection • Entirely different approach to testing On Off CtrlSig
  • 42. Selection Strategies Based on Search • Input diversity • White-box Structural Coverage • State Coverage • Transition Coverage • Output Diversity • Failure-Based Selection Criteria • Domain specific failure patterns • Output Stability • Output Continuity 42 S3 t S3 t
  • 43. Failure-basedTest Generation 4 3 Instability Discontinuity 0.0 1.0 2.0 -1.0 -0.5 0.0 0.5 1.0 Time CtrlSigOutput • Search: Maximizing the likelihood of presence of specific failure patterns in output signals • Domain-specific failure patterns elicited from engineers 0.0 1.0 2.0 Time 0.0 0.25 0.50 0.75 1.0 CtrlSigOutput
  • 44. Summary of Results • The test cases resulting from state/transition coverage algorithms cover the faulty parts of the models • However, they fail to generate output signals that are sufficiently distinct from the oracle signal, hence yieldinga low fault revealing rate • Output-based algorithmsare more effective 44
  • 45. Automated Testing of Driver Assistance Systems Through Simulation Reference: 45 R. Ben Abdessalem et al., "Testing Advanced DriverAssistance Systems Using Multi-Objective Search and Neural Networks”, ACM ESEC/FSE 2016
  • 46. Pedestrian Detection Vision System (PeVi) 46 • The PeVi system is a camera-based collision-warning system providing improved vision
  • 47. Testing DA Systems • Testing DA systems requires complex and comprehensive simulation environments – Static objects: roads, weather, etc. – Dynamic objects: cars, humans, animals, etc. • A simulationenvironment captures the behavior of dynamic objects as well as constraints and relationships between dynamic and static objects 47
  • 48. Approach 48 Generation of Test specifications Static [ranges/values/ resolution] Dynamic [ranges/ resolution] (2) test case specification Specification Documents (Simulation Environment and PeVi System) Domain model Requirements model (1)Development of Requirements and domain models
  • 49. 49 - simulationTime: Real - timeStep: Real Test Scenario - v0: Real Vehicle - x0: Real - y0: Real - θ: Real - v0: Real Pedestrian - simulationTime: Real - timeStep: Real Test Scenario PeVi 1 1 1 1 «positioned» Dynamic Object - v0: Real Vehicle - x0: Real - y0: Real - θ: Real - v0: Real Pedestrian - x: Real - y: Real Position * 1 1 - state: Boolean Collision PeVi - state: Boolean Detection 11 1 1 - AWA Output Trajectory Static inputs Dynamic inputs Outputs - intensity: Real SceneLight 1 - weatherType: Condition Weather - fog - rain - snow - normal «enumeration» Condition - field of view: Real Camera Sensor RoadSide Object - roadType: RT Road 1 - curved - straight - ramped «enumeration» RT 1 * 1 Parked Cars Trees - simulationTime: Real - timeStep: Real Test Scenario «uses» 1 1 PeVi PeVi and Environment Domain Model
  • 50. Requirements Model 50 <<trace>> <<trace>> Speed Profile Path 1 1 Slot Path Segment 1..** 1 Trajectory Human 1* trajectory Warning Sensors posx1, posx2 posy1, posy2 AWACar/Motor/ Truck/Bus sensor has has awa 1 1 1 * human appears posx1 posx2 posy1 posy2 The NiVi system shall detect any person located in the Acute Warning Area of a vehicle
  • 51. MiL Testing via Search 51 Simulator + NiVi Environment Settings (Roads, weather, vehicle type, etc.) Fixed during Search Manipulated by Search Human Simulator (initial position, speed, orientation) Car Simulator (speed) PeVi Meta-heuristic Search (multi-objective) Generate scenarios Detection or not? Collision or not?
  • 52. 5 2 Type of Road Type of vehicle Type of actor Situation 1 Straight Car Male Situation 2 Straight Car Child Situation 3 Straight Car Cow Situation 4 Straight Truck Male Situation 5 Straight Truck Child Situation 6 Straight Truck Cow Situation 7 Curved Car Male Situation 8 Curved Car Child Situation 9 Curved Car Cow Situation 10 Curved Truck Male Situation 11 Curved Truck Child Situation 12 Curved Track Cow Situation 13 Ramp Car Male Situation 14 Ramp Car Child Situation 15 Ramp Car Cow Situation 16 Ramp Truck Male Situation 17 Ramp Truck Child Situation 18 Ramp Truck Cow Situation 19 Situation 20 Straight Car+ Cars in parking Car + buildings Male Test Case Specification: Static (combinatorial)
  • 53. Test Case Specification: Dynamic 53 Start locationX = 74 Start locationY = 37.72 Start locationZ = 0 Orientation = 0 trajectoryPerson : Trajectory PositionX= 74 Position Y= 37.72 Position Z = 0 OrientationHeading = 93.33 Acceleration = 0 MaxWalkingSpeed =14 height=1.75 person :Actor UniqueId profilePerson : Speed Profile StartPointX = 74 StartPointY = 37.72 StartPointY = 0 StartAngle = 93.33 End Angle = 0 Length = 60 pathPerson : Path Length = 60 Type = Straight MaxSpeedLimit = 14 segmentPerson : Path Segment ID slotPerson : Slot Time = 0 Speed = 12.59 startPerson : StartState Start locationX = 10 Start locationY = 50.125 Start locationZ = 0.56 Orientation = 0 trajectoryCar : TrajectoryPositionX=10 Position Y= 50.125 Position Z = 0.56 OrientationHeading = 0 Acceleration = 0 MaxWalkingSpeed =100 car : Actor UniqueId profileCar : Speed Profile StartPointX = 10 StartPointY = 50.125 StartPointZ =0.56 StartAngle = 0 End Angle = 0 Length = 100 pathCar : Path Length = 100 Type = Straight MaxSpeedLimit = 100 segmentCar : Path Segment ID slotCar : Slot Time = 0 Speed = 60.66 startCar : StartState MinTTC=0.3191 Collision
  • 54. Choice of Surrogate Model • Neural networks (NN) have been trained to learn complex functions predicting fitness values • NN can be trained using different algorithms such as: – LM: Levenberg-Marquardt – BR: Bayesian regularization backpropagation – SCG: Scaled conjugate gradient backpropagation • R2 (coefficient of determination) indicates how well data fit a statistical model • Computed R2 for LM, BR and SCG è BR has the highest R2 54
  • 55. Multi-Objective Search • Input space: car-speed, person- speed, person-position (x,y), person-orientation • Search algorithm need objective or fitness functions for guidance • In our case several independent functions could be interesting: – Minimum distance between car and pedestrian – Minimum distance between pedestrian and AWA – Minimum time to collision • NSGA II algorithm 55 posx1 posx2 posy1 posy2
  • 56. Pareto Front 56 Individual A Pareto dominates individual B if A is at least as good as B in every objective and better than B in at least one objective. Dominated by x O1 O2 Pareto front x • A multi-objective optimization algorithm must achieve: • Guide the search towards the global Pareto-Optimal front. • Maintain solution diversity in the Pareto-Optimal front.
  • 57. MO Search with NSGA-II 57 Non-Dominated Sorting Selection based on rank and crowding distance Size: 2*N Size: 2*N Size: N • Based on Genetic Algorithm • N: Archive and population size • Non-Dominated sorting: Solutions are ranked according to how far they are from the Pareto front, fitness is based on rank. • Crowding Distance: Individuals in the archive are being spread more evenly across the front (forcing diversity) • Runs simulations for close to N new solutions
  • 60. SimulationScenario Execution • Straight road with parking • The person appears in the AWA, but is not detected 60
  • 61. Improving Time Performance • Individual simulations take on average more than 1min • It takes 10 hours to run our search-based test generation (≈ 500 simulations) • We use surrogate modeling to improve the search • Neural networks are used to predict fitness values within a confidence interval • During the search, we use prediction values & confidence intervals to run simulations only for the solutions likely to be selected 61
  • 62. Search with Surrogate Models 62 Non-Dominated Sorting Selection based on rank and crowding distance Size: 2*N Size: 2*N Size: N Original Algorithm - Runs simulations for all new solutions Our Algorithm - Uses prediction values & intervals to run simulations only for the solutions likely to be selected NSGA II
  • 63. Results – Surrogate Modeling 63 0.00 0.25 0.50 0.75 1.00 Time (min) HV 50 100 15010 (a) Comparing HV values obtained by NSGAII and NSGAII-SM NSGAII (mean) NSGAII-SM (mean) 1.00 (b) Comparing HV values obtained by RS and NSGAII-SM execution the worst among ou NSGAII worst run be unluc run simil the wors worst run safer algo constrain values do This is be distance from the tions in t We fur and NSG min. Sim
  • 64. Results – Random Search 64 Time (min) 0.00 0.25 0.50 0.75 1.00 Time (min) 50 100 15010 (b) Comparing HV values obtained by RS and NSGAII-SM HV RS (mean) NSGAII-SM (mean) (c) HV values for worst runs of NSGAII, NSGAII-SM and RS dis fro tio an mi alg is Fu an tha res tio SM gen ten NS tha in
  • 65. Results – Worst Runs 65 Time (min) 50 100 15010 0.00 0.25 0.50 0.75 1.00 Time (min) HV 50 100 15010 (c) HV values for worst runs of NSGAII, NSGAII-SM and RS RS NSGAII-SM NSGAII Figure 6: Comparing HV values obtained by (a) 20 runs of NSGAII and NSGAII-SM (cl=.95); (b) 20 g t N t i w G o p N G g a A t s c G b
  • 66. Minimizing CPU Shortage Risks During Integration References: 66 • S. Nejati et al., ‘‘Minimizing CPU Time Shortage Risks in Integrated Embedded Software’’, in 28th IEEE/ACM International Conference on Automated Software Engineering (ASE 2013), 2013 • S. Nejati, L. Briand, “Identifying Optimal Trade-Offs between CPU Time Usage and Temporal Constraints Using Search”, ACM International Symposium on Software Testing and Analysis (ISSTA 2014), 2014
  • 69. • Develop software optimized for their specific hardware • Provide integrator with runnables • Integrate car makers software with their own platform • Deploy final software on ECUs and send them to car makers Car Makers Integrator Stakeholders 69
  • 70. • Objective: Effective execution and synchronization of runnables • Some runnables should execute simultaneously or in a certain order • Objective: Effective usage of CPU time • Max CPU time used by all the runnables should remain as low as possible over time Car Makers Integrator Different Objectives 70
  • 71. An overview of an integration process in the automotive domain AUTOSAR Models sw runnables sw runnables AUTOSAR Models Glue 71
  • 72. 72 CPU time shortage • Static cyclic scheduling: predictable, analyzable • Challenge – Many OS tasks and their many runnables run within a limited available CPU time • The execution time of the runnables may exceed their time slot • Goal – Reducing the maximum CPU time used per time slot to be able to • Minimize the hardware cost • Reduce the probability of overloading the CPU in practice • Enable addition of new functions incrementally 72 5ms 10ms 15ms 20ms 25ms 30ms 35ms 40ms ✗ 5ms 10ms 15ms 20ms 25ms 30ms 35ms 40ms ✔ (a) (b) Fig. 4. Two possible CPU time usage simulations for an OS task with a 5ms cycle: (a) Usage with bursts, and (b) Desirable usage. to dead run pas by OS divisibl offset v synchro
  • 73. 73 Using runnable offsets (delay times) 5ms 10ms 15ms 20ms 25ms 30ms 35ms 40ms 5ms 10ms 15ms 20ms 25ms 30ms 35ms 40ms ✗ ✔ Insertingrunnables’offsets Offsets have to be chosen such that the maximum CPU usage per time slot is minimized, and further, the runnables respect their period the runnables respect their time slot the runnables satisfy their synchronization constraints 73
  • 74. 5.34ms 5.34ms 5 ms Time CPUtimeusage(ms) CPU time usage exceeds the size of the slot (5ms) Without optimization 74
  • 75. CPU time usage always remains less than 2.13ms, so more than half of each slot is guaranteed to be free 2.13ms 5 ms Time CPUtimeusage(ms) With Optimization 75
  • 76. Single-objectiveSearch algorithms Hill Climbing and Tabu Search and their variations Solution Representation a vector of offset values: o0=0, o1=5, o2=5, o3=0 Tweak operator o0=0, o1=5, o2=5, o3=0 à o0=0, o1=5, o2=10, o3=0 Synchronization Constraints offset values are modified to satisfy constraints Fitness Function max CPU time usage per time slot 76
  • 77. Summary of Problem and Solution Optimization while satisfying synchronization/temporal constraints Explicit Time Model for real-time embedded systems Search meta-heuristic single objective search algorithms 10^27 an industrial case study with a large search space 77
  • 78. 78 Search Solution and Results Case Study: an automotive software system with 430 runnables, search space = 10^27 Running the system without offsets Simulation for the runnables in our case study and corresponding to the lowest max CPU usage found by HC 5.34 ms Optimized offset assignment 2.13 ms - The objective function is the max CPU usage of a 2s-simulation of runnables - The search modifies one offset at a time, and updates other offsets only if timing constraints are violated - Single-state search algorithms for discrete spaces (HC, Tabu) 78
  • 79. 79 Comparingdifferent search algorithms (ms)(s) Best CPU usage Time to find Best CPU usage 79
  • 80. 80 Comparingour best search algorithm with random search (a) (b) (c) (a) Lowest max CPU usage values computed by HC within 70 ms over 100 different runs Lowest max CPU usage values computed by Random within 70 ms over 100 different runs Comparing average behavior of Random and HC in computing lowest max CPU usage values within 70 s and over 100 different runs 80 HC Random Average
  • 81. 0ms 5ms 10ms 15ms 20ms 25ms 30ms 0ms 5ms 10ms 15ms 20ms 25ms 30ms 0ms 5ms 10ms 15ms 20ms 25ms 30ms 4ms 3ms 2ms Car Makers Integratorr0 r1 r2 r3 Minimize CPU time usage 1 slot 2 slots 3 slots Execute r0 to r3 close to one another. Trade-off between Objectives 81
  • 82. Trade-off curve #ofslots CPU time usage (ms) 2.041.45 12 21 14 1.56 1 2 3 Boundary Trade Offs Interesting Solutions 82
  • 83. Multi-objective search • Multi-objective genetic algorithms (NSGA II) • Pareto optimality • Supporting decision making and negotiation between stakeholders 83 Report: GraphRandom-NSGAII-25 Page 1 of 2 CPU Time Usage-NSGAII & CPU Time Usage-Random vs. Number of Slots-NSGAII & Number of Slots-Random Number of Slots-NSGAII & Number of Slots-Random 10 15 20 25 30 35 40 45 50 CPUTimeUsage-NSGAII&CPUTimeUsage-Random 1.5 2.0 2.5 3.0 3.5 Graph Builder Total Number of Time Slots MaxCPUTimeUsage(ms) Random(25,000) NSGA-II(25,000) A B 12 1.45 C Objectives: • (1) Max CPU time • (2) Maximum time slots between “dependent” tasks
  • 84. Input.csv: - runnables - Periods - CETs - Groups - # of slots per groups Search A list of solutions: - objective 1 (CPU usage) - objective 2 (# of slots) - vector of group slots - vector of offsets Visualization/ Query Analysis - Visualize solutions - Retrieve/visualize simulations - Visualize Pareto Fronts - Apply queries to the solutions Trade-Off Analysis Tool 84
  • 85. 85 Conclusions - Search algorithms to compute offset values that reduce the max CPU time needed - Generate reasonably good results for a large automotive system and in a small amount of time - Used multi-objective search à tool for establishing trade-off between relaxing synchronization constraints and maximum CPU time usage 85
  • 86. Schedulability Analysis and Stress Testing References: 86 • S. Di Alesio et al., “Worst-Case Scheduling of Software Tasks – A Constraint Optimization Model to Support Performance Testing, Constraint Programming (CP), 2014 • S. Di Alesio et al. “Combining Genetic Algorithms and Constraint Programming to Support Stress Testing”, ACM TOSEM, 25(1), 2015
  • 87. Real-time, concurrent systems (RTCS) • Real-time, concurrent systems (RTCS) have concurrent interdependent tasks which have to finish before their deadlines • Some task properties depend on the environment, some are design choices • Tasks can trigger other tasks, and can share computational resources with other tasks • How can we determine whether tasks meet their deadlines? 87
  • 88. Problem • Schedulability analysis encompasses techniques that try to predict whether all (critical) tasks are schedulable, i.e., meet their deadlines • Stress testing runs carefully selected test cases that have a high probability of leading to deadline misses • Stress testing is complementary to schedulability analysis • Testing is typically expensive, e.g., hardware in the loop • Finding stress test cases is difficult 88
  • 89. Finding Stress Test Cases is Difficult 89 0 1 2 3 4 5 6 7 8 9 j0, j1 , j2 arrive at at0 , at1 , at2 and must finish before dl0 , dl1 , dl2 J1 can miss its deadline dl1 depending on when at2 occurs! 0 1 2 3 4 5 6 7 8 9 j0 j1 j2 j0 j1 j2 at0 dl0 dl1 at1 dl2 at2 T T at0 dl0 dl1 at1 at2 dl2
  • 90. Challengesand Solutions • Ranges for arrival times form a very large input space • Task interdependencies and properties constrain what parts of the space are feasible • We re-expressed the problem as a constraint optimisation problem • Constraint programming (e.g., IBM CPLEX) 90
  • 91. Constraint Optimization 91 Constraint Optimization Problem Static Properties of Tasks (Constants) Dynamic Properties of Tasks (Variables) Performance Requirement (Objective Function) OS Scheduler Behaviour (Constraints)
  • 92. Process and Technologies 92 UML Modeling (e.g., MARTE) Constraint Optimization Optimization Problem (Find arrival times that maximize the chance of deadline misses) System Platform Solutions (Task arrival times likely to lead to deadline misses) Deadline Misses Analysis System Design Design Model (Time and Concurrency Information) INPUT OUTPUT Stress Test Cases Constraint Programming (CP)
  • 93. Context 93 Drivers (Software-Hardware Interface) Control Modules Alarm Devices (Hardware) Multicore Architecture Real-Time Operating System System monitors gas leaks and fire in oil extraction platforms
  • 94. Challengesand Solutions • CP effective on small problems • Scalability problem: Constraint programming (e.g., IBM CPLEX) cannot handle large input spaces (CPU, memory) • Solution: Combine metaheuristic search and constraint programming – metaheuristic search (GA) identifies high risk regions in the input space – constraint programming finds provably worst-case schedules within these (limited) regions – Achieve (nearly) GA efficiency and CP effectiveness • Our approach can be used both for stress testing and schedulability analysis (assumption free) 94
  • 95. Combining GA and CP 95 A:12 S. Di Alesio et al. Fig. 3: Overview of GA+CP: the solutions x1, y1 and z1 in the initial population of GA evolve into x6, y6, and z6, then CP searches in their neighborhood for the optimal solutions x⇤ , y⇤ and z⇤ . in the schedule generated by the arrival times in x:
  • 96. Process and Technologies 96 UML Modeling (e.g., MARTE) Constraint Optimization Optimization Problem (Find arrival times that maximize the chance of deadline misses) System Platform Solutions (Task arrival times likely to lead to deadline misses) Deadline Misses Analysis System Design Design Model (Time and Concurrency Information) INPUT OUTPUT Genetic Algorithms (GA) Stress Test Cases Constraint Programming (CP)
  • 97. V&V Topics Addressed by Search • Many projects over the last 15 years • Design-time verification – Schedulability – Concurrency – Resource usage • Testing – Stress/load testing, e.g., task deadlines – Robustness testing, e.g., data errors – Reachability of safety or business critical states, e.g., collision and no warning – Security testing, e.g., XML and SQLi injections 97
  • 98. Publicity! • Chunhui Wang et al., “System Testing of Timing Requirements based on Use Cases and Timed Automata”. Session R09 @ ICST 2017, Tuesday, 2 pm • Sadeeq Jan et al., “A Search-based Testing Approach for XML Injection Vulnerabilities in Web Applications”. Session R11 @ ICST 2017, Thursday 11 am 98
  • 99. 99 Objective Function Search Space Search Technique n Problem = fault model n Model = system or environment n Search to optimize objective function(s) n Metaheuristics n Scalability: A small part of the search space is traversed n Model: Guidance to worst case, high-risk scenarios across space n Reasonable modeling effort based on standards or extension n Heuristics: Extensive empirical studies are required General Pattern: Using Metaheuristic Search
  • 100. 100 Objective Function Search Space Search Technique n Model simulation can be time consuming n Makes the search impractical or ineffective n Surrogate modeling based on machine learning n Simulator dedicated to search General Pattern: Using Metaheuristic Search Simulator
  • 101. 101 Objective Function Search Space Search Technique n Use techniques such as sensitivity analysis to minimize dimensionality before running search n Predict parts of the space worth searching in General Pattern: Using Metaheuristic Search Large
  • 102. 102 Objective Function Search Space Search Technique n Combine with solvers and optimization engines n Need heuristic strategies to determine when to use what General Pattern: Using Metaheuristic Search Multiple techniques?
  • 104. Project examples • Scalability is the most common verification challenge in practice • Testing closed-loop controllers, DA system – Large input and configuration space – Expensive simulations – Smart heuristics to avoid simulations (machine learning to predict fitness) • Schedulability analysis and stress testing – Large space of possible arrival times – Constraint programming cannot scale by itself – CP was carefully combined with genetic algorithms 104
  • 105. Scalability: Lessons Learned • Scalability must be part of the problem definition and solution from the start, not a refinement or an after- thought • Meta-heuristic search, by necessity, has been an essential part of the solutions, along with, in some cases, machine learning, statistics, etc. • Scalability often leads to solutions that offer “best answers” within time constraints, but no guarantees • Scalability analysis should be a component of every research project – otherwise it is unlikely to be adopted in practice • How many papers research papers do include even a minimal form of scalability analysis? 105
  • 107. Project examples • Practicality requires to account for the domain and context • Testing controllers – Relies on Simulink only – No additional modeling or complex translation – Differences between open versus closed loop controllers • Minimizing risks of CPU shortage – Trade-off between between effective synchronization and CPU usage – Trade-off achieved through multiple-objective GA search and appropriate decision tool 107
  • 108. Practicality: Lessons Learned • In software engineering, and verification in particular, just understanding the real problems in context is difficult • What are the inputs required by the proposed technique? • How does it fit in development practices? • Is the output what engineers require to make decisions? • There is no unique solution to a problem as they tend to be context dependent, but a context is rarely unique and often representative of a domain or type of system 108
  • 109. Discussion • Metaheuristic search for verification and testing – Tends to be versatile, tailorable to new problems and contexts – Particularly suited to the verification of non-functional properties – Entails acceptable modeling requirements – Can provide “best” answers at any time – Scalable, practical But – Not a proof, no certainty – Effectiveness of search guidance is key and must be experimentally evaluated – Models are key to provide adequate guidance – Search must often be combined with other techniques, e.g., machine learning, constraint programming 109
  • 110. DiscussionII • Constraint solvers (e.g., Comet, ILOG CPLEX, SICStus) – Is there an efficient constraintmodel for the problem at hand? – Can effective heuristics be found to order the search? – Better if there is a match to a known standard problem, e.g., job shop scheduling – Tend to be strongly affected by small changes in the problem, e.g., allowing task pre-emption – Often not scalable, e.g., memory • Model checking – Detailed operationalmodels (e.g., state models), involving (complex) temporal properties (e.g., CTL) – Enough details to analyze statically or execute symbolically – These modeling requirements are usually not realistic in actual system development.State explosion problem. – Originally designed for checking temporal properties through reachability analysis,as opposed to explicit timing properties – Often not scalable 110
  • 111. Talk Summary • Focus: Meta-heuristic Search to enable scalable verification and testing. • Scalability is the main challenge in practice. • We drew lessons learned from example projects in collaboration with industry, on real systems and in real verification contexts. • Results show that meta-heuristic search contributes to mitigate the scalability problem. • It has also shown to lead to practical solutions. • Solutions are very context dependent. • Solutions tend to be multidisciplinary: system modeling, constraint solving, machine learning, statistics. 111
  • 112. Acknowledgements PhD. Students: • Vahid Garousi • Marwa Shousha • Zohaib Iqbal • Reza Matinnejad • Stefano Di Alesio • Raja Ben Abdessalem Scientists: • Shiva Nejati • Andrea Arcuri 112
  • 113. Scalable Software Testing and Verification of Non-Functional Properties through Heuristic Search and Optimization Lionel Briand Interdisciplinary Centre for ICT Security, Reliability, and Trust (SnT) University of Luxembourg, Luxembourg ITEQS, March 13, 2017 SVV lab: svv.lu SnT: www.securityandtrust.lu We are hiring!