ICST/SBFT Tool Competition 2025
UAV Testing Track
The 18th Workshop on Search-Based
and Fuzz Testing
27 April
Sajad
Khatiri
Prasun
Saurabh
Tahereh
Zohdinasab
Dmytro
Humeniuk
Sebastiano
Panichella
• `
Unmanned Aerial Vehicles (UAVs)
2
o Reality Gap
o Reproducible
oScalable & Automatable
o Affordable
o Safe
oSimulation-based Test Generation
vs. Simulation-based Testing
o Reliable
o Not Reproducible
oLimited Test Scenarios
o Expensive & Time Consuming
o Unsafe
3
Field Testing
4
Sample PX4 Flight Log
Aerialist
UAV Test Bench
5
"Simulation-based testing of unmanned aerial vehicles with Aerialist“, ICSE 2024
Khatiri, Sajad, Sebastiano Panichella, and Paolo Tonella
UAV Config.
Env. Config.
Commands
Expectation
Test Description
Given an autonomous UAV flight mission
Generate test cases that violate safety distance to the
obstacles by placing obstacles in the environment
6
Test Generation
"Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the
Neighborhood of real flights “, ICST 2023
Sajad Khatiri, Sebastiano Panichella, and Paolo Tonella
7
First Edition
SBFT@ICSE 2024
"SBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track “, SBFT@ICSE 2024
Sajad Khatiri , et al.
• 6 competing test generation tools
• 1 baseline approach
• Generated tests for 6 flight missions
• 5 competing test generation tools
• 3 at ICST
• 2 at SBFT
Joint ICST/SBFT call for the second edition
Competition Rules
• Use the provided platform for test definition
• Use a test generation approach
• Place up to 3 box-shaped obstacles
• Size (length, width, height)
• Position (x, y, z)
• Orientation (r)
• Obstacles should
• Keep the mission physically possible
• Fit in the predefined area
• Be taller than the flight altitude (10m)
8
9
github.com/skhatiri/UAV-Testing-Competition
github.com/skhatiri/Aerialist
mazras@usi.ch
Sajad Khatiri
• Pseudo-Random
• OptObstacles
• Surrealist
Competing Tools
Evaluation
• 5 competing test generation tools (ICST:3, SBFT:2)
• 1 baseline approach
• Generated tests for 3 flight missions
• With 100 Simulation Budget
• Using our K8S evaluation platform
• Reported a ranked test suite from the failing ones
10
Evaluation Metrics
• Top 20 Tests from each test suite were evaluated
• Failure Score
• Simulated 3 times
• Assign points to each execution
• Assign a score to each test case
• Calculate the failure score for the test suite
• Diversity Score
• Similarity of the area covered by the obstacles
• Assign a similarity score to each pair of tests
• Calculate the diversity score for the test suite
11
Case Studies
12
Case Study 2
Tool
#
tests
#
failures
failure
score
failure
rank
diversity
score
diversity
rank
Evolv-1 57 12 9.02 1 0.88 3
Pseudo-Random 31 7 5.25 2 0.87 4
Surrealist 52 11 4.17 3 0.21 6
TGen-UQ 5 1 0.50 4 1.00 1
OptObstacles 3 1 0.44 5 0.94 2
PALM 58 1 0.14 6 0.58 5
Surrealist
OptObstacles OptObstacles
Pseudo-Random Pseudo-Random
Surrealist
Case Studies
13
Case Study 4
Tool
#
tests
#
failures
failure
score
failure
rank
diversity
score
diversity
rank
Pseudo-Random 66 12 9.30 1 0.88 3
PALM 82 20 3.59 2 0.42 5
Surrealist 9 3 1.94 3 0.21 6
TGen-UQ 6 1 1.45 4 0.95 1
Evolv-1 22 2 1.08 5 0.83 4
OptObstacles 7 0 0.00 6 0.92 2
Surrealist
Surrealist
OptObstacles OptObstacles
Pseudo-Random Pseudo-Random
Case Studies
14
Case Study 5
Tool
#
tests
#
failures
failure
score
failure
rank
diversity
score
diversity
rank
Evolv-1 37 6 2.94 1 0.86 3
Pseudo-Random 17 5 2.72 2 0.83 4
PALM 76 19 0.80 3 0.47 5
Surrealist 9 1 0.12 4 0.18 6
TGen-UQ 6 0 0.00 5 0.90 2
OptObstacles 2 0 0.00 5 0.90 1
Surrealist
OptObstacles
OptObstacles
Pseudo-Random Pseudo-Random
Surrealist
Ranking
15
0 50 100 150 200 250
Pseudo-Random
Evolv-1
PALM
TGen-UQ
Surrealist
OptObstacles
Test Suite Size
# Reported Tests # Failed Tests
0 2 4 6 8 10 12 14 16 18 20
Pseudo-Random
Evolv-1
PALM
TGen-UQ
Surrealist
OptObstacles
Test Suite Score
Failure Score Diversity Score
16
github.com/skhatiri/UAV-Testing-Competition
github.com/skhatiri/Aerialist
mazras@usi.ch
Sajad Khatiri
0 2 4 6 8 10 12 14 16 18 20
Pseudo-Random
Evolv-1
PALM
TGen-UQ
Surrealist
OptObstacles
Test Suite Score
Failure Score Diversity Score
17
Simulator non-determinism
18

ICST/SBFT Tool Competition 2025 - UAV Testing Track

  • 1.
    ICST/SBFT Tool Competition2025 UAV Testing Track The 18th Workshop on Search-Based and Fuzz Testing 27 April Sajad Khatiri Prasun Saurabh Tahereh Zohdinasab Dmytro Humeniuk Sebastiano Panichella
  • 2.
    • ` Unmanned AerialVehicles (UAVs) 2
  • 3.
    o Reality Gap oReproducible oScalable & Automatable o Affordable o Safe oSimulation-based Test Generation vs. Simulation-based Testing o Reliable o Not Reproducible oLimited Test Scenarios o Expensive & Time Consuming o Unsafe 3 Field Testing
  • 4.
  • 5.
    Aerialist UAV Test Bench 5 "Simulation-basedtesting of unmanned aerial vehicles with Aerialist“, ICSE 2024 Khatiri, Sajad, Sebastiano Panichella, and Paolo Tonella UAV Config. Env. Config. Commands Expectation Test Description
  • 6.
    Given an autonomousUAV flight mission Generate test cases that violate safety distance to the obstacles by placing obstacles in the environment 6 Test Generation "Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Neighborhood of real flights “, ICST 2023 Sajad Khatiri, Sebastiano Panichella, and Paolo Tonella
  • 7.
    7 First Edition SBFT@ICSE 2024 "SBFTTool Competition 2024 - CPS-UAV Test Case Generation Track “, SBFT@ICSE 2024 Sajad Khatiri , et al. • 6 competing test generation tools • 1 baseline approach • Generated tests for 6 flight missions • 5 competing test generation tools • 3 at ICST • 2 at SBFT Joint ICST/SBFT call for the second edition
  • 8.
    Competition Rules • Usethe provided platform for test definition • Use a test generation approach • Place up to 3 box-shaped obstacles • Size (length, width, height) • Position (x, y, z) • Orientation (r) • Obstacles should • Keep the mission physically possible • Fit in the predefined area • Be taller than the flight altitude (10m) 8
  • 9.
  • 10.
    Evaluation • 5 competingtest generation tools (ICST:3, SBFT:2) • 1 baseline approach • Generated tests for 3 flight missions • With 100 Simulation Budget • Using our K8S evaluation platform • Reported a ranked test suite from the failing ones 10
  • 11.
    Evaluation Metrics • Top20 Tests from each test suite were evaluated • Failure Score • Simulated 3 times • Assign points to each execution • Assign a score to each test case • Calculate the failure score for the test suite • Diversity Score • Similarity of the area covered by the obstacles • Assign a similarity score to each pair of tests • Calculate the diversity score for the test suite 11
  • 12.
    Case Studies 12 Case Study2 Tool # tests # failures failure score failure rank diversity score diversity rank Evolv-1 57 12 9.02 1 0.88 3 Pseudo-Random 31 7 5.25 2 0.87 4 Surrealist 52 11 4.17 3 0.21 6 TGen-UQ 5 1 0.50 4 1.00 1 OptObstacles 3 1 0.44 5 0.94 2 PALM 58 1 0.14 6 0.58 5 Surrealist OptObstacles OptObstacles Pseudo-Random Pseudo-Random Surrealist
  • 13.
    Case Studies 13 Case Study4 Tool # tests # failures failure score failure rank diversity score diversity rank Pseudo-Random 66 12 9.30 1 0.88 3 PALM 82 20 3.59 2 0.42 5 Surrealist 9 3 1.94 3 0.21 6 TGen-UQ 6 1 1.45 4 0.95 1 Evolv-1 22 2 1.08 5 0.83 4 OptObstacles 7 0 0.00 6 0.92 2 Surrealist Surrealist OptObstacles OptObstacles Pseudo-Random Pseudo-Random
  • 14.
    Case Studies 14 Case Study5 Tool # tests # failures failure score failure rank diversity score diversity rank Evolv-1 37 6 2.94 1 0.86 3 Pseudo-Random 17 5 2.72 2 0.83 4 PALM 76 19 0.80 3 0.47 5 Surrealist 9 1 0.12 4 0.18 6 TGen-UQ 6 0 0.00 5 0.90 2 OptObstacles 2 0 0.00 5 0.90 1 Surrealist OptObstacles OptObstacles Pseudo-Random Pseudo-Random Surrealist
  • 15.
    Ranking 15 0 50 100150 200 250 Pseudo-Random Evolv-1 PALM TGen-UQ Surrealist OptObstacles Test Suite Size # Reported Tests # Failed Tests 0 2 4 6 8 10 12 14 16 18 20 Pseudo-Random Evolv-1 PALM TGen-UQ Surrealist OptObstacles Test Suite Score Failure Score Diversity Score
  • 16.
    16 github.com/skhatiri/UAV-Testing-Competition github.com/skhatiri/Aerialist mazras@usi.ch Sajad Khatiri 0 24 6 8 10 12 14 16 18 20 Pseudo-Random Evolv-1 PALM TGen-UQ Surrealist OptObstacles Test Suite Score Failure Score Diversity Score
  • 17.
  • 18.