SlideShare a Scribd company logo
1 of 35
Download to read offline
Keynote
On the Effectiveness of SBSE
Techniques through Instance
Space Analysis
Aldeida Aleti
Monash University, Australia
@AldeidaAleti aldeida.aleti@monash.edu
Effectiveness of SBSE - Status Quo
A large focus of SBSE research is in introducing new SBSE approaches
As part of the evaluation process, usually a set of experiments are conducted
- A benchmark is selected, e..g., Defects4J
- The new approach is compared against the state of the art
- Averages/medians are reported
- Some statistical tests are conducted
Instance Space Analysis
1. to understand and visualise the strengths and weaknesses of different approaches
2. to help with the objective assessment of different approaches
a. Scrutinising how approaches perform under different conditions, and stress testing them
Motivation 1: Are the problem instances adequate?
Problem 1: How were the problem instances selected?
Common benchmark problems are important for fair comparison, but are they
- demonstrably diverse
- unbiased
- representative of a range of real world context,
- challenging
- discriminating
ICSE 2022 review criteria
Motivation 2: Reporting averages/medians obscures
important information
A. Perera, A. Aleti, M. Böhme and B. Turhan, "Defect Prediction Guided Search-Based
Software Testing," 2020 35th IEEE/ACM International Conference on Automated Software
Engineering (ASE), 2020, pp. 448-460.
Problem 2: Performance is often problem dependent
(NFT)
- What are the strengths and weaknesses of the approaches?
- Which are the problem instances where an approach performs really well and
why?
- Which are the problem instances where an approach struggles and why?
- How do features of the problem instances affect the performance of the
approaches?
- Which features give an algorithm competitive advantage?
- Given a problem instance with particular features, which approach should I use?
Which algorithm is suitable for future problems?
Example
Which approach is better? SF110
C. Oliveira, A. Aleti, L. Grunske and K. Smith-Miles, "Mapping the Effectiveness of Automated Test Suite Generation
Techniques," in IEEE Transactions on Reliability, vol. 67, no. 3, pp. 771-785, Sept. 2018, doi: 10.1109/TR.2018.2832072.
Open Questions
● What impacts the effectiveness of SBSE techniques?
○ How can features of problem instances help us infer what are the strengths and weaknesses of
different SBSE approaches?
○ How can we objectively assess different SBSE techniques
● How easy or hard are existing benchmarks? How diverse are they? Are they biased
towards a particular technique?
● Can we select the most suitable SBSE technique given a problem with particular
features?
Empirical Review of Program Repair Tools: A Large-Scale Experiment on 2 141 Bugs and 23 551
Repair Attempts. T. Durieux, F. Madeiral, M. Martinez, R. Abreu. ESEC/FSE Foundations of Software
Engineering (2019) doi: 10.1145/ 3338906.3338911.
ISA
K. Smith-Miles et al. / Computers & Operations Research 45 (2014) 12–24
Steps of ISA
1. Create the metadata
a. Features
b. SBSE performances
2. Create instance space
3. Visualise footprints
4. Explain strengths/weaknesses
Features (56)
What makes the problem easy or hard?
Problem instances SF110
Performance measure
● Branch coverage.
● An approach is considered superior if its branch coverage is at least 1% higher than
the other techniques; otherwise, we use the label “Equal.”
Approaches
● Whole Test Suite with Archive (WSA)
● Many Objective Sorting Algorithm (MOSA)
● Random Testing (RT)
Significant features
● coupling between object classes
○ the number of classes coupled to a given class (method calls, field accesses, inheritance,
arguments, return types, and exceptions)
● response for a class
○ number of different methods that can be executed when a method is invoked for that object
of a class
SBST Footprints
SBST selection
E-APR
Metadata
Features (146)
Observation-based features (Yu et al. 2019)
Significant Features (9)
(F1) MOA: Measure of Aggregation.
(F2) CAM: Cohesion Among Methods
(F3) AMC: Average Method Complexity
(F4) PMC: Private Method Count
(F5) AECSL: Atomic Expression Comparison Same Left indicates the number of statements
with a binary expression that have more than an atomic expression (e.g., variable access).
(F6) SPTWNG: Similar Primitive Type With Normal Guard indicates the number of
statements that contain a variable (local or global) that is also used in another statement
contained inside a guard (i.e., an If condition).
(F7) CVNI: Compatible Variable Not Included is the number of local primitive type variables
within the scope of a statement that involves primitive variables that are not part of that
statement.
(F8) VCTC: Variable Compatible Type in Condition measures the number of variables within
an If condition that are compatible with another variable in the scope.
(F9) PUIA: Primitive Used In Assignment - the number of primitive variables in assignments.
● Little overlap between
IntroClassJava/Defects4J and the other
datasets
● Bugs.jar has the most diverse bugs
APR selection
For ISA to reveal useful insights
● Diverse features
● Diverse instances
● Diverse approaches
● A good performance measure
So what
We have a responsibility to find the weaknesses of the approaches we develop
We need to make sure that the chosen problem instances are demonstrably diverse,
unbiased, representative of a range of real world context, challenging,
discriminating of approach performance
To understand which approach is suitable for future problems, we must understand
which features impact its performance

More Related Content

What's hot

A software fault localization technique based on program mutations
A software fault localization technique based on program mutationsA software fault localization technique based on program mutations
A software fault localization technique based on program mutationsTao He
 
Testing survey by_directions
Testing survey by_directionsTesting survey by_directions
Testing survey by_directionsTao He
 
Experiments on Design Pattern Discovery
Experiments on Design Pattern DiscoveryExperiments on Design Pattern Discovery
Experiments on Design Pattern DiscoveryTim Menzies
 
Controlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
Controlled experiments, Hypothesis Testing, Test Selection, Threats to ValidityControlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
Controlled experiments, Hypothesis Testing, Test Selection, Threats to Validityalessio_ferrari
 
Survey Research In Empirical Software Engineering
Survey Research In Empirical Software EngineeringSurvey Research In Empirical Software Engineering
Survey Research In Empirical Software Engineeringalessio_ferrari
 
Programming with GUTs
Programming with GUTsProgramming with GUTs
Programming with GUTscatherinewall
 
Using Developer Information as a Prediction Factor
Using Developer Information as a Prediction FactorUsing Developer Information as a Prediction Factor
Using Developer Information as a Prediction FactorTim Menzies
 
Exploratory testing STEW 2016
Exploratory testing STEW 2016Exploratory testing STEW 2016
Exploratory testing STEW 2016Per Runeson
 
130411 francis palma - detection of process antipatterns -- a bpel perspective
130411   francis palma - detection of process antipatterns -- a bpel perspective130411   francis palma - detection of process antipatterns -- a bpel perspective
130411 francis palma - detection of process antipatterns -- a bpel perspectivePtidej Team
 
Experimental design
Experimental designExperimental design
Experimental designDan Toma
 
Sound Empirical Evidence in Software Testing
Sound Empirical Evidence in Software TestingSound Empirical Evidence in Software Testing
Sound Empirical Evidence in Software TestingJaguaraci Silva
 
Software testing using genetic algorithms
Software testing using genetic algorithmsSoftware testing using genetic algorithms
Software testing using genetic algorithmsNurhussen Menza
 
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control PoliciesModel-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control PoliciesLionel Briand
 
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...Chakkrit (Kla) Tantithamthavorn
 
Cause-Effect Graphing: Rigorous Test Case Design
Cause-Effect Graphing: Rigorous Test Case DesignCause-Effect Graphing: Rigorous Test Case Design
Cause-Effect Graphing: Rigorous Test Case DesignTechWell
 

What's hot (20)

VST2022.pdf
VST2022.pdfVST2022.pdf
VST2022.pdf
 
A software fault localization technique based on program mutations
A software fault localization technique based on program mutationsA software fault localization technique based on program mutations
A software fault localization technique based on program mutations
 
Testing survey by_directions
Testing survey by_directionsTesting survey by_directions
Testing survey by_directions
 
Experiments on Design Pattern Discovery
Experiments on Design Pattern DiscoveryExperiments on Design Pattern Discovery
Experiments on Design Pattern Discovery
 
Controlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
Controlled experiments, Hypothesis Testing, Test Selection, Threats to ValidityControlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
Controlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
 
Survey Research In Empirical Software Engineering
Survey Research In Empirical Software EngineeringSurvey Research In Empirical Software Engineering
Survey Research In Empirical Software Engineering
 
Wcre13a.ppt
Wcre13a.pptWcre13a.ppt
Wcre13a.ppt
 
Programming with GUTs
Programming with GUTsProgramming with GUTs
Programming with GUTs
 
[Tho Quan] Fault Localization - Where is the root cause of a bug?
[Tho Quan] Fault Localization - Where is the root cause of a bug?[Tho Quan] Fault Localization - Where is the root cause of a bug?
[Tho Quan] Fault Localization - Where is the root cause of a bug?
 
Wcre13b.ppt
Wcre13b.pptWcre13b.ppt
Wcre13b.ppt
 
Using Developer Information as a Prediction Factor
Using Developer Information as a Prediction FactorUsing Developer Information as a Prediction Factor
Using Developer Information as a Prediction Factor
 
Exploratory testing STEW 2016
Exploratory testing STEW 2016Exploratory testing STEW 2016
Exploratory testing STEW 2016
 
130411 francis palma - detection of process antipatterns -- a bpel perspective
130411   francis palma - detection of process antipatterns -- a bpel perspective130411   francis palma - detection of process antipatterns -- a bpel perspective
130411 francis palma - detection of process antipatterns -- a bpel perspective
 
Ssbse12b.ppt
Ssbse12b.pptSsbse12b.ppt
Ssbse12b.ppt
 
Experimental design
Experimental designExperimental design
Experimental design
 
Sound Empirical Evidence in Software Testing
Sound Empirical Evidence in Software TestingSound Empirical Evidence in Software Testing
Sound Empirical Evidence in Software Testing
 
Software testing using genetic algorithms
Software testing using genetic algorithmsSoftware testing using genetic algorithms
Software testing using genetic algorithms
 
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control PoliciesModel-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
 
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
 
Cause-Effect Graphing: Rigorous Test Case Design
Cause-Effect Graphing: Rigorous Test Case DesignCause-Effect Graphing: Rigorous Test Case Design
Cause-Effect Graphing: Rigorous Test Case Design
 

Similar to Instance Space Analysis for Search Based Software Engineering

Guidelines to Understanding Design of Experiment and Reliability Prediction
Guidelines to Understanding Design of Experiment and Reliability PredictionGuidelines to Understanding Design of Experiment and Reliability Prediction
Guidelines to Understanding Design of Experiment and Reliability Predictionijsrd.com
 
Specification based or black box techniques
Specification based or black box techniques Specification based or black box techniques
Specification based or black box techniques Muhammad Ibnu Wardana
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniquesIrvan Febry
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniquesmuhammad afif
 
Specification based or black box techniques (andika m)
Specification based or black box techniques (andika m)Specification based or black box techniques (andika m)
Specification based or black box techniques (andika m)Andika Mardanu
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniquesDinul
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniquesAji Pamungkas Prasetio
 
Specification based or black box techniques 3
Specification based or black box techniques 3Specification based or black box techniques 3
Specification based or black box techniques 3alex swandi
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniquesM Branikno Ramadhan
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniquesM Branikno Ramadhan
 
Specification Based or Black Box Techniques
Specification Based or Black Box TechniquesSpecification Based or Black Box Techniques
Specification Based or Black Box TechniquesNadia Chairunissa
 
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...ijseajournal
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniquesYoga Setiawan
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniquesM HiDayat
 
Specification Based or Black Box Techniques
Specification Based or Black Box TechniquesSpecification Based or Black Box Techniques
Specification Based or Black Box TechniquesRakhesLeoPutra
 
Comprehensive Testing Tool for Automatic Test Suite Generation, Prioritizatio...
Comprehensive Testing Tool for Automatic Test Suite Generation, Prioritizatio...Comprehensive Testing Tool for Automatic Test Suite Generation, Prioritizatio...
Comprehensive Testing Tool for Automatic Test Suite Generation, Prioritizatio...CSCJournals
 
ders 6 Panel data analysis.pptx
ders 6 Panel data analysis.pptxders 6 Panel data analysis.pptx
ders 6 Panel data analysis.pptxErgin Akalpler
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniquesYoga Pratama Putra
 
Software Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking SchemeSoftware Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking SchemeEditor IJMTER
 
Specification based or black box techniques 3
Specification based or black box techniques 3Specification based or black box techniques 3
Specification based or black box techniques 3Bima Alvamiko
 

Similar to Instance Space Analysis for Search Based Software Engineering (20)

Guidelines to Understanding Design of Experiment and Reliability Prediction
Guidelines to Understanding Design of Experiment and Reliability PredictionGuidelines to Understanding Design of Experiment and Reliability Prediction
Guidelines to Understanding Design of Experiment and Reliability Prediction
 
Specification based or black box techniques
Specification based or black box techniques Specification based or black box techniques
Specification based or black box techniques
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniques
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniques
 
Specification based or black box techniques (andika m)
Specification based or black box techniques (andika m)Specification based or black box techniques (andika m)
Specification based or black box techniques (andika m)
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniques
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniques
 
Specification based or black box techniques 3
Specification based or black box techniques 3Specification based or black box techniques 3
Specification based or black box techniques 3
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniques
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniques
 
Specification Based or Black Box Techniques
Specification Based or Black Box TechniquesSpecification Based or Black Box Techniques
Specification Based or Black Box Techniques
 
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...A Software Measurement Using Artificial Neural Network and Support Vector Mac...
A Software Measurement Using Artificial Neural Network and Support Vector Mac...
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniques
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniques
 
Specification Based or Black Box Techniques
Specification Based or Black Box TechniquesSpecification Based or Black Box Techniques
Specification Based or Black Box Techniques
 
Comprehensive Testing Tool for Automatic Test Suite Generation, Prioritizatio...
Comprehensive Testing Tool for Automatic Test Suite Generation, Prioritizatio...Comprehensive Testing Tool for Automatic Test Suite Generation, Prioritizatio...
Comprehensive Testing Tool for Automatic Test Suite Generation, Prioritizatio...
 
ders 6 Panel data analysis.pptx
ders 6 Panel data analysis.pptxders 6 Panel data analysis.pptx
ders 6 Panel data analysis.pptx
 
Specification based or black box techniques
Specification based or black box techniquesSpecification based or black box techniques
Specification based or black box techniques
 
Software Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking SchemeSoftware Cost Estimation Using Clustering and Ranking Scheme
Software Cost Estimation Using Clustering and Ranking Scheme
 
Specification based or black box techniques 3
Specification based or black box techniques 3Specification based or black box techniques 3
Specification based or black box techniques 3
 

Recently uploaded

Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 

Recently uploaded (20)

Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 

Instance Space Analysis for Search Based Software Engineering

  • 1. Keynote On the Effectiveness of SBSE Techniques through Instance Space Analysis Aldeida Aleti Monash University, Australia @AldeidaAleti aldeida.aleti@monash.edu
  • 2. Effectiveness of SBSE - Status Quo A large focus of SBSE research is in introducing new SBSE approaches As part of the evaluation process, usually a set of experiments are conducted - A benchmark is selected, e..g., Defects4J - The new approach is compared against the state of the art - Averages/medians are reported - Some statistical tests are conducted
  • 3. Instance Space Analysis 1. to understand and visualise the strengths and weaknesses of different approaches 2. to help with the objective assessment of different approaches a. Scrutinising how approaches perform under different conditions, and stress testing them
  • 4. Motivation 1: Are the problem instances adequate?
  • 5. Problem 1: How were the problem instances selected? Common benchmark problems are important for fair comparison, but are they - demonstrably diverse - unbiased - representative of a range of real world context, - challenging - discriminating
  • 6. ICSE 2022 review criteria
  • 7. Motivation 2: Reporting averages/medians obscures important information A. Perera, A. Aleti, M. Böhme and B. Turhan, "Defect Prediction Guided Search-Based Software Testing," 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2020, pp. 448-460.
  • 8. Problem 2: Performance is often problem dependent (NFT) - What are the strengths and weaknesses of the approaches? - Which are the problem instances where an approach performs really well and why? - Which are the problem instances where an approach struggles and why? - How do features of the problem instances affect the performance of the approaches? - Which features give an algorithm competitive advantage? - Given a problem instance with particular features, which approach should I use? Which algorithm is suitable for future problems?
  • 9. Example Which approach is better? SF110 C. Oliveira, A. Aleti, L. Grunske and K. Smith-Miles, "Mapping the Effectiveness of Automated Test Suite Generation Techniques," in IEEE Transactions on Reliability, vol. 67, no. 3, pp. 771-785, Sept. 2018, doi: 10.1109/TR.2018.2832072.
  • 10.
  • 11. Open Questions ● What impacts the effectiveness of SBSE techniques? ○ How can features of problem instances help us infer what are the strengths and weaknesses of different SBSE approaches? ○ How can we objectively assess different SBSE techniques ● How easy or hard are existing benchmarks? How diverse are they? Are they biased towards a particular technique? ● Can we select the most suitable SBSE technique given a problem with particular features?
  • 12. Empirical Review of Program Repair Tools: A Large-Scale Experiment on 2 141 Bugs and 23 551 Repair Attempts. T. Durieux, F. Madeiral, M. Martinez, R. Abreu. ESEC/FSE Foundations of Software Engineering (2019) doi: 10.1145/ 3338906.3338911.
  • 13. ISA K. Smith-Miles et al. / Computers & Operations Research 45 (2014) 12–24
  • 14. Steps of ISA 1. Create the metadata a. Features b. SBSE performances 2. Create instance space 3. Visualise footprints 4. Explain strengths/weaknesses
  • 15.
  • 16. Features (56) What makes the problem easy or hard?
  • 18. Performance measure ● Branch coverage. ● An approach is considered superior if its branch coverage is at least 1% higher than the other techniques; otherwise, we use the label “Equal.”
  • 19. Approaches ● Whole Test Suite with Archive (WSA) ● Many Objective Sorting Algorithm (MOSA) ● Random Testing (RT)
  • 20. Significant features ● coupling between object classes ○ the number of classes coupled to a given class (method calls, field accesses, inheritance, arguments, return types, and exceptions) ● response for a class ○ number of different methods that can be executed when a method is invoked for that object of a class
  • 23.
  • 24.
  • 25.
  • 26. E-APR
  • 30. (F1) MOA: Measure of Aggregation. (F2) CAM: Cohesion Among Methods (F3) AMC: Average Method Complexity (F4) PMC: Private Method Count (F5) AECSL: Atomic Expression Comparison Same Left indicates the number of statements with a binary expression that have more than an atomic expression (e.g., variable access). (F6) SPTWNG: Similar Primitive Type With Normal Guard indicates the number of statements that contain a variable (local or global) that is also used in another statement contained inside a guard (i.e., an If condition). (F7) CVNI: Compatible Variable Not Included is the number of local primitive type variables within the scope of a statement that involves primitive variables that are not part of that statement. (F8) VCTC: Variable Compatible Type in Condition measures the number of variables within an If condition that are compatible with another variable in the scope. (F9) PUIA: Primitive Used In Assignment - the number of primitive variables in assignments.
  • 31.
  • 32. ● Little overlap between IntroClassJava/Defects4J and the other datasets ● Bugs.jar has the most diverse bugs
  • 34. For ISA to reveal useful insights ● Diverse features ● Diverse instances ● Diverse approaches ● A good performance measure
  • 35. So what We have a responsibility to find the weaknesses of the approaches we develop We need to make sure that the chosen problem instances are demonstrably diverse, unbiased, representative of a range of real world context, challenging, discriminating of approach performance To understand which approach is suitable for future problems, we must understand which features impact its performance