Pareto-Optimal Search-Based
Software Engineering (POSBSE):
A Literature Survey
Abdel Salam Sayyad
Hany Ammar
West Virginia University, USA
2nd International Workshop on Realizing Artificial
Intelligence Synergies in Software Engineering
(RAISE’13)
May 25th, 2013
Roadmap
① Search-Based Software Engineering
② Multi-Objective Optimization
③ Survey Criteria
④ Survey Results
 Algorithms
 Number of Objectives
 Frameworks
 Quality Indicators
⑤ Conclusion & Recommendations
Roadmap
① Search-Based Software Engineering
② Multi-Objective Optimization
③ Survey Criteria
④ Survey Results
 Algorithms
 Number of Objectives
 Frameworks
 Quality Indicators
⑤ Conclusion & Recommendations
Search-Based Software Engineering
2001 1st concept paper, term “SBSE” coined [Harman & Jones]
4
2004 1st paper in Pareto-optimal SBSE [Khoshgoftaar et al.]
2004 Survey on Search-Based Test Data Generation [McMinn]
2010 Survey on Search-Based Software Design [Raiha]
2009 Comprehensive Review, several Pareto-optimal SBSE
works cited [Harman et al.]
2013 Survey on Pareto-Optimal Search-Based Software
Engineering [Sayyad & Ammar]
“The application of metaheuristic search-based
optimization techniques to find near-optimal solutions in
software engineering problems.”
Roadmap
① Search-Based Software Engineering
② Multi-Objective Optimization
③ Survey Criteria
④ Survey Results
 Algorithms
 Number of Objectives
 Frameworks
 Quality Indicators
⑤ Conclusion & Recommendations
Multi-objective Optimization
6
Higher-level
Decision Making
The Pareto Front
The Chosen Solution
Vilfredo Pareto
[1848-1923]
7
• Pareto efficiency
• Pareto distribution
• Pareto principle
(a.k.a. 80-20 rule)
• Microeconomics
• Data-driven
(*) http://en.wikipedia.org/wiki/Vilfredo_Pareto
The Pareto Front
8
dominated
Non-dominated
(Pareto Front)
Combines N objectives
to one with some
weighting scheme
Boolean dominance: x Dominates y if and only if:
- In no objective is x worse than y
- In at least one objective, x is better than y
Fitness Assignment
(according to NSGA-II [Deb et al. 2002])
9
Pareto-based methods
Indicator-Based Evolutionary
Algorithm (IBEA) [Zitzler and Kunzli 2004]
10
1) For {old generation + new generation} do
– Add up every individual’s amount of dominance with
respect to everyone else
– Sort all instances by F
– Delete worst, recalculate, delete worst, recalculate, …
2) Then, standard GA (cross-over, mutation) on the
survivors  Create a new generation  Back to 1.
Indicator-based methods
Got a High Number of Variables?
• Using real-valued benchmark functions, Wagner et
al. (EMO 2007) show that the performance of Pareto-
based methods (e.g. NSGA-II and SPEA2) rapidly
deteriorates with increasing dimension, whereas
indicator-based algorithms (e.g. IBEA) cope very well
with high-dimensional objective spaces.
11
• This motivated us to survey Pareto-optimal SBSE. In particular,
the choice of algorithms and the number of objectives.
• First use of IBEA in software engineering: Sayyad et al. ICSE’13
Roadmap
① Search-Based Software Engineering
② Multi-Objective Optimization
③ Survey Criteria
④ Survey Results
 Algorithms
 Number of Objectives
 Frameworks
 Quality Indicators
⑤ Conclusion & Recommendations
Scope
• All published work that we could access.
13
• Pareto-optimal, not just multi-objective.
• Additional papers by the same authors are
included if:
– Pareto-optimal algorithms are added/changed, or
– Number of objectives is changed, or
– Quality indicators are added.
UCL CREST SBSE Repository [Zhang]
• http://crestweb.cs.ucl.ac.uk/resources/sbse_repository/
• Total works: 1101, as of April 2013
14
Pareto-optimal: 51 Surveyed Papers
(Less than 5% of all SBSE work.)
15
Roadmap
① Search-Based Software Engineering
② Multi-Objective Optimization
③ Survey Criteria
④ Survey Results
 Algorithms
 Number of Objectives
 Frameworks
 Quality Indicators
⑤ Conclusion & Recommendations
Algorithms
• A total of 26 algorithms.
17
67%
53%
Reason for choosing a single algorithm
18
42%
25% 67%
Performance is often compared to that of single-objective
algorithms.
Roadmap
① Search-Based Software Engineering
② Multi-Objective Optimization
③ Survey Criteria
④ Survey Results
 Algorithms
 Number of Objectives
 Frameworks
 Quality Indicators
⑤ Conclusion & Recommendations
Number of Objectives
• 7 papers explored different formulations of their problems,
with varying number of objectives.
20
Roadmap
① Search-Based Software Engineering
② Multi-Objective Optimization
③ Survey Criteria
④ Survey Results
 Algorithms
 Number of Objectives
 Frameworks
 Quality Indicators
⑤ Conclusion & Recommendations
Frameworks
• 34 papers (67%): coded their own implementations.
• 13 papers (26%): used jMetal [Durillo & Nebro 2011]
• 2 papers: used Matlab.
• 1 paper: used Frontier.
• 1 paper: used Opt4J.
22
Roadmap
① Search-Based Software Engineering
② Multi-Objective Optimization
③ Survey Criteria
④ Survey Results
 Algorithms
 Number of Objectives
 Frameworks
 Quality Indicators
⑤ Conclusion & Recommendations
Quality Indicators
• Only 15 papers (30%) used quality indicators.
24
• Of those, 12 papers used quality indicators to compare
algorithms against one another.
• Hypervolume was the most widely used indicator (12 papers).
• Most useful with many objectives.
• Aggregate measures of certain qualities of the Pareto front.
Roadmap
① Search-Based Software Engineering
② Multi-Objective Optimization
③ Survey Criteria
④ Survey Results
 Algorithms
 Number of Objectives
 Frameworks
 Quality Indicators
⑤ Conclusion & Recommendations
Conclusion
• Shortcomings:
– Lack of clarity regarding the reasons why an algorithm is
chosen for a problem.
– Tendency to simplify problems by specifying fewer
objectives to evaluate.
– Heavy reliance on personal implementations of widely-
used algorithms.
– Lack of agreement on whether to utilize quality indicators,
and which indicators to use.
26
Conclusion
• Promising directions:
– Increasing adoption of Pareto-optimal methods.
– Comparing algorithms against one another to discover
better performance, and to reason about suitability of the
algorithms to the problems at hand (15 papers out of 51).
– exploring different formulations of problems, wherein the
complexity is increased and more objectives are evaluated
(7 papers out of 51).
– Increasing use of the open-source jMetal framework with
its rich set of algorithms and quality indicators.
27
Recommendations
• Single-valued fitness functions are a thing of the past. Software
engineering problems are multiobjective by nature, and Pareto
optimization is the best way to find all the possible trade-offs
among the objectives such that the stakeholders can make
enlightened decisions.
28
• More attention needs to be paid to the suitability of an
algorithm to the type of problem at hand. It is true that the
right optimizer for a specific problem remains an open
question [Wolpert & Macready ‘97], but there has to be a
thought process about the structure of the problem and the
suitability of the metaheuristic.
Recommendations
29
• More comparisons regarding the performance of various
algorithms when applied to specific problems.
• Reformulating two- and three-objective problems to bring out
objectives that might have been aggregated or ignored. In
addition to being closer to the business reality of many
competing objectives, this should put algorithms under
increased stress, and enable testing out reported results about
certain MEOAs performing better than others at higher
dimensions.
• Utilizing and contributing to the algorithms available in jMetal,
as well as its quality indicator offerings.
Recommendations
30
Acknowledgment
This research work was
funded by the Qatar
National Research Fund
(QNRF) under the National
Priorities Research Program
(NPRP)
Grant No.: 09-1205-2-470.
Suitability
Go
Pareto!
Higher
Objectives
More
Comparisons
jMetal

Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey

  • 1.
    Pareto-Optimal Search-Based Software Engineering(POSBSE): A Literature Survey Abdel Salam Sayyad Hany Ammar West Virginia University, USA 2nd International Workshop on Realizing Artificial Intelligence Synergies in Software Engineering (RAISE’13) May 25th, 2013
  • 2.
    Roadmap ① Search-Based SoftwareEngineering ② Multi-Objective Optimization ③ Survey Criteria ④ Survey Results  Algorithms  Number of Objectives  Frameworks  Quality Indicators ⑤ Conclusion & Recommendations
  • 3.
    Roadmap ① Search-Based SoftwareEngineering ② Multi-Objective Optimization ③ Survey Criteria ④ Survey Results  Algorithms  Number of Objectives  Frameworks  Quality Indicators ⑤ Conclusion & Recommendations
  • 4.
    Search-Based Software Engineering 20011st concept paper, term “SBSE” coined [Harman & Jones] 4 2004 1st paper in Pareto-optimal SBSE [Khoshgoftaar et al.] 2004 Survey on Search-Based Test Data Generation [McMinn] 2010 Survey on Search-Based Software Design [Raiha] 2009 Comprehensive Review, several Pareto-optimal SBSE works cited [Harman et al.] 2013 Survey on Pareto-Optimal Search-Based Software Engineering [Sayyad & Ammar] “The application of metaheuristic search-based optimization techniques to find near-optimal solutions in software engineering problems.”
  • 5.
    Roadmap ① Search-Based SoftwareEngineering ② Multi-Objective Optimization ③ Survey Criteria ④ Survey Results  Algorithms  Number of Objectives  Frameworks  Quality Indicators ⑤ Conclusion & Recommendations
  • 6.
  • 7.
    Vilfredo Pareto [1848-1923] 7 • Paretoefficiency • Pareto distribution • Pareto principle (a.k.a. 80-20 rule) • Microeconomics • Data-driven (*) http://en.wikipedia.org/wiki/Vilfredo_Pareto
  • 8.
    The Pareto Front 8 dominated Non-dominated (ParetoFront) Combines N objectives to one with some weighting scheme Boolean dominance: x Dominates y if and only if: - In no objective is x worse than y - In at least one objective, x is better than y
  • 9.
    Fitness Assignment (according toNSGA-II [Deb et al. 2002]) 9 Pareto-based methods
  • 10.
    Indicator-Based Evolutionary Algorithm (IBEA)[Zitzler and Kunzli 2004] 10 1) For {old generation + new generation} do – Add up every individual’s amount of dominance with respect to everyone else – Sort all instances by F – Delete worst, recalculate, delete worst, recalculate, … 2) Then, standard GA (cross-over, mutation) on the survivors  Create a new generation  Back to 1. Indicator-based methods
  • 11.
    Got a HighNumber of Variables? • Using real-valued benchmark functions, Wagner et al. (EMO 2007) show that the performance of Pareto- based methods (e.g. NSGA-II and SPEA2) rapidly deteriorates with increasing dimension, whereas indicator-based algorithms (e.g. IBEA) cope very well with high-dimensional objective spaces. 11 • This motivated us to survey Pareto-optimal SBSE. In particular, the choice of algorithms and the number of objectives. • First use of IBEA in software engineering: Sayyad et al. ICSE’13
  • 12.
    Roadmap ① Search-Based SoftwareEngineering ② Multi-Objective Optimization ③ Survey Criteria ④ Survey Results  Algorithms  Number of Objectives  Frameworks  Quality Indicators ⑤ Conclusion & Recommendations
  • 13.
    Scope • All publishedwork that we could access. 13 • Pareto-optimal, not just multi-objective. • Additional papers by the same authors are included if: – Pareto-optimal algorithms are added/changed, or – Number of objectives is changed, or – Quality indicators are added.
  • 14.
    UCL CREST SBSERepository [Zhang] • http://crestweb.cs.ucl.ac.uk/resources/sbse_repository/ • Total works: 1101, as of April 2013 14
  • 15.
    Pareto-optimal: 51 SurveyedPapers (Less than 5% of all SBSE work.) 15
  • 16.
    Roadmap ① Search-Based SoftwareEngineering ② Multi-Objective Optimization ③ Survey Criteria ④ Survey Results  Algorithms  Number of Objectives  Frameworks  Quality Indicators ⑤ Conclusion & Recommendations
  • 17.
    Algorithms • A totalof 26 algorithms. 17 67% 53%
  • 18.
    Reason for choosinga single algorithm 18 42% 25% 67% Performance is often compared to that of single-objective algorithms.
  • 19.
    Roadmap ① Search-Based SoftwareEngineering ② Multi-Objective Optimization ③ Survey Criteria ④ Survey Results  Algorithms  Number of Objectives  Frameworks  Quality Indicators ⑤ Conclusion & Recommendations
  • 20.
    Number of Objectives •7 papers explored different formulations of their problems, with varying number of objectives. 20
  • 21.
    Roadmap ① Search-Based SoftwareEngineering ② Multi-Objective Optimization ③ Survey Criteria ④ Survey Results  Algorithms  Number of Objectives  Frameworks  Quality Indicators ⑤ Conclusion & Recommendations
  • 22.
    Frameworks • 34 papers(67%): coded their own implementations. • 13 papers (26%): used jMetal [Durillo & Nebro 2011] • 2 papers: used Matlab. • 1 paper: used Frontier. • 1 paper: used Opt4J. 22
  • 23.
    Roadmap ① Search-Based SoftwareEngineering ② Multi-Objective Optimization ③ Survey Criteria ④ Survey Results  Algorithms  Number of Objectives  Frameworks  Quality Indicators ⑤ Conclusion & Recommendations
  • 24.
    Quality Indicators • Only15 papers (30%) used quality indicators. 24 • Of those, 12 papers used quality indicators to compare algorithms against one another. • Hypervolume was the most widely used indicator (12 papers). • Most useful with many objectives. • Aggregate measures of certain qualities of the Pareto front.
  • 25.
    Roadmap ① Search-Based SoftwareEngineering ② Multi-Objective Optimization ③ Survey Criteria ④ Survey Results  Algorithms  Number of Objectives  Frameworks  Quality Indicators ⑤ Conclusion & Recommendations
  • 26.
    Conclusion • Shortcomings: – Lackof clarity regarding the reasons why an algorithm is chosen for a problem. – Tendency to simplify problems by specifying fewer objectives to evaluate. – Heavy reliance on personal implementations of widely- used algorithms. – Lack of agreement on whether to utilize quality indicators, and which indicators to use. 26
  • 27.
    Conclusion • Promising directions: –Increasing adoption of Pareto-optimal methods. – Comparing algorithms against one another to discover better performance, and to reason about suitability of the algorithms to the problems at hand (15 papers out of 51). – exploring different formulations of problems, wherein the complexity is increased and more objectives are evaluated (7 papers out of 51). – Increasing use of the open-source jMetal framework with its rich set of algorithms and quality indicators. 27
  • 28.
    Recommendations • Single-valued fitnessfunctions are a thing of the past. Software engineering problems are multiobjective by nature, and Pareto optimization is the best way to find all the possible trade-offs among the objectives such that the stakeholders can make enlightened decisions. 28 • More attention needs to be paid to the suitability of an algorithm to the type of problem at hand. It is true that the right optimizer for a specific problem remains an open question [Wolpert & Macready ‘97], but there has to be a thought process about the structure of the problem and the suitability of the metaheuristic.
  • 29.
    Recommendations 29 • More comparisonsregarding the performance of various algorithms when applied to specific problems. • Reformulating two- and three-objective problems to bring out objectives that might have been aggregated or ignored. In addition to being closer to the business reality of many competing objectives, this should put algorithms under increased stress, and enable testing out reported results about certain MEOAs performing better than others at higher dimensions. • Utilizing and contributing to the algorithms available in jMetal, as well as its quality indicator offerings.
  • 30.
    Recommendations 30 Acknowledgment This research workwas funded by the Qatar National Research Fund (QNRF) under the National Priorities Research Program (NPRP) Grant No.: 09-1205-2-470. Suitability Go Pareto! Higher Objectives More Comparisons jMetal