Faster Evolutionary Multi-Objective Optimization via GALE: the Geometric Active Learner

JosephKrall
In partial fulfillment of the requirements for the degree of Doctor of
Philosophy in Computer Science.
College of Engineering and Mineral Resources
Faster Evolutionary Multi-Objective Optimization
via GALE, the Geometric Active Learner
a Ph.D. Final Defense Presentation for the
Special Thanks to the
NASA Ames Research Center
The Lane Department of Computer Science and Electrical Engineering
at
April 21, 2014
Estimated Duration:
45 minutes

4/21/2014 Faster Multi-Objective Optimization via GALE
A Thesis Proposal
- “JMOO: Tools for Faster Multi-Objective Optimization”
Comments from Committee
- Lacking Rigor
- Generalizability of Proposal
- Lacking Details / Misunderstandings
- Some Missing Related Works
- Validity Concerns
- Needed More – Not Substantial Enough
Last Time
1. Introduction
November, 2013
SE or CS?
2/48

Final Dissertation
- “Faster Multi-Objective Optimization via GALE”
Key Changes from Proposal
- Focus on Contributions of GALE
- Focus on Assessing and Validating GALE
- Very rigorous experimental methodology
- Addressing Comments from Proposal
- Expansive Related Works
- Formalizing the Field
- MANY more experimental results
This Time
Spring!
…Sort of
April, 2014
1. Introduction
3/48

Search & Optimization of Goals
- the art of decision making
- e.g. shortest time city navigation
- e.g. managing calorie intake for diets
Not always trivial
- Landing an airplane safely
- Maximizing software project profits
MOO = Multi-Objective Optimization
- Draft solutions to a problem (red)
- Find Pareto Frontiers (green)
- Report to a decision maker
This Thesis
Areas on the Pareto frontier
Rejected
Solutions
Who do I pick???
1. Introduction
4/48

Increasing Interest
The Field of MOO
Agile
Project
Studies
Aircraft
Studies
Software Engineering (SE) General MOO
(MOO) Coello: http://delta.cs.cinvestav.mx/˜ccoello/EMOO/EMOObib.html
(SE) CREST: http://crestweb.cs.ucl.ac.uk/resources/sbse_repository/repository.html
* Data from :
8000 Papers
Since the
1950’s
1. Introduction
In this thesis:
SE and CS
5/48

[Sayyad & Ammar 2013] Report:
- NSGA-II and SPEA2 are the most popular search tools today
Popular Search Tools Evaluate Too Much
- O(N2) internal search: fast if solution evaluation is a cheap operation
- Need to count number of evaluations instead: O(2NG)
This Thesis Proposes GALE: O(2Log2(NG))
- GALE adds data mining to evaluate only the most-informative solutions
Main Message
Introduction
GALE:
597s
NSGA-II:
14,018s
N = population size
G = number of generations
6/48

Aircraft Studies for Safety Assurance
- Complex Simulations at NASA [8 seconds per run]
Standard MOO Tools
- Many [300] weeks
GALE
- Many [300] hours
Applications of MOO
!
* Asiana Flight Wreckage,
Summer 2013
(50400 hrs)
(1.8 wks)
1. Introduction
7/48

GALE is a Meta-heuristic Search Tool
- Too difficult (maybe impossible) to “prove”
- Can only be experimented
-> Generalizability (External Validity) concerns
-> A MOO Critique to Improve Validity
Research Questions
- Evaluations
- Runtime
- Solution Quality
Assessing GALE
4 Experimental Areas:
- #1 Aircraft Safety (CDA)
- #2 Agile Projects (POM3)
- #3 Constrained Lab Problems
- #4 Unconstrained Lab Problems
SE or CS?
SE
CS
CS
CS
1. Introduction
8/48

GALE shown to be a strong rival to NSGA-II & SPEA2
And The Results
Two orders of magnitude
fewer evaluations for all
models
Two orders of magnitude
faster (seconds) for big
models
Better Solution Quality
SPEA2 much slower
GALE Never worse NSGA-II/SPEA2 Never better
1. Introduction
9/48

Background2
In this chapter:
- Formalities
- Definitions
- Related Works
1. Introduction
2. Background
3. MOO Critique
4. GALE
5. Models
6. Experiments
7. Validity
8. Conclusion
10 Slides
10/48

Mathematical Programming: [Dantzig]
- The aim is to find solutions that optimize objectives
- Transformation functions transform decisions (x) into objectives (y)
- Solutions are infeasible if they do not satisfy constraint functions
Formalities
2. Background
objectives
Constraint functionsOptimality direction
Transformation functions
a. Defines
11/48

Lab Problems
- Schaffer, Viennet, Tanaka, etc.
Real-world Problems
- Simulations
- Too complex for math
- Aircraft Safety
- Software Dev. Profit
Kinds of Models
The Schaffer Model
2. Background a. Defines
12/48

Early methods assumed math models
- A bad assumption for real world practicality
They also assume other aspects:
- Concave vs. Convex
- Differentiability
- Linear vs. Non-linear
- Single vs. Multi-objective
- Objective Functions vs. Simulation
Numerical Optimization
2. Background b. Early Methods
13/48

Exterior Search [Dantzig]
- For Linear problems ( [Nelder & Mead 1965] made a non-linear version)
- Embed a simplex with solutions along the vertices
- Traverse along the nodes
- Good average Complexity
- But bad O(N3) worst case
Simplex Search
Nelder, John A.; R. Mead (1965). "A simplex method for
function minimization". Computer Journal 7: 308–313.
14/48

Karmarkar’s Algorithm – [Karmarkar 1984]
- Good for big data
- Fast convergence
- Polynomial complexity
- 50x faster than Simplex
- Single-Objective Only
- Requires Concavity
Interior Point Methods
Narendra Karmarkar (1984). "A New Polynomial Time Algorithm for Linear Programming", Combinatorica, Vol 4, nr. 4, p. 373–395.
15/48

Moving onward from Numerical Methods
- Improve a heuristic, not the actual objectives
- Hill Climbing: Accept only improved steps
- Tabu Search: Refuse only recently attempted steps
- Simulated Annealing: Early bad okay, late bad refused
Heuristic-based Searches
2. Background c. Recent Methods
16/48

Particle Swarm Optimization [Kennedy 1995]
- Real life swarms; flocks of birds, etc
- Swarm towards good solutions
- Self best and Pack best
Ant Colony Optimization [Dorigo 1992]
- Ant Colony Path Searches
- Pheromone density = best path
PSO & ACO
Kennedy, J.; Eberhart, R. (1995). "Particle Swarm
Optimization". Proceedings of IEEE International Conference on Neural
Networks IV. pp. 1942–1948.
M. Dorigo, Optimization, Learning and Natural Algorithms, PhD thesis,
Politecnico di Milano, Italy, 1992.
17/48

Standard EA (Evolutionary Algorithm):
1) Build initial population
2) Repeat for max_generations:
a) crossover
b) mutation
c) select
3) Return final population
Evolutionary Algorithms
a+b) Build Offspring: Perturb Population
c) Combine Offspring + Population
c) Cull the worst solutions to retain Population Size
* Malin Åberg:
http://physiol.gu.se/maberg/images.html
18/48

NSGA-II [Deb 2002]
- Non-dominated Sorting Genetic Algorithm
- Standard select+crossover+mutation
- Sort by ‘bands’, or domination ‘depth’
- Break ties based on density
- crowding distance
NSGA-II
Deb, K.; Pratap, A.; Agarwal, S.; Meyarivan, T. (2002). "A fast
and elitist multiobjective genetic algorithm: NSGA-II". IEEE
Transactions on Evolutionary Computation 6 (2): 182
19/48

SPEA2 [Zitzler2002]
- Strength Pareto Evolutionary Algorithm
- Standard select+crossover+mutation
- Sort by ‘strength’: count of solutions someone dominates
- Truncate crowded solutions via nearest neighbor
SPEA2
E. Zitzler, M. Laumanns, and L. Thiele. SPEA2: Improving the strength pareto evolutionary algorithm for multiobjective
optimization. Evolutionary Methods for Design Optimization and Control with Applications to Industrial Problems, 95--100, 2001.
20/48

MOO Critique
3
In this chapter:
- Survey
- Rigor
1. Introduction
2. Background
3. MOO Critique
4. GALE
5. Models
6. Experiments
7. Validity
8. Conclusion
4 Slides
21/48

Experimental Rigor
- Want to maximize validity
- Because reasons to doubt GALE
- Still does good with few evals?
- Can still run fast?
We looked at literature for advice
- Search query targeted these questions:
- Ended up selecting 21 papers
Survey of MOO
Statistical Methods?
- [Demsar2006]: recommends
KS-Test + Friedman + Nemenyi
* J. Demsar, “Statistical comparisons of classiﬁers over multiple data sets,” ˇ J. Mach. Learn. Res., vol. 7, pp. 1–30, Dec. 2006.
Population size?
- 20 ~ 100 is good.
- Over 200 is a waste
Number of Repeats?
- [Harman 2012]: 30-50 is common.
- This Thesis: 20.
* M. Harman et al., Search based software engineering: techniques, taxonomy, tutorial. In Empirical Software Engineering and
Verification, Bertrand Meyer and Martin Nordio (Eds.). Springer-Verlag, Berlin, Heidelberg 1-59.
3. MOO Critique
22/48

1. Use variety of models
– Real World Models: Practicality.
– Standard Models: Reproducibility.
– Constrained and Unconstrained: Generalizability
2. How many Repeats
– Pragmatics: Keep repeats low to save on computational cost
– Statistics: Want high repeats for statistical stability
– The middle ground: for n in 20,30,40: no change. So 20 is good.
Principles 1 & 2
Many
papers
used only
lab models
- 7 Constrained
- 13 Unconstrained
- 1 Privatized (CDA)
- 1 Public (POM3)
In this thesis:
Standard ModelsReal World Models
Constrained Lab
Unconstrained LabPublic
Privatized
Use models from
all quadrants:
3. MOO Critique
23/48

3. Statistical Methods
– Based on Demsar’s Recommendations
– Begin with Kolmogorov-Smirnov (KS-Test) to test normality
• Data rarely conforms to normality assumptions
– For two-group testing, use Wilcoxon Rank Sum (WRS) Test
– For Multi-group testing, use Friedman Test + Nemenyi
4. Runtimes
– Report runtimes to aid reproducibility arguments
– Report details of machine
Principles 3 & 4
3. MOO Critique
Most papers failed to
address number of groups
Half of the papers
neglected to report runtimes
24/48

5. Number of Evaluations
– Report number of evaluations
– Because they dominate runtime of real-world models
6. Parameters
– Define all parameters carefully
– Reproducibility concerns: pop. Size, #gens, stopping criteria
7. Discuss Threats of Validity
– Don’t make the reader do all the work
– Rigorous Experimental Methods = Stronger Conclusions
Principles 5-7
Half of the papers
neglected to report evaluations
Almost no one had a threats
to validity section in their paper
3. MOO Critique
25/48

GALE4
1. Introduction
2. Background
3. MOO Critique
4. GALE
5. Models
6. Experiments
7. Validity
8. Conclusion
In this chapter:
- Spectral Learning
- Active Learning
5 Slides
26/48

GALE: Geometric Active Learning (Evolution)
- At most O(2Log2N) evaluations per generation
- Exactly Θ(2N) evaluations for NSGA-II, SPEA2
Main Differences in GALE:
- cluster solutions
- evaluate some, not all
- Directed vs random
- More on these later
Introducing GALE
4. GALE
GALE NSGA-II SPEA2
Asymptotic Notation:
Big-O: worst case
Big-Theta: Exact case
27/48

Three key phrases to talk about
1. Active Learning
- Minimize cost of evaluation
- Learn more from using less [Settles 2009]
2. Spectral Learning (WHERE)
- Reasoning with eigenvectors via covariance matrix
- “Spectral Clustering” – via eigenvectors
- FastMap finds eigenvectors faster than PCA
3. Directed Search
- Shove solutions along promising directions
Components to GALE
some, not all
clustered
spectrally
Directed
mutation
4. GALE
28/48

Algorithm shown here and explained over next several slides
- WHERE algorithm
- WHERE uses FastMap
- Directed Mutation
1. Build initial population, P0. Initialize generation: t = 0. Set Life = 3.
2. Repeat until stopping criteria is met (stop if life == 0):
a. Run WHERE (with pruning) to select Rt = dominant leafs from WHERE.
b. Perform Directed Mutation on members of Rt.
c. Copy Rt into Pt+1 and generate new random candidates until new population is full.
d. Increment generation number t = t + 1.
e. Collect stats and evaluate stopping criteria. Decrement life if no improvement to any
objective.
3. Run WHERE (without pruning) to select Rt = dominant leafs from WHERE.
4. Rt contains approximations to the Pareto frontier.
GALE Pseudo-Code
GALE
Spectral Learning
Active Learning
Directed Search
29/48

Spectral clustering is O(n3) [Kumar12]
- Common method: PCA
- The Nystrom Method reduces to near-linear
- Low-rank approx. of covariance matrix
e.g.: FastMap is a Nystrom Algorithm [Platt05]
- 1) Pick an arbitrary point, z.
- 2) Let ‘east’ be the furthest point from z.
- 3) Let ‘west’ be the furthest point from ‘east’.
- 4) Project all points onto the line east-west
- 5) east-west is the first principal component
Nystrom Method
GALE
east
west
c
b
a x
Active Learning:
- Only evaluate East & West!
30/48

WHERE = Spectral Learning in GALE
- Similar to Boley’s PDDP: find first eigenvector and recursively split
- PDDP uses PCA. WHERE uses FastMap.
The WHERE Tool
GALE
Initial population
WHERE clusters
initial population =
Spectral Learning
Only evaluate the
best clusters =
Active Learning
Mutate along
those clusters =
Directed Search
At Most 2Log2(NG) Evaluations (N=Population Size. G=Number of Generations)
Refill the Population
Non-dominated
clusters
31/48

Models
5
1. Introduction
2. Background
3. MOO Critique
4. GALE
5. Models
6. Experiments
7. Validity
8. Conclusion
In this chapter:
- CDA
- POM3
- Lab Models
4 Slides
32/48

5. Models
Continuous Descent Arrival
- NASA wants to know if CDA is doable
- Standard descents are less efficient than CDA
-> more {noise, time, fuel, $$$}
- CDA might unnecessarily strain air traffic control (ATC)
CDA Model
a. CDA
33/48

Lots of work
- 2 months at NASA Ames Research Center
- CDA not pre-assembled
Inspiration from 2013 Asiana Flight Crash
- Pilots had to do unusually more tasks than normal
- Keeping airspeed nominal was a task they ‘forgot’
- Human Factors model a pilot ‘HTM’ = maximum human taskload
Goal of CDA: less forgetting, less time from delays and missed tasks
* based on Work Models that Compute by Pritchett, Kim and Feigh, 2011-2013
Building CDA
5. Models a. CDA
34/48

POM3
- Model of Agile Software Requirements Engineering
Agile Software Projects
- Programmers rush to complete tasks
- But what tasks get most priority?
Requirements Prioritization Strategies
- Find good schemes that optimize objectives
POM3
Repeat 2 < N < 6 times:
1. Collect Tasks
2. Prioritize Tasks
3. Execute Tasks
4. Find New Tasks
5. Adjust Priorities
Objectives to Minimize
- Total Cost
- % Idle Rate of Teams
Objectives to Maximize
- % Completion of Tasks
* POM3 based on POM2 based on POM by Portman, Owens, Menzies (2008, 2009)
5. Models b. POM3
35/48

We explore all these:
The Constrex Model
Standard Lab Models
Unconstrained Constrained
Fonseca BNH
Golinski Constrex
Kursawe Osyczka2
Poloni Srinivas
Schaffer Tanaka
Viennet2-3-4 TwoBarTruss
ZDT1-2-3 Water
ZDT4-6
5. Models c. Lab
36/48

Experiments6
3. MOO Critique
4. GALE
5. Models
6. Experiments
7. Validity
8. Conclusion
1. Introduction
2. Background
4 Slides
37/48
In this chapter:
- Results
- Analysis

Research Questions:
- Number of Evaluations
- Runtime
- Quality of Solutions
4 Experiment Areas:
- #1 Aircraft Safety
- #2 Agile Software Development
- #3 Constrained Lab Models
- #4 Unconstrained Lab Models
Experimental Methods
6. Experiments
1. Run the Model 500 times
2. Collect an average-case baseline
3. Compute loss (x, baseline) for each solution x
4. The median loss is the “Quality Score”
o = number of objectives
Quality Score:
> 1.0: Loss in Quality from Baseline
= 1.0: No Change from Baseline
< 1.0: Improvement from Baseline
[Zitzler & Kunzli 2004]
38/48

Experiment GALE NSGA-II SPEA2
#1 Aircraft Safety
(CDA Model)
50
+++
2800
=
2450
=
#2 Agile Software
(POM3 Models)
36-46
+++
3000-3550
=
3050-3300
=
#3 Constrained
Lab Models
28-88
+++
1050-3250
=
950-3150
=
#4 Unconstrained
Lab models
26-45
+++
1250-3550
=
1250-3250
=
RQ1: Number of Evaluations
GALE needed two orders of magnitude fewer evaluations
6. Experiments
39/48

#1 Aircraft Safety
(CDA Model)
6 – 20mins
+++
3 – 5hrs
=
3 – 5hrs
=
#2 Agile Software
(POM3 Models)
1.5 – 9.5s
++
4.0 – 108s
=
12 – 109s
=
#3 Constrained
Lab Models
0.5 – 1.5s
=
0.5 – 1.0s
=
3 – 30s
–
#4 Unconstrained
Lab models
0.5 – 2.5s
=
0.5 – 1.0s
=
3 – 30s
–
#5 – 16 Modes of
the CDA Model
83 hours 6 months 6 months
RQ2: Runtime
GALE needed two orders of magnitude lesser runtime
6. Experiments
GALE
enabled an
even larger
study on
CDA
NSGA-II and
SPEA
weren’t
used in #5,
so these
values were
extrapolated
from #1
40/48

#1 Aircraft Safety
(CDA Model)
0-0-2
=
0-0-2
=
0-0-2
=
#2 Agile Software
(POM3 Models)
0-0-6
=
0-1-5
=
1-0-5
=
#3 Constrained
Lab Models
12-0-2
+
0-6-8
=
0-6-8
=
#4 Unconstrained
Lab models
10-3-13
+
1-5-20
=
2-5-19
=
RQ3: Solution Quality
Displays are ‘Wins-Losses-Ties’ Format
GALE never loses. GALE usually wins.
KS-Test + Friedman + Nemenyi at the 99% Level
6. Experiments
41/48

Threats to Validity
3. MOO Critique
4. GALE
5. Models
6. Experiments
7. Validity
8. Conclusion
7
1. Introduction
2. Background
1 Slide
42/48
In this chapter:
- Validity

Most threats were already addressed
Others too trivial for this presentation
Threats to Validity
7. Validity
43/48

Conclusion
3. MOO Critique
4. GALE
5. Models
6. Experiments
7. Validity
8. Conclusion
8
1. Introduction
2. Background
3 Slides
44/48
In this chapter:
- Summary
- Ending

Popular MOO Tools Need O(2NG) Evaluations
- Very slow for large models
GALE: Geometric Active Learning (Evolution)
- Add Data Mining to Search
- Evaluate only most informative Solutions
- At most O(2LogNG) Evaluations (usually less than that)
- Enables large studies with large models
- Finds good solutions for wide
variety of models
Summary
8. Conclusion
N = population size
G = number of generations
Active Learning:
- Only evaluate East & West!
Standard ModelsReal World Models
Constrained Lab
Unconstrained LabPublic
Privatized
45/48

Developed principles for rigorous experiments
Employed those principles for our experiments
Principles
8. Conclusion
46/48

GALE a clear winner
Results of Experiments
#1
#2 #3
8. Conclusion
47/48

The End
48/48
Blue Guy Clipart Collection

Faster Evolutionary Multi-Objective Optimization via GALE: the Geometric Active Learner

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Faster Evolutionary Multi-Objective Optimization via GALE: the Geometric Active Learner

Similar to Faster Evolutionary Multi-Objective Optimization via GALE: the Geometric Active Learner (20)

More from Joe Krall

More from Joe Krall (7)

Recently uploaded

Recently uploaded (20)

Faster Evolutionary Multi-Objective Optimization via GALE: the Geometric Active Learner