2. Outline
1. Population-based methods
2. Bandits
3. Bandits with progressive widening
4. Evolution strategies with bandits
5. Evolution strategies with population control
6. Collaborative coevolution
7. Progressive collaborative coevolution
8. Others
3. Black-box optimization
We have a budget b and an objective function on some domain.
For i in {1,2,3,...,b}:
- X = optimization-method.ask()
- Y = objective-function(X)
- optimization-method.tell(X, Y)
X* = optimization-method.recommend
Ask, Tell and Recommend define our optimization-method.
Noisy-optimization: the objective function is corrupted by noisy.
Regret = E (objective-function(X*) ) - inf_x E ( objective-function(x)) ⇐ we want a small
regret.
4. Population-based methods for black-box optimization
There are plenty of population-based methods! (PSO, DE, ES…)
Just an example: EMNA (Evolution of Multivariate Normal Algorithm
While budget not elapsed:
- Generate lambda points with the current probability distribution
- Select the mu best (e.g. mu = min(d, lambda / 4)
- Fit a Gaussian distribution on the mu best points
5. Outline
1. Population-based methods
2. Bandits
3. Bandits with progressive widening
4. Evolution strategies with bandits
5. Evolution strategies with population control
6. Collaborative coevolution
7. Progressive collaborative coevolution
8. Others
6. Bandits
Randomly draw M points in the domain.
For each i in {1,2,3,..., budget}:
- For each of the M points, compute the average of losses when you evaluated
that point.
- Also evaluate a confidence interval:
mean +- std x sqrt( log(i) / number-of-evaluations).
- Evaluate point with best optimistic bound mean - std x sqrt( log(i) / num-evals)
7. Outline
1. Population-based methods
2. Bandits
3. Bandits with progressive widening
4. Evolution strategies with bandits
5. Evolution strategies with population control
6. Collaborative coevolution
7. Progressive collaborative coevolution
8. Others
8. Bandits with progressive widening
Randomly draw M points in the domain.
For each i in {1,2,3,..., budget}:
- For each of the M points, compute the average of losses when you evaluated
that point.
- Also evaluate a confidence interval:
mean +- std x sqrt( log(i) / number-of-evaluations).
- Evaluate point with best optimistic bound mean - std x sqrt( log(i) / num-evals)
- If M3
< i, add one more random point.
9. Outline
1. Population-based methods
2. Bandits
3. Bandits with progressive widening
4. Evolution strategies with bandits
5. Evolution strategies with population control
6. Collaborative coevolution
7. Progressive collaborative coevolution
8. Others
10. Bandits combined with evolution strategies
Randomly draw M points in the domain.
For each i in {1,2,3,..., budget}:
- For each of the M points, compute the average of losses when you evaluated
that point.
- Also evaluate a confidence interval:
mean +- std x sqrt( log(i) / number-of-evaluations).
- Evaluate point with best optimistic bound mean - std x sqrt( log(i) / num-evals)
- If M3
< i, add one more point:
- Choose the pessimistically best point
- Mutate it as you would do it with an evolution strategy or differential evolution or whatever.
11. Outline
1. Population-based methods
2. Bandits
3. Bandits with progressive widening
4. Evolution strategies with bandits
5. Evolution strategies with population control
6. Collaborative coevolution
7. Progressive collaborative coevolution
8. Others
13. Outline
1. Population-based methods
2. Bandits
3. Bandits with progressive widening
4. Evolution strategies with bandits
5. Evolution strategies with population control
6. Collaborative coevolution
7. Progressive collaborative coevolution
8. Others
14. Collaborative coevolution (13 groups…)
1. Split the variables into 13 groups.
2. One optimization algorithm per group.
3. Each optimization algorithm proposes (“ask”) an instantiation of its variables
4. Concatenate those instantiations
5. Evaluate the obtained individual by the objective function
6. Feed the (same) fitness value to all algorithms
15. Outline
1. Population-based methods
2. Bandits
3. Bandits with progressive widening
4. Evolution strategies with bandits
5. Evolution strategies with population control
6. Collaborative coevolution
7. Progressive collaborative coevolution
8. Others
16. Collaborative coevolution: progressive version
1. Split the variables into 13 groups.
2. One optimization algorithm per group.
3. Each optimization algorithm proposes (“ask”) an instantiation of its variables
BUT optimization algorithm #i returns just middle if (i / 13 <
sqrt(2num-ask/budget)
4. Concatenate those instantiations
5. Evaluate the obtained individual by the objective function
6. Feed the (same) fitness value to all algorithms
17. Outline
1. Population-based methods
2. Bandits
3. Bandits with progressive widening
4. Evolution strategies with bandits
5. Evolution strategies with population control
6. Collaborative coevolution
7. Progressive collaborative coevolution
8. Others
18. Others
SPSA
Fabian’s method
Known best rates:
- Regret scales as 1/n if sufficiently smooth.
- Known: you should sample also far from the optimum or you will be slow.
19. Running noisy optimization tests with Nevergrad
pip install nevergrad
or (better):
git clone git@github.com:facebookresearch/nevergrad.git .
python -m nevergrad.benchmark noise --seed=12 --repetitions=10
--num_workers=40 --plot ⇐ on a 40 cores machine.
You can replace “noise” (artificial benchmark) by “ng_gym” (OpenAI Gym
benchmark) or “sequential_fastgames” or “yanoisybbob” or “yahdnoisybbob”.
You might also love “nevergrad4sf” used for optimizing the weights of
StockFish (and StockFish is very cool!)