Meta online learning: experiments on a unit commitment problem (ESANN2014)

MetaOnline Learning: Experimentsona Unit
CommitmentProblem
Jialin Liu, Olivier Teytaud
liu@lri.fr,teytaud@lri.fr
Black-box Noisy Optimization
Objective function fitness : Rd ! R
Optimum = argmin
2Rd
fitness()
Some NOAs
RSAES: Self-Adaptive Evolution Strategy
with resampling;
Fabian’s algorithm: a first-order method
using gradients estimated by finite
differences[?, ?];
Noisy Newton’s algorithm: a second-order
method using a Hessian matrix approxi-mated
also by finite differences[?].
Compare Solvers Early
kn n: lag
Why this lag ?
(i) comparing current recommendations
! comparing good points
! very close fitness
! very expensive
(ii) algorithms’ ranking is usually stable
! let us save up time by comparing
older recommendations
Solvers and Notations
: parent pop. size in ES
: pop. size in ES
d: search space dimension
n: generation index
n : stepsize at generation n
rn: resampling number at generation n
For all NOPA:
kn = dn0:1e
rn = n3
sn = 15n
Table 1: Solvers in experiments
Notation Algorithm and parametrization
RSAES = 10d, = 5d, rn = 10n2
Fabian1 n = 10=n0:49, a = 100
Fabian2 n = 10=n0:05, a = 100
Newton1 n = 10=n, rn = n2
Newton2 n = 100=n4, rn = n2
P:12345 NOPA of 5 solvers above
P:12345 + S: P:12345 with information sharing.
P:22 NOPA of 2 (identical) Fabian1
P:222 NOPA of 3 (identical) Fabian1
Some References
Abstract
Online learning = real time machine learning “on the fly”
Meta online learning = combining several online learning algorithms from a given set (termed
portfolio) of algorithms ' combining Noisy Optimization Algorithms (NOPA=noisy optimiza-tion
portfolio algorithm).
Goals: (i) mitigating the effect of a bad choice of online learning algorithms (ii) parallelization
(iii) combining the strengths of different algorithms.
This paper:
- Portfolio = classical for combinatorial optimization: we test portfolios for noisy optimization.
- Recently, a methodology termed lag has been proposed for NOPA. We test experimentally
the lag methodology for various problems.
Noisy Optimization Portfolio Algorithm (NOPA)
Iteration n of the portfolio fS1; : : : ; SMg containing M NOAs:
Initialization module: If n = 0 initialize all Si, i 2 f1; : : : ;Mg.
For i 2 f1; : : : ;Mg:
– Update module: Apply an iteration of solver Si until it has received at least n data samples.
– Let i;n be the current recommendation by solver Si.
Comparison module: If n = rm for some m, then
– For i 2 f1; : : : ;Mg, perform sm evaluations of the (stochastic) reward R(i;kn) and define yi the average reward.
– Define i arg mini2f1;:::;Mg yi.
Recommendation module: ~ = i;n
Experiments
Table 2: Artificial problem R() = jj jj2 + jj jjz Gaussian. n: evaluation number. z = rate
at which the variance decreases around the optimum.
z Comparison of log(R(~n))= log(n) for d = 2 Comparison of log(R(~n))= log(n) for d = 5
0 Newton1 RSAES ' P:12345 : : : Newton1 RSAES ' P:12345 : : :
1 P:12345 Fabian1 ' P:22 : : : Fabian1 P:22 P:222 : : :
2 P:12345 Fabian1 ' P:22 : : : Fabian1 P:12345 P:22 : : :
Discussion: NOPAs are usually not far from the best of their NOAs. In small dimension with noise
variance decreazing quickly to 0 around optimum (z = 2), NOPA outperforms all its NOAs.
Table 3: Stochastic Unit Commitment problems, conformant planning. St: number of stocks.
Problem size Considered NOA or NOPA
St, T, d P:22 P:22 + S: P:222 P:222 + S: Best NOA Worst NOA
3, 21, 63 0.61 0.07 0.63 0.03 0.63 0.05 0.63 0.07 0.49 0.08 0.81 0.05
4, 21, 84 0.75 0.02 0.75 0.03 0.79 0.05 0.76 0.03 0.69 0.06 1.27 0.06
5, 21, 105 0.53 0.04 0.58 0.08 0.58 0.03 0.52 0.05 0.58 0.04 1.44 0.16
6, 15, 90 0.40 0.05 0.39 0.06 0.37 0.06 0.39 0.06 0.38 0.06 0.96 0.13
6, 21, 126 0.53 0.08 0.54 0.08 0.55 0.07 0.54 0.07 0.54 0.07 1.78 0.37
8, 15, 120 0.53 0.03 0.50 0.05 0.53 0.02 0.51 0.05 0.51 0.04 1.70 0.10
8, 21, 168 0.69 0.04 0.77 0.09 0.73 0.06 0.71 0.04 0.71 0.06 2.68 0.02
7, 21, 147 0.70 0.07 0.70 0.05 0.70 0.07 0.70 0.07 0.69 0;06 2.28 0.08
Discussion: Given a same budget, a NOPA of identical solvers can outperform its NOAs. RSAES
is usually the best NOA for small dimensions and variants of Fabian for large dimension.
Table 4: Approximate convergence rates log(R(~n))= log(n) for Cart-Pole, a multimodal problem, using
NN. n: evaluation number.
Solver 2 neurons, d = 9 4 neurons, d = 17 8 neurons, d = 33
1 (RSAES) -0.4580330.045014 -0.4215350.045643 -0.3517260.051705
2 (Fabian1) 0.0022265.29923e-05 0.0020891.57766e-04 0.002218.14518e-05
3 (Fabian2) 0.0023189.80792e-05 0.0022381.14289e-04 0.002361.51244e-04
4 (Newton1) 0.0022296.08973e-05 -0.0307310.111294 0.0022471.19829e-04
5 (Newton2) 0.002275.2989e-05 0.0022177.80888e-05 0.0023079.96404e-05
6 (P:12345) -0.4087050.068428 -0.39170.071791 -0.3203990.050338
7 (P:12345 + S:) -0.427430.05709 -0.4037070.056173 -0.3540430.069576
Discussion: Fabian and Newton can’t solve this multimodal problem ) one solver is much better
than others ) easy for NOPA.
Conclusion
Main conclusion:
Usual: Portfolio of Algorithms for Combinatorial Optimization;
New: Portfolio of Algorithms for Noisy Optimization.
“Sharing” not that good.
NOPA sometimes better than NOA even if all NOA equal!
We show mathematically[?] and empirically a log(M) shift when using M solvers, when working
on the log-log scale (usual scale in noisy optimization).
Portfolio = approximately as efficient as the best - except when one iteration of one algorithm
monopolizes most of the budget - as RSAES in the unit commitment problem.

Meta online learning: experiments on a unit commitment problem (ESANN2014)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Meta online learning: experiments on a unit commitment problem (ESANN2014)

Similar to Meta online learning: experiments on a unit commitment problem (ESANN2014) (20)

Recently uploaded

Recently uploaded (20)

Meta online learning: experiments on a unit commitment problem (ESANN2014)