SlideShare a Scribd company logo
Noisy optimization
Astete-Morales, Cauwet,
Decock, Liu, Rolet, Teytaud

Thanks all !
•
•
•

Runtime analysis
Black-box complexity
Noisy objective function
@inproceedings{dagstuhlNoisyOptimization, title={Noisy Optimization},
booktitle={Proceedings of Dagstuhl seminar 13271 “Theory of Evolutionary Algorithm”},
author={Sandra Astete-Morales and Marie-Liesse Cauwet and Jeremie Decock and Jialin Liu
and Philippe Rolet and Olivier Teytaud}, year={2013}}
This talk about noisy optimization in
continuous domains
Has an impact
on algorithms.

EA theory in
discrete
domains

EA theory in
Continuous
domains

Yes, I wanted a little bit
of troll in my talk.
In case I still have friends in the
room, a second troll.
Has an impact
on algorithms.
(but ok, when we
really want it to work,
we use mathematical
programming)

EA theory in
Continuous
domains

EA theory in
discrete
domains
Noisy optimization:
preliminaries (1)

Points at which the fitness function
is evaluated (might be bad)

Obtained fitness values
Points which are recommended as
Approximation of the optimum (should
be good)
Noisy optimization:
preliminaries (2)
À la Anne Auger:
In noise-free cases, you can get
Log ||x_n|| ~ - C n
with an ES
(log-linear convergence)
Noisy optimization:
preliminaries (3)
A noise model (among others):
p

z

f(x) = ||x|| + ||x||  Noise
with “Noise” some i.i.d noise.
●

Z=0 ==> additive constant variance noise

●

Z=2 ==> noise quickly converging to 0
Part 1: what about ES ?
Let us start with:
●

ES with nothing special

●

Just reevaluate and same business as usual
Noisy optimization: log-log or loglinear convergence for ES ?
Set z=0 (constant additive noise).
Define r(n) = number of revaluations at iteration n.
●

r(n)=poly(n)

●

r(n)=expo(n)

●

r(n)=poly (1 / step-size(n) )
Results:

●

Log-linear w.r.t. iterations

●

Log-log w.r.t. evaluations
Noisy optimization: log-log or loglinear convergence for ES ?
●

r(n)=expo(n) ==> log(||~xn||) ~ -C log(n)

●

r(n)=poly (1 / ss(n) ) ==> log(||~xn||) ~ -C log(n)
Noise' = Noise after averaging over r(n) revaluations

Proof: enough revaluations for no misrank
●

d(n) = sequence decreasing to zero.

●

p(n) = P( |E f(xi)-E f(xj)| <= d(n) ) (≈same fitness)

●

p'(n) = λ2 P( | Noise'(xi)-Noise'(xj) | >d(n) )

●

choose coeffs so that, if linear convergence in
the noise-free case, ∑ p(n)+p(n') < δ
So with simple revaluation schemes
●
●

z=0 ==> log-log

(for #evals)

z=large (2?) ==> not finished checking, we
believe it is log-linear

Now:
What about “races” ?
Races = revaluations until statistical difference.
Part 2

●
●

Bernoulli noise model (sorry, another model...)
Combining bandits and ES for runtime
analysis
(see also work by
V. Heidrich-Meisner and C. Igel)

●

Complexity bounds by information theory
(#bits of information)
Bernoulli objective function
(convenient for applying information theory; what you
get is bits of information.)
ES + races
●

The race stops when you know one of the
good points

●

Assume sphere (sorry)

●

Sample five in a row, on one axis

●

At some point two of them are statistically
different (because of the sphere)

●

Reduce the domain accordingly

●

Now split another axis
ES + races
●

The race stops when you know one of the
good points

●

Assume sphere (sorry)

●

Sample five in a row, on one axis

●

At some point two of them are statistically
different

●

Reduce the domain accordingly

●

Now split another axis
ES + races
●

The race stops when you know one of the
good points

●

Assume sphere (sorry)

●

Sample five in a row, on one axis

●

At some point two of them are statistically
different

●

Reduce the domain accordingly

●

Now split another axis
Black-box complexity: the tools
●

Ok, we got a rate for both evaluated and
recommended points.

●

What about lower bounds ?

●

UR = uniform rate = bound for worse evaluated point
(see also: cumulative regret)

Proof technique for precision e with Bernoulli
noise:

b=#bits of information

(yes, it's a rigorous proof)

b > log2( nb of possible solutions with precision e )
b < sum of q(n) with

q(n) = proba that fitness(x*,w) ≠ fitness(x**,w)
Rates on Bernoulli model
log||x(n)|| < C - α log(n)

Other algorithms reach 1/2. So, yes, sampling close
to the optimum makes algorithms slower sometimes.
Conclusions of parts 1 and 2
●

●

Part 1: with constant noise, log-log
convergence; improved with noise decreasing
to 0
Part 2:
–

When using races, sampling should be careful

–

sampling not too close to the optimum can
improve convergence rates
Part 3: Mathematical programming
and surrogate models
●

●

Fabian 1967: good convergence rate for
recommended points (evaluated points: not
good) with stochastic gradient
z=0, constant noise:
Log ||~xn|| ~ C - log ( n )/2

(rate=-1/2)

●

Chen 1988: this is optimal

●

Rediscovered recently in ML conferences
Newton-style information
●

Estimate the gradient & Hessian by finite differences

●

s(SR) = slope(simple regret)
= log | E f (~xn)-Ef(x*)| / log(n)

●

s(CR) = slope(cumulative regret)
= log | E f(xi) – Ef(xi)| / log(n)

Step-size for
finite differences
#revals per point
Newton-style information
●

Estimate the gradient & Hessian by finite differences

●

s(SR) = slope(simple regret) = log | E f (xn)-Ef(x*)| / log(n)

●

s(CR) = slope(cumulative) = log | E f(xi) – Ef(xi)| / log(n)
(same rate as Fabian for z=0; better for z>0)
ly
nd ;
e
-fri rion
ES rite ers
C sid st
n
co wor
the oints
p
Conclusions
●

●

Compromise between SR and CR (ES better
for CR)
Information theory can help for proving
complexity bounds in the noisy case

●

Bandits + races = require careful sampling
(if two points very close, huge #revals)

●

●

Newton-style algorithms are fast … when
z>0
No difference between evaluated points &
recommended points ==> slow rates (similar
to simple regret vs cumulative regret debates in bandits)
I have no xenophobia against
discrete domains

http://www.lri.fr/~teytaud/discretenoise.pdf
is a Dagstuhl-collaborative work on discrete
optimization (thanks Youhei, Adam, Jonathan).

More Related Content

What's hot

Image transforms 2
Image transforms 2Image transforms 2
Image transforms 2
Ali Baig
 
Algorithm Design and Complexity - Course 11
Algorithm Design and Complexity - Course 11Algorithm Design and Complexity - Course 11
Algorithm Design and Complexity - Course 11
Traian Rebedea
 
Digital Signal Processing[ECEG-3171]-Ch1_L04
Digital Signal Processing[ECEG-3171]-Ch1_L04Digital Signal Processing[ECEG-3171]-Ch1_L04
Digital Signal Processing[ECEG-3171]-Ch1_L04
Rediet Moges
 
Digital Signal Processing[ECEG-3171]-Ch1_L03
Digital Signal Processing[ECEG-3171]-Ch1_L03Digital Signal Processing[ECEG-3171]-Ch1_L03
Digital Signal Processing[ECEG-3171]-Ch1_L03
Rediet Moges
 
Lecture 12 (Image transformation)
Lecture 12 (Image transformation)Lecture 12 (Image transformation)
Lecture 12 (Image transformation)
VARUN KUMAR
 
Lecture 15 DCT, Walsh and Hadamard Transform
Lecture 15 DCT, Walsh and Hadamard TransformLecture 15 DCT, Walsh and Hadamard Transform
Lecture 15 DCT, Walsh and Hadamard Transform
VARUN KUMAR
 
13 fourierfiltrationen
13 fourierfiltrationen13 fourierfiltrationen
13 fourierfiltrationen
hoailinhtinh
 
Discrete Signal Processing
Discrete Signal ProcessingDiscrete Signal Processing
Discrete Signal Processing
margretrosy
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
Elvis DOHMATOB
 
Lab no.07
Lab no.07Lab no.07
Tensorizing Neural Network
Tensorizing Neural NetworkTensorizing Neural Network
Tensorizing Neural Network
Ruochun Tzeng
 
DSP_FOEHU - MATLAB 01 - Discrete Time Signals and Systems
DSP_FOEHU - MATLAB 01 - Discrete Time Signals and SystemsDSP_FOEHU - MATLAB 01 - Discrete Time Signals and Systems
DSP_FOEHU - MATLAB 01 - Discrete Time Signals and Systems
Amr E. Mohamed
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
The Statistical and Applied Mathematical Sciences Institute
 
Discrete Fourier Transform
Discrete Fourier TransformDiscrete Fourier Transform
Discrete Fourier Transform
Abhishek Choksi
 
DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and Systems
DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and SystemsDSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and Systems
DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and Systems
Amr E. Mohamed
 
DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)
DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)
DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)
Amr E. Mohamed
 
Sorting algorithms
Sorting algorithmsSorting algorithms
Sorting algorithms
sonugupta
 
Stochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersStochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of Multipliers
Taiji Suzuki
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
The Statistical and Applied Mathematical Sciences Institute
 
DSP 05 _ Sheet Five
DSP 05 _ Sheet FiveDSP 05 _ Sheet Five
DSP 05 _ Sheet Five
Amr E. Mohamed
 

What's hot (20)

Image transforms 2
Image transforms 2Image transforms 2
Image transforms 2
 
Algorithm Design and Complexity - Course 11
Algorithm Design and Complexity - Course 11Algorithm Design and Complexity - Course 11
Algorithm Design and Complexity - Course 11
 
Digital Signal Processing[ECEG-3171]-Ch1_L04
Digital Signal Processing[ECEG-3171]-Ch1_L04Digital Signal Processing[ECEG-3171]-Ch1_L04
Digital Signal Processing[ECEG-3171]-Ch1_L04
 
Digital Signal Processing[ECEG-3171]-Ch1_L03
Digital Signal Processing[ECEG-3171]-Ch1_L03Digital Signal Processing[ECEG-3171]-Ch1_L03
Digital Signal Processing[ECEG-3171]-Ch1_L03
 
Lecture 12 (Image transformation)
Lecture 12 (Image transformation)Lecture 12 (Image transformation)
Lecture 12 (Image transformation)
 
Lecture 15 DCT, Walsh and Hadamard Transform
Lecture 15 DCT, Walsh and Hadamard TransformLecture 15 DCT, Walsh and Hadamard Transform
Lecture 15 DCT, Walsh and Hadamard Transform
 
13 fourierfiltrationen
13 fourierfiltrationen13 fourierfiltrationen
13 fourierfiltrationen
 
Discrete Signal Processing
Discrete Signal ProcessingDiscrete Signal Processing
Discrete Signal Processing
 
MVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priorsMVPA with SpaceNet: sparse structured priors
MVPA with SpaceNet: sparse structured priors
 
Lab no.07
Lab no.07Lab no.07
Lab no.07
 
Tensorizing Neural Network
Tensorizing Neural NetworkTensorizing Neural Network
Tensorizing Neural Network
 
DSP_FOEHU - MATLAB 01 - Discrete Time Signals and Systems
DSP_FOEHU - MATLAB 01 - Discrete Time Signals and SystemsDSP_FOEHU - MATLAB 01 - Discrete Time Signals and Systems
DSP_FOEHU - MATLAB 01 - Discrete Time Signals and Systems
 
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
Program on Quasi-Monte Carlo and High-Dimensional Sampling Methods for Applie...
 
Discrete Fourier Transform
Discrete Fourier TransformDiscrete Fourier Transform
Discrete Fourier Transform
 
DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and Systems
DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and SystemsDSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and Systems
DSP_2018_FOEHU - Lec 03 - Discrete-Time Signals and Systems
 
DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)
DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)
DSP_FOEHU - MATLAB 04 - The Discrete Fourier Transform (DFT)
 
Sorting algorithms
Sorting algorithmsSorting algorithms
Sorting algorithms
 
Stochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of MultipliersStochastic Alternating Direction Method of Multipliers
Stochastic Alternating Direction Method of Multipliers
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
DSP 05 _ Sheet Five
DSP 05 _ Sheet FiveDSP 05 _ Sheet Five
DSP 05 _ Sheet Five
 

Viewers also liked

3slides
3slides3slides
Dynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsDynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systems
Olivier Teytaud
 
Complexity of planning and games with partial information
Complexity of planning and games with partial informationComplexity of planning and games with partial information
Complexity of planning and games with partial information
Olivier Teytaud
 
Uncertainties in large scale power systems
Uncertainties in large scale power systemsUncertainties in large scale power systems
Uncertainties in large scale power systems
Olivier Teytaud
 
Computers and Killall-Go
Computers and Killall-GoComputers and Killall-Go
Computers and Killall-Go
Olivier Teytaud
 
Inteligencia Artificial y Go
Inteligencia Artificial y GoInteligencia Artificial y Go
Inteligencia Artificial y Go
Olivier Teytaud
 
Tools for artificial intelligence
Tools for artificial intelligenceTools for artificial intelligence
Tools for artificial intelligence
Olivier Teytaud
 
Statistics 101
Statistics 101Statistics 101
Statistics 101
Olivier Teytaud
 
Energy Management Forum, Tainan 2012
Energy Management Forum, Tainan 2012Energy Management Forum, Tainan 2012
Energy Management Forum, Tainan 2012
Olivier Teytaud
 
Meta Monte-Carlo Tree Search
Meta Monte-Carlo Tree SearchMeta Monte-Carlo Tree Search
Meta Monte-Carlo Tree Search
Olivier Teytaud
 
Machine learning 2016: deep networks and Monte Carlo Tree Search
Machine learning 2016: deep networks and Monte Carlo Tree SearchMachine learning 2016: deep networks and Monte Carlo Tree Search
Machine learning 2016: deep networks and Monte Carlo Tree Search
Olivier Teytaud
 
Introduction to the TAO Uct Sig, a team working on computational intelligence...
Introduction to the TAO Uct Sig, a team working on computational intelligence...Introduction to the TAO Uct Sig, a team working on computational intelligence...
Introduction to the TAO Uct Sig, a team working on computational intelligence...
Olivier Teytaud
 
Stochastic modelling and quasi-random numbers
Stochastic modelling and quasi-random numbersStochastic modelling and quasi-random numbers
Stochastic modelling and quasi-random numbers
Olivier Teytaud
 
Multimodal or Expensive Optimization
Multimodal or Expensive OptimizationMultimodal or Expensive Optimization
Multimodal or Expensive Optimization
Olivier Teytaud
 
Theories of continuous optimization
Theories of continuous optimizationTheories of continuous optimization
Theories of continuous optimization
Olivier Teytaud
 
Combining UCT and Constraint Satisfaction Problems for Minesweeper
Combining UCT and Constraint Satisfaction Problems for MinesweeperCombining UCT and Constraint Satisfaction Problems for Minesweeper
Combining UCT and Constraint Satisfaction Problems for Minesweeper
Olivier Teytaud
 

Viewers also liked (16)

3slides
3slides3slides
3slides
 
Dynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systemsDynamic Optimization without Markov Assumptions: application to power systems
Dynamic Optimization without Markov Assumptions: application to power systems
 
Complexity of planning and games with partial information
Complexity of planning and games with partial informationComplexity of planning and games with partial information
Complexity of planning and games with partial information
 
Uncertainties in large scale power systems
Uncertainties in large scale power systemsUncertainties in large scale power systems
Uncertainties in large scale power systems
 
Computers and Killall-Go
Computers and Killall-GoComputers and Killall-Go
Computers and Killall-Go
 
Inteligencia Artificial y Go
Inteligencia Artificial y GoInteligencia Artificial y Go
Inteligencia Artificial y Go
 
Tools for artificial intelligence
Tools for artificial intelligenceTools for artificial intelligence
Tools for artificial intelligence
 
Statistics 101
Statistics 101Statistics 101
Statistics 101
 
Energy Management Forum, Tainan 2012
Energy Management Forum, Tainan 2012Energy Management Forum, Tainan 2012
Energy Management Forum, Tainan 2012
 
Meta Monte-Carlo Tree Search
Meta Monte-Carlo Tree SearchMeta Monte-Carlo Tree Search
Meta Monte-Carlo Tree Search
 
Machine learning 2016: deep networks and Monte Carlo Tree Search
Machine learning 2016: deep networks and Monte Carlo Tree SearchMachine learning 2016: deep networks and Monte Carlo Tree Search
Machine learning 2016: deep networks and Monte Carlo Tree Search
 
Introduction to the TAO Uct Sig, a team working on computational intelligence...
Introduction to the TAO Uct Sig, a team working on computational intelligence...Introduction to the TAO Uct Sig, a team working on computational intelligence...
Introduction to the TAO Uct Sig, a team working on computational intelligence...
 
Stochastic modelling and quasi-random numbers
Stochastic modelling and quasi-random numbersStochastic modelling and quasi-random numbers
Stochastic modelling and quasi-random numbers
 
Multimodal or Expensive Optimization
Multimodal or Expensive OptimizationMultimodal or Expensive Optimization
Multimodal or Expensive Optimization
 
Theories of continuous optimization
Theories of continuous optimizationTheories of continuous optimization
Theories of continuous optimization
 
Combining UCT and Constraint Satisfaction Problems for Minesweeper
Combining UCT and Constraint Satisfaction Problems for MinesweeperCombining UCT and Constraint Satisfaction Problems for Minesweeper
Combining UCT and Constraint Satisfaction Problems for Minesweeper
 

Similar to Noisy optimization --- (theory oriented) Survey

My PhD defence
My PhD defenceMy PhD defence
My PhD defence
Jialin LIU
 
A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...
A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...
A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...
Jialin LIU
 
DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..
KarthikeyaLanka1
 
Automatic bayesian cubature
Automatic bayesian cubatureAutomatic bayesian cubature
Automatic bayesian cubature
Jagadeeswaran Rathinavel
 
Design and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture NotesDesign and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture Notes
Sreedhar Chowdam
 
01 - DAA - PPT.pptx
01 - DAA - PPT.pptx01 - DAA - PPT.pptx
01 - DAA - PPT.pptx
KokilaK25
 
Oracle-based algorithms for high-dimensional polytopes.
Oracle-based algorithms for high-dimensional polytopes.Oracle-based algorithms for high-dimensional polytopes.
Oracle-based algorithms for high-dimensional polytopes.
Vissarion Fisikopoulos
 
Neural ODE
Neural ODENeural ODE
Neural ODE
Natan Katz
 
5.3 dyn algo-i
5.3 dyn algo-i5.3 dyn algo-i
5.3 dyn algo-i
Krish_ver2
 
Q-Metrics in Theory And Practice
Q-Metrics in Theory And PracticeQ-Metrics in Theory And Practice
Q-Metrics in Theory And Practice
guest3550292
 
Q-Metrics in Theory and Practice
Q-Metrics in Theory and PracticeQ-Metrics in Theory and Practice
Q-Metrics in Theory and Practice
Magdi Mohamed
 
lecture01_lecture01_lecture0001_ceva.pdf
lecture01_lecture01_lecture0001_ceva.pdflecture01_lecture01_lecture0001_ceva.pdf
lecture01_lecture01_lecture0001_ceva.pdf
AnaNeacsu5
 
1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx
pallavidhade2
 
Introduction to Algorithms and Asymptotic Notation
Introduction to Algorithms and Asymptotic NotationIntroduction to Algorithms and Asymptotic Notation
Introduction to Algorithms and Asymptotic Notation
Amrinder Arora
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Per Kristian Lehre
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution Algorithms
PK Lehre
 
Lp and ip programming cp 9
Lp and ip programming cp 9Lp and ip programming cp 9
Lp and ip programming cp 9
M S Prasad
 
Introduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksIntroduction to Artificial Neural Networks
Introduction to Artificial Neural Networks
Stratio
 
2 chapter2 algorithm_analysispart1
2 chapter2 algorithm_analysispart12 chapter2 algorithm_analysispart1
2 chapter2 algorithm_analysispart1
SSE_AndyLi
 
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Yandex
 

Similar to Noisy optimization --- (theory oriented) Survey (20)

My PhD defence
My PhD defenceMy PhD defence
My PhD defence
 
A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...
A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...
A Mathematically Derived Number of Resamplings for Noisy Optimization (GECCO2...
 
DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..DS Unit-1.pptx very easy to understand..
DS Unit-1.pptx very easy to understand..
 
Automatic bayesian cubature
Automatic bayesian cubatureAutomatic bayesian cubature
Automatic bayesian cubature
 
Design and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture NotesDesign and Analysis of Algorithms Lecture Notes
Design and Analysis of Algorithms Lecture Notes
 
01 - DAA - PPT.pptx
01 - DAA - PPT.pptx01 - DAA - PPT.pptx
01 - DAA - PPT.pptx
 
Oracle-based algorithms for high-dimensional polytopes.
Oracle-based algorithms for high-dimensional polytopes.Oracle-based algorithms for high-dimensional polytopes.
Oracle-based algorithms for high-dimensional polytopes.
 
Neural ODE
Neural ODENeural ODE
Neural ODE
 
5.3 dyn algo-i
5.3 dyn algo-i5.3 dyn algo-i
5.3 dyn algo-i
 
Q-Metrics in Theory And Practice
Q-Metrics in Theory And PracticeQ-Metrics in Theory And Practice
Q-Metrics in Theory And Practice
 
Q-Metrics in Theory and Practice
Q-Metrics in Theory and PracticeQ-Metrics in Theory and Practice
Q-Metrics in Theory and Practice
 
lecture01_lecture01_lecture0001_ceva.pdf
lecture01_lecture01_lecture0001_ceva.pdflecture01_lecture01_lecture0001_ceva.pdf
lecture01_lecture01_lecture0001_ceva.pdf
 
1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx1_Asymptotic_Notation_pptx.pptx
1_Asymptotic_Notation_pptx.pptx
 
Introduction to Algorithms and Asymptotic Notation
Introduction to Algorithms and Asymptotic NotationIntroduction to Algorithms and Asymptotic Notation
Introduction to Algorithms and Asymptotic Notation
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution Algorithms
 
Simplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution AlgorithmsSimplified Runtime Analysis of Estimation of Distribution Algorithms
Simplified Runtime Analysis of Estimation of Distribution Algorithms
 
Lp and ip programming cp 9
Lp and ip programming cp 9Lp and ip programming cp 9
Lp and ip programming cp 9
 
Introduction to Artificial Neural Networks
Introduction to Artificial Neural NetworksIntroduction to Artificial Neural Networks
Introduction to Artificial Neural Networks
 
2 chapter2 algorithm_analysispart1
2 chapter2 algorithm_analysispart12 chapter2 algorithm_analysispart1
2 chapter2 algorithm_analysispart1
 
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
Subgradient Methods for Huge-Scale Optimization Problems - Юрий Нестеров, Cat...
 

Recently uploaded

みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Zilliz
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Zilliz
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
Daiki Mogmet Ito
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Vladimir Iglovikov, Ph.D.
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
Pixlogix Infotech
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 

Recently uploaded (20)

みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...Building RAG with self-deployed Milvus vector database and Snowpark Container...
Building RAG with self-deployed Milvus vector database and Snowpark Container...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
Introducing Milvus Lite: Easy-to-Install, Easy-to-Use vector database for you...
 
How to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For FlutterHow to use Firebase Data Connect For Flutter
How to use Firebase Data Connect For Flutter
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIEnchancing adoption of Open Source Libraries. A case study on Albumentations.AI
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AI
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website20 Comprehensive Checklist of Designing and Developing a Website
20 Comprehensive Checklist of Designing and Developing a Website
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 

Noisy optimization --- (theory oriented) Survey

  • 1. Noisy optimization Astete-Morales, Cauwet, Decock, Liu, Rolet, Teytaud Thanks all ! • • • Runtime analysis Black-box complexity Noisy objective function @inproceedings{dagstuhlNoisyOptimization, title={Noisy Optimization}, booktitle={Proceedings of Dagstuhl seminar 13271 “Theory of Evolutionary Algorithm”}, author={Sandra Astete-Morales and Marie-Liesse Cauwet and Jeremie Decock and Jialin Liu and Philippe Rolet and Olivier Teytaud}, year={2013}}
  • 2. This talk about noisy optimization in continuous domains Has an impact on algorithms. EA theory in discrete domains EA theory in Continuous domains Yes, I wanted a little bit of troll in my talk.
  • 3. In case I still have friends in the room, a second troll. Has an impact on algorithms. (but ok, when we really want it to work, we use mathematical programming) EA theory in Continuous domains EA theory in discrete domains
  • 4. Noisy optimization: preliminaries (1) Points at which the fitness function is evaluated (might be bad) Obtained fitness values Points which are recommended as Approximation of the optimum (should be good)
  • 5. Noisy optimization: preliminaries (2) À la Anne Auger: In noise-free cases, you can get Log ||x_n|| ~ - C n with an ES (log-linear convergence)
  • 6. Noisy optimization: preliminaries (3) A noise model (among others): p z f(x) = ||x|| + ||x||  Noise with “Noise” some i.i.d noise. ● Z=0 ==> additive constant variance noise ● Z=2 ==> noise quickly converging to 0
  • 7. Part 1: what about ES ? Let us start with: ● ES with nothing special ● Just reevaluate and same business as usual
  • 8. Noisy optimization: log-log or loglinear convergence for ES ? Set z=0 (constant additive noise). Define r(n) = number of revaluations at iteration n. ● r(n)=poly(n) ● r(n)=expo(n) ● r(n)=poly (1 / step-size(n) ) Results: ● Log-linear w.r.t. iterations ● Log-log w.r.t. evaluations
  • 9. Noisy optimization: log-log or loglinear convergence for ES ? ● r(n)=expo(n) ==> log(||~xn||) ~ -C log(n) ● r(n)=poly (1 / ss(n) ) ==> log(||~xn||) ~ -C log(n) Noise' = Noise after averaging over r(n) revaluations Proof: enough revaluations for no misrank ● d(n) = sequence decreasing to zero. ● p(n) = P( |E f(xi)-E f(xj)| <= d(n) ) (≈same fitness) ● p'(n) = λ2 P( | Noise'(xi)-Noise'(xj) | >d(n) ) ● choose coeffs so that, if linear convergence in the noise-free case, ∑ p(n)+p(n') < δ
  • 10. So with simple revaluation schemes ● ● z=0 ==> log-log (for #evals) z=large (2?) ==> not finished checking, we believe it is log-linear Now: What about “races” ? Races = revaluations until statistical difference.
  • 11. Part 2 ● ● Bernoulli noise model (sorry, another model...) Combining bandits and ES for runtime analysis (see also work by V. Heidrich-Meisner and C. Igel) ● Complexity bounds by information theory (#bits of information)
  • 12. Bernoulli objective function (convenient for applying information theory; what you get is bits of information.)
  • 13. ES + races ● The race stops when you know one of the good points ● Assume sphere (sorry) ● Sample five in a row, on one axis ● At some point two of them are statistically different (because of the sphere) ● Reduce the domain accordingly ● Now split another axis
  • 14. ES + races ● The race stops when you know one of the good points ● Assume sphere (sorry) ● Sample five in a row, on one axis ● At some point two of them are statistically different ● Reduce the domain accordingly ● Now split another axis
  • 15. ES + races ● The race stops when you know one of the good points ● Assume sphere (sorry) ● Sample five in a row, on one axis ● At some point two of them are statistically different ● Reduce the domain accordingly ● Now split another axis
  • 16. Black-box complexity: the tools ● Ok, we got a rate for both evaluated and recommended points. ● What about lower bounds ? ● UR = uniform rate = bound for worse evaluated point (see also: cumulative regret) Proof technique for precision e with Bernoulli noise: b=#bits of information (yes, it's a rigorous proof) b > log2( nb of possible solutions with precision e ) b < sum of q(n) with q(n) = proba that fitness(x*,w) ≠ fitness(x**,w)
  • 17. Rates on Bernoulli model log||x(n)|| < C - α log(n) Other algorithms reach 1/2. So, yes, sampling close to the optimum makes algorithms slower sometimes.
  • 18. Conclusions of parts 1 and 2 ● ● Part 1: with constant noise, log-log convergence; improved with noise decreasing to 0 Part 2: – When using races, sampling should be careful – sampling not too close to the optimum can improve convergence rates
  • 19. Part 3: Mathematical programming and surrogate models ● ● Fabian 1967: good convergence rate for recommended points (evaluated points: not good) with stochastic gradient z=0, constant noise: Log ||~xn|| ~ C - log ( n )/2 (rate=-1/2) ● Chen 1988: this is optimal ● Rediscovered recently in ML conferences
  • 20. Newton-style information ● Estimate the gradient & Hessian by finite differences ● s(SR) = slope(simple regret) = log | E f (~xn)-Ef(x*)| / log(n) ● s(CR) = slope(cumulative regret) = log | E f(xi) – Ef(xi)| / log(n) Step-size for finite differences #revals per point
  • 21. Newton-style information ● Estimate the gradient & Hessian by finite differences ● s(SR) = slope(simple regret) = log | E f (xn)-Ef(x*)| / log(n) ● s(CR) = slope(cumulative) = log | E f(xi) – Ef(xi)| / log(n) (same rate as Fabian for z=0; better for z>0) ly nd ; e -fri rion ES rite ers C sid st n co wor the oints p
  • 22. Conclusions ● ● Compromise between SR and CR (ES better for CR) Information theory can help for proving complexity bounds in the noisy case ● Bandits + races = require careful sampling (if two points very close, huge #revals) ● ● Newton-style algorithms are fast … when z>0 No difference between evaluated points & recommended points ==> slow rates (similar to simple regret vs cumulative regret debates in bandits)
  • 23. I have no xenophobia against discrete domains http://www.lri.fr/~teytaud/discretenoise.pdf is a Dagstuhl-collaborative work on discrete optimization (thanks Youhei, Adam, Jonathan).