Master defence 2020 -Volodymyr Lut-Neural Architecture Search: a Probabilistic Approach

Neural architecture
search: a probabilistic
approach
Author:
Volodymyr LUT
Supervisors:
Yuriy KHOMA and Vasilii GANISHEV

What is AutoML?
AutoML is a general name of automation in the routine work of
ML engineers.
It covers different areas including
but not limited to data preparation, feature engineering,
feature extraction,
neural architecture search, hyperparameters selection, etc.

Why AutoML?
Democratization of
technology
Extension of ML engineering
toolkit
Effective resource
management
Leverage to move
industry forward

Timeline of accuracy advances in ImageNet image recognition (Google)

- Reinforcement Learning
- Evolutionary algorithms
- Bayesian optimization
- Grid search
- Other
Neural Architecture Search Strategies

- The architecture of the student CNN in terms of action
space of MDP
- Max. accuracy received after student CNN evaluation
would become an immediate reward for the controller
- This stochastic process has Markov property, meaning
that future state depends only upon the present state.
- Reinforcement learning agent is maximizing cumulative
reward
Reinforcement Learning Strategy

DQN in terms of NAS problem
The controller
(CNN)
Trains student
network with
architecture A to
get accuracy R
Sample architecture A with probability p
Update weights of controller based on R

We are interested in the full predictive distribution, not just
a single best fit of Q(state, action) value received from the
controller CNN.
CNN may learn the specific input examples and their
associated outputs. To prevent it, we use the mean and
standard deviation of the target variable to model its
Gaussian distribution.
We use those parameters as a measure of uncertainty of
target variable prediction.
Probabilistic approximation of Q

Output of a last layer of regression model actually relates
to a well-known probability distribution, the Gaussian
distribution.
We would use the maximum likelihood estimate (MLE) of the
variance as a measure of uncertainty. We use log-
likelihood of the variance of the distribution as a loss
function of controller CNN.
Gaussian Layer and loss function

Following limitations was set:
- 80 training epochs per each dataset (CIFAR-10/CIFAR-
100) for controller CNN
- 8 training epochs per each architecture A of student
CNN
- Action space limited to selection of kernel size and
number of output filters for 4 Convolutional Layers
Details of the experiment

Action space limitation
Action type Values available in experiment
Number of filters 2, 4, 8, 16, 32, 64
Kernel size 1, 3, 6, 9, 12, 24

Algorithm CIFAR-10 mean
accuracy at last 10
epochs
CIFAR-100 mean
accuracy at last 10
epochs
Gaussian epsilon-
greedy
0.3807 0.1588
Classic epsilon-greedy 0.3308 0.0846
Gaussian UCB 0.3875 0.1078
Classic UCB 0.3737 0.0799
Results

- Gaussian modification of controller CNN was able to yield better
architectures on a previously unknown dataset
- Reinforcement learning is a good tool for NAS problem, however, NAS
problem is not the best environment for such research because it is
computationally expensive
- Even though Gaussian modification yields better results it hasn’t
prevented algorithm from overfitting at small action space and other
tools should be also considered.
Conclusions

Because logarithms are strictly increasing functions, maximizing the
likelihood is equivalent to maximizing the log-likelihood. Since Gaussian
density curve is log-concave it is convenient to use log-likelihood in this
case.
Log-likelihood

Detailed results
CIFAR-10 CIFAR-100
Algorithm max. reward max. acc. max. reward max. acc.
Gaussian 0.42795802 0.6592 0.7739771 0.3239
Classic 1.38204733 0.6437 1.2746971 0.1929
Complete CSV log with training history of experiments could be found at
https://github.com/volodymyrlut/masters-project

Master defence 2020 -Volodymyr Lut-Neural Architecture Search: a Probabilistic Approach

More Related Content

What's hot

Similar to Master defence 2020 -Volodymyr Lut-Neural Architecture Search: a Probabilistic Approach

More from Lviv Data Science Summer School

Recently uploaded

Master defence 2020 -Volodymyr Lut-Neural Architecture Search: a Probabilistic Approach