2018. 10. 24.
의료영상 분석 및 처리
- Auto ML -
이 동 헌
https://www.slideshare.net/KihoSuh/neural-architecture-search-with-reinforcement-learning-76883153
v 큰 구조: fixed
• Normal / Reduction Cell #
• B = 5
https://www.slideshare.net/KihoSuh/neural-architecture-search-with-reinforcement-learning-76883153
v세부 사항: Searching (Normal Cell / Reduction Cell 내부)
The 13 available operations
Softmax Skip Connection
• Penn Treebank
The CIFAR-10 dataset consists of 60000 32x32 colour
images in 10 classes, with 6000 images per class. There
are 50000 training images and 10000 test images.
• CIFAR-10
Penn Treebank dataset, known as PTB dataset, is widely
used in machine learning of NLP (Natural Language
Processing) research
v Datasets
Keras
Tensorflow
NAS is computationally expensive and time consuming, e.g. Zoph et al. (2018) use 450 GPUs for 3-4 days
(i.e. 32,400-43,200 GPU hours)
• We observe that the computational bottleneck of NAS is the training of each child model to
convergence, only to measure its accuracy whilst throwing away all the trained weights.
• The main contribution of this work is to improve the efficiency of NAS by forcing all child models
to share weights to eschew training each child model from scratch to convergence.
Importantly, in all of our experiments, for which we use a single Nvidia GTX 1080Ti GPU, the search for
architectures takes less than 16 hours. Compared to NAS, this is a reduction of GPU-hours by more than 1000x.
uDirected Acyclic Graph (DAG)
§ Node
• Local Computation
• Own Parameters (activate시, 사용)
§ Edge
• Flow of information
• Determined by a controller (red)
Input
Output
→ Search space 내의 모든 child model의 parameter sharing
→ RNN cell의 1node 배치와 2operation을 같이 학습 (유연함)
(↔ NAS: 사용자가 node의 배치 등을 정해주고, 각 노드의 operation만 학습)
https://jayhey.github.io/deep%20learning/2018/03/15/ENAS/
(1) Macro Search : 전체 구조 Search (7시간)
The 6 available operations (< NAS)
• Convolution with kernel size 3 × 3 and 5 × 5.
• Depthwise-Separable Convolution with kernel size 3 × 3 and 5 × 5.
• Average Pooling / Max pooling with kernel size 3 × 3.
6L×2L(L-1)/2 개
(L=12, 1.6×1029 개 후보)
(2) Micro Search
: Cell 단위로 Search하고나서 합침 (11.5시간)
The 5 available operations (< NAS)
• Identity
• Separable Convolution with kernel size 3 × 3 and 5 × 5.
• Average Pooling / Max pooling with kernel size 3 × 3.
(5×(B-2)!)4 개
(B=7, 1.3×1011 개 후보)
https://github.com/melodyguan/enas
Normal Cell
Reduction Cell
ModuLab DLC-Medical3
ModuLab DLC-Medical3
ModuLab DLC-Medical3

ModuLab DLC-Medical3

  • 1.
    2018. 10. 24. 의료영상분석 및 처리 - Auto ML - 이 동 헌
  • 4.
  • 5.
    v 큰 구조:fixed • Normal / Reduction Cell # • B = 5
  • 7.
    https://www.slideshare.net/KihoSuh/neural-architecture-search-with-reinforcement-learning-76883153 v세부 사항: Searching(Normal Cell / Reduction Cell 내부) The 13 available operations Softmax Skip Connection
  • 8.
    • Penn Treebank TheCIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images. • CIFAR-10 Penn Treebank dataset, known as PTB dataset, is widely used in machine learning of NLP (Natural Language Processing) research v Datasets
  • 9.
  • 10.
    NAS is computationallyexpensive and time consuming, e.g. Zoph et al. (2018) use 450 GPUs for 3-4 days (i.e. 32,400-43,200 GPU hours) • We observe that the computational bottleneck of NAS is the training of each child model to convergence, only to measure its accuracy whilst throwing away all the trained weights. • The main contribution of this work is to improve the efficiency of NAS by forcing all child models to share weights to eschew training each child model from scratch to convergence. Importantly, in all of our experiments, for which we use a single Nvidia GTX 1080Ti GPU, the search for architectures takes less than 16 hours. Compared to NAS, this is a reduction of GPU-hours by more than 1000x.
  • 11.
    uDirected Acyclic Graph(DAG) § Node • Local Computation • Own Parameters (activate시, 사용) § Edge • Flow of information • Determined by a controller (red) Input Output → Search space 내의 모든 child model의 parameter sharing → RNN cell의 1node 배치와 2operation을 같이 학습 (유연함) (↔ NAS: 사용자가 node의 배치 등을 정해주고, 각 노드의 operation만 학습) https://jayhey.github.io/deep%20learning/2018/03/15/ENAS/
  • 12.
    (1) Macro Search: 전체 구조 Search (7시간) The 6 available operations (< NAS) • Convolution with kernel size 3 × 3 and 5 × 5. • Depthwise-Separable Convolution with kernel size 3 × 3 and 5 × 5. • Average Pooling / Max pooling with kernel size 3 × 3. 6L×2L(L-1)/2 개 (L=12, 1.6×1029 개 후보)
  • 14.
    (2) Micro Search :Cell 단위로 Search하고나서 합침 (11.5시간) The 5 available operations (< NAS) • Identity • Separable Convolution with kernel size 3 × 3 and 5 × 5. • Average Pooling / Max pooling with kernel size 3 × 3. (5×(B-2)!)4 개 (B=7, 1.3×1011 개 후보)
  • 17.
  • 18.