SlideShare a Scribd company logo
Pixel RNN 부터
Pixel CNN++ 까지
2020. 01. 16 (목)
이동헌
Contents
Taxonomy of Generative Models
(1) Pixel RNN
(2) Pixel CNN
(3) Gated Pixel CNN
(4) Pixel CNN++
(Google DeepMind, arxiv, 2016)
(Google DeepMind, arxiv, 2016)
(Google DeepMind, NIPS, 2016)
(OpenAI, ICML, 2017)
Taxonomy of Generative Models
Generative model은 Maximum Likelihood를 바탕으로 학습하는 것으로
정리할 수 있으며, 이 때 어떤 식으로 likelihood를 다루느냐 (근사를 할
것이냐 혹은 정확히 표현할 것이냐 등)에 따라 다양한 전략이 존재
Taxonomy of Generative Models
Density (=Prior distribution, model) 정의
(+) 다루기가 비교적 편하고 어느 정도 모델의 움직임이
예측가능
(-) 우리가 아는 것 이상으로는 결과를 낼 수 없는 한계
Density를 정의하지 않고 Sampling 함
Taxonomy of Generative Models
Generator가 만드는 분포로부터 sample을 생성
(Markov Chain과 다르게 input 없이 sample 생성)
sample x′을 반복적으로 뽑다보면 결국에
는 x′이 pmodel(x)로부터 나온 sample로 수렴
(+) Sample간의 분산이 높지 않은 경우 괜찮
은 성능
(-) 고차원에서 성능 떨어지고 계산 느림
Taxonomy of Generative Models
학습 시, Density를
수학적으로 계산
(미적분)이 가능
Neural Autoregressive à
: 이전의 자기 자신을 이용하여
현재의 자신을 예측하는 모델
Taxonomy of Generative Models
• Encoder:
• Decoder: from a latent code z, reconstructed sample
!" #$ z to be close to the data used to obtain the latent code, x
5!67! 5 8 79 8~;< 8 $ , =>?@@A B7!C?@ ß VAE는 결합분포를 적분식으로 표현
하며 이를 ‘직접’ 적분하지 못하기 때문
에 variational inference로 '추정'
(1) Pixel RNN
• Autoregreesive Model의 핵심은, 데이터간의 dependency 순서를 정해주는 것!
• One effective approach to tractably model a joint distribution of the pixels in the
image is to cast it as a product of conditional distributions.
à Pixel (1~n2) 순서로 진행
Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
(1) Pixel RNN
Architecture
Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
(1) Pixel RNN
• R, G, B 순서로 진행
MASK
: First Layer, each of the RGB channels is connected to previous
channels and to the context, but is not connected to itself.
: Subsequent Layers, the channels are also connected to themselves.
Multiple Residual Blocks (모델마다 다름)
Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
(1) Pixel RNN
Input
Hidden
State
input-to-state & state-to-state
Row LSTM
Multiplication à Convolution
https://www.slideshare.net/thinkingfactory/pr12-pixelrnn-jaejun-yoo?from_action=save
(1) Pixel RNN
Input
Hidden
State
input-to-state & state-to-state
Diagonal BiLSTM 2x1 Conv
• Diagonal convolution 어려우므로, skew the feature maps
à it can be parallelized
https://www.slideshare.net/thinkingfactory/pr12-pixelrnn-jaejun-yoo?from_action=save
(2) Pixel CNN
input-to-state
Input
Hidden
State
Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
Experiments
• Discrete Softmax Distribution
Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
Experiments
• Negative log-likelihood (NLL)
Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
Experiments
Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
Experiments
Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
(3) Gated Pixel CNN
v Pixel CNN 성능 개선
1) ReLU à Gated Activation Unit à Conditional PixelCNN
<A single layer in the Gated PixelCNN architecture>
Condition
(Vk,g ∗ s is an unmasked 1 × 1 convolution, h=s)
Van den Oord, Aaron, et al. "Conditional image generation with pixelcnn decoders." Advances in neural information processing systems. 2016.
(3) Gated Pixel CNN
2) Stacks : blinded spot 제거
PixelCNN
1.Horizontal Stack : It conditions only on the current row and takes as input the output of previous layer as
well as the of the vertical stack.
2.Vertical Stack : It conditions on all the rows above the current pixel. It doesn’t have any masking. It’s output
is fed into the horizontal stack and the receptive field grows in rectangular fashion.
Gated PixelCNN
current pixel
https://towardsdatascience.com/auto-regressive-generative-models-pixelrnn-pixelcnn-32d192911173
(4) Pixel CNN++
1) Discretized logistic mixture likelihood
The softmax layer which is used to compute the conditional distribution of a pixel although efficiency is very costly in terms of
memory. Also, it makes gradients sparse early on during training.
à To counter this, we assume a latent color intensity akin to that used in variational autoencoders, with a continuous distribution
It is rounded off to its nearest 8-bit representation to give pixel value. The distribution of intensity is logistic so the pixel values
can be easily determined.
Salimans, Tim, et al. "Pixelcnn++: Improving the pixelcnn with discretized logistic mixture likelihood and other modifications." arXiv preprint arXiv:1701.05517 (2017).
à This method is memory efficient, output is of lower dimensions which provides denser gradients thus solving both problems.
(4) Pixel CNN++
2) Other Modification
• Conditioning on whole pixels : PixelCNN factorizes the model over the 3 sub pixels according to the color(RGB) which
however, complicates the model. The dependency between color channels of a pixel is relatively simple and doesn’t
require a deep model to train.
à Therefore, it is better to condition on whole pixels instead of separate colors and then output joint distributions over
all 3 channels of the predicted pixel.
• Downsampling : PixelCNN cannot compute long range dependencies. This is one of the disadvantages of PixelCNN as
to why it cannot match the performance of PixelRNN. To overcome this, we downsample the layers by using
convolutions of stride 2. Downsampling reduces input size and thus improves relative size of receptive field which
leads to some loss of information but it can be compensated by adding extra short-cut connections.
https://towardsdatascience.com/auto-regressive-generative-models-pixelrnn-pixelcnn-32d192911173
(4) Pixel CNN++
2) Other Modification
• Short-cut connections : This model the encoder-decoder structure of U-net. Layers 2 and 3 are downsampled and then
layers 5 and 6 are upsampled. There is a residual connection from encoders to decoders to provide the localised
information.
• Dropout : Since the model for PixelCNN and PixelCNN++ are both very powerful, they are likely to overfit data if not
regularized. So, we apply dropout on the residual path after the first convolution.
https://towardsdatascience.com/auto-regressive-generative-models-pixelrnn-pixelcnn-32d192911173
Experiments
Salimans, Tim, et al. "Pixelcnn++: Improving the pixelcnn with discretized logistic mixture likelihood and other modifications." arXiv preprint arXiv:1701.05517 (2017).
감사합니다

More Related Content

What's hot

Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
Sungchul Kim
 
ディープラーニングの2値化(Binarized Neural Network)
ディープラーニングの2値化(Binarized Neural Network)ディープラーニングの2値化(Binarized Neural Network)
ディープラーニングの2値化(Binarized Neural Network)
Hideo Terada
 
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Ono Shigeru
 
Discrete Logarithm Problem over Prime Fields, Non-canonical Lifts and Logarit...
Discrete Logarithm Problem over Prime Fields, Non-canonical Lifts and Logarit...Discrete Logarithm Problem over Prime Fields, Non-canonical Lifts and Logarit...
Discrete Logarithm Problem over Prime Fields, Non-canonical Lifts and Logarit...
PadmaGadiyar
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
NAVER Engineering
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learning
Ryo Iwaki
 
Deep Generative Models
Deep Generative ModelsDeep Generative Models
Deep Generative Models
Mijung Kim
 
[기초개념] Graph Convolutional Network (GCN)
[기초개념] Graph Convolutional Network (GCN)[기초개념] Graph Convolutional Network (GCN)
[기초개념] Graph Convolutional Network (GCN)
Donghyeon Kim
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
홍배 김
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks
Seiya Tokui
 
가깝고도 먼 Trpo
가깝고도 먼 Trpo가깝고도 먼 Trpo
가깝고도 먼 Trpo
Woong won Lee
 
【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
Deep Learning JP
 
Tutorial Equivariance in Imaging ICMS 23.pptx
Tutorial Equivariance in Imaging ICMS 23.pptxTutorial Equivariance in Imaging ICMS 23.pptx
Tutorial Equivariance in Imaging ICMS 23.pptx
Julián Tachella
 
Local Outlier Factor
Local Outlier FactorLocal Outlier Factor
Local Outlier Factor
AMR koura
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
Jinwon Lee
 
Wasserstein GAN
Wasserstein GANWasserstein GAN
Wasserstein GAN
Bar Vinograd
 
Introduction to Diffusion Models
Introduction to Diffusion ModelsIntroduction to Diffusion Models
Introduction to Diffusion Models
Sangwoo Mo
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
Daiki Tanaka
 
Anatomy of YOLO - v1
Anatomy of YOLO - v1Anatomy of YOLO - v1
Anatomy of YOLO - v1
Jihoon Song
 
Pr045 deep lab_semantic_segmentation
Pr045 deep lab_semantic_segmentationPr045 deep lab_semantic_segmentation
Pr045 deep lab_semantic_segmentation
Taeoh Kim
 

What's hot (20)

Exploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation LearningExploring Simple Siamese Representation Learning
Exploring Simple Siamese Representation Learning
 
ディープラーニングの2値化(Binarized Neural Network)
ディープラーニングの2値化(Binarized Neural Network)ディープラーニングの2値化(Binarized Neural Network)
ディープラーニングの2値化(Binarized Neural Network)
 
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 7
 
Discrete Logarithm Problem over Prime Fields, Non-canonical Lifts and Logarit...
Discrete Logarithm Problem over Prime Fields, Non-canonical Lifts and Logarit...Discrete Logarithm Problem over Prime Fields, Non-canonical Lifts and Logarit...
Discrete Logarithm Problem over Prime Fields, Non-canonical Lifts and Logarit...
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
safe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learningsafe and efficient off policy reinforcement learning
safe and efficient off policy reinforcement learning
 
Deep Generative Models
Deep Generative ModelsDeep Generative Models
Deep Generative Models
 
[기초개념] Graph Convolutional Network (GCN)
[기초개념] Graph Convolutional Network (GCN)[기초개념] Graph Convolutional Network (GCN)
[기초개념] Graph Convolutional Network (GCN)
 
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
InfoGAN: Interpretable Representation Learning by Information Maximizing Gene...
 
論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks論文紹介 Pixel Recurrent Neural Networks
論文紹介 Pixel Recurrent Neural Networks
 
가깝고도 먼 Trpo
가깝고도 먼 Trpo가깝고도 먼 Trpo
가깝고도 먼 Trpo
 
【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
【DL輪読会】Contrastive Learning as Goal-Conditioned Reinforcement Learning
 
Tutorial Equivariance in Imaging ICMS 23.pptx
Tutorial Equivariance in Imaging ICMS 23.pptxTutorial Equivariance in Imaging ICMS 23.pptx
Tutorial Equivariance in Imaging ICMS 23.pptx
 
Local Outlier Factor
Local Outlier FactorLocal Outlier Factor
Local Outlier Factor
 
Faster R-CNN - PR012
Faster R-CNN - PR012Faster R-CNN - PR012
Faster R-CNN - PR012
 
Wasserstein GAN
Wasserstein GANWasserstein GAN
Wasserstein GAN
 
Introduction to Diffusion Models
Introduction to Diffusion ModelsIntroduction to Diffusion Models
Introduction to Diffusion Models
 
[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need[Paper Reading] Attention is All You Need
[Paper Reading] Attention is All You Need
 
Anatomy of YOLO - v1
Anatomy of YOLO - v1Anatomy of YOLO - v1
Anatomy of YOLO - v1
 
Pr045 deep lab_semantic_segmentation
Pr045 deep lab_semantic_segmentationPr045 deep lab_semantic_segmentation
Pr045 deep lab_semantic_segmentation
 

Similar to Pixel RNN to Pixel CNN++

Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Universitat Politècnica de Catalunya
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Universitat Politècnica de Catalunya
 
Yolo
YoloYolo
Mnist report ppt
Mnist report pptMnist report ppt
Mnist report ppt
RaghunandanJairam
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Universitat Politècnica de Catalunya
 
Mnist report
Mnist reportMnist report
Mnist report
RaghunandanJairam
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
Pierre de Lacaze
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
Jihong Kang
 
Deep learning for image video processing
Deep learning for image video processingDeep learning for image video processing
Deep learning for image video processing
Yu Huang
 
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Alex Conway
 
Review-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningReview-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learning
Trong-An Bui
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
NUPUR YADAV
 
B.tech_project_ppt.pptx
B.tech_project_ppt.pptxB.tech_project_ppt.pptx
B.tech_project_ppt.pptx
supratikmondal6
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
Edge AI and Vision Alliance
 
Review on cs231 part-2
Review on cs231 part-2Review on cs231 part-2
Review on cs231 part-2
Jeong Choi
 
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
inside-BigData.com
 
CNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent AdvancesCNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent Advances
Dmytro Mishkin
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
ssuser3aa461
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
ananth
 
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 

Similar to Pixel RNN to Pixel CNN++ (20)

Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
 
Yolo
YoloYolo
Yolo
 
Mnist report ppt
Mnist report pptMnist report ppt
Mnist report ppt
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Mnist report
Mnist reportMnist report
Mnist report
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331Recent Progress on Object Detection_20170331
Recent Progress on Object Detection_20170331
 
Deep learning for image video processing
Deep learning for image video processingDeep learning for image video processing
Deep learning for image video processing
 
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
Convolutional Neural Networks for Image Classification (Cape Town Deep Learni...
 
Review-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learningReview-image-segmentation-by-deep-learning
Review-image-segmentation-by-deep-learning
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
B.tech_project_ppt.pptx
B.tech_project_ppt.pptxB.tech_project_ppt.pptx
B.tech_project_ppt.pptx
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
 
Review on cs231 part-2
Review on cs231 part-2Review on cs231 part-2
Review on cs231 part-2
 
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
Scratch to Supercomputers: Bottoms-up Build of Large-scale Computational Lens...
 
CNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent AdvancesCNNs: from the Basics to Recent Advances
CNNs: from the Basics to Recent Advances
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
 
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
PixelCNN, Wavenet, Normalizing Flows - Santiago Pascual - UPC Barcelona 2018
 

More from Dongheon Lee

Workshop 210417 dhlee
Workshop 210417 dhleeWorkshop 210417 dhlee
Workshop 210417 dhlee
Dongheon Lee
 
GAN Evaluation
GAN EvaluationGAN Evaluation
GAN Evaluation
Dongheon Lee
 
BeautyGlow
BeautyGlowBeautyGlow
BeautyGlow
Dongheon Lee
 
ModuLab DLC-Medical5
ModuLab DLC-Medical5ModuLab DLC-Medical5
ModuLab DLC-Medical5
Dongheon Lee
 
ModuLab DLC-Medical4
ModuLab DLC-Medical4ModuLab DLC-Medical4
ModuLab DLC-Medical4
Dongheon Lee
 
ModuLab DLC-Medical1
ModuLab DLC-Medical1ModuLab DLC-Medical1
ModuLab DLC-Medical1
Dongheon Lee
 
ModuLab DLC-Medical3
ModuLab DLC-Medical3ModuLab DLC-Medical3
ModuLab DLC-Medical3
Dongheon Lee
 
Deep Learning for AI (2)
Deep Learning for AI (2)Deep Learning for AI (2)
Deep Learning for AI (2)
Dongheon Lee
 
Deep Learning for AI (3)
Deep Learning for AI (3)Deep Learning for AI (3)
Deep Learning for AI (3)
Dongheon Lee
 
Deep Learning for AI (1)
Deep Learning for AI (1)Deep Learning for AI (1)
Deep Learning for AI (1)
Dongheon Lee
 

More from Dongheon Lee (10)

Workshop 210417 dhlee
Workshop 210417 dhleeWorkshop 210417 dhlee
Workshop 210417 dhlee
 
GAN Evaluation
GAN EvaluationGAN Evaluation
GAN Evaluation
 
BeautyGlow
BeautyGlowBeautyGlow
BeautyGlow
 
ModuLab DLC-Medical5
ModuLab DLC-Medical5ModuLab DLC-Medical5
ModuLab DLC-Medical5
 
ModuLab DLC-Medical4
ModuLab DLC-Medical4ModuLab DLC-Medical4
ModuLab DLC-Medical4
 
ModuLab DLC-Medical1
ModuLab DLC-Medical1ModuLab DLC-Medical1
ModuLab DLC-Medical1
 
ModuLab DLC-Medical3
ModuLab DLC-Medical3ModuLab DLC-Medical3
ModuLab DLC-Medical3
 
Deep Learning for AI (2)
Deep Learning for AI (2)Deep Learning for AI (2)
Deep Learning for AI (2)
 
Deep Learning for AI (3)
Deep Learning for AI (3)Deep Learning for AI (3)
Deep Learning for AI (3)
 
Deep Learning for AI (1)
Deep Learning for AI (1)Deep Learning for AI (1)
Deep Learning for AI (1)
 

Recently uploaded

The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
ankuprajapati0525
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
Kamal Acharya
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
seandesed
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
MdTanvirMahtab2
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
Neometrix_Engineering_Pvt_Ltd
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
DuvanRamosGarzon1
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
AJAYKUMARPUND1
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
Osamah Alsalih
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Pipe Restoration Solutions
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Dr.Costas Sachpazis
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
ViniHema
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
FluxPrime1
 

Recently uploaded (20)

The role of big data in decision making.
The role of big data in decision making.The role of big data in decision making.
The role of big data in decision making.
 
Event Management System Vb Net Project Report.pdf
Event Management System Vb Net  Project Report.pdfEvent Management System Vb Net  Project Report.pdf
Event Management System Vb Net Project Report.pdf
 
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang,  ICLR 2024, MLILAB, KAIST AI.pdfJ.Yang,  ICLR 2024, MLILAB, KAIST AI.pdf
J.Yang, ICLR 2024, MLILAB, KAIST AI.pdf
 
Architectural Portfolio Sean Lockwood
Architectural Portfolio Sean LockwoodArchitectural Portfolio Sean Lockwood
Architectural Portfolio Sean Lockwood
 
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)
 
Standard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - NeometrixStandard Reomte Control Interface - Neometrix
Standard Reomte Control Interface - Neometrix
 
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSETECHNICAL TRAINING MANUAL   GENERAL FAMILIARIZATION COURSE
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSE
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
Pile Foundation by Venkatesh Taduvai (Sub Geotechnical Engineering II)-conver...
 
MCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdfMCQ Soil mechanics questions (Soil shear strength).pdf
MCQ Soil mechanics questions (Soil shear strength).pdf
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
The Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdfThe Benefits and Techniques of Trenchless Pipe Repair.pdf
The Benefits and Techniques of Trenchless Pipe Repair.pdf
 
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...
 
ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
power quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptxpower quality voltage fluctuation UNIT - I.pptx
power quality voltage fluctuation UNIT - I.pptx
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
DESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docxDESIGN A COTTON SEED SEPARATION MACHINE.docx
DESIGN A COTTON SEED SEPARATION MACHINE.docx
 

Pixel RNN to Pixel CNN++

  • 1. Pixel RNN 부터 Pixel CNN++ 까지 2020. 01. 16 (목) 이동헌
  • 2. Contents Taxonomy of Generative Models (1) Pixel RNN (2) Pixel CNN (3) Gated Pixel CNN (4) Pixel CNN++ (Google DeepMind, arxiv, 2016) (Google DeepMind, arxiv, 2016) (Google DeepMind, NIPS, 2016) (OpenAI, ICML, 2017)
  • 3. Taxonomy of Generative Models Generative model은 Maximum Likelihood를 바탕으로 학습하는 것으로 정리할 수 있으며, 이 때 어떤 식으로 likelihood를 다루느냐 (근사를 할 것이냐 혹은 정확히 표현할 것이냐 등)에 따라 다양한 전략이 존재
  • 4. Taxonomy of Generative Models Density (=Prior distribution, model) 정의 (+) 다루기가 비교적 편하고 어느 정도 모델의 움직임이 예측가능 (-) 우리가 아는 것 이상으로는 결과를 낼 수 없는 한계 Density를 정의하지 않고 Sampling 함
  • 5. Taxonomy of Generative Models Generator가 만드는 분포로부터 sample을 생성 (Markov Chain과 다르게 input 없이 sample 생성) sample x′을 반복적으로 뽑다보면 결국에 는 x′이 pmodel(x)로부터 나온 sample로 수렴 (+) Sample간의 분산이 높지 않은 경우 괜찮 은 성능 (-) 고차원에서 성능 떨어지고 계산 느림
  • 6. Taxonomy of Generative Models 학습 시, Density를 수학적으로 계산 (미적분)이 가능 Neural Autoregressive à : 이전의 자기 자신을 이용하여 현재의 자신을 예측하는 모델
  • 7. Taxonomy of Generative Models • Encoder: • Decoder: from a latent code z, reconstructed sample !" #$ z to be close to the data used to obtain the latent code, x 5!67! 5 8 79 8~;< 8 $ , =>?@@A B7!C?@ ß VAE는 결합분포를 적분식으로 표현 하며 이를 ‘직접’ 적분하지 못하기 때문 에 variational inference로 '추정'
  • 8. (1) Pixel RNN • Autoregreesive Model의 핵심은, 데이터간의 dependency 순서를 정해주는 것! • One effective approach to tractably model a joint distribution of the pixels in the image is to cast it as a product of conditional distributions. à Pixel (1~n2) 순서로 진행 Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
  • 9. (1) Pixel RNN Architecture Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
  • 10. (1) Pixel RNN • R, G, B 순서로 진행 MASK : First Layer, each of the RGB channels is connected to previous channels and to the context, but is not connected to itself. : Subsequent Layers, the channels are also connected to themselves. Multiple Residual Blocks (모델마다 다름) Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
  • 11. (1) Pixel RNN Input Hidden State input-to-state & state-to-state Row LSTM Multiplication à Convolution https://www.slideshare.net/thinkingfactory/pr12-pixelrnn-jaejun-yoo?from_action=save
  • 12. (1) Pixel RNN Input Hidden State input-to-state & state-to-state Diagonal BiLSTM 2x1 Conv • Diagonal convolution 어려우므로, skew the feature maps à it can be parallelized https://www.slideshare.net/thinkingfactory/pr12-pixelrnn-jaejun-yoo?from_action=save
  • 13. (2) Pixel CNN input-to-state Input Hidden State Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
  • 14.
  • 15. Experiments • Discrete Softmax Distribution Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
  • 16. Experiments • Negative log-likelihood (NLL) Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
  • 17. Experiments Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
  • 18. Experiments Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprint arXiv:1601.06759 (2016).
  • 19. (3) Gated Pixel CNN v Pixel CNN 성능 개선 1) ReLU à Gated Activation Unit à Conditional PixelCNN <A single layer in the Gated PixelCNN architecture> Condition (Vk,g ∗ s is an unmasked 1 × 1 convolution, h=s) Van den Oord, Aaron, et al. "Conditional image generation with pixelcnn decoders." Advances in neural information processing systems. 2016.
  • 20. (3) Gated Pixel CNN 2) Stacks : blinded spot 제거 PixelCNN 1.Horizontal Stack : It conditions only on the current row and takes as input the output of previous layer as well as the of the vertical stack. 2.Vertical Stack : It conditions on all the rows above the current pixel. It doesn’t have any masking. It’s output is fed into the horizontal stack and the receptive field grows in rectangular fashion. Gated PixelCNN current pixel https://towardsdatascience.com/auto-regressive-generative-models-pixelrnn-pixelcnn-32d192911173
  • 21. (4) Pixel CNN++ 1) Discretized logistic mixture likelihood The softmax layer which is used to compute the conditional distribution of a pixel although efficiency is very costly in terms of memory. Also, it makes gradients sparse early on during training. à To counter this, we assume a latent color intensity akin to that used in variational autoencoders, with a continuous distribution It is rounded off to its nearest 8-bit representation to give pixel value. The distribution of intensity is logistic so the pixel values can be easily determined. Salimans, Tim, et al. "Pixelcnn++: Improving the pixelcnn with discretized logistic mixture likelihood and other modifications." arXiv preprint arXiv:1701.05517 (2017). à This method is memory efficient, output is of lower dimensions which provides denser gradients thus solving both problems.
  • 22. (4) Pixel CNN++ 2) Other Modification • Conditioning on whole pixels : PixelCNN factorizes the model over the 3 sub pixels according to the color(RGB) which however, complicates the model. The dependency between color channels of a pixel is relatively simple and doesn’t require a deep model to train. à Therefore, it is better to condition on whole pixels instead of separate colors and then output joint distributions over all 3 channels of the predicted pixel. • Downsampling : PixelCNN cannot compute long range dependencies. This is one of the disadvantages of PixelCNN as to why it cannot match the performance of PixelRNN. To overcome this, we downsample the layers by using convolutions of stride 2. Downsampling reduces input size and thus improves relative size of receptive field which leads to some loss of information but it can be compensated by adding extra short-cut connections. https://towardsdatascience.com/auto-regressive-generative-models-pixelrnn-pixelcnn-32d192911173
  • 23. (4) Pixel CNN++ 2) Other Modification • Short-cut connections : This model the encoder-decoder structure of U-net. Layers 2 and 3 are downsampled and then layers 5 and 6 are upsampled. There is a residual connection from encoders to decoders to provide the localised information. • Dropout : Since the model for PixelCNN and PixelCNN++ are both very powerful, they are likely to overfit data if not regularized. So, we apply dropout on the residual path after the first convolution. https://towardsdatascience.com/auto-regressive-generative-models-pixelrnn-pixelcnn-32d192911173
  • 24. Experiments Salimans, Tim, et al. "Pixelcnn++: Improving the pixelcnn with discretized logistic mixture likelihood and other modifications." arXiv preprint arXiv:1701.05517 (2017).