SlideShare a Scribd company logo
1 of 10
Download to read offline
Neural Discrete Representation Learning
https://arxiv.org/abs/1711.00937
Aaron van den Oord, Oriol Vinyals, Koray Kavukcuoglu
박수철
Goal
Neural NetworkMusic Data Generated Data
Wavenet
WavenetMusic Data Generated Data
Wavenet의 한계
Long-range structure를 반영하지 못한다.

receptive field를 늘려도 sample단위의 미래를 예측할 뿐, 

의미를 만들어내는 단어(speech), 프레이즈(music)를 만들어내지 못함.
https://deepmind.com/blog/wavenet-generative-model-raw-audio
WavenetMusic Data Generated Data
Latent를 만들자
z1 z2 zM
Encoder
Decoder
Sampling 가능한 Latent를 만들자 (VAE같은걸 끼얹나?)
z1 z2 zM
Decoder
Encoder
VAE의 한계 : Posterior Collapse
Encoder
Decoder
Gaussian Noise
p(x) =
T
∏
t=1
p(xt |x<t, zt)
Variational posterior gaussian prior
VQ-VAE : 샘플링 가능한 비 노이즈적 벡터로 posterior를 근사
Encoder
Decoder
p(x) =
T
∏
t=1
p(xt |x<t, zt)
p(z) = Categorical
VQ-VAE : 샘플링 가능한 비 노이즈적 벡터로 posterior를 근사
Encoder Decoderze(x)
zq(x)
codebook
reconstruction codebook commitment
e1
e2
e3
e4
e5
e6
e7
VQ-VAE : 샘플링 가능한 비 노이즈적 벡터로 posterior를 근사

More Related Content

What's hot

Lecture_16_Self-supervised_Learning.pptx
Lecture_16_Self-supervised_Learning.pptxLecture_16_Self-supervised_Learning.pptx
Lecture_16_Self-supervised_Learning.pptxKarimdabbabi
 
DNNの曖昧性に関する研究動向
DNNの曖昧性に関する研究動向DNNの曖昧性に関する研究動向
DNNの曖昧性に関する研究動向Naoki Matsunaga
 
Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...
Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...
Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...Jason Tsai
 
Generating Diverse High-Fidelity Images with VQ-VAE-2
Generating Diverse High-Fidelity Images with VQ-VAE-2Generating Diverse High-Fidelity Images with VQ-VAE-2
Generating Diverse High-Fidelity Images with VQ-VAE-2harmonylab
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기NAVER Engineering
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
[CVPR読み会]BING:Binarized normed gradients for objectness estimation at 300fps
[CVPR読み会]BING:Binarized normed gradients for objectness estimation at 300fps[CVPR読み会]BING:Binarized normed gradients for objectness estimation at 300fps
[CVPR読み会]BING:Binarized normed gradients for objectness estimation at 300fpsTakuya Minagawa
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelYanbin Kong
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsJinwon Lee
 
Physics-Informed Machine Learning
Physics-Informed Machine LearningPhysics-Informed Machine Learning
Physics-Informed Machine LearningOmarYounis21
 
A Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual RepresentationsA Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual RepresentationsSeunghyun Hwang
 
Deep Belief Networks
Deep Belief NetworksDeep Belief Networks
Deep Belief NetworksHasan H Topcu
 
Toward Disentanglement through Understand ELBO
Toward Disentanglement through Understand ELBOToward Disentanglement through Understand ELBO
Toward Disentanglement through Understand ELBOKai-Wen Zhao
 
[DL輪読会]High-Quality Self-Supervised Deep Image Denoising
[DL輪読会]High-Quality Self-Supervised Deep Image Denoising[DL輪読会]High-Quality Self-Supervised Deep Image Denoising
[DL輪読会]High-Quality Self-Supervised Deep Image DenoisingDeep Learning JP
 
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional NetworksPR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional NetworksHyeongmin Lee
 
PRML学習者から入る深層生成モデル入門
PRML学習者から入る深層生成モデル入門PRML学習者から入る深層生成モデル入門
PRML学習者から入る深層生成モデル入門tmtm otm
 
Res netと派生研究の紹介
Res netと派生研究の紹介Res netと派生研究の紹介
Res netと派生研究の紹介masataka nishimori
 
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2Preferred Networks
 
Backpropagation in Convolutional Neural Network
Backpropagation in Convolutional Neural NetworkBackpropagation in Convolutional Neural Network
Backpropagation in Convolutional Neural NetworkHiroshi Kuwajima
 

What's hot (20)

Lecture_16_Self-supervised_Learning.pptx
Lecture_16_Self-supervised_Learning.pptxLecture_16_Self-supervised_Learning.pptx
Lecture_16_Self-supervised_Learning.pptx
 
DNNの曖昧性に関する研究動向
DNNの曖昧性に関する研究動向DNNの曖昧性に関する研究動向
DNNの曖昧性に関する研究動向
 
Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...
Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...
Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...
 
Generating Diverse High-Fidelity Images with VQ-VAE-2
Generating Diverse High-Fidelity Images with VQ-VAE-2Generating Diverse High-Fidelity Images with VQ-VAE-2
Generating Diverse High-Fidelity Images with VQ-VAE-2
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018
 
[CVPR読み会]BING:Binarized normed gradients for objectness estimation at 300fps
[CVPR読み会]BING:Binarized normed gradients for objectness estimation at 300fps[CVPR読み会]BING:Binarized normed gradients for objectness estimation at 300fps
[CVPR読み会]BING:Binarized normed gradients for objectness estimation at 300fps
 
Cs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative ModelCs231n 2017 lecture13 Generative Model
Cs231n 2017 lecture13 Generative Model
 
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsPR-231: A Simple Framework for Contrastive Learning of Visual Representations
PR-231: A Simple Framework for Contrastive Learning of Visual Representations
 
Physics-Informed Machine Learning
Physics-Informed Machine LearningPhysics-Informed Machine Learning
Physics-Informed Machine Learning
 
A Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual RepresentationsA Simple Framework for Contrastive Learning of Visual Representations
A Simple Framework for Contrastive Learning of Visual Representations
 
Deep Belief Networks
Deep Belief NetworksDeep Belief Networks
Deep Belief Networks
 
Toward Disentanglement through Understand ELBO
Toward Disentanglement through Understand ELBOToward Disentanglement through Understand ELBO
Toward Disentanglement through Understand ELBO
 
[DL輪読会]High-Quality Self-Supervised Deep Image Denoising
[DL輪読会]High-Quality Self-Supervised Deep Image Denoising[DL輪読会]High-Quality Self-Supervised Deep Image Denoising
[DL輪読会]High-Quality Self-Supervised Deep Image Denoising
 
Restricted boltzmann machine
Restricted boltzmann machineRestricted boltzmann machine
Restricted boltzmann machine
 
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional NetworksPR-214: FlowNet: Learning Optical Flow with Convolutional Networks
PR-214: FlowNet: Learning Optical Flow with Convolutional Networks
 
PRML学習者から入る深層生成モデル入門
PRML学習者から入る深層生成モデル入門PRML学習者から入る深層生成モデル入門
PRML学習者から入る深層生成モデル入門
 
Res netと派生研究の紹介
Res netと派生研究の紹介Res netと派生研究の紹介
Res netと派生研究の紹介
 
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2
 
Backpropagation in Convolutional Neural Network
Backpropagation in Convolutional Neural NetworkBackpropagation in Convolutional Neural Network
Backpropagation in Convolutional Neural Network
 

More from 수철 박

diffusion 모델부터 DALLE2까지.pdf
diffusion 모델부터 DALLE2까지.pdfdiffusion 모델부터 DALLE2까지.pdf
diffusion 모델부터 DALLE2까지.pdf수철 박
 
Flow based generative models
Flow based generative modelsFlow based generative models
Flow based generative models수철 박
 
A universal music translation network
A universal music translation networkA universal music translation network
A universal music translation network수철 박
 

More from 수철 박 (6)

diffusion 모델부터 DALLE2까지.pdf
diffusion 모델부터 DALLE2까지.pdfdiffusion 모델부터 DALLE2까지.pdf
diffusion 모델부터 DALLE2까지.pdf
 
Flow based generative models
Flow based generative modelsFlow based generative models
Flow based generative models
 
Gmm to vgmm
Gmm to vgmmGmm to vgmm
Gmm to vgmm
 
A universal music translation network
A universal music translation networkA universal music translation network
A universal music translation network
 
Kernel Method
Kernel MethodKernel Method
Kernel Method
 
R.T.Bach
R.T.BachR.T.Bach
R.T.Bach
 

VQ-VAE