VQ-VAE

•

3 likes•2,040 views

수철 박

Neural Discrete Representation Learning 논문리뷰 발표자료

Engineering

Neural Discrete Representation Learning
https://arxiv.org/abs/1711.00937
Aaron van den Oord, Oriol Vinyals, Koray Kavukcuoglu
박수철

Goal
Neural NetworkMusic Data Generated Data

Wavenet
WavenetMusic Data Generated Data

Wavenet의 한계
Long-range structure를 반영하지 못한다. 
receptive ﬁeld를 늘려도 sample단위의 미래를 예측할 뿐,  
의미를 만들어내는 단어(speech), 프레이즈(music)를 만들어내지 못함.
https://deepmind.com/blog/wavenet-generative-model-raw-audio
WavenetMusic Data Generated Data

Sampling 가능한 Latent를 만들자 (VAE같은걸 끼얹나?)
z1 z2 zM
Decoder
Encoder

VAE의 한계 : Posterior Collapse
Encoder
Decoder
Gaussian Noise
p(x) =
T
∏
t=1
p(xt |x<t, zt)
Variational posterior gaussian prior

VQ-VAE : 샘플링 가능한 비 노이즈적 벡터로 posterior를 근사
Encoder
Decoder
p(x) =
T
∏
t=1
p(xt |x<t, zt)
p(z) = Categorical

VQ-VAE : 샘플링 가능한 비 노이즈적 벡터로 posterior를 근사
Encoder Decoderze(x)
zq(x)
codebook
reconstruction codebook commitment
e1
e2
e3
e4
e5
e6
e7

VQ-VAE : 샘플링 가능한 비 노이즈적 벡터로 posterior를 근사

What's hot

Lecture_16_Self-supervised_Learning.pptxKarimdabbabi

DNNの曖昧性に関する研究動向Naoki Matsunaga

Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...Jason Tsai

Generating Diverse High-Fidelity Images with VQ-VAE-2harmonylab

1시간만에 GAN(Generative Adversarial Network) 완전 정복하기NAVER Engineering

Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018Universitat Politècnica de Catalunya

[CVPR読み会]BING:Binarized normed gradients for objectness estimation at 300fpsTakuya Minagawa

Cs231n 2017 lecture13 Generative ModelYanbin Kong

PR-231: A Simple Framework for Contrastive Learning of Visual RepresentationsJinwon Lee

Physics-Informed Machine LearningOmarYounis21

A Simple Framework for Contrastive Learning of Visual RepresentationsSeunghyun Hwang

Deep Belief NetworksHasan H Topcu

Toward Disentanglement through Understand ELBOKai-Wen Zhao

[DL輪読会]High-Quality Self-Supervised Deep Image DenoisingDeep Learning JP

Restricted boltzmann machine강민국 강민국

PR-214: FlowNet: Learning Optical Flow with Convolutional NetworksHyeongmin Lee

PRML学習者から入る深層生成モデル入門tmtm otm

Res netと派生研究の紹介masataka nishimori

Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2Preferred Networks

Backpropagation in Convolutional Neural NetworkHiroshi Kuwajima

What's hot (20)

Lecture_16_Self-supervised_Learning.pptx

DNNの曖昧性に関する研究動向

Introduction to Spiking Neural Networks: From a Computational Neuroscience pe...

Generating Diverse High-Fidelity Images with VQ-VAE-2

1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

Variational Autoencoders VAE - Santiago Pascual - UPC Barcelona 2018

[CVPR読み会]BING:Binarized normed gradients for objectness estimation at 300fps

Cs231n 2017 lecture13 Generative Model

PR-231: A Simple Framework for Contrastive Learning of Visual Representations

Physics-Informed Machine Learning

A Simple Framework for Contrastive Learning of Visual Representations

Deep Belief Networks

Toward Disentanglement through Understand ELBO

[DL輪読会]High-Quality Self-Supervised Deep Image Denoising

Restricted boltzmann machine

PR-214: FlowNet: Learning Optical Flow with Convolutional Networks

PRML学習者から入る深層生成モデル入門

Res netと派生研究の紹介

Optuna Dashboardの紹介と設計解説 - 2022/12/10 Optuna Meetup #2

Backpropagation in Convolutional Neural Network

More from 수철 박

diffusion 모델부터 DALLE2까지.pdf수철 박

Flow based generative models수철 박

Gmm to vgmm수철 박

A universal music translation network수철 박

Kernel Method수철 박

R.T.Bach수철 박

More from 수철 박 (6)

diffusion 모델부터 DALLE2까지.pdf

Flow based generative models

Gmm to vgmm

A universal music translation network

Kernel Method

R.T.Bach

VQ-VAE

1. Neural Discrete Representation Learning https://arxiv.org/abs/1711.00937 Aaron van den Oord, Oriol Vinyals, Koray Kavukcuoglu 박수철

2. Goal Neural NetworkMusic Data Generated Data

3. Wavenet WavenetMusic Data Generated Data

4. Wavenet의 한계 Long-range structure를 반영하지 못한다.  receptive ﬁeld를 늘려도 sample단위의 미래를 예측할 뿐,   의미를 만들어내는 단어(speech), 프레이즈(music)를 만들어내지 못함. https://deepmind.com/blog/wavenet-generative-model-raw-audio WavenetMusic Data Generated Data

5. Latent를 만들자 z1 z2 zM Encoder Decoder

6. Sampling 가능한 Latent를 만들자 (VAE같은걸 끼얹나?) z1 z2 zM Decoder Encoder

7. VAE의 한계 : Posterior Collapse Encoder Decoder Gaussian Noise p(x) = T ∏ t=1 p(xt |x<t, zt) Variational posterior gaussian prior

8. VQ-VAE : 샘플링 가능한 비 노이즈적 벡터로 posterior를 근사 Encoder Decoder p(x) = T ∏ t=1 p(xt |x<t, zt) p(z) = Categorical

9. VQ-VAE : 샘플링 가능한 비 노이즈적 벡터로 posterior를 근사 Encoder Decoderze(x) zq(x) codebook reconstruction codebook commitment e1 e2 e3 e4 e5 e6 e7

10. VQ-VAE : 샘플링 가능한 비 노이즈적 벡터로 posterior를 근사

VQ-VAE

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

More from 수철 박

More from 수철 박 (6)

VQ-VAE