십분딥러닝_9_VAE(Variational Autoencoder)

•Download as PPTX, PDF•

3 likes•646 views

HyunKyu Jeon

Variational AutoEncoder에 대해 설명하는 슬라이드 입니다.

Data & Analytics

非線形方程式の解法にはNewton法が定番。しかし、収束は初期値に敏感であることはあまり教科書に載っていない。非常に簡単な問題でもいいので試しに指数関数が入った方程式をNewton法で解いてみるとイライラしてくるくらい初期値設定が難しいことがわかる。現実問題とはそんなもの。初期値設定を手助けする意味でHomotopy法を使ってみる。数学的には美しい理論。おもしろいのはHomotopy法を使うということは微分方程式を解いていることにつながるということ。エンジョイして欲しい。

Matrix calculus

Sungbin Lim

Finding connections among images using CycleGAN

NAVER Engineering

발표자: 박태성 (UC Berkeley 박사과정) 발표일: 2017.6. Taesung Park is a Ph.D. student at UC Berkeley in AI and computer vision, advised by Prof. Alexei Efros. His research interest lies between computer vision and computational photography, such as generating realistic images or enhancing photo qualities. He received B.S. in mathematics and M.S. in computer science from Stanford University. 개요: Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. However, for many tasks, paired training data will not be available. We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples. Our goal is to learn a mapping G: X → Y such that the distribution of images from G(X) is indistinguishable from the distribution Y using an adversarial loss. Because this mapping is highly under-constrained, we couple it with an inverse mapping F: Y → X and introduce a cycle consistency loss to push F(G(X)) ≈ X (and vice versa). Qualitative results are presented on several tasks where paired training data does not exist, including collection style transfer, object transfiguration, season transfer, photo enhancement, etc. Quantitative comparisons against several prior methods demonstrate the superiority of our approach.

Multimodal deep learning

hoai_ln

This presentation discusses multimodal deep learning and unsupervised feature learning from audio and video speech data. It introduces the McGurk effect where audio-visual speech is integrated. An autoencoder model is used to learn shared representations from audio and video input that outperform single modality learning on lip-reading tasks. On the AVLetters dataset, the cross-modality features achieved a classification accuracy of 64.4%, and on the CUAVE dataset, an accuracy of 68.7%.

ICASSP 2018 Tutorial: Generative Adversarial Network and its Applications to ...

宏毅李

The document provides an overview of generative adversarial networks (GANs) and their applications to signal processing and natural language processing. It begins with a general introduction to GANs, including how they work, common issues, and potential solutions. Conditional GANs and unsupervised conditional GANs are also discussed. The document then outlines applications of GANs to signal processing and natural language processing.

Graph Neural Network in practice

tuxette

This document summarizes and compares two popular Python libraries for graph neural networks - Spektral and PyTorch Geometric. It begins by providing an overview of the basic functionality and architecture of each library. It then discusses how each library handles data loading and mini-batching of graph data. The document reviews several common message passing layer types implemented in both libraries. It provides an example comparison of using each library for a node classification task on the Cora dataset. Finally, it discusses a graph classification comparison in PyTorch Geometric using different message passing and pooling layers on the IMDB-binary dataset.

십분딥러닝_16_WGAN (Wasserstein GANs)

HyunKyu Jeon

Denoising Diffusion Probabilistic Modelsの重要な式の解説

Tomonari Masada

公開URL：http://fastdepth.mit.edu/2019_icra_fastdepth.pdf 出典：Diana Wofk, Fangchang Ma, Tien-Ju Yang, Sertac Karaman, FastDepth: Fast Monocular Depth Estimation on Embedded Systems, 2019 International Conference on Robotics and Automation (ICRA), Montreal, Canada (2019) 概要：深度推定は、マッピングや障害物検出などのロボットタスクにおいて重要な機能です。最近では、高コストな深度センサによる深度推定ではなく、安価な単眼カメラによる深度推定に関心が寄せられています。しかし、最先端の単眼深度推定は複雑なディープラーニングネットワークをベースにしており、リアルタイム推論に時間がかかります。このような背景から、組み込みシステム上での高速深度推定の問題に取り組み、軽量なエンコーダ/デコーダネットワークを提案しています。

クラシックな機械学習入門：付録：よく使う線形代数の公式

Hiroshi Nakagawa

Tokyor45 カーネル多変量解析第２章カーネル多変量解析の仕組みYohei Sato

Recent Progress on Single-Image Super-Resolution

Hiroto Honda

This document summarizes recent progress in single image super resolution (SISR) techniques using deep convolutional neural networks. It discusses early networks like SRCNN and VDSR, as well as more advanced models such as SRResNet, SRGAN, and EDSR that utilize residual blocks and perceptual loss functions. The document notes that while SISR accuracy has improved significantly in recent years, achieving both high PSNR and natural perceptual quality remains challenging due to a distortion-perception tradeoff. It concludes that the application determines whether more accurate or plausible output is preferred.

CMA-ESサンプラーによるハイパーパラメータ最適化 at Optuna Meetup #1

Masashi Shibata

Semantic Segmentation Review

Takeshi Otsuka

Graph Neural Network 1부

seungwoo kim

Graph neural networks are a type of neural network that operates on graph structured data. They work by passing messages between nodes in a graph and aggregating information from neighboring nodes. Common graph neural network models include graph convolutional networks (GCNs) which use convolutional filters on graphs, and graph attention networks (GATs) which use attention mechanisms. GraphSAGE is another model that learns node representations by sampling and aggregating features from a node's local neighborhood. Graph neural networks have applications in tasks like node classification, link prediction, and graph classification and can be used to model many real-world problems that can be represented as graphs.

Graph Convolutional Network 概説

KCS Keio Computer Society

Wasserstein GANを熟読する

ssusera4bf2d

This document discusses Wasserstein GAN (WGAN) and how it improves upon traditional GANs. WGAN uses the Wasserstein distance as its loss function instead of the Jensen-Shannon divergence used in traditional GANs. This allows for more stable training with less mode collapse. The Wasserstein distance is continuous, unlike other distance metrics, which helps gradients flow better during training. However, the Wasserstein distance is computationally intractable, so WGAN uses weight clipping to make the critic Lipchitz continuous and allow for its estimation. Overall, WGAN provides more meaningful learning curves and hyperparameters are easier to tune compared to traditional GANs.

Fast Unbalanced Optimal Transport on a Tree

joisino

VAE-type Deep Generative Models

Kenta Oono

This document provides an overview of VAE-type deep generative models, especially RNNs combined with VAEs. It begins with notations and abbreviations used. The agenda then covers the mathematical formulation of generative models, the Variational Autoencoder (VAE), variants of VAE that combine it with RNNs (VRAE, VRNN, DRAW), a Chainer implementation of Convolutional DRAW, other related models (Inverse DRAW, VAE+GAN), and concludes with challenges of VAE-like generative models.

MIRU MIRU わかる GAN

Tomohiro Takahashi

[PR-358] Training Differentially Private Generative Models with Sinkhorn Dive...

HyunKyu Jeon

Super tickets in pre trained language models

HyunKyu Jeon

This document discusses finding "super tickets" in pre-trained language models through pruning attention heads and feedforward layers. It shows that lightly pruning BERT models can improve generalization without degrading accuracy (phase transition phenomenon). The authors propose a new pruning approach for multi-task fine-tuning of language models called "ticket sharing" where pruned weights are shared across tasks. Experiments on GLUE benchmarks show their proposed super ticket and ticket sharing methods consistently outperform unpruned baselines, with more significant gains on smaller tasks. Analysis indicates pruning reduces model variance and some tasks share more task-specific knowledge than others.

What's hot

GANの簡単な理解から正しい理解まで

Kazuma Komiya

[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G...

Deep Learning JP

Rによるword2vec

Yuichiro Kobayashi

12. Diffusion Model の数学的基礎.pdf

幸太朗岩澤

機械学習におけるオンライン確率的最適化の理論

Taiji Suzuki

第1回NIPS読み会・関西発表資料

Takato Horii

Probabilistic Graphical Models 輪読会 #1

Takuma Yagi

パターン認識と機械学習 §6.2 カーネル関数の構成

Prunus 1350

FastDepth: Fast Monocular Depth Estimation on Embedded Systems

harmonylab

クラシックな機械学習入門：付録：よく使う線形代数の公式

Hiroshi Nakagawa

Tokyor45 カーネル多変量解析第２章カーネル多変量解析の仕組みYohei Sato

Recent Progress on Single-Image Super-Resolution

Hiroto Honda

CMA-ESサンプラーによるハイパーパラメータ最適化 at Optuna Meetup #1

Masashi Shibata

Semantic Segmentation Review

Takeshi Otsuka

Graph Neural Network 1부

seungwoo kim

Graph Convolutional Network 概説

KCS Keio Computer Society

Wasserstein GANを熟読する

ssusera4bf2d

Fast Unbalanced Optimal Transport on a Tree

joisino

VAE-type Deep Generative Models

Kenta Oono

MIRU MIRU わかる GAN

Tomohiro Takahashi

What's hot (20)

GANの簡単な理解から正しい理解まで

[DL輪読会]Graph Convolutional Policy Network for Goal-Directed Molecular Graph G...

Rによるword2vec

12. Diffusion Model の数学的基礎.pdf

機械学習におけるオンライン確率的最適化の理論

第1回NIPS読み会・関西発表資料

Probabilistic Graphical Models 輪読会 #1

パターン認識と機械学習 §6.2 カーネル関数の構成

FastDepth: Fast Monocular Depth Estimation on Embedded Systems

クラシックな機械学習入門：付録：よく使う線形代数の公式

Tokyor45 カーネル多変量解析第２章カーネル多変量解析の仕組み

Recent Progress on Single-Image Super-Resolution

CMA-ESサンプラーによるハイパーパラメータ最適化 at Optuna Meetup #1

Semantic Segmentation Review

Graph Neural Network 1부

Graph Convolutional Network 概説

Wasserstein GANを熟読する

Fast Unbalanced Optimal Transport on a Tree

VAE-type Deep Generative Models

MIRU MIRU わかる GAN

More from HyunKyu Jeon

[PR-358] Training Differentially Private Generative Models with Sinkhorn Dive...

HyunKyu Jeon

Super tickets in pre trained language models

HyunKyu Jeon

Synthesizer rethinking self-attention for transformer models

HyunKyu Jeon

Domain Invariant Representation Learning with Domain Density Transformations

HyunKyu Jeon

Meta back translation

HyunKyu Jeon

This document summarizes Meta Back-Translation, a method for improving back-translation by training the backward model to directly optimize the performance of the forward model during training. The key points are: 1. Back-translation typically relies on a fixed backward model, which can lead the forward model to overfit to its outputs. Meta back-translation instead continually trains the backward model to generate pseudo-parallel data that improves the forward model. 2. Experiments show Meta back-translation generates translations with fewer pathological outputs like greatly differing in length from references. It also avoids both overfitting and underfitting of the forward model by flexibly controlling the diversity of pseudo-parallel data. 3. Related work leverages mon

Maxmin qlearning controlling the estimation bias of qlearning

HyunKyu Jeon

This document summarizes the Maxmin Q-learning paper published at ICLR 2020. Maxmin Q-learning aims to address the overestimation bias of Q-learning and underestimation bias of Double Q-learning by maintaining multiple Q-functions and using the minimum value across them for the target in the Q-learning update. It defines the action selection and target construction for the update based on taking the maximum over the minimum Q-value for each action. The algorithm initializes multiple Q-functions, selects a random subset to update using the maxmin target constructed from the minimum Q-values. This approach reduces the biases seen in prior methods.

Adversarial Attack in Neural Machine Translation

HyunKyu Jeon

십분딥러닝_19_ALL_ABOUT_CNN

HyunKyu Jeon

십분수학_Entropy and KL-Divergence

HyunKyu Jeon

(edited) 십분딥러닝_17_DIM(DeepInfoMax)

HyunKyu Jeon

십분딥러닝_18_GumBolt (VAE with Boltzmann Machine)

HyunKyu Jeon

십분딥러닝_17_DIM(Deep InfoMax)

HyunKyu Jeon