Successfully reported this slideshow.
Your SlideShare is downloading. ×

Tutorial of GANs in Gifu Univ

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 65 Ad

More Related Content

Similar to Tutorial of GANs in Gifu Univ (20)

Advertisement

Recently uploaded (20)

Tutorial of GANs in Gifu Univ

  1. 1. Tutorial of GANs 2018/1/15 岐阜大学 加藤研究室 中塚俊介
  2. 2.  名前:中塚 俊介  所属:岐阜大学大学院 加藤研究室  研究 深層学習を用いた外観検査  少数不良品サンプル下における正常モデル生成と異常度判定手法  回帰型CNNを用いた外観検査手法  DAEGANを用いた欠損復元による外観検査手法 2 About me My Mail Address : nakatsuka@cv.info.gifu-u.ac.jp My github : https://github.com/salty-vanilla My Qiita : https://qiita.com/salty-vanilla Input From distribution? From AutoEncoder? Input OutputEncoder Decoder Discriminator 正常品画像 正常品画像
  3. 3.  Introduction  Standard GAN  Wasserstein GAN  Applications of GAN 3 Agenda
  4. 4. Introduction of GAN 4
  5. 5.  特徴抽出層  Convolutional Layer  Pooling Layer  識別部  Fully Connected Layer CNN: Convolutional Neural Networks 5 cat leopard dog human bird
  6. 6.  フィルタを畳み込み演算するLayer  フィルタの係数を学習する Convolutional Layer 学習されたフィルタ 6
  7. 7. What is Convolution ? 120 100 135 50 60 150 125 95 150 40 30 100 55 60 60 25 50 150 40 75 30 150 250 230 60 100 50 60 75 245 40 75 30 150 250 230 0.8 0.0 0.4 0.9 0.5 0.2 0.0 0.1 0.1 Input Image Filter 7
  8. 8. What is Convolution ? 120 100 135 50 60 150 125 95 150 40 30 100 55 60 60 25 50 150 40 75 30 150 250 230 60 100 50 60 75 245 40 75 30 150 250 230 × 0.8 × 0.0 × 0.4 × 0.9 × 0.5 × 0.2 × 0.0 × 0.1 × 0.1 Input Image 460.2 Output Image 8
  9. 9. What is Convolution ? 120 100 135 50 60 150 125 95 150 40 30 100 55 60 60 25 50 150 40 75 30 150 250 230 60 100 50 60 75 245 40 75 30 150 250 230 × 0.8 × 0.0 × 0.4 × 0.9 × 0.5 × 0.2 × 0.0 × 0.1 × 0.1 Input Image Output Image 9 460.2 277
  10. 10.  位置不変性の獲得  様々な特徴を抽出 Convolutional Layer Input Image Filters of Conv1 Output of Conv1 10
  11. 11.  Conv Layerの直後に存在するLayer  低解像度化  多少の位置ずれ、特徴量の変化に鈍感に  計算量の削減 Pooling Layer 60 25 50 150 30 150 250 230 50 60 75 245 30 150 250 230 150 250 150 250 11 Input Output
  12. 12.  Conv Layerの直後に存在するLayer  低解像度化  多少の位置ずれ、特徴量の変化に鈍感に  計算量の削減 Pooling Layer Input Image Output of Conv1 Output of Pool1 12
  13. 13.  最後段のFeature MapをVectorに変換して入力  出力は各クラスの確率  出力層のユニットはクラス数と等しい Fully Connected Layer cat leopard dog human bird bWxy += 13
  14. 14.  特徴抽出層  Convolutional Layer  Pooling Layer  識別部  Fully Connected Layer CNN: Convolutional Neural Networks 14 cat leopard dog human bird
  15. 15. Generative Adversarial Networks 敵対的生成ネットワーク  GeneratorとDiscriminatorという2つのネットワークモデルが存在  Generator : ランダムノイズzをLatent Spaceとして、 realなデータを生成(fake)  Discriminator : 入力がreal or fake を判定する 15 What is GAN? Z Generator 0.0 – 1.0 Discriminator Pg: Generated Images Pdata: Real Images
  16. 16. 敵対的学習のイメージ Generator Discriminator 偽札士 偽札鑑定士 上手く偽造して騙す 偽造を見破る 16
  17. 17.  3D-GAN —Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling(github)  3D-IWGAN —Improved Adversarial Systems for 3D Object Generation and Reconstruction (github)  3D-RecGAN —3D Object Reconstruction from a Single Depth View with Adversarial Learning (github)  ABC-GAN —ABC-GAN: Adaptive Blur and Control for improved training stability of Generative Adversarial Networks (github)  AC-GAN —Conditional Image Synthesis With Auxiliary Classifier GANs  acGAN—Face Aging With Conditional Generative Adversarial Networks  AdaGAN—AdaGAN: Boosting Generative Models  AE-GAN —AE-GAN: adversarial eliminating with GAN  AEGAN —Learning Inverse Mapping by Autoencoder based Generative Adversarial Nets  AffGAN—Amortised MAP Inference for Image Super-resolution  AL-CGAN —Learning to Generate Images of Outdoor Scenes from Attributes and Semantic Layouts  ALI —Adversarially Learned Inference  AlignGAN—AlignGAN: Learning to Align Cross-Domain Images with Conditional Generative Adversarial Networks  AM-GAN —Activation Maximization Generative Adversarial Nets  AnoGAN—Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery  ARAE —Adversarially Regularized Autoencoders for Generating Discrete Structures (github)  ARDA —Adversarial Representation Learning for Domain Adaptation  ARIGAN —ARIGAN: Synthetic Arabidopsis Plants using Generative Adversarial Network  ArtGAN—ArtGAN: Artwork Synthesis with Conditional Categorial GANs  b-GAN —Generative Adversarial Nets from a Density Ratio Estimation Perspective  Bayesian GAN —Deep and Hierarchical Implicit Models  Bayesian GAN —Bayesian GAN  BCGAN —Bayesian Conditional Generative Adverserial Networks  BEGAN —BEGAN: Boundary Equilibrium Generative Adversarial Networks  BGAN —Binary Generative Adversarial Networks for Image Retrieval(github)  BiGAN—Adversarial Feature Learning  BS-GAN —Boundary-Seeking Generative Adversarial Networks  C-RNN-GAN —C-RNN-GAN: Continuous recurrent neural networks with adversarial training (github)  CaloGAN—CaloGAN: Simulating 3D High Energy Particle Showers in Multi-Layer Electromagnetic Calorimeters with Generative Adversarial Networks (github)  CAN —CAN: Creative Adversarial Networks, Generating Art by Learning About Styles and Deviating from Style Norms  CatGAN—Unsupervised and Semi-supervised Learning with Categorical Generative Adversarial Networks  CausalGAN—CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training  CC-GAN —Semi-Supervised Learning with Context-Conditional Generative Adversarial Networks (github)  CDcGAN—Simultaneously Color-Depth Super-Resolution with Conditional Generative Adversarial Network  CGAN —Conditional Generative Adversarial Nets  CGAN —Controllable Generative Adversarial Network  Chekhov GAN —An Online Learning Approach to Generative Adversarial Networks  CoGAN—Coupled Generative Adversarial Networks  Conditional cycleGAN—Conditional CycleGAN for Attribute Guided Face Image Generation  constrast-GAN —Generative Semantic Manipulation with Contrasting GAN  Context-RNN-GAN —Contextual RNN-GANs for Abstract Reasoning Diagram Generation  Coulomb GAN —Coulomb GANs: Provably Optimal Nash Equilibria via Potential Fields  Cramèr GAN —The Cramer Distance as a Solution to Biased Wasserstein Gradients  crVAE-GAN —Channel-Recurrent Variational Autoencoders  CS-GAN —Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets  CVAE-GAN —CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training  CycleGAN—Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks (github)  D2GAN —Dual Discriminator Generative Adversarial Nets  DAN —Distributional Adversarial Networks  DCGAN —Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks(github)  DeliGAN—DeLiGAN : Generative Adversarial Networks for Diverse and Limited Data (github)  DiscoGAN—Learning to Discover Cross-Domain Relations with Generative Adversarial Networks  DistanceGAN—One-Sided Unsupervised Domain Mapping  DM-GAN —Dual Motion GAN for Future-Flow Embedded Video Prediction  DR-GAN —Representation Learning by Rotating Your Faces  DRAGAN —How to Train Your DRAGAN (github)  DSP-GAN —Depth Structure Preserving Scene Image Generation  DTN —Unsupervised Cross-Domain Image Generation  DualGAN—DualGAN: Unsupervised Dual Learning for Image-to-Image Translation  Dualing GAN —Dualing GANs  EBGAN —Energy-based Generative Adversarial Network  ED//GAN —Stabilizing Training of Generative Adversarial Networks through Regularization  EGAN —Enhanced Experience Replay Generation for Efficient Reinforcement Learning  ExprGAN—ExprGAN: Facial Expression Editing with Controllable Expression Intensity  f-GAN —f-GAN: Training Generative Neural Samplers using Variational Divergence Minimization  FF-GAN —Towards Large-Pose Face Frontalization in the Wild  Fila-GAN —Synthesizing Filamentary Structured Images with GANs  Fisher GAN —Fisher GAN  Flow-GAN —Flow-GAN: Bridging implicit and prescribed learning in generative models  GAMN —Generative Adversarial Mapping Networks  GAN —Generative Adversarial Networks (github)  GAN-CLS —Generative Adversarial Text to Image Synthesis (github)  GAN-sep—GANs for Biological Image Synthesis (github)  GAN-VFS —Generative Adversarial Network-based Synthesis of Visible Faces from Polarimetric Thermal Faces  GANCS —Deep Generative Adversarial Networks for Compressed Sensing Automates MRI  GAWWN —Learning What and Where to Draw (github)  GeneGAN—GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data (github)  Geometric GAN —Geometric GAN  GMAN —Generative Multi-Adversarial Networks  GMM-GAN —Towards Understanding the Dynamics of Generative Adversarial Networks  GoGAN—Gang of GANs: Generative Adversarial Networks with Maximum Margin Ranking  GP-GAN —GP-GAN: Towards Realistic High-Resolution Image Blending(github)  GRAN —Generating images with recurrent adversarial networks (github)  IAN —Neural Photo Editing with Introspective Adversarial Networks(github)  IcGAN—Invertible Conditional GANs for image editing (github)  ID-CGAN —Image De-raining Using a Conditional Generative Adversarial Network  iGAN—Generative Visual Manipulation on the Natural Image Manifold(github)  Improved GAN —Improved Techniques for Training GANs (github)  InfoGAN—InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets (github)  IRGAN —IRGAN: A Minimax Game for Unifying Generative and Discriminative Information Retrieval models  IWGAN —On Unifying Deep Generative Models  l-GAN —Representation Learning and Adversarial Generation of 3D Point Clouds  LAGAN —Learning Particle Physics by Example: Location-Aware Generative Adversarial Networks for Physics Synthesis  LAPGAN —Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks (github)  LD-GAN —Linear Discriminant Generative Adversarial Networks  LDAN —Label Denoising Adversarial Network (LDAN) for Inverse Lighting of Face Images  LeakGAN—Long Text Generation via Adversarial Training with Leaked Information  LeGAN—Likelihood Estimation for Generative Adversarial Networks  LR-GAN —LR-GAN: Layered Recursive Generative Adversarial Networks for Image Generation  LS-GAN —Loss-Sensitive Generative Adversarial Networks on Lipschitz Densities  LSGAN —Least Squares Generative Adversarial Networks  MAD-GAN —Multi-Agent Diverse Generative Adversarial Networks  MAGAN —MAGAN: Margin Adaptation for Generative Adversarial Networks  MalGAN—Generating Adversarial Malware Examples for Black-Box Attacks Based on GAN  MaliGAN—Maximum-Likelihood Augmented Discrete Generative Adversarial Networks  MARTA-GAN —Deep Unsupervised Representation Learning for Remote Sensing Images  McGAN—McGan: Mean and Covariance Feature Matching GAN  MD-GAN —Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks  MDGAN —Mode Regularized Generative Adversarial Networks  MedGAN—Generating Multi-label Discrete Electronic Health Records using Generative Adversarial Networks  MGAN —Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks (github)  MGGAN —Multi-Generator Generative Adversarial Nets  MIX+GAN —Generalization and Equilibrium in Generative Adversarial Nets (GANs)  MMD-GAN —MMD GAN: Towards Deeper Understanding of Moment Matching Network (github)  MMGAN —MMGAN: Manifold Matching Generative Adversarial Network for Generating Images  MoCoGAN—MoCoGAN: Decomposing Motion and Content for Video Generation (github)  MPM-GAN —Message Passing Multi-Agent GANs  MuseGAN—MuseGAN: Symbolic-domain Music Generation and Accompaniment with Multi-track Sequential Generative Adversarial Networks  MV-BiGAN—Multi-view Generative Adversarial Networks  OptionGAN—OptionGAN: Learning Joint Reward-Policy Options using Generative Adversarial Inverse Reinforcement Learning  ORGAN —Objective-Reinforced Generative Adversarial Networks (ORGAN) for Sequence Generation Models  PAN —Perceptual Adversarial Networks for Image-to-Image Transformation  PassGAN—PassGAN: A Deep Learning Approach for Password Guessing  Perceptual GAN —Perceptual Generative Adversarial Networks for Small Object Detection  PGAN —Probabilistic Generative Adversarial Networks  pix2pix —Image-to-Image Translation with Conditional Adversarial Networks (github)  PixelGAN—PixelGAN Autoencoders  Pose-GAN —The Pose Knows: Video Forecasting by Generating Pose Futures  PPGN —Plug & Play Generative Networks: Conditional Iterative Generation of Images in Latent Space  PrGAN—3D Shape Induction from 2D Views of Multiple Objects  PSGAN —Learning Texture Manifolds with the Periodic Spatial GAN  RankGAN—Adversarial Ranking for Language Generation  RCGAN —Real-valued (Medical) Time Series Generation with Recurrent Conditional GANs  RefineGAN—Compressed Sensing MRI Reconstruction with Cyclic Loss in Generative Adversarial Networks  RenderGAN—RenderGAN: Generating Realistic Labeled Data  ResGAN—Generative Adversarial Network based on Resnet for Conditional Image Restoration  RNN-WGAN —Language Generation with Recurrent Generative Adversarial Networks without Pre-training(github)  RPGAN —Stabilizing GAN Training with Multiple Random Projections(github)  RTT-GAN —Recurrent Topic-Transition GAN for Visual Paragraph Generation  RWGAN —Relaxed Wasserstein with Applications to GANs  SAD-GAN —SAD-GAN: Synthetic Autonomous Driving using Generative Adversarial Networks  SalGAN—SalGAN: Visual Saliency Prediction with Generative Adversarial Networks (github)  SBADA-GAN —From source to target and back: symmetric bi-directional adaptive GAN  SD-GAN —Semantically Decomposing the Latent Spaces of Generative Adversarial Networks  SEGAN —SEGAN: Speech Enhancement Generative Adversarial Network  SeGAN—SeGAN: Segmenting and Generating the Invisible  SegAN—SegAN: Adversarial Network with Multi-scale L1 Loss for Medical Image Segmentation  SeqGAN—SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient (github)  SGAN —Texture Synthesis with Spatial Generative Adversarial Networks  SGAN —Stacked Generative Adversarial Networks (github)  SGAN —Steganographic Generative Adversarial Networks  SimGAN—Learning from Simulated and Unsupervised Images through Adversarial Training  SketchGAN—Adversarial Training For Sketch Retrieval  SL-GAN —Semi-Latent GAN: Learning to generate and modify facial images from attributes  SN-GAN —Spectral Normalization for Generative Adversarial Networks(github)  Softmax-GAN —Softmax GAN  Splitting GAN —Class-Splitting Generative Adversarial Networks  SRGAN —Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network  SS-GAN —Semi-supervised Conditional GANs  ss-InfoGAN—Guiding InfoGAN with Semi-Supervision  SSGAN —SSGAN: Secure Steganography Based on Generative Adversarial Networks  SSL-GAN —Semi-SupervisedLearning with Context-Conditional Generative Adversarial Networks  ST-GAN —Style Transfer Generative Adversarial Networks: Learning to Play Chess Differently  StackGAN—StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks  SteinGAN—Learning Deep Energy Models: Contrastive Divergence vs. Amortized MLE  S²GAN —Generative Image Modeling using Style and Structure Adversarial Networks  TAC-GAN —TAC-GAN — Text Conditioned Auxiliary Classifier Generative Adversarial Network (github)  TAN —Outline Colorization through Tandem Adversarial Networks  TextureGAN—TextureGAN: Controlling Deep Image Synthesis with Texture Patches  TGAN —Temporal Generative Adversarial Nets  TP-GAN —Beyond Face Rotation: Global and Local Perception GAN for Photorealistic and Identity Preserving Frontal View Synthesis  Triple-GAN —Triple Generative Adversarial Nets  Unrolled GAN —Unrolled Generative Adversarial Networks (github)  VAE-GAN —Autoencoding beyond pixels using a learned similarity metric  VariGAN—Multi-View Image Generation from a Single-View  VAW-GAN —Voice Conversion from Unaligned Corpora using Variational Autoencoding Wasserstein Generative Adversarial Networks  VEEGAN —VEEGAN: Reducing Mode Collapse in GANs using Implicit Variational Learning (github)  VGAN —Generating Videos with Scene Dynamics (github)  VGAN —Generative Adversarial Networks as Variational Training of Energy Based Models (github)  ViGAN—Image Generation and Editing with Variational Info Generative Adversarial Networks  VIGAN —VIGAN: Missing View Imputation with Generative Adversarial Networks  VRAL —Variance Regularizing Adversarial Learning  WaterGAN—WaterGAN: Unsupervised Generative Network to Enable Real-time Color Correction of Monocular Underwater Images  WGAN —Wasserstein GAN (github)  WGAN-GP —Improved Training of Wasserstein GANs (github)  WS-GAN —Weakly Supervised Generative Adversarial Networks for 3D Reconstruction  α-GAN —Variational Approaches for Auto-Encoding Generative Adversarial Networks (github)  Δ-GAN —Triangle Generative Adversarial Networks 17 A Variety of GAN The GAN Zoo https://deephunt.in/the-gan-zoo-79597dc8c347
  18. 18. Tricks  Label Smoothing  Add noise to inputs, decay o  Use Dropouts in G in both train and test phase 18 A Variety of GAN Surrogate or auxiliary objective  Unrolled GAN  WGAN-GP  DRAGAN New Objectives  EBGAN  LSGAN  WGAN  BEGAN Network Architecture  LAPGAN  Stacked GAN
  19. 19.  NIPS 2016 Tutorial: Generative Adversarial Networks https://arxiv.org/abs/1701.00160 https://www.youtube.com/watch?v=AJVyzd0rqdc  Theory and Application of Generative Adversarial Network(CVPR2017) https://www.slideshare.net/mlreview/tutorial-on-theory-and-application-of-generative- adversarial-networks  ICCV 2017 Tutorial on GANs https://sites.google.com/view/iccv-2017-gans  Yann LeCun曰く “This, and the variations that are now being proposed is the most interesting idea in the last 10 years in ML, in my opinion. “ 19 GAN is Hot Topic
  20. 20. Generative Adversarial Networks 20
  21. 21. Generative Adversarial Networks 敵対的生成ネットワーク  GeneratorとDiscriminatorという2つのネットワークモデルが存在  Generator : ランダムノイズzをLatent Spaceとして、 realなデータを生成(fake)  Discriminator : 入力がreal or fake を判定する 21 What is GAN? Z Generator 0.0 – 1.0 Discriminator Pg: Generated Images Pdata: Real Images
  22. 22.  多次元のノイズから画像を生成する  生成分布pgをReal分布pr に近づけることで、本物に近い画像を出力する  Noise z を線形補完することで視覚的に連続な画像が得られる Generator 22 Z Generator Pg: Generated Images
  23. 23.  入力された画像が、Generatorが生成した画像 or 本物の画像かを判定  出力層のユニットが本物である確率を表す  実際には、生成分布pgとReal分布pr の距離を推定している Discriminator 23 0.0 – 1.0 Discriminator Pg: Generated Images Pdata: Real Images
  24. 24. 分布間の距離をどう定義するかでGANのObjectiveは変わる  JSD(Jensen Shannon Divergence) → Standard GAN  EMD(Earth Mover’s Distance) → WGAN 24 Distance, Divergence         mpKLDmpKLDppJSD xpxp m xp xp EppKLD grgr gr g r pxgr r |||| 2 1 || 2 )()( )( )( log|| ~ += + =         =     yxEppW yx pp gr gr =    ~),( , inf,
  25. 25. DiscriminatorとGeneratorのMinMax  x~pr : real分布からサンプリングされたデータ  D(*) : 入力されたデータのreal / fake 判定 (Discriminator)  z~pz : ノイズ生成分布からサンプリングされたノイズベクター  G(*) : 入力されたノイズからデータを生成 (Generator) 25 Objectives    )))((log(1))(log(),(maxmin ~~ zGDExDEGDV zrDG pzpx +=
  26. 26. a. Dは部分的には正確 b. DがD∗ x = pr x pr x +pg x に収束するように訓練 c. pr がpg に近づいてくる d. pr とpg が一致 このときD x = 1 2 26 Objectives  Discrimination Distribution  Real Distribution  Generative Distribution  Dが正確になってきてから、Gが正確になる  しかし、Dが正確になりすぎてもだめ
  27. 27. 27 Algorithm 最大化 最小化  学習初期段階では、Gが弱く、Dが強い  Dが1を頻繁に出力してしまう  勾配消失 ))((log zGD 最大化 実際の実装では、最大化問題に置き換える
  28. 28.  Discriminator 最大化問題なので、符号を反転して最小化問題に  Generator 置き換えた最大化問題を最小化問題に 28 Objectives (implementation)       +  == m i i m i i zGD m xD m 11 )))((1log( 1 )(log 1         = m i izGD m 1 )))((log( 1 
  29. 29. 29 生成結果
  30. 30.  Discriminatorの精度が高くないと、Generatorは分布間の距離を最小化できない  Dだけ先にたくさん学習させた後に、Gを学習すればよい? ※うまくいきません  DとGのバランス調整をする必要がある  Mode Collapseしてしまう 30 Drawbacks of Standard GAN  DiscriminatorのLossの最大値は↓のはずなのに、実際には誤差が0になってしまう   4log)(||)(2 + xpxpJSD gr  pr とpg が低次元多様体上で不連続  pr とpg が互いに素なSupportを含んでいる
  31. 31. pr とpg が低次元多様体上で不連続  画像の情報は高次元だが、その分布はもっと低次元な空間にあるはず(仮にdreal 次元とする)  dreal > dz ならpg は不連続になる pr とpg が互いに素なSupportを含んでいる  2つの集合が共通の元を持たない状態 31 Drawbacks of Standard GAN uncontinuous Disjoint support  Optimal Discriminatorが存在  さらにこのとき、  完全に分離できる  勾配0 0)( 2log = =  xD JSD x
  32. 32. もし、Optimal Discriminatorを回避しても・・・ Discriminatorが正しくJSDを近似できるようになると、gradientのnormは小さくなっていく つまるところ・・・  Discriminatorの学習が不十分 → Generatorは正しくないJSDを最小化  Discriminatorの学習が十分 → Generator更新のためのgradientが小さい、勾配消失 32 Drawbacks of Standard GAN   * 2 ~ 1 )))((1log( DD MzgDE zpz        詳しくは Arjovsky, Martin, and Léon Bottou. "Towards principled methods for training generative adversarial networks." arXiv preprint arXiv:1701.04862 (2017).
  33. 33. Wasserstein GAN 33
  34. 34.  DiscriminatorとGeneratorのバランス調整が必要ない  Mode Collapseが起きにくい  学習時の際に、Lossを評価することができる Advantages of WGAN pr Standard GAN WGAN Standard GAN WGAN
  35. 35.  Earth Mover’s Distanceを分布間の距離尺度に使用  Πは同時分布  直感的には、最適な輸送コスト “pgをprに変えるために、地点xから地点yまでどれだけの質量を輸送にしなければいけないか?” 35 Objectives     yxEppW yx pp gr gr =    ~),( , inf,
  36. 36. 直感的に考えて・・・  Pgが縦軸に近づくほど、距離は小さくなる  Pgが縦軸から遠ざかるほど、距離は大きくなる 36 JSD vs EMD )1,1(~ Uz θ
  37. 37.  実際は  JSDは不連続かつ勾配が消失  EMDは連続かつ勾配がある 37 JSD vs EMD θ )1,1(~ Uz      = = otherwise0 0if2log ||  gr ppJSD   =gr ppEMD || JSD EMD
  38. 38.  Standard GAN  不連続なpg を生成していたのは、JSD自体が不連続だから  また、勾配消失問題もJSDのせい  WGAN  Gがθにおいて連続なら、EMDも連続  DがK-Lipschitz連続性(関数の傾きがある定数Kで抑えられるような一様連続性)を保つなら、 EMDはどこでも連続かつ微分可能 38 JSD vs EMD
  39. 39.  EMDの良さはわかったけど、これどうやって計算するの?  Kantorovich-Rubinstein Duality を使うと、 39 Objectives     yxEppW yx pp gr gr =    ~),( , inf,              ))(()(max ))(()(max )()(sup, ~~ ~~ ~~ 1 zGDExDE zGfExfE xfExfEppW wpzwpx Ww wpzwpx Ww pxpx f gr zr zr gr L   = = =    1-Lipschitz性を保つDiscriminator
  40. 40. 40 Algorithm 最大化 最小化  やっていることはGANと同じ  ObjectiveとWeight Clippingのみ違う
  41. 41.  Discriminator (Critic) 最大化問題なので、符号を反転して最小化問題に  Generator W-Distanceを最小化する 41 Objectives (implementation)         == m i iw m i iww zGD m xD m 11 ))(( 1 )( 1          = m i iw zGD m 1 ))(( 1 
  42. 42. Weight Clipping  1-Lipschitz連続性を保つために、重みに制約を加える  Discriminatorのweightをupdate後に毎回 (–c, c) にclipする (paper内では c=0.01)  cが大きければ、制約の力が弱くなかなか1-Lipschitz連続性にたどり着かないため、 Discriminatorが正しく収束しない  cが小さければ、勾配消失を起こしやすくなる “Weight clipping is a clearly terrible way to enforce a Lipschitz constraint. [……] However, we do leave the topic of enforcing Lipschitz constraints in a neural network setting for further investigation, and we actively encourage interested researchers to improve on this method.” 42 1-Lipschitz
  43. 43. 43 生成結果 W-Distanceが小さくなるごとに良い画像ができている これでLossの評価ができる
  44. 44.  LayerのWeightが-0.01 もしくは0.01のような偏った値になってしまう  Weight Decayも考案されたが、勾配消失 44 Weight Clippingの罠
  45. 45. WGANのDiscriminatorのLossにGP項を加算  λ : GP項の強さを決めるパラメータ(paper内では10)  ε : U(0, 1)からサンプリングされた値 45 Gradient Penalty             +  == 2 2ˆ~ˆ 11 1)ˆ()( 1 ))(( 1 min ˆ xDxf m zGD m xpx m i iw m i iw w E x  Gradient Penalty xzGx  += )()1(ˆ 𝐺(𝑧) ො𝑥 𝑥 𝜖 1 − 𝜖 Discrimination Loss
  46. 46. Optimal Critic D*は という直線上のどこにおいても、以下の勾配を持つ つまり、勾配のnormは1 だから、GP項はnorm=1からどれだけ離れているかを表している これで、1-Lipschitz連続性が担保できる! 46 Gradient Penalty        2 2ˆ 1)ˆ(xDx xzGx  += )()1(ˆ xx xx xDx ˆ ˆ )ˆ(* ˆ   = 正規化されているのでnormは1
  47. 47. 47 Algorithm  やっていることはWGANと同じ  Weight Clippingを削除  DiscriminatorのLossにGP項を加算
  48. 48.  Discriminator (Critic) WGANのLossにGP項を加算  Generator W-Distanceを最小化する 48 Objectives (implementation)         = m i iw zGD m 1 ))(( 1              +  == 2 2ˆ~ˆ 11 1)ˆ()( 1 ))(( 1 ˆ xDxD m zGD m xpx m i iw m i iww E x 
  49. 49. 49 生成結果
  50. 50.  WGANはDCGANの欠点をいくつか克服した  学習の安定性  意味のあるLoss  Mode Collapse  しかし、WGANにも問題がある  画像がぼやける  収束が遅い  表現力が低い  これらに向けた新しいGAN  DRAGAN  Crammer GAN 50 GANのこれから
  51. 51. Applications of GAN 51
  52. 52.  Auxiliary Classifier GAN 自分の狙った画像を生成  Super Resolution GAN 超解像 画像の解像度を綺麗にアップサンプリングする  pix2pix 汎用的なImage Translation(e.g. 地図↔航空写真, スケッチ↔写真, 白黒↔カラー)  Cycle GAN アンペアのImage Translation 52 Applications of GAN
  53. 53.  ランダムノイズz に加え,condition vector c を結合して入力とする  Discriminatorの出力は通常のGANの出力と,入力から予測されるcondition c  e.g. MNIST: 10次元のone-hot 53 ACGAN(Auxiliary Classifier GAN) Generator Discriminator G(z, c) x c : condition z : random noise c : condition fake or real
  54. 54. 54 ACGAN(Auxiliary Classifier GAN)
  55. 55.  C92にて発表  Condition + DRAGANを用いたアニメ顔生成 55 MakeGirls http://make.girls.moe/#/ https://github.com/makegirlsmoe/makegirlsmoe.githu
  56. 56.  2016年 FaceBook Research  入力はnoiseではなく,低解像画像 56 Super Resolution GAN
  57. 57.  低解像度画像から高解像度画像を作成する際に,答えは1つではない  単なるMSEでは,平均的な画像を生成しぼやけてしまう  GANの構造を加えることで,実際のデータにありそうな画像に近づける 57 Super Resolution GAN
  58. 58.  画像のペアを学習させることで、その関係性を獲得するネットワーク  1つのタスクに特化するのではなく,汎用的なImage Translation  地図↔航空写真, スケッチ↔写真, 白黒↔カラー  画像のペアが必要なので,データセット作成のコスト大 58 pix2pix https://affinelayer.com/pixsrv/ https://github.com/phillipi/pix2p
  59. 59.  入力は画像  Discriminatorには,画像のペアが入力される  Lossは,Adversarial LossとタスクごとのLossを組み合わせ  e.g. MSE, VGG Loss, CCE 59 pix2pix Discriminator fake pair real pair fake or real Generator x G(x) x x y
  60. 60.  線画を着色するネットワーク  おそらくpix2pixベース 60 PaintsChainer http://paintschainer.preferred.tech/index_ja.html https://qiita.com/taizan/items/cf77fd37ec3a0bef5d9d
  61. 61.  Pix2pixとOptical Flowを利用した異常シーン検出  Best Paper / Student Paper Award Finalist, IEEE International Conference on Image Processing (ICIP), 2017 61 ABNORMAL EVENT DETECTION https://arxiv.org/abs/1708.09644
  62. 62.  ペアが必要ないImage Transfer  双方向の変換を1度に学習する  e.g. 馬⇔シマウマ, リンゴ⇔みかん,写真⇔絵画 62 Cycle GAN
  63. 63. 63 Cycle GAN Dy fake or real Gxy Gyx Dx fake or real  11 ))(())(()))((log( ),,,(),,( yGGyxGGxxGD yxGGLxGDLL yxxyxyyxxyy xyyxcyclexyyGANGxy ++= +=  
  64. 64.  Goodfellow, Ian, et al. "Generative adversarial nets." Advances in neural information processing systems. 2014.  Arjovsky, Martin, and Léon Bottou. "Towards principled methods for training generative adversarial networks." arXiv preprint arXiv:1701.04862 (2017).  Arjovsky, Martin, Soumith Chintala, and Léon Bottou. "Wasserstein gan." arXiv preprint arXiv:1701.07875 (2017).  Gulrajani, Ishaan, et al. "Improved training of wasserstein gans." arXiv preprint arXiv:1704.00028 (2017).  https://makegirlsmoe.github.io/assets/pdf/technical_report.pdf 64 参考文献
  65. 65. Questions ? 65 My Mail Address : nakatsuka@cv.info.gifu-u.ac.jp My implementations : https://github.com/salty-vanilla My Qiita : https://qiita.com/salty-vanilla

×