Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

[系列活動] 一日搞懂生成式對抗網路

34,485 views

Published on

生成式對抗網路 (Generative Adversarial Network, GAN) 顯然是深度學習領域的下一個熱點,Yann LeCun 說這是機器學習領域這十年來最有趣的想法 (the most interesting idea in the last 10 years in ML),又說這是有史以來最酷的東西 (the coolest thing since sliced bread)。生成式對抗網路解決了什麼樣的問題呢?在機器學習領域,回歸 (regression) 和分類 (classification) 這兩項任務的解法人們已經不再陌生,但是如何讓機器更進一步創造出有結構的複雜物件 (例如:圖片、文句) 仍是一大挑戰。用生成式對抗網路,機器已經可以畫出以假亂真的人臉,也可以根據一段敘述文字,自己畫出對應的圖案,甚至還可以畫出二次元人物頭像 (左邊的動畫人物頭像就是機器自己生成的)。本課程希望能帶大家認識生成式對抗網路這個深度學習最前沿的技術。

Published in: Data & Analytics
  • Secrets To Making Up, These secrets will help you get back together with your ex. ▲▲▲ http://scamcb.com/exback123/pdf
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD FULL eBOOK INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc eBook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, CookeBOOK Crime, eeBOOK Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y6a5rkg5 } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/yxufevpm } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/yxufevpm } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/yxufevpm } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/yxufevpm } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/yxufevpm } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/yxufevpm } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

[系列活動] 一日搞懂生成式對抗網路

  1. 1. Introduction of Generative Adversarial Network 李宏毅 Hung-yi Lee
  2. 2. Outline Lecture 1: Brief Review of Deep Learning Lecture 2: Introduction of Generative Adversarial Network Lecture 3: Conditional Generation Lecture 4: Making Decision and Control
  3. 3. Generative Adversarial Network (GAN) Google 小姐 Ian Goodfellow
  4. 4. Yann LeCun’s comment https://www.quora.com/What-are-some-recent-and- potentially-upcoming-breakthroughs-in-unsupervised-learning
  5. 5. Yann LeCun’s comment https://www.quora.com/What-are-some-recent-and-potentially-upcoming-breakthroughs- in-deep-learning ……
  6. 6. Is GAN popular?
  7. 7. Resources • GAN Zoo • https://github.com/hindupuravinash/the-gan-zoo • Tricks: https://github.com/soumith/ganhacks • Tutorial of Ian Goodfellow: https://arxiv.org/abs/1701.00160
  8. 8. Creation Drawing? Writing Poems? http://www.rb139.com/index.ph p?s=/Lot/44547
  9. 9. Anime Face Generation 何之源的知乎: https://zhuanlan.zhihu.com/p/24767059 DCGAN: https://github.com/carpedm20/DCGAN-tensorflow Drawing?
  10. 10. Basic Idea of GAN Generator It is a neural network (NN), or a function. Generator 0.1 −3 ⋮ 2.4 0.9 imagevector Generator 3 −3 ⋮ 2.4 0.9 Generator 0.1 2.1 ⋮ 2.4 0.9 Generator 0.1 −3 ⋮ 2.4 3.5 matrix Powered by: http://mattya.github.io/chainer-DCGAN/ Each dimension of input vector represents some characteristics. Longer hair blue hair Open mouth
  11. 11. Discri- minator scalar image Basic Idea of GAN It is a neural network (NN), or a function. Larger value means real, smaller value means fake. Discri- minator Discri- minator Discri- minator Discri- minator 1.0 1.0 0.1 ?
  12. 12. Basic Idea of GAN Brown veins Butterflies are not brown Butterflies do not have veins …….. Generator Discriminator
  13. 13. Basic Idea of GAN NN Generator v1 Discri- minator v1 Real images: NN Generator v2 Discri- minator v2 NN Generator v3 Discri- minator v3 This is where the term “adversarial” comes from. You can explain the process in different ways…….
  14. 14. Generator v3 Generator v2 Basic Idea of GAN Generator (student) Discriminator (teacher) Generator v1 Discriminator v1 Discriminator v2 No eyes No mouth
  15. 15. So many questions …… Q1: Why generator cannot learn by itself? Q2: Why discriminator don’t generate object itself? Q3: How discriminator and generator interact?
  16. 16. Anime Face Generation 100 updates
  17. 17. Anime Face Generation 1000 updates
  18. 18. Anime Face Generation 2000 updates
  19. 19. Anime Face Generation 5000 updates
  20. 20. Anime Face Generation 10,000 updates
  21. 21. Anime Face Generation 20,000 updates
  22. 22. Anime Face Generation 50,000 updates
  23. 23. 感謝陳柏文同學提供實驗結果 0.0 0.0 G 0.9 0.9 G 0.1 0.1 G 0.2 0.2 G 0.3 0.3 G 0.4 0.4 G 0.5 0.5 G 0.6 0.6 G 0.7 0.7 G 0.8 0.8 G
  24. 24. 這有什麼用? 用來生成更多資料? 擴增實境/虛擬實境? This will be clear in the next lecture.
  25. 25. Lecture I: Basic Idea of Deep Learning
  26. 26. Machine Learning ≈ Looking for a Function • Binary Classification (是 非題) • Multi-class Classification (選擇題) Function f Function f Input Input Yes/No Class 1, Class 2, … Class N
  27. 27. Machine Learning ≈ Looking for a Function • Structured input/output “大家好,歡迎大 家來修機器學習” f Speech Recognition “girl with red hair and red eyes” f f Summarization “近半選民挺脫歐 公投 ……” (title, summary)
  28. 28. Framework A set of function 21, ff  1f “cat”  1f “dog”  2f “money”  2f “snake” Model  f “cat” Image Recognition:
  29. 29. Framework A set of function 21, ff  f “cat” Image Recognition: Model Training Data Goodness of function f Better! “monkey” “cat” “dog” function input: function output: Supervised Learning
  30. 30. Framework A set of function 21, ff  f “cat” Image Recognition: Model Training Data Goodness of function f “monkey” “cat” “dog” * f Pick the “Best” Function Using  f “cat” Training Testing
  31. 31. Example Application Input Output 16 x 16 = 256 1x 2x 256x …… Ink → 1 No ink → 0 …… y1 y2 y10 Each dimension represents the confidence of a digit. is 1 is 2 is 0 …… 0.1 0.7 0.2 The image is “2”
  32. 32. Example Application • Handwriting Digit Recognition Function “2” 1x 2x 256x …… …… y1 y2 y10 is 1 is 2 is 0 …… Input: 256-dim vector output: 10-dim vector Neural Network
  33. 33. Training Data • Preparing training data: images and their labels The learning target is defined on the training data. “5” “0” “4” “1” “3”“1”“2”“9”
  34. 34. bwawawaz KKkk  11 Neural Network z 1w kw Kw … 1a ka Ka  b  z bias a weights Neuron… …… A simple function Activation function
  35. 35. Output LayerHidden Layers Input Layer Fully Connect Feedforward Network Input Output 1x 2x Layer 1 …… Nx …… Layer 2 …… Layer L …… …… …… …… …… y1 y2 yM neuron is 1 is 2 is 0 ……
  36. 36. Fully Connect Feedforward Network Input Output 1x 2x Layer 1 …… Nx …… Layer 2 …… Layer L …… …… …… …… …… y1 y2 yM Simple Function 1 is 1 is 2 is 0 …… a1 Simple Function 2 a2 Simple Function L y… A very complex function How can we know the weights and biases of the neurons?
  37. 37. 1x 2x …… Nx …… …… …… …… Ink → 1 No ink → 0 …… y1 y2 y10 y1 has the maximum value Find a set of weights and bias such that Input: y2 has the maximum valueInput: is 1 is 2 is 0 Gradient descent is the algorithm finding the weights and biases of the neurons that can achieve the target. Gradient descent has lots of problems …… Learning/ Training:
  38. 38. Convolutional Layer…… 1 1 2 2 Sparse Connectivity 3 4 5 Each neural only connects to part of the output of the previous layer 3 4 Receptive Field Different neurons have different, but overlapping, receptive fields
  39. 39. Convolutional Layer…… 1 1 2 2 Sparse Connectivity 3 4 5 Each neural only connects to part of the output of the previous layer 3 4 Parameter Sharing The neurons with different receptive fields can use the same set of parameters. Less parameters then fully connected layer
  40. 40. Convolutional Layer…… 1 1 2 2 3 4 5 3 4 Considering neuron 1 and 3 as “filter 1” (kernel 1) filter (kernel) size: size of the receptive field of a neuron Stride = 2 Considering neuron 2 and 4 as “filter 2” (kernel 2) Kernel size, no. of filter, stride are all designed by the developers.
  41. 41. Pooling Layer Group the neurons corresponding to the same filter with nearby receptive fields …… 1 1 2 2 3 4 5 3 4 Convolutional Layer Pooling Layer 1 2 Subsampling
  42. 42. Recurrent Neural Network • Recurrent Structure: usually used when the input is a sequence Sentiment Analysis 我 覺 太得 糟 了 超好雷 好雷 普雷 負雷 超負雷 看了這部電影覺 得很高興 ……. 這部電影太糟了 ……. 這部電影很 棒 ……. Positive (正雷) Negative (負雷) Positive (正雷) …… Function
  43. 43. Recurrent Neural Network • Recurrent Structure: usually used when the input is a sequence fh0 h1 x1 f h2 f h3 g No matter how long the input sequence is, we only need one function f y x2 x3 Func f ht xt ht-1
  44. 44. LSTM GRU Func f ht xt ht-1
  45. 45. Auto-encoder NN Encoder NN Decoder code Compact representation of the input object code Can reconstruct the original object Learn together 28 X 28 = 784 Low dimension 𝑐 Trainable NN Encoder NN Decoder
  46. 46. Reference: Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. "Reducing the dimensionality of data with neural networks." Science 313.5786 (2006): 504-507 • NN encoder + NN decoder = a deep network Deep Auto-encoderInputLayer Layer Layer bottle OutputLayer Layer Layer Layer Layer … … Code As close as possible 𝑥 ො𝑥 Encoder Decoder
  47. 47. Deep Auto-encoder - Example Pixel -> tSNE 𝑐 NN Encoder PCA 降到 32-dim Unsupervised Leaning without annotation
  48. 48. Sequence-to-sequence Auto-encoder • Machine represents each audio segment also by a vector audio segment vector (word-level) Learn from lots of audio without supervision [Chung, Wu, Lee, Lee, Interspeech 16) Used in the following spoken language understanding applications
  49. 49. Sequence-to-sequence Auto-encoder audio segment acoustic featuresx1 x2 x3 x4 RNN Encoder vector The vector we want We use sequence-to-sequence auto-encoder here The training is unsupervised.
  50. 50. Sequence-to-sequence Auto-encoder RNN Generator x1 x2 x3 x4 y1 y2 y3 y4 x1 x2 x3 x4 RNN Encoder audio segment acoustic features The RNN encoder and generator are jointly trained. Input acoustic features
  51. 51. What does machine learn? • Typical word to vector: • Audio word to vector (phonetic information) 𝑉 𝑅𝑜𝑚𝑒 − 𝑉 𝐼𝑡𝑎𝑙𝑦 + 𝑉 𝐺𝑒𝑟𝑚𝑎𝑛𝑦 ≈ 𝑉 𝐵𝑒𝑟𝑙𝑖𝑛 𝑉 𝑘𝑖𝑛𝑔 − 𝑉 𝑞𝑢𝑒𝑒𝑛 + 𝑉 𝑎𝑢𝑛𝑡 ≈ 𝑉 𝑢𝑛𝑐𝑙𝑒 V( ) - V( ) + V( ) = V( ) V( ) - V( ) + V( ) = V( )
  52. 52. Network Many kinds of network structures: Matrix How to find the function? Given the example inputs/outputs as training data: {(x1,y1),(x2,y2), ……, (x1000,y1000)} ➢ Fully connected feedforward network ➢ Convolutional neural network (CNN) ➢ Recurrent neural network (RNN) Vector Vector Seq 𝑥 𝑦 Deep Learning in One Slide (Review) (speech, video, sentence) Different networks can take input/output in different domains.
  53. 53. Lecture II: Introduction of GAN To learn more theory: https://www.youtube.com/watch?v=0CKeqXl5IY0&lc=z13zuxbgl pvsgbgpo04cg1bxuoraejdpapo0k https://www.youtube.com/watch?v=KSN4QYgAtao&lc=z13kz1nq vuqsipqfn23phthasre4evrdo
  54. 54. Outline When can I use GAN? Generation by GAN Improving GAN Evaluation A little bit theory
  55. 55. Structured Learning YXf : Regression: output a scalar Classification: output a “class” Structured Learning/Prediction: output a sequence, a matrix, a graph, a tree …… (one-hot vector) 1 0 0 Machine learning is to find a function f Output is composed of components with dependency 0 1 0 0 0 1 Class 1 Class 2 Class 3
  56. 56. Regression, Classification Structured Learning
  57. 57. Output Sequence (what a user says) YXf : “機器學習及其深層與 結構化” “Machine learning and having it deep and structured” :X :Y 感謝大家來上課” (response of machine) :X :Y Machine Translation Speech Recognition Chat-bot (speech) :X :Y (transcription) (sentence of language 1) (sentence of language 2) “How are you?” “I’m fine.”
  58. 58. Output Matrix YXf : ref: https://arxiv.org/pdf/1605.05396.pdf “this white and yellow flower have thin white petals and a round yellow stamen” :X :Y Text to Image Image to Image :X :Y Colorization: Ref: https://arxiv.org/pdf/1611.07004v1.pdf
  59. 59. Decision Making and Control Action: “right” GO Playing is the same. Action: “fire” Action: “left”
  60. 60. Why Structured Learning Interesting? • One-shot/Zero-shot Learning: • In classification, each class has some examples. • In structured learning, • If you consider each possible output as a “class” …… • Since the output space is huge, most “classes” do not have any training data. • Machine has to create new stuff during testing. • Need more intelligence
  61. 61. Why Structured Learning Interesting? • Machine has to learn to planning • Machine can generate objects component-by- component, but it should have a big picture in its mind. • Because the output components have dependency, they should be considered globally. Image Generation Sentence Generation 這個婆娘不是人 九天玄女下凡塵
  62. 62. Structured Learning Approach Bottom Up Top Down Generative Adversarial Network (GAN) Generator Discriminator Learn to generate the object at the component level Evaluating the whole object, and find the best one
  63. 63. Outline When can I use GAN? Generation by GAN Improving GAN Evaluation A little bit theory
  64. 64. Generation NN Generator Image Generation Sentence Generation NN Generator We will control what to generate latter. → Conditional Generation 0.1 −0.1 ⋮ 0.7 −0.3 0.1 ⋮ 0.9 0.1 −0.1 ⋮ 0.2 −0.3 0.1 ⋮ 0.5 Good morning. How are you? 0.3 −0.1 ⋮ −0.7 0.3 −0.1 ⋮ −0.7 Good afternoon. In a specific range
  65. 65. So many questions …… Q1: Why generator cannot learn by itself? Q2: Why discriminator don’t generate object itself? Q3: How discriminator and generator interact?
  66. 66. Generator Image: code: 0.1 0.9(where does them come from?) NN Generator 0.1 −0.5 0.2 −0.1 0.3 0.2 NN Generator code vectors code code0.1 0.9 image As close as possible c.f. NN Classifier 𝑦1 𝑦2 ⋮ As close as possible 1 0 ⋮
  67. 67. Generator Encoder in auto-encoder provides the code ☺ 𝑐 NN Encoder NN Generator code vectors code code Image: code: (where does them come from?) 0.1 0.9 0.1 −0.5 0.2 −0.1 0.3 0.2
  68. 68. Remember Auto-encoder? As close as possible NN Encoder NN Decoder code NN Decoder code Randomly generate a vector as code Image ? = Generator = Generator
  69. 69. Remember Auto-encoder? NN Decoder code 2D -1.5 1.5 −1.5 0 NN Decoder 1.5 0 NN Decoder (real examples) image
  70. 70. Remember Auto-encoder? -1.5 1.5 (real examples)
  71. 71. Generator NN Generator code vectors code codeNN Generator a NN Generator b NN Generator ?a b0.5x + 0.5x Image: code: (where does them come from?) 0.1 0.9 0.1 −0.5 0.2 −0.1 0.3 0.2
  72. 72. Auto-encoder Variational Auto-encoder (VAE) NN Encoder input NN Decoder output m1 m2 m3 𝜎1 𝜎2 𝜎3 𝑒3 𝑒1 𝑒2 From a normal distribution 𝑐3 𝑐1 𝑐2 X + Minimize reconstruction error ෍ 𝑖=1 3 𝑒𝑥𝑝 𝜎𝑖 − 1 + 𝜎𝑖 + 𝑚𝑖 2 exp 𝑐𝑖 = 𝑒𝑥𝑝 𝜎𝑖 × 𝑒𝑖 + 𝑚𝑖 Minimize NN Encoder NN Decoder code input output = Generator
  73. 73. Why VAE? ? encode decode code
  74. 74. Ref: http://www.wired.co.uk/article/google-artificial-intelligence-poetry Samuel R. Bowman, Luke Vilnis, Oriol Vinyals, Andrew M. Dai, Rafal Jozefowicz, Samy Bengio, Generating Sentences from a Continuous Space, arXiv prepring, 2015 VAE - Writing Poetry RNN Encoder sentence RNN Decoder sentence code Code Space i went to the store to buy some groceries. "come with me," she said. i store to buy some groceries. i were to buy any groceries. "talk to me," she said. "don’t worry about it," she said. ……
  75. 75. What do we miss? G as close as possible TargetGenerated Image It will be fine if the generator can truly copy the target image. What if the generator makes some mistakes ……. Some mistakes are serious, while some are fine.
  76. 76. What do we miss? Target 1 pixel error 1 pixel error 6 pixel errors 6 pixel errors 我覺得不行 我覺得不行 我覺得 其實 OK 我覺得 其實 OK
  77. 77. What do we miss? 我覺得不行 我覺得其實 OK The relation between the components are critical. The last layer generates each components independently. Need deep structure to catch the relation between components. Layer L-1 …… Layer L …… …… …… …… Each neural in output layer corresponds to a pixel. ink empty 旁邊的,我們產 生一樣的顏色 誰管你
  78. 78. (Variational) Auto-encoder 感謝 黃淞楓 同學提供結果 x1 x2 G 𝑧1 𝑧2 𝑥1 𝑥2
  79. 79. So many questions …… Q1: Why generator cannot learn by itself? Q2: Why discriminator don’t generate object itself? Q3: How discriminator and generator interact?
  80. 80. Discriminator • Discriminator is a function D (network, can deep) • Input x: an object x (e.g. an image) • Output D(x): scalar which represents how “good” an object x is R:D X D 1.0 D 0.1 Can we use the discriminator to generate objects? Yes. Evaluation function, Potential Function, Evaluation Function …
  81. 81. Discriminator • It is easier to catch the relation between the components by top-down evaluation. 我覺得不行 我覺得其實 OK This CNN filter is good enough.
  82. 82. Discriminator • Suppose we already have a good discriminator D(x) … How to learn the discriminator? Enumerate all possible x !!! It is feasible ???
  83. 83. Discriminator - Training • I have some real images Discriminator training needs some negative examples. D scalar 1 (real) D scalar 1 (real) D scalar 1 (real) D scalar 1 (real) Discriminator only learns to output “1” (real).
  84. 84. Discriminator - Training D(x) real examples In practice, you cannot decrease all the x other than real examples. Discrimi nator D object x scalar D(x)
  85. 85. Discriminator - Training • Negative examples are critical. D scalar 1 (real) D scalar 0 (fake) D 0.9 Pretty real D scalar 1 (real) D scalar 0 (fake) How to generate realistic negative examples?
  86. 86. Discriminator - Training • General Algorithm • Given a set of positive examples, randomly generate a set of negative examples. • In each iteration • Learn a discriminator D that can discriminate positive and negative examples. • Generate negative examples by discriminator D D  xDx Xx  maxarg~ v.s.
  87. 87. Discriminator - Training 𝐷 𝑥 𝐷 𝑥 𝐷 𝑥𝐷 𝑥 In the end …… real generated
  88. 88. Graphical Model Bayesian Network (Directed Graph) Markov Random Field (Undirected Graph) Markov Logic Network Boltzmann Machine Restricted Boltzmann Machine Structured Learning ➢Structured Perceptron ➢Structured SVM ➢Gibbs sampling ➢Hidden information ➢Application: sequence labelling, summarization Conditional Random Field Segmental CRF (Only list some of the approaches) Energy-based Model: http://www.cs.nyu .edu/~yann/resear ch/ebm/
  89. 89. Generator v.s. Discriminator • Generator • Pros: • Easy to generate even with deep model • Cons: • Imitate the appearance • Hard to learn the correlation between components • Discriminator • Pros: • Considering the big picture • Cons: • Generation is not always feasible • Especially when your model is deep • How to do negative sampling?
  90. 90. So many questions …… Q1: Why generator cannot learn by itself? Q2: Why discriminator don’t generate object itself? Q3: How discriminator and generator interact?
  91. 91. Discriminator - Training • General Algorithm • Given a set of positive examples, randomly generate a set of negative examples. • In each iteration • Learn a discriminator D that can discriminate positive and negative examples. • Generate negative examples by discriminator D D  xDx Xx  maxarg~ v.s. G x~ =
  92. 92. Generating Negative Examples  xDx Xx  maxarg~G x~ = Discri- minator NN Generator image code 0.13 hidden layer update fix 0.9 Gradient Ascent
  93. 93. • Initialize generator and discriminator • In each training iteration: DG Sample some real objects: Generate some fake objects: G Algorithm D Update G D image code code code code 0000 1111 code code code code image image image 1 update fix
  94. 94. • Initialize generator and discriminator • In each training iteration: DG Learning D Sample some real objects: Generate some fake objects: G Algorithm D Update Learning G G D image code code code code 1111 code code code code image image image 1 update fix 0000
  95. 95. Benefit of GAN • From Discriminator’s point of view • Using generator to generate negative samples • From Generator’s point of view • Still generate the object component-by- component • But it is learned from the discriminator with global view.  xDx Xx  maxarg~G x~ = efficient
  96. 96. GAN 感謝 段逸林 同學提供結果 https://arxiv.org/a bs/1512.09300 VAE GAN
  97. 97. Outline When can I use GAN? Generation by GAN Improving GAN Evaluation A little bit theory
  98. 98. Binary Classifier as Discriminator Binary Classifier Real Typical binary classifier uses sigmoid function at the output layer You cannot train your classifier too good…… They don’t move. Binary Classifier Fake 1 is the largest, 0 is the smallest real generated
  99. 99. Binary Classifier as Discriminator • Don’t let the discriminator perfectly separate real and generated data • Weaken your discriminator? real generated strong weak
  100. 100. Binary Classifier as Discriminator • Don’t let the discriminator perfectly separate real and generated data • Add noise to input or label? real generated
  101. 101. Least Square GAN (LSGAN) • Replace sigmoid with linear (replace classification with regression) 1 (Real) 0 (Fake) Binary Classifier scalar Binary Classifier scalar They don’t move. 0 1 real generated
  102. 102. WGAN • We want the scores of the real examples as large as possible, generated examples as small as possible. Any problem? Binary Classifier scalar Binary Classifier scalar as large as possible as small as possible real generated Move towards Max +∞ −∞
  103. 103. WGAN Move towards Max 𝐷 𝑥1 − 𝐷 𝑥2 ≤ 𝐾 𝑥1 − 𝑥2 Lipschitz Function K=1 for "1 − 𝐿𝑖𝑝𝑠𝑐ℎ𝑖𝑡𝑧" Output change Input change Do not change fast The discriminator should be a 1-Lipschitz function. It should be smooth. How to realize?
  104. 104. WGAN • Original WGAN → Weight Clipping • Improved WGAN → Gradient Penalty Force the parameters w between c and -c After parameter update, if w > c, w = c; if w < -c, w = -c Do not truly maximize (minimize) the real (generated) examples It should be smooth enough. 𝐷 𝑥𝐷 𝑥 Keep the gradient close to 1 Easy to move
  105. 105. DCGAN G: CNN, D: CNN G: CNN (no normalization), D: CNN (no normalization) LSGAN Original WGAN Improved WGAN G: CNN (tanh), D: CNN(tanh)
  106. 106. DCGAN LSGAN Original WGAN Improved WGAN G: MLP, D: CNN G: CNN (bad structure), D: CNN G: 101 layer, D: 101 layer
  107. 107. Sentence Generation • good bye. g o o d <space> …… 0 1 0 0 ………… d g o <space> 0 0 1 0 0 0 1 0 1 0 0 0 1 0 0 1 Consider this matrix as an “image” 1-of-N Encoding I will talk about RNN later.
  108. 108. Sentence Generation • Real sentence • Generated 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 1 0.9 0.1 0 0 0 0.1 0.9 0 0 0 0.1 0.1 0.7 0.1 0 0 0 0.1 0.8 0.1 0 0 0 0.1 0.9 Can never be 1-of-N A binary classifier can immediately find the difference. WGAN is helpful No overlap I will talk about RNN later.
  109. 109. https://arxiv.org/abs/1704.00028Improved WGAN successfully generating sentences
  110. 110. W-GAN – 唐詩鍊成 感謝 李仲翊 同學提供實 驗結果 • 升雲白遲丹齋取,此酒新巷市入頭。黃道故海歸中後,不驚入得韻子門。 • 據口容章蕃翎翎,邦貸無遊隔將毬。外蕭曾臺遶出畧,此計推上呂天夢。 • 新來寳伎泉,手雪泓臺蓑。曾子花路魏,不謀散薦船。 • 功持牧度機邈爭,不躚官嬉牧涼散。不迎白旅今掩冬,盡蘸金祇可停。 • 玉十洪沄爭春風,溪子風佛挺橫鞋。盤盤稅焰先花齋,誰過飄鶴一丞幢。 • 海人依野庇,為阻例沉迴。座花不佐樹,弟闌十名儂。 • 入維當興日世瀕,不評皺。頭醉空其杯,駸園凋送頭。 • 鉢笙動春枝,寶叅潔長知。官爲宻爛去,絆粒薛一靜。 • 吾涼腕不楚,縱先待旅知。楚人縱酒待,一蔓飄聖猜。 • 折幕故癘應韻子,徑頭霜瓊老徑徑。尚錯春鏘熊悽梅,去吹依能九將香。 • 通可矯目鷃須浄,丹迤挈花一抵嫖。外子當目中前醒,迎日幽筆鈎弧前。 • 庭愛四樹人庭好,無衣服仍繡秋州。更怯風流欲鴂雲,帛陽舊據畆婷儻。 輸出 32 個字 (包含標點)
  111. 111. Loss-sensitive GAN (LSGAN) D(x) 𝑥 WGAN LSGAN 𝑥′′ 𝑥′ D(x) Δ 𝑥, 𝑥′ Δ 𝑥, 𝑥′ 𝑥′′ 𝑥′ 𝑥
  112. 112. Energy-based GAN (EBGAN) • Using an autoencoder as discriminator D scalar Discriminator An image is good. It can be reconstructed by autoencoder. = 0 for best images Generator is the same. - 0.1 EN DE Autoencoder X -1 -0.1
  113. 113. EBGAN realgen gen Hard to reconstruct, easy to destroy m 0 is for the best. Do not have to be very negative Auto-encoder based discriminator only give limited region large value.
  114. 114. Mode Collapse
  115. 115. Missing Mode ? ? Mode collapse is easy to detect.
  116. 116. Solution? • Feature matching, Minibatch discrimination • https://arxiv.org/abs/1606.03498 • Unroll GAN • https://arxiv.org/abs/1611.02163 • InfoGAN • Next lecture • Ensemble GAN • Next lecture
  117. 117. Outline When can I use GAN? Generation by GAN Improving GAN Evaluation A little bit theory Lucas Theis, Aäron van den Oord, Matthias Bethge, “A note on the evaluation of generative models”, arXiv preprint, 2015
  118. 118. Likelihood : real data (not observed during training) : generated data 𝑥 𝑖 Log Likelihood: Generator PG 𝐿 = 1 𝑁 ෍ 𝑖 𝑙𝑜𝑔𝑃𝐺 𝑥 𝑖 We cannot compute 𝑃𝐺 𝑥 𝑖 . We can only sample from 𝑃𝐺. Prior Distribution
  119. 119. Likelihood - Kernel Density Estimation • Estimate the distribution of PG(x) from sampling Generator PG https://stats.stackexchange.com/questions/244012/can- you-explain-parzen-window-kernel-density-estimation-in- laymans-terms/244023 Now we have an approximation of 𝑃𝐺, so we can compute 𝑃𝐺 𝑥 𝑖 for each real data 𝑥 𝑖 Then we can compute the likelihood. Each sample is the mean of a Gaussian with the same covariance.
  120. 120. Likelihood v.s. Quality • Low likelihood, high quality? • High likelihood, low quality? Considering a model generating good images (small variance) Generator PG Generated data Data for evaluation 𝑥 𝑖 𝑃𝐺 𝑥 𝑖 = 0 G1 𝐿 = 1 𝑁 ෍ 𝑖 𝑙𝑜𝑔𝑃𝐺 𝑥 𝑖 G2 X 0.99 X 0.01 = −𝑙𝑜𝑔100 + 1 𝑁 ෍ 𝑖 𝑙𝑜𝑔𝑃𝐺 𝑥 𝑖 4.6100
  121. 121. Evaluate by Other Networks Well-trained CNN 𝑥 𝑃 𝑦|𝑥 Lower entropy means higher visual quality CNN𝑥1 𝑃 𝑦1|𝑥1 Higher entropy means higher variety CNN𝑥2 𝑃 𝑦2|𝑥2 CNN𝑥3 𝑃 𝑦3 |𝑥3 … 1 𝑁 ෍ 𝑛 𝑃 𝑦 𝑛|𝑥 𝑛 Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, Xi Chen, “Improved Techniques for Training GANs”, arXiv prepring, 2016
  122. 122. Evaluate by Other Networks - Inception Score • Improved W-GAN
  123. 123. Outline When can I use GAN? Generation by GAN Improving GAN Evaluation A little bit theory Can skip this part
  124. 124. Theory behind GAN • The data we want to generate has a distribution 𝑃𝑑𝑎𝑡𝑎 𝑥 𝑃𝑑𝑎𝑡𝑎 𝑥 High Probability Low ProbabilityImage Space
  125. 125. Theory behind GAN • A generator G is a network. The network defines a probability distribution. generator G𝑧 𝑥 = 𝐺 𝑧 Normal Distribution 𝑃𝐺 𝑥 𝑃𝑑𝑎𝑡𝑎 𝑥 As close as possible https://blog.openai.com/generative-models/ It is difficult to compute 𝑃𝐺 𝑥 We do not know what the distribution looks like.
  126. 126. generator G𝑧 𝑥 = 𝐺 𝑧 Normal Distribution 𝑃𝐺 𝑥 𝑃𝑑𝑎𝑡𝑎 𝑥 As close as possible https://blog.openai.com/generative-models/ Discri- minator When the discriminator is well trained, its loss represents a specific divergence measure between PG and Pdata. Update discriminator is to minimize the divergence measure Binary classifier evaluates the JS divergence You can design the discriminator to evaluate other divergence.
  127. 127. Why GAN is hard to train? http://www.guokr.com/post/773890/ Better
  128. 128. Why GAN is hard to train? 𝑃𝑑𝑎𝑡𝑎 𝑥 𝑃𝐺0 𝑥 𝑃𝑑𝑎𝑡𝑎 𝑥 𝑃𝐺50 𝑥 𝐽𝑆 𝑃𝐺1 ||𝑃𝑑𝑎𝑡𝑎 = 𝑙𝑜𝑔2 𝐽𝑆 𝑃𝐺2 ||𝑃𝑑𝑎𝑡𝑎 = 𝑙𝑜𝑔2 𝑃𝑑𝑎𝑡𝑎 𝑥 𝑃𝐺100 𝑥 𝐽𝑆 𝑃𝐺2 ||𝑃𝑑𝑎𝑡𝑎 = 0 ? ………… Not really better ……
  129. 129. Evaluating JS divergence • JS divergence estimated by discriminator telling little information https://arxiv.org/a bs/1701.07875 Weak Generator Strong Generator
  130. 130. Earth Mover’s Distance • Considering one distribution P as a pile of earth, and another distribution Q as the target • The average distance the earth mover has to move the earth. 𝑃 𝑄 d 𝑊 𝑃, 𝑄 = 𝑑
  131. 131. 𝑃𝑑𝑎𝑡𝑎𝑃𝐺0 𝑃𝑑𝑎𝑡𝑎𝑃𝐺50 𝐽𝑆 𝑃𝐺0 , 𝑃𝑑𝑎𝑡𝑎 = 𝑙𝑜𝑔2 𝑃𝑑𝑎𝑡𝑎𝑃𝐺100 …… …… 𝑑0 𝑑50 𝐽𝑆 𝑃𝐺50 , 𝑃𝑑𝑎𝑡𝑎 = 𝑙𝑜𝑔2 𝐽𝑆 𝑃𝐺100 , 𝑃𝑑𝑎𝑡𝑎 = 0 𝑊 𝑃𝐺0 , 𝑃𝑑𝑎𝑡𝑎 = 𝑑0 𝑊 𝑃𝐺50 , 𝑃𝑑𝑎𝑡𝑎 = 𝑑50 𝑊 𝑃𝐺100 , 𝑃𝑑𝑎𝑡𝑎 = 0 Why Earth Mover’s Distance? 𝐷𝑓 𝑃𝑑𝑎𝑡𝑎||𝑃𝐺 𝑊 𝑃𝑑𝑎𝑡𝑎, 𝑃𝐺
  132. 132. https://arxiv.org/abs/1701.07875 Vertical
  133. 133. When can I use GAN? • I consider GAN as a new approach for structured learning. Generation by GAN • Generator: generate locally • Discriminator: evaluate globally • Generator + Discriminator → GAN Improving GAN Evaluation A Little Bit Theory Concluding Remarks
  134. 134. Question? https://arxiv.org/pdf/1406.2661.pdf Yann Lecun’s talk Discriminator is flat eventually?
  135. 135. Discriminator is flat in the end. • Discriminator leads the generator 𝐷 𝑥 𝐷 𝑥 𝐷 𝑥 𝐷 𝑥 Discriminator Real Generated
  136. 136. Discriminator is flat in the end. Source: https://www.youtube.com/watch?v=ebMei6bYeWw (credit: Benjamin Striner)
  137. 137. Discriminator is flat in the end. http://blog.aylien.com/introduction-generative-adversarial-networks-code-tensorflow/
  138. 138. Lecture III: Conditional Generation and its application on Anime Face Generation
  139. 139. Conditional Generation Generation Conditional Generation NN Generator “Girl with red hair and red eyes” “Girl with yellow ribbon” NN Generator 0.1 −0.1 ⋮ 0.7 −0.3 0.1 ⋮ 0.9 0.3 −0.1 ⋮ −0.7 In a specific range
  140. 140. Conditional Generation • We don’t want to simply generate some random stuff. • Generate objects based on conditions: Given condition: Caption Generation Chat-bot Given condition: “Hello” “A young girl is dancing.” “Hello. Nice to see you.”
  141. 141. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  142. 142. Modifying Input Code Generator 0.1 −3 ⋮ 2.4 0.9 Generator 3 −3 ⋮ 2.4 0.9 Generator 0.1 2.1 ⋮ 2.4 0.9 Generator 0.1 −3 ⋮ 2.4 3.5 Each dimension of input vector represents some characteristics. Longer hair blue hair Open mouth ➢The input code determines the generator output. ➢Understand the meaning of each dimension to control the output.
  143. 143. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  144. 144. InfoGAN Regular GAN Modifying a specific dimension, no clear meaning What we expect Actually … (The colors represents the characteristics.)
  145. 145. What is InfoGAN? Discrimi nator Classifier scalarGenerator=z z’ c x c Predict the code c that generates x “Auto- encoder” Parameter sharing (only the last layer is different) encoder decoder
  146. 146. What is InfoGAN? Classifier Generator=z z’ c x c Predict the code c that generates x “Auto- encoder” encoder decoder c must have clear influence on x The classifier can recover c from x.
  147. 147. https://arxiv.org/abs/1606.03657
  148. 148. https://arxiv.org/abs/1606.03657
  149. 149. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  150. 150. Connecting Code and Attribute CelebA
  151. 151. GAN+Autoencoder • We have a generator (input z, output x) • However, given x, how can we find z? • Learn an encoder (input x, output z) Generator (Decoder) Encoder z xx as close as possible fixed Discriminator init Autoencoder different structures?
  152. 152. Generator (Decoder) Encoder z xx as close as possible
  153. 153. Attribute Representation 𝑧𝑙𝑜𝑛𝑔 = 1 𝑁1 ෍ 𝑥∈𝑙𝑜𝑛𝑔 𝐸𝑛 𝑥 − 1 𝑁2 ෍ 𝑥′∉𝑙𝑜𝑛𝑔 𝐸𝑛 𝑥′ Short Hair Long Hair 𝐸𝑛 𝑥 + 𝑧𝑙𝑜𝑛𝑔 = 𝑧′𝑥 𝐺𝑒𝑛 𝑧′ 𝑧𝑙𝑜𝑛𝑔
  154. 154. https://www.youtube.com/watch?v=kPEIJJsQr7U Photo Editing
  155. 155. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  156. 156. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  157. 157. Conditional GAN • Generating images based on text description NN Encoder NN Generator sentence code code image “a dog is sleeping” ?
  158. 158. Conditional GAN • Generating images based on text description NN Encoder NN Generator sentence code code “a bird is flying” c1: a dog is running ො𝑥1: ො𝑥2:c2: a bird is flying “a dog is running”
  159. 159. Target of NN output Conditional GAN • Text to image by traditional supervised learning NN Image Text: “train” c1: a dog is running ො𝑥1: ො𝑥2:c2: a bird is flying A blurry image! c1: a dog is running ො𝑥1: as close as possible
  160. 160. Conditional GAN G 𝑧Prior distribution x = G(c,z) c: train Target of NN output Text: “train” A blurry image! Approximate the distribution below It is a distribution. Image
  161. 161. Conditional GAN D (type 2) scalar 𝑐 𝑥 D (type 1) scalar𝑥 Positive example: Negative example: G 𝑧Prior distribution x = G(c,z) c: train x is realistic or not Image x is realistic or not + c and x are matched or not (train , ) (train , ) (cat , ) Positive example: Negative example:
  162. 162. Text to Image - Results "red flower with black center"
  163. 163. Text to Image - Results
  164. 164. Image-to-image https://arxiv.org/pdf/1611.07004 G 𝑧 x = G(c,z) 𝑐
  165. 165. as close as possible Image-to-image • Traditional supervised approach NN Image It is blurry because it is the average of several images. Testing: input close
  166. 166. Image-to-image • Experimental results Testing: input close GAN G 𝑧 Image D scalar GAN + close
  167. 167. Image super resolution • Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, Wenzhe Shi, “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network”, CVPR, 2016
  168. 168. Video Generation Generator Discrimi nator Last frame is real or generated Discriminator thinks it is real target Minimize distance
  169. 169. https://github.com/dyelax/Adversarial_Video_Generation
  170. 170. Speech Enhancement • Typical deep learning approach Noisy Clean G Output Using CNN
  171. 171. Speech Enhancement • Conditional GAN G D scalar noisy output clean noisy clean output noisy training data (fake pair or not)
  172. 172. Speech Enhancement Noisy Speech Enhanced Speech Which Enhanced Speech is better? 感謝廖峴峰同學提供實驗結果 (和中研院曹昱博士共同指導)
  173. 173. More about Speech Processing • Speech synthesis • Takuhiro Kaneko, Hirokazu Kameoka, Nobukatsu Hojo, Yusuke Ijima, Kaoru Hiramatsu, Kunio Kashino, “Generative Adversarial Network-based Postfiltering for Statistical Parametric Speech Synthesis”, ICASSP 2017 • Yuki Saito, Shinnosuke Takamichi, and Hiroshi Saruwatari, "Training algorithm to deceive anti-spoofing verification for DNN-based speech synthesis, ”, ICASSP 2017 • Voice Conversion • Chin-Cheng Hsu, Hsin-Te Hwang, Yi-Chiao Wu, Yu Tsao, Hsin-Min Wang, Voice Conversion from Unaligned Corpora using Variational Autoencoding Wasserstein Generative Adversarial Networks, Interspeech 2017 • Speech Enhancement • Santiago Pascual, Antonio Bonafonte, Joan Serrà, SEGAN: Speech Enhancement Generative Adversarial Network, Interspeech 2017
  174. 174. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  175. 175. Cycle GAN, Disco GAN Transform an object from one domain to another without paired data photo van Gogh Domain X Domain Y paired data
  176. 176. Cycle GAN 𝐺 𝑋→𝑌 Domain X Domain Y 𝐷 𝑌 Domain Y Domain X scalar Input image belongs to domain Y or not Become similar to domain Y Not what we want ignore input https://arxiv.org/abs/1703.10593 https://junyanz.github.io/CycleGAN/
  177. 177. Cycle GAN Domain X Domain Y 𝐺 𝑋→𝑌 𝐷 𝑌 Domain Y scalar Input image belongs to domain Y or not 𝐺Y→X as close as possible Lack of information for reconstruction
  178. 178. Cycle GAN Domain X Domain Y 𝐺 𝑋→𝑌 𝐺Y→X as close as possible 𝐺Y→X 𝐺 𝑋→𝑌 as close as possible 𝐷 𝑌𝐷 𝑋 scalar: belongs to domain Y or not scalar: belongs to domain X or not c.f. Dual Learning
  179. 179. 動畫化的世界 • Using the code: https://github.com/Hi- king/kawaii_creator • It is not cycle GAN, Disco GAN input output domain
  180. 180. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  181. 181. https://www.youtube.com/watch?v=9c4z6YsBGQ0 Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman and Alexei A. Efros. "Generative Visual Manipulation on the Natural Image Manifold", ECCV, 2016.
  182. 182. Andrew Brock, Theodore Lim, J.M. Ritchie, Nick Weston, Neural Photo Editing with Introspective Adversarial Networks, arXiv preprint, 2017
  183. 183. Basic Idea space of code z Why move on the code space? Fulfill the constraint
  184. 184. Generator (Decoder) Encoder z xx as close as possible Back to z • Method 1 • Method 2 • Method 3 𝑧∗ = 𝑎𝑟𝑔 min 𝑧 𝐿 𝐺 𝑧 , 𝑥 𝑇 𝑥 𝑇 𝑧∗ Difference between G(z) and xT ➢ Pixel-wise ➢ By another network Using the results from method 2 as the initialization of method 1 Gradient Descent
  185. 185. Editing Photos • z0 is the code of the input image 𝑧∗ = 𝑎𝑟𝑔 min 𝑧 𝑈 𝐺 𝑧 + 𝜆1 𝑧 − 𝑧0 2 − 𝜆2 𝐷 𝐺 𝑧 Does it fulfill the constraint of editing? Not too far away from the original image Using discriminator to check the image is realistic or notimage
  186. 186. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  187. 187. Domain Independent Features • Training and testing data are in different domains Training data: Testing data: Generator Generator The same distribution feature feature
  188. 188. Domain Independent Features
  189. 189. Domain Independent Features Too easy to feature extractor …… feature extractor Domain classifier
  190. 190. Domain-adversarial training Not only cheat the domain classifier, but satisfying label classifier at the same time This is a big network, but different parts have different goals. Maximize label classification accuracy Maximize domain classification accuracy Maximize label classification accuracy + minimize domain classification accuracy feature extractor Domain classifier Label predictor
  191. 191. Domain-adversarial training Yaroslav Ganin, Victor Lempitsky, Unsupervised Domain Adaptation by Backpropagation, ICML, 2015 Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, Domain-Adversarial Training of Neural Networks, JMLR, 2016 Domain classifier fails in the end It should struggle ……
  192. 192. Domain-adversarial training Yaroslav Ganin, Victor Lempitsky, Unsupervised Domain Adaptation by Backpropagation, ICML, 2015 Hana Ajakan, Pascal Germain, Hugo Larochelle, François Laviolette, Mario Marchand, Domain-Adversarial Training of Neural Networks, JMLR, 2016 Train: Test:
  193. 193. Conditional Generation Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  194. 194. VAE-GAN Discrimi nator Generator (Decoder) Encoder z x scalarx as close as possible VAE GAN ➢ Minimize reconstruction error ➢ z close to normal ➢ Minimize reconstruction error ➢ Cheat discriminator ➢ Discriminate real, generated and reconstructed images Discriminator provides the similarity measure Anders Boesen, Lindbo Larsen, Søren Kaae Sønderby, Hugo Larochelle, Ole Winther, “Autoencoding beyond pixels using a learned similarity metric”, ICML. 2016
  195. 195. BiGAN Encoder Decoder Discriminator Image x code z Image x code z Image x code z from encoder or decoder? (real) (generated) (from prior distribution) Jeff Donahue, Philipp Krähenbühl, Trevor Darrell, “Adversarial Feature Learning”, ICLR, 2017 Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Olivier Mastropietro, Alex Lamb, Martin Arjovsky, Aaron Courville, “Adversarially Learned Inference”, ICLR, 2017
  196. 196. BiGAN • Initialize encoder En, decoder De, discriminator Dis • In each iteration: • Sample M images 𝑥1, 𝑥2, ⋯ , 𝑥 𝑀 from database • Generate M codes ǁ𝑧1, ǁ𝑧2, ⋯ , ǁ𝑧 𝑀 from encoder • ǁ𝑧 𝑖 = 𝐸𝑛 𝑥 𝑖 • Sample M codes 𝑧1, 𝑧2, ⋯ , 𝑧 𝑀 from prior P(z) • Generate M codes ෤𝑥1, ෤𝑥2, ⋯ , ෤𝑥 𝑀 from decoder • ෤𝑥 𝑖 = 𝐷𝑒 𝑧 𝑖 • Update Discriminator to increase 𝐷 𝑥 𝑖, ǁ𝑧 𝑖 , decrease 𝐷 ෤𝑥 𝑖, 𝑧 𝑖 • Update En and De to decrease 𝐷 𝑥 𝑖 , ǁ𝑧 𝑖 , increase 𝐷 ෤𝑥 𝑖, 𝑧 𝑖
  197. 197. Encoder Decoder Discriminator Image x code z Image x code z Image x code z from encoder or decoder? (real) (generated) (from prior distribution) 𝑃 𝑥, 𝑧 𝑄 𝑥, 𝑧 Evaluate the difference between P and QTo make P and Q the same Optimal encoder and decoder: En(x’) = z’ De(z’) = x’ En(x’’) = z’’De(z’’) = x’’ For all x’ For all z’’
  198. 198. BiGAN En(x’) = z’ De(z’) = x’ En(x’’) = z’’De(z’’) = x’’ For all x’ For all z’’ How about? Encoder Decoder x z ෤𝑥 Encoder Decoder x z ǁ𝑧 En, De En, De Optimal encoder and decoder:
  199. 199. Concluding Remarks Modifying input code • Making code has influence (InfoGAN) • Connection code space with attribute Controlling by input objects • Paired data • Unpaired data • Unsupervised Feature extraction • Domain Independent Feature • Improving Auto-encoder (VAE-GAN, BiGAN)
  200. 200. Anime Face Generation
  201. 201. Anime Face Generation • My Course at NTU: “Machine Learning and having it Deep and Structured” • This spring I opened this course the third time • I want to update the course. • Including the HW for sequence-to-sequence learning, reinforcement learning, GAN • What is the HW for GAN? “girl with red hair and red eyes” machine
  202. 202. Anime Face Generation • https://github.com/mattya/chainer-DCGAN • https://zhuanlan.zhihu.com/p/24767059 • https://github.com/jayleicn/animeGAN
  203. 203. Anime Face Generation - Girl Friend Factory • http://qiita.com/Hiroshiba/items/d5749d8896613e6f0b48
  204. 204. Data Collection http://konachan.net/post/show/239400/aikatsu-clouds-flowers-hikami_sumire-hiten_goane_r 感謝曾柏翔助教、 樊恩宇助教蒐集資料
  205. 205. Released Training Data • Data download link: https://drive.google.com/open?id=0BwJmB7alR- AvMHEtczZZN0EtdzQ • Anime Dataset: • training data: 33.4k (image, tags) pair • Training tags file format • img_id <comma> tag1 <colon> #_post <tab> tag2 <colon> … blue eyes red hair short hair tags.csv 96 x 96
  206. 206. Data • The students have to submit their codes. • The code generates 5 64x64 images based on text description. • Input testing text only includes ‘color hair’ and ‘color eyes’. • Testing text file format • testing_text_id <comma> testing_text • Extra bonus if you can generate images beyond the above description sample testing_text.txt
  207. 207. Evaluation • Lots evaluation methods in the literature • We will put all the generated images in the grading platform • Giving 2 scores for each image • How the image fits the text? • How the image looks real? • Scores should be integer from 1 to 5 • 1 to 5 corressponding to (super bad, bad, average, good, super good) • It is double blind.
  208. 208. Implementation Details • Updates between Generator and Discriminator • 1 : 1 or 2 : 1 • ADAM with lr = 0.0002, momentum = 0.5 • gaussian noise dim = 100 • batch size = 64, epoch = 300 Paper: https://arxiv.org/pdf/1605.05396.pdf Code: https://github.com/paarthneekhara/text-to-image
  209. 209. Results of TA • input text: black hair blue eyes • input text: green hair green eyes • input text: blue hair green eyes
  210. 210. Advanced Techniques • WGAN and Improved WGAN • StackGAN https://arxiv.org/abs/1612.03242
  211. 211. Sentence Representation • Using RNN to encode the input text • But you already know the testing input …... One-hot encoding Why not 15*15 one-hot encoding? 本技巧由江東峻、陳翰浩、鄭嘉文提供
  212. 212. Data Augmentation • Total training data: 33.4K • Only 14.8K is useful • Not sufficient for image generation? • Data Augmentation • Horizonal Flipping + Rotation 本技巧由江東峻、陳翰浩、鄭嘉文提供
  213. 213. Sampled Noise v.s. Image Quality Std=0.950 Std=0.950 Std=0.666 Std=0.666 Std=0.432 Std=0.432 本技巧由柯達方提供
  214. 214. Sampled Noise v.s. Image Quality green hair green eyes noise with small std noise with large std green hair green eyes Generator Generator
  215. 215. Ensemble • E.g. BEGAN on CelebA 陳柏文同學提供 實驗結果
  216. 216. Ensemble Generator 1 Generator 2 Generator 3 Generator 4 Generator 5 green hair green eyes noise with small std
  217. 217. Bonus • hat, twin tails, ribbon, horns, headdress, headphones, glasses …… • Whole images • Deep Cleavage GAN (DCGAN) • Successful …… but I will not show this …… 感謝 孫凡耕、羅啟心、許晉嘉、郭子生 提供結果
  218. 218. Lecture 4: Decision Making and Control and its relation with GAN
  219. 219. Decision Making and Control • Machine observe some inputs, takes an action, and finally achieve the target. • E.g. Go playing observation action Target: win the game State: summarization of observation
  220. 220. Decision Making and Control • Machine plays video games • Widely studies: • Gym: https://gym.openai.com/ • Universe: https://openai.com/blog/universe/ Machine learns to play video games as human players ➢ Machine learns to take proper action itself ➢ What machine observes are pixels
  221. 221. Decision Making and Control • E.g. self-driving car Object Recognition 紅燈 Decision Making observation Stop! Action: a giant network?
  222. 222. Decision Making and Control • E.g. dialogue system Slot Filling Decision Making 我想訂11月5日 到台北的機票 問出發地 a giant network? 抵達時間:11月5日 目的地:台北 Natural Language Generation 請問您要從 哪裡出發呢?
  223. 223. How to solve this problem? • Network as a function, learn as typical supervised tasks Network Target: 3-3 Network Target: Stop! 我想訂11月5日 到台北的機票 Network Target: 請問您要 從哪裡出發呢?
  224. 224. Behavior Cloning https://www.youtube.com/watch?v=j2FSB3bseek Machine do not know some behavior must copy, but some can be ignored.
  225. 225. Properties of Decision Making and Control • Agent’s actions affect the subsequent data it receives • Reward delay • In space invader, only “fire” obtains reward • Although the moving before “fire” is important • In Go playing, sacrificing immediate reward to gain more long-term reward Machine does not know the influence of each action. What do we miss?
  226. 226. Better Way …… Network (19 x 19 positions) Next move A sequence of decision Classification
  227. 227. Two Learning Scenarios • Scenario 1: Reinforcement Learning • Machine interacts with the environment. • Machine obtains the reward from the environment, so it knows its performance is good or bad. • Scenario 2: Learning by demonstration • Also known as imitation learning, apprenticeship learning • An expert demonstrates how to solve the task, and machine learns from the demonstration.
  228. 228. Outline Reinforcement Learning •Training an actor •RNN generation with GAN •Actor + Critic Inverse Reinforcement Learning
  229. 229. RL Policy-based Value-based Learning an Actor Learning a CriticActor + Critic Model-based Approach Model-free Approach Alpha Go: policy-based + value-based + model-based
  230. 230. Outline Reinforcement Learning •Training an actor •RNN generation with GAN •Actor + Critic Inverse Reinforcement Learning
  231. 231. Basic Idea EnvActor Reward Function Video Game Go 殺一隻怪 得 20 分 … 圍棋規則 You cannot control
  232. 232. Actor, Environment, Reward 𝜏 = 𝑠1, 𝑎1, 𝑠2, 𝑎2, ⋯ , 𝑠 𝑇, 𝑎 𝑇 Trajectory Actor 𝑠1 𝑎1 Env 𝑠2 Env 𝑠1 𝑎1 Actor 𝑠2 𝑎2 Env 𝑠3 𝑎2 …… Total reward: 𝑅 𝜏 = ෍ 𝑡=1 𝑇 𝑟𝑡 Reward 𝑟1 Reward 𝑟2
  233. 233. Start with observation 𝑠1 Observation 𝑠2 Observation 𝑠3 Example: Playing Video Game Action 𝑎1: “right” Obtain reward 𝑟1 = 0 Action 𝑎2 : “fire” (kill an alien) Obtain reward 𝑟2 = 5 Usually there is some randomness in the environment
  234. 234. Start with observation 𝑠1 Observation 𝑠2 Observation 𝑠3 Example: Playing Video Game After many turns Action 𝑎 𝑇 Obtain reward 𝑟𝑇 Game Over (spaceship destroyed) This is an episode. We want the total reward be maximized. Total reward: 𝑅 = ෍ 𝑡=1 𝑇 𝑟𝑡
  235. 235. • Input of neural network: the observation of machine represented as a vector or a matrix • Output neural network : each action corresponds to a neuron in output layer …… NN as actor pixels fire right left Score of an action 0.7 0.2 0.1 Take the action with the highest score Neural network as Actor What is the benefit of using network instead of lookup table? generalization
  236. 236. Actor, Environment, Reward Actor 𝑠1 𝑎1 Env 𝑠2 Env 𝑠1 𝑎1 Actor 𝑠2 𝑎2 Env 𝑠3 𝑎2 …… 𝑅 𝜏 = ෍ 𝑡=1 𝑇 𝑟𝑡 Reward 𝑟1 Reward 𝑟2 NetworkNetwork They are not networks. 如果發現不能微分就用 policy gradient 硬 train 一發 argmax Ref: https://www.youtube.com/watch?v=W8XF3ME8G2I
  237. 237. Warning of Math Policy Gradient
  238. 238. Step 1: define a set of function Step 2: goodness of function Step 3: pick the best function Three Steps for Deep Learning Deep Learning is so simple …… Neural Network as Actor
  239. 239. Step 1: define a set of function Step 2: goodness of function Step 3: pick the best function Three Steps for Deep Learning Deep Learning is so simple …… Neural Network as Actor
  240. 240. Goodness of Actor • Given an actor 𝜋 𝜃 𝑠 with network parameter 𝜃 • Use the actor 𝜋 𝜃 𝑠 to play the video game • Start with observation 𝑠1 • Machine decides to take 𝑎1 • Machine obtains reward 𝑟1 • Machine sees observation 𝑠2 • Machine decides to take 𝑎2 • Machine obtains reward 𝑟2 • Machine sees observation 𝑠3 • …… • Machine decides to take 𝑎 𝑇 • Machine obtains reward 𝑟𝑇 Total reward: 𝑅 𝜃 = σ 𝑡=1 𝑇 𝑟𝑡 Even with the same actor, 𝑅 𝜃 is different each time We define ത𝑅 𝜃 as the expected value of Rθ ത𝑅 𝜃 evaluates the goodness of an actor 𝜋 𝜃 𝑠 Randomness in the actor and the game END
  241. 241. Goodness of Actor • 𝜏 = 𝑠1, 𝑎1, 𝑟1, 𝑠2, 𝑎2, 𝑟2, ⋯ , 𝑠 𝑇, 𝑎 𝑇, 𝑟𝑇 𝑃 𝜏|𝜃 = 𝑝 𝑠1 𝑝 𝑎1|𝑠1, 𝜃 𝑝 𝑟1, 𝑠2|𝑠1, 𝑎1 𝑝 𝑎2|𝑠2, 𝜃 𝑝 𝑟2, 𝑠3|𝑠2, 𝑎2 ⋯ = 𝑝 𝑠1 ෑ 𝑡=1 𝑇 𝑝 𝑎 𝑡|𝑠𝑡, 𝜃 𝑝 𝑟𝑡, 𝑠𝑡+1|𝑠𝑡, 𝑎 𝑡 Control by your actor 𝜋 𝜃 not related to your actor Actor 𝜋 𝜃 left right fire 𝑠𝑡 0.1 0.2 0.7 𝑝 𝑎 𝑡 = "𝑓𝑖𝑟𝑒"|𝑠𝑡, 𝜃 = 0.7 We define ത𝑅 𝜃 as the expected value of Rθ
  242. 242. Goodness of Actor • An episode is considered as a trajectory 𝜏 • 𝜏 = 𝑠1, 𝑎1, 𝑟1, 𝑠2, 𝑎2, 𝑟2, ⋯ , 𝑠 𝑇, 𝑎 𝑇, 𝑟𝑇 • 𝑅 𝜏 = σ 𝑡=1 𝑇 𝑟𝑡 • If you use an actor to play the game, each 𝜏 has a probability to be sampled • The probability depends on actor parameter 𝜃: 𝑃 𝜏|𝜃 ത𝑅 𝜃 = ෍ 𝜏 𝑅 𝜏 𝑃 𝜏|𝜃 Sum over all possible trajectory Use 𝜋 𝜃 to play the game N times, obtain 𝜏1, 𝜏2, ⋯ , 𝜏 𝑁 Sampling 𝜏 from 𝑃 𝜏|𝜃 N times ≈ 1 𝑁 ෍ 𝑛=1 𝑁 𝑅 𝜏 𝑛
  243. 243. Step 1: define a set of function Step 2: goodness of function Step 3: pick the best function Three Steps for Deep Learning Deep Learning is so simple …… Neural Network as Actor
  244. 244. Gradient Ascent • Problem statement • Gradient ascent • Start with 𝜃0 • 𝜃1 ← 𝜃0 + 𝜂𝛻 ത𝑅 𝜃0 • 𝜃2 ← 𝜃1 + 𝜂𝛻 ത𝑅 𝜃1 • …… 𝜃∗ = 𝑎𝑟𝑔 max 𝜃 ത𝑅 𝜃 𝛻 ത𝑅 𝜃 𝜃 = 𝑤1, 𝑤2, ⋯ , 𝑏1, ⋯ = Τ𝜕 ത𝑅 𝜃 𝜕𝑤1 Τ𝜕 ത𝑅 𝜃 𝜕𝑤2 ⋮ Τ𝜕 ത𝑅 𝜃 𝜕𝑏1 ⋮
  245. 245. Policy Gradient ത𝑅 𝜃 = ෍ 𝜏 𝑅 𝜏 𝑃 𝜏|𝜃 𝛻 ത𝑅 𝜃 =? 𝛻 ത𝑅 𝜃 = ෍ 𝜏 𝑅 𝜏 𝛻𝑃 𝜏|𝜃 = ෍ 𝜏 𝑅 𝜏 𝑃 𝜏|𝜃 𝛻𝑃 𝜏|𝜃 𝑃 𝜏|𝜃 = ෍ 𝜏 𝑅 𝜏 𝑃 𝜏|𝜃 𝛻𝑙𝑜𝑔𝑃 𝜏|𝜃 Use 𝜋 𝜃 to play the game N times, Obtain 𝜏1 , 𝜏2 , ⋯ , 𝜏 𝑁≈ 1 𝑁 ෍ 𝑛=1 𝑁 𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑃 𝜏 𝑛|𝜃 𝑅 𝜏 do not have to be differentiable It can even be a black box. 𝑑𝑙𝑜𝑔 𝑓 𝑥 𝑑𝑥 = 1 𝑓 𝑥 𝑑𝑓 𝑥 𝑑𝑥
  246. 246. Policy Gradient • 𝜏 = 𝑠1, 𝑎1, 𝑟1, 𝑠2, 𝑎2, 𝑟2, ⋯ , 𝑠 𝑇, 𝑎 𝑇, 𝑟𝑇 𝑃 𝜏|𝜃 = 𝑝 𝑠1 ෑ 𝑡=1 𝑇 𝑝 𝑎 𝑡|𝑠𝑡, 𝜃 𝑝 𝑟𝑡, 𝑠𝑡+1|𝑠𝑡, 𝑎 𝑡 𝛻𝑙𝑜𝑔𝑃 𝜏|𝜃 =? 𝑙𝑜𝑔𝑃 𝜏|𝜃 = 𝑙𝑜𝑔𝑝 𝑠1 + ෍ 𝑡=1 𝑇 𝑙𝑜𝑔𝑝 𝑎 𝑡|𝑠𝑡, 𝜃 + 𝑙𝑜𝑔𝑝 𝑟𝑡, 𝑠𝑡+1|𝑠𝑡, 𝑎 𝑡 𝛻𝑙𝑜𝑔𝑃 𝜏|𝜃 = ෍ 𝑡=1 𝑇 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡|𝑠𝑡, 𝜃 Ignore the terms not related to 𝜃
  247. 247. Policy Gradient 𝛻 ത𝑅 𝜃 ≈ 1 𝑁 ෍ 𝑛=1 𝑁 𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑃 𝜏 𝑛|𝜃 𝛻𝑙𝑜𝑔𝑃 𝜏|𝜃 = ෍ 𝑡=1 𝑇 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡|𝑠𝑡, 𝜃 = 1 𝑁 ෍ 𝑛=1 𝑁 𝑅 𝜏 𝑛 ෍ 𝑡=1 𝑇𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 , 𝜃 𝜃 𝑛𝑒𝑤 ← 𝜃 𝑜𝑙𝑑 + 𝜂𝛻 ത𝑅 𝜃 𝑜𝑙𝑑 = 1 𝑁 ෍ 𝑛=1 𝑁 ෍ 𝑡=1 𝑇𝑛 𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 , 𝜃 If in 𝜏 𝑛 machine takes 𝑎 𝑡 𝑛 when seeing 𝑠𝑡 𝑛 in 𝑅 𝜏 𝑛 is positive Tuning 𝜃 to increase 𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 𝑅 𝜏 𝑛 is negative Tuning 𝜃 to decrease 𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 It is very important to consider the cumulative reward 𝑅 𝜏 𝑛 of the whole trajectory 𝜏 𝑛 instead of immediate reward 𝑟𝑡 𝑛 What if we replace 𝑅 𝜏 𝑛 with 𝑟𝑡 𝑛 ……
  248. 248. Add a Baseline 𝜃 𝑛𝑒𝑤 ← 𝜃 𝑜𝑙𝑑 + 𝜂𝛻 ത𝑅 𝜃 𝑜𝑙𝑑 𝛻 ത𝑅 𝜃 ≈ 1 𝑁 ෍ 𝑛=1 𝑁 ෍ 𝑡=1 𝑇𝑛 𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 , 𝜃 It is possible that 𝑅 𝜏 𝑛 is always positive. Ideal case Sampling …… a b c a b c It is probability … a b c Not sampled a b c The probability of the actions not sampled will decrease. 1 𝑁 ෍ 𝑛=1 𝑁 ෍ 𝑡=1 𝑇𝑛 𝑅 𝜏 𝑛 − 𝑏 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 , 𝜃
  249. 249. Policy Gradient 𝜏1: 𝑠1 1 , 𝑎1 1 𝑅 𝜏1 𝑠2 1 , 𝑎2 1 …… 𝑅 𝜏1…… 𝜏2 : 𝑠1 2 , 𝑎1 2 𝑅 𝜏2 𝑠2 2 , 𝑎2 2 …… 𝑅 𝜏2 …… 𝜃 ← 𝜃 + 𝜂𝛻 ത𝑅 𝜃 𝛻 ത𝑅 𝜃 = 1 𝑁 ෍ 𝑛=1 𝑁 ෍ 𝑡=1 𝑇𝑛 𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 , 𝜃 Given actor parameter 𝜃 Update Model Data Collection
  250. 250. Policy Gradient … … fire right left s 1 0 0 − ෍ 𝑖=1 3 ො𝑦𝑖 𝑙𝑜𝑔𝑦𝑖Minimize: Maximize: 𝑙𝑜𝑔𝑃 "𝑙𝑒𝑓𝑡"|𝑠 𝑦𝑖 ො𝑦𝑖 𝑙𝑜𝑔𝑦𝑖 = 𝜃 ← 𝜃 + 𝜂𝛻𝑙𝑜𝑔𝑃 "𝑙𝑒𝑓𝑡"|𝑠 Considered as Classification Problem
  251. 251. Policy Gradient 𝜃 ← 𝜃 + 𝜂𝛻 ത𝑅 𝜃 𝛻 ത𝑅 𝜃 = 1 𝑁 ෍ 𝑛=1 𝑁 ෍ 𝑡=1 𝑇𝑛 𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 , 𝜃 𝜏1: 𝑠1 1 , 𝑎1 1 𝑅 𝜏1 𝑠2 1 , 𝑎2 1 …… 𝑅 𝜏1 …… 𝜏2: 𝑠1 2 , 𝑎1 2 𝑅 𝜏2 𝑠2 2 , 𝑎2 2 …… 𝑅 𝜏2 …… Given actor parameter 𝜃 … 𝑠1 1 NN fire right left 𝑎1 1 = 𝑙𝑒𝑓𝑡 1 0 0 𝑠1 2 NN fire right left 𝑎1 2 = 𝑓𝑖𝑟𝑒 0 0 1
  252. 252. Policy Gradient 𝜃 ← 𝜃 + 𝜂𝛻 ത𝑅 𝜃 𝛻 ത𝑅 𝜃 = 1 𝑁 ෍ 𝑛=1 𝑁 ෍ 𝑡=1 𝑇𝑛 𝑅 𝜏 𝑛 𝛻𝑙𝑜𝑔𝑝 𝑎 𝑡 𝑛 |𝑠𝑡 𝑛 , 𝜃 𝜏1: 𝑠1 1 , 𝑎1 1 𝑅 𝜏1 𝑠2 1 , 𝑎2 1 …… 𝑅 𝜏1 …… 𝜏2: 𝑠1 2 , 𝑎1 2 𝑅 𝜏2 𝑠2 2 , 𝑎2 2 …… 𝑅 𝜏2 …… Given actor parameter 𝜃 … 𝑠1 1 NN 𝑎1 1 = 𝑙𝑒𝑓𝑡 2 2 1 1 𝑠1 1 NN 𝑎1 1 = 𝑙𝑒𝑓𝑡 𝑠1 2 NN 𝑎1 2 = 𝑓𝑖𝑟𝑒 Each training data is weighted by 𝑅 𝜏 𝑛
  253. 253. End of Warning
  254. 254. Outline Reinforcement Learning •Training an actor •RNN generation with GAN •Actor + Critic Inverse Reinforcement Learning
  255. 255. RNN Generation • Training begin 春 眠 不 …… …… …… 春 眠 不 覺 Unsupervised Leaning without annotation
  256. 256. RNN Generation • Testing 前 明 月 <BOS> c1 c2 c3 …… …… …… P(c1) P(c2|c1) P(c3|c1,c2) P(c4|c1,c2,c3) 床 床 前 明 ~ ~ ~ ~ sample Unsupervised Leaning without annotation
  257. 257. RNN Generation • Images are composed of pixels • Generating a pixel at each time by RNN blue pink blue <BOS> …… …… …… red Consider as a sentence blue red yellow gray …… Each pixel is a character. red blue pink ~ ~ ~ ~ c1 c2 c3 P(c1) P(c2|c1) P(c3|c1,c2) P(c4|c1,c2,c3)
  258. 258. 寶可夢鍊成 • Small images of 792 Pokémon's • Can machine learn to create new Pokémons? • Source of image: http://bulbapedia.bulbagarden.net/wiki/List_of_Pok%C3%A 9mon_by_base_stats_(Generation_VI) Don't catch them! Create them! Original image is 40 x 40 Making them into 20 x 20
  259. 259. Real Pokémon Cover 50% Cover 75% It is difficult to evaluate generation. Never seen by machine!
  260. 260. 寶可夢鍊成 Drawing from scratch
  261. 261. PixelRNN Real World Ref: Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu, Pixel Recurrent Neural Networks, arXiv preprint, 2016
  262. 262. More than images …… Audio: Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu, WaveNet: A Generative Model for Raw Audio, arXiv preprint, 2016 Video: Nal Kalchbrenner, Aaron van den Oord, Karen Simonyan, Ivo Danihelka, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu, Video Pixel Networks , arXiv preprint, 2016
  263. 263. Sentence Generation Generator Discriminator sentence x sentence x Real or fake Original GAN code z sampled from prior distribution
  264. 264. Sentence Generation • Initialize generator G and discriminator D • In each iteration: • Sample real sentences 𝑥 from database • Generate sentences ෤𝑥 by G • Update D to increase 𝐷 𝑥 and decrease 𝐷 ෤𝑥 • Update Gen such that Generator Discrimi nator scalar update
  265. 265. Sentence Generation A A A B A B A A B B B <BOS> Can we do backpropogation? Tuning generator a little bit will not change the output. Discriminator scalar NO! In the paper of improved WGAN … (ignoring sampling process)
  266. 266. Sentence Generation A A A B A B A A B B B <BOS> Discriminator scalar observation Actions set Action taken Reward Function
  267. 267. Sentence Generation - SeqGAN • Using Reinforcement learning • Consider the discriminator as reward function • Consider the output of discriminator as total reward • Update generator to increase discriminator = to get maximum total reward Ref: Lantao Yu, Weinan Zhang, Jun Wang, Yong Yu, “SeqGAN: Sequence Generative Adversarial Nets with Policy Gradient”, AAAI, 2017 Generator Discrimi nator scalar update Using Policy Gradient
  268. 268. Chat-bot with GAN Discriminator Input sentence/history h response sentence x Real or fake human dialogues Chatbot En De Conditional GAN response sentence x Input sentence/history h Jiwei Li, Will Monroe, Tianlin Shi, Sébastien Jean, Alan Ritter, Dan Jurafsky, “Adversarial Learning for Neural Dialogue Generation”, arXiv preprint, 2017
  269. 269. Example Results 感謝 段逸林 同學提供實驗結果
  270. 270. Outline Reinforcement Learning •Training an actor •RNN generation with GAN •Actor + Critic Inverse Reinforcement Learning
  271. 271. Critic • A critic does not determine the action. • Given an actor π, it evaluates the how good the actor is http://combiboilersleeds.com/picaso/critics/critics-4.html An actor can be found from a critic. e.g. Q-learning Not suitable for continuous action a
  272. 272. Actor-Critic 𝜋 interacts with the environment Learning Critic Update actor from 𝜋 → 𝜋’ based on Critic TD or MC𝜋 = 𝜋′ Actor = Generator ? Critic = Discriminator ? https://arxiv.org/abs/1610.01945
  273. 273. Visual Doom AI Competition @ CIG 2016 https://www.youtube.com/watch?v=94EPSjQH38Y
  274. 274. Playing On-line Game 劉廷緯、溫明浩 https://www.youtube.com/watch?v=8iRD1w73fDo&feature=youtu.be
  275. 275. Outline Reinforcement Learning •Training an actor •RNN generation with GAN •Actor + Critic Inverse Reinforcement Learning
  276. 276. Imitation Learning We have demonstration of the expert. Actor 𝑠1 𝑎1 Env 𝑠2 Env 𝑠1 𝑎1 Actor 𝑠2 𝑎2 Env 𝑠3 𝑎2 …… reward function is not available Ƹ𝜏1, Ƹ𝜏2, ⋯ , Ƹ𝜏 𝑁 Each Ƹ𝜏 is a trajectory of the export. Self driving: record human drivers Robot: grab the arm of robot
  277. 277. Motivation • It is hard to define reward in some tasks. • Hand-crafted rewards can lead to uncontrolled behavior. 機器人三大法則: 一、機器人不得傷害人類,或坐視人類受到 傷害而袖手旁觀。 二、除非違背第一法則,機器人必須服從人 類的命令。 三、在不違背第一法則及第二法則的情況下, 機器人必須保護自己。 因此為了保護人類整體,控制人類的自由是 必須的 神邏輯
  278. 278. Inverse Reinforcement Learning Reward Function Environment Optimal Actor Inverse Reinforcement Learning ➢Using the reward function to find the optimal actor. ➢Modeling reward can be easier. Simple reward function can lead to complex policy. Reinforcement Learning Expert demonstration of the expert Ƹ𝜏1, Ƹ𝜏2, ⋯ , Ƹ𝜏 𝑁
  279. 279. Inverse Reinforcement Learning • Principle: The teacher is always the best. • Basic idea: • Initialize an actor • In each iteration • The actor interacts with the environments to obtain some trajectories • Define a reward function, which makes the trajectories of the teacher better than the actor • The actor learns to maximize the reward based on the new reward function. • Output the reward function and the actor learned from the reward function
  280. 280. Framework of IRL Expert ො𝜋 Actor 𝜋 Obtain Reward Function R Ƹ𝜏1, Ƹ𝜏2, ⋯ , Ƹ𝜏 𝑁 𝜏1, 𝜏2, ⋯ , 𝜏 𝑁 Find an actor based on reward function R By Reinforcement learning ෍ 𝑛=1 𝑁 𝑅 Ƹ𝜏 𝑛 > ෍ 𝑛=1 𝑁 𝑅 𝜏 Reward function = Discriminator Actor = Generator Reward Function R
  281. 281. 𝜏1, 𝜏2, ⋯ , 𝜏 𝑁 GAN IRL G D High score for real, low score for generated Find a G whose output obtains large score from D Ƹ𝜏1, Ƹ𝜏2, ⋯ , Ƹ𝜏 𝑁 Expert Actor Reward Function Larger reward for Ƹ𝜏 𝑛, Lower reward for 𝜏 Find a Actor obtains large reward
  282. 282. Teaching Robot • In the past …… https://www.youtube.com/watch?v=DEGbtjTOIB0
  283. 283. Robot http://rll.berkeley.edu/gcl/ Chelsea Finn, Sergey Levine, Pieter Abbeel, ” Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization”, ICML, 2016
  284. 284. Parking a Car https://pdfs.semanticscholar.org/27c3/6fd8e6aeadd50ec1e180399df70c46dede00.pdf
  285. 285. Path Planning http://martin.zinkevich.org/publications /maximummarginplanning.pdf
  286. 286. Third Person Imitation Learning • Ref: Bradly C. Stadie, Pieter Abbeel, Ilya Sutskever, “Third-Person Imitation Learning”, arXiv preprint, 2017 http://lasa.epfl.ch/research_new/ML/index.php https://kknews.cc/sports/q5kbb8.html http://sc.chinaz.com/Files/pic/icons/1913/%E6%9C%BA%E5%99%A8%E4%BA%BA%E5%9B %BE%E6%A0%87%E4%B8%8B%E8%BD%BD34.png Third PersonFirst Person
  287. 287. Third Person Imitation Learning
  288. 288. Concluding Remarks Lecture 1: Brief Review of Deep Learning Lecture 2: Introduction of Generative Adversarial Network Lecture 3: Conditional Generation Lecture 4: Making Decision and Control
  289. 289. To Learn More … • Machine Learning • Slides: http://speech.ee.ntu.edu.tw/~tlkagk/courses_ML16.htm l • Video: https://www.youtube.com/watch?v=fegAeph9UaA&list= PLJV_el3uVTsPy9oCRY30oBPNLCo89yu49 • Machine Learning and Having it Deep and Structured • Slides: http://speech.ee.ntu.edu.tw/~tlkagk/courses_MLDS17.h tml • Video: https://www.youtube.com/watch?v=IzHoNwlCGnE&list= PLJV_el3uVTsPMxPbjeX7PicgWbY7F8wW9
  290. 290. 工商時間 「科技大擂台 與AI對話」 熱身賽初賽
  291. 291. 「科技大擂台 與AI對話」 ➢ 參加對象:大專院校研究所以下、 高中職學生 (一到四人一組參加) ➢有高額獎金 ➢但跟這份投影片 沒什麼關係 ➢ 首獎有十萬 ➢ 讓大家在拿高額獎金前先暖個身 ➢全民皆可參加
  292. 292. 熱身賽初賽 ─ 試題 • 機器在聽一段人類對話後,預測人類的下一句話 • 試題下載 • 範例試題 (development set):8/02 後可以在科技部官 網上下載 50 題的範例試題和其答案 • 測試試題 (testing set):8/07 釋出 A: 柏翰謝謝你 B: 不客氣 A: 不好意思耽誤你下 課的休息時間 初賽直接提供字幕檔 選項: (1) B: 我很喜歡下課時間 (2) B: 都這麼熟了不用說 客套話 …… 本出題模式 僅限熱身賽
  293. 293. Time Topic 13:50-14:00 報到 14:00-14:50 詞向量 (Word Embedding) 實作 15:00-15:50 遞迴式類神經網路(Recurrent Neural Network, RNN) 實作 16:00-16:50 序列對序列模型(Sequence-to- sequence Model) 實作 17:00- Q&A 本計畫辦公室將舉辦賽前教育訓練,邀請台大 李宏毅教授 團隊蒞臨指導。 時間:2017年8月16日(三)下午2-5點 地點:106台北市和平東路2段106號1樓(科技大樓)創新創 業推動組會議室 僅開放給報名完成的團隊 (8月15日報名截止) 實作教學

×