Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DRAW: Deep Recurrent Attentive Writer

2,509 views

Published on

slides for http://www.meetup.com/Taiwan-R/events/232661381/?rv=co1

Published in: Technology

DRAW: Deep Recurrent Attentive Writer

  1. 1. DRAW Presented by Mark Chang 1
  2. 2. Original Paper •  Title: – DRAW: A Recurrent Neural Network For Image GeneraEon •  Authors: – Karol Gregor, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende and Daan Wierstra •  OrganizaEon: – Google DeepMind •  URL: – hNps://arxiv.org/pdf/1502.04623v2.pdf 2
  3. 3. Outline •  Image GeneraEon –  DiscriminaEve Model –  GeneraEve Model –  What is DRAW ? •  Background Knowledge –  Neural Networks –  Autoencoder –  VariaEonal Autoencoder –  Recurrent Neural Networks –  Long Short-term Memory •  DRAW –  Network Architecture –  SelecEve ANenEon Model –  Training DRAW –  GeneraEng New Images –  Experiments 3
  4. 4. Image GeneraEon •  DiscriminaEve Model •  GeneraEve Model •  What is DRAW ? 4
  5. 5. DiscriminaEve Model model training data They are 5. They are 7. This is 7. tesEng data a_er training Is this 5 or 7? 5
  6. 6. DiscriminaEve Model low dimensional space high dimensional space discriminaEve model feature extracEon other example 6
  7. 7. GeneraEve Model model training data They are 5. They are 7. Draw a 7, please. a_er training 7
  8. 8. GeneraEve Model low dimensional space high dimensional space GeneraEve model generate new example other example 8
  9. 9. What is DRAW ? •  GeneraEve Model for Image GeneraEon: – Deep ConvoluEonal GeneraEve Adversarial Networks (DCGAN) – Pixel Recurrent Neural Networks (PixelRNN) – Deep Recurrent ANenEve Writer (DRAW) 9
  10. 10. What is DRAW ? You can’t capture all the details at once. image reconstruct the image feed it into the model model 10
  11. 11. What is DRAW ? reconstruct the image “step by step” Deep Recurrent ANenEve Writer (DRAW) aNenEon reconstruct the image result model 11
  12. 12. Background Knowledge •  Neural Networks •  Autoencoder •  VariaEonal Autoencoder •  Recurrent Neural Networks •  Long Short-term Memory 12
  13. 13. Background Knowledge 13 Autoencoder VariaEonal Autoencoder Recurrent Neural Networks Long Short-term Memory Neural Networks generate new example low dimensional space high dimensional space reconstruct the image “step by step”
  14. 14. Neural Networks n W1 W2 x1 x2 b Wb y nin = w1x1 + w2x2 + wb nout = 1 1 + e nin nin nout y = 1 1 + e (w1x1+w2x2+wb) neuron acEvaEon funcEon (sigmoid) 14
  15. 15. nout = 1 nout = 0.5 nout = 0(0,0) x2 x1 Neuron nin = w1x1 + w2x2 + wb nout = 1 1 + e nin nin = w1x1 + w2x2 + wb nout = 1 1 + e nin w1x1 + w2x2 + wb = 0 w1x1 + w2x2 + wb > 0 w1x1 + w2x2 + wb < 0 1 0 input signal output signal 15
  16. 16. AND Gate x1 x2 y 0 0 0 0 1 0 1 0 0 1 1 1 (0,0) (0,1) (1,1) (1,0) 0 1 n 20 20 b -30 y x1 x2 y = 1 1 + e (20x1+20x2 30) 20x1 + 20x2 30 = 0 16
  17. 17. XOR Gate ? (0,0) (0,1) (1,1) (1,0) 0 0 1 x1 x2 y 0 0 0 0 1 1 1 0 1 1 1 0 17
  18. 18. XOR Gate n -20 20 b -10 y (0,0) (0,1) (1,1) (1,0) 0 1 (0,0) (0,1) (1,1) (1,0) 1 0 (0,0) (0,1) (1,1) (1,0) 0 0 1 n1 20 20 b -30 x1 x2 n2 20 20 b -10 x1 x2 x1 x2 n1 n2 y 0 0 0 0 0 0 1 0 1 1 1 0 0 1 1 1 1 1 1 0 18
  19. 19. Neural Networks x y n11 n12 n21 n22 W12,y W12,x b W11,y W11,b W12,b b W11,x W21,11 W22,12 W21,12 W22,11 W21,b W22,b z1 z2 Input Layer Hidden Layer Output Layer 19
  20. 20. Training Neural Networks 7 5 Inputs: Outputs: 60% 40% 7 5 100% Golden: y p(y|x)x J = log(p(y|x)) w = w ⌘ @log(p(y|x)) @w forward propagaEon backward propagaEon loss funcEon 20
  21. 21. Training Neural Networks w w 21 p(y|x) ⇡ 1 p(y|x) ⇡ 0 J = log(p(y|x))
  22. 22. Training Neural Networks Learning Rate 22 gradient descent w = w ⌘ @log(p(y|x)) @w @log(p(y|x)) @w
  23. 23. Training Neural Networks 23 Golden is 1 Golden is 0 p(y|x) y y w = w ⌘ @log(p(y|x)) @w x
  24. 24. Training Neural Networks 24
  25. 25. Backward PropagaEon n2 n1 J Cost funcEon: n2(out) n2(in) w21 25 @J @w21 = @J @n2(out) @n2(out) @n2(in) @n2(in) @w21 w21 w21 ⌘ @J @w21 w21 w21 ⌘ @J @n2(out) @n2(out) @n2(in) @n2(in) @w21
  26. 26. Autoencoder input output encode decode x h(x) ˆx = g(h(x)) smaller hidden layer same size 26
  27. 27. Autoencoder input x low dimensional space high dimensional space h(x) encode encode decode decode output ˆx = g(h(x)) 27
  28. 28. VariaEonal Inference Intractable to compute observable data: x latent space: z p(z|x) = p(z, x) p(x) = p(x|z)p(z) R p(x|z)p(z)dz p(x|z) can be easily computed. 28
  29. 29. VariaEonal Autoencoder VariaEonal Inference: 1. Approximate     by 2. Minimize the KL Divergence: p(z|x) q(z) DKL[q(z)||p(z|x)] = Z q(z)log q(z) p(z|x) dz hNps://www.youtube.com/playlist?list=PLeeHDpwX2Kj55He_jfPojKrZf22HVjAZY 29
  30. 30. Encoder Networks Decoder Networks VariaEonal Autoencoder p✓(x|z) ✏ ⇠ N(0, I) x z = µ + ✏ z input ˆx output µ Sampling normal distribuEon N(µ, ) g (µ, |x) 30
  31. 31. VariaEonal Autoencoder encode low dimensional space encode decode decode decode x µ, z = µ + ✏ ✏ ⇠ N(0, I) z ˆx normal distribuEon Sampling input output N(µ, ) high dimensional space 31
  32. 32. Recurrent Neural Network This is a cat. This is a cat. This is a This This is model with memory This is a cat. 32
  33. 33. Recurrent Neural Network nin,t = wcxt + wpnout,t 1 + wb nout,t = 1 1 + e nin,t The output is feedback into the input. n Wc b Wb xt yt nout,t nin,t Wp 33
  34. 34. Recurrent Neural Network This is n(n(This), is) n( This ) a n(n(n(This),is),a) n Wc b Wb xt yt nout,t nin,t Wp n Wc b Wb xt yt nout,t nin,t Wp n Wc b Wb xt yt nout,t nin,t Wp 34
  35. 35. The black dog is chasing the white cat. Recurrent Neural Network QuesEon: what is the color of the dog chasing the cat? The memory is limited, so it can’t remember everything. 35
  36. 36. Long Short-Term Memory Input Gate: Cin Read Gate: Cread Forget Gate: Cforget Write Gate: Cwrite Output Gate: Cout 36
  37. 37. Long Short-Term Memory •  Cwrite Cwrite = sigmoid(wcw,xxt + wcw,yyt 1 + wcw,b) min,t = koutCwrite kout = tanh(wk,xxt + wk,b) 37
  38. 38. Long Short-Term Memory •  Cforget Cforget = sigmoid(wcf,xxt + wcf,yyt + wcf,b) mout,t = min,t +Cforgetmout,t 1 38
  39. 39. Long Short-Term Memory •  Cread Cread = sigmoid(wcr,xxt + wcr,yyt 1 + wcr,b) Cout nout= Cread nout = tanh(mout,t) 39
  40. 40. The black dog is chasing the white cat. Long Short-Term Memory QuesEon: what is the color of the dog chasing the cat? It keeps the most important informaEon in its memory. black dog chasing cat 40
  41. 41. DRAW •  Network Architecture •  SelecEve ANenEon Model •  Training DRAW •  GeneraEng New Images •  Experiments 41
  42. 42. Network Architecture zt ⇠ Q(Zt|henc t ) henc t = RNNenc (henc t 1, [rt, hdec t 1]) hdec t = RNNdec (hdec t 1, zt) ct = ct 1 + write(hdec t ) rt = read(x, ˆxt, hdec t 1) RNNdec x henc t zt ct 1 ct hdec t 1 RNNenc ˆxt RNNdec RNNenc henc t 1 input image sampling canvas hdec t ˆxt = x sigmoid(ct 1) 42
  43. 43. Network Architecture x hdec t henc t zt ct 1 ct hdec t 1 RNNenc ˆxt RNNdec RNNenc henc t 1 x zt+1 henc t+1 hdec t+1 ct+1 ˆxt+1 RNNenc RNNdec cT sigmoid RNNdec P(x|z1:T ) output image input image canvas 43
  44. 44. X A B Y N = 3 SelecEve ANenEon Model (gX , gY ) centre of the grid centre of the filter1,2 (µ1 X , µ2 Y ) µi X = gX + (i N/2 0.5) µj Y = gY + (i N/2 0.5) µ1 X = gX + (1 1.5 0.5) = gX µ2 Y = gY + (2 1.5 0.5) = gY N × N grid of Gaussian filters distance between the filters 44
  45. 45. SelecEve ANenEon Model (gX , gY ) X A B Y (µ1 X , µ2 Y ) N = 3 (˜gX , ˜gY , log 2 , log˜, log ) = W(hdec ) gX = A + 1 2 (˜gX + 1) gY = B + 1 2 (˜gY + 1) = max(A, B) 1 N 1 ˜ parameters are determined by hdec variance of the Gaussian filters 45
  46. 46. SelecEve ANenEon Model FY [i, b] = 1 ZY exp( (b µi Y )2 2 2 ) horizontal and verEcal filter bank matrices FX [i, a] = 1 ZX exp( (a µi X )2 2 2 ) A B FX FX[1, :] FX[2, :] FX[3, :] horizontal filter bank matrix verEcal filter bank matrix FT Y FY [1, :]T FY [2, :]T FY [3, :]T 46
  47. 47. SelecEve ANenEon Model read(x, ˆxt, hdec t 1) = [FY xFT X , FY ˆxtFT X ] N N intensity horizontal filter bank matrix verEcal filter bank matrix FY FT Xx N A B FY xFT X 47
  48. 48. SelecEve ANenEon Model (˜g0 X , ˜g0 Y , log 02 , log˜0 , log 0 , wt) = W(hdec ) write(hdec t ) = 1 ˆ F0T Y wtF0 X F0T Y F0 X A B wt A B F0T Y wtF0 X 48
  49. 49. Training DRAW reconstrucEon loss Bernoulli distribuEon D(x|cT ) = P(x|z1:T )x (1 P(x|z1:T ))1 x Lx = logD(x|cT ) = ⇣ xlogP(x|z1:T ) + (1 x)log 1 P(x|z1:T ) ⌘ P(x|z1:T ) pixel at posiEon i,j xi,j = P(x|z1:T )i,j ) logD(x|cT )i,j = 0 P(x|z1:T )i,j xi,j x xi,j 6= P(x|z1:T )i,j ) logD(x|cT )i,j = 1 49
  50. 50. Training DRAW Lz = TX t=1 KL ⇣ Q(Zt|henc t )||P(Zt) ⌘ latent loss (regularizaEon) Q(Zt|henc t ) = N(Zt|µt, t) P(Zt) = N(Zt|0, 1) Lz = 1 2 ⇣ TX t=1 µ2 t + 2 t log 2 t ⌘ T 2 prior henc t RNNenc Q(Zt|henc t ) W(henc t ) µt, t P(Zt) Q should be as simple as possible 50
  51. 51. Training DRAW L = Lx + Lz total loss = reconstrucEon loss + latent loss w w ⌘ @L @w gradient descent 51 reconstrucEon and x should be as similar as possible. z should be as simple as possible.
  52. 52. GeneraEng New Images ˜zt ⇠ P(Zt) ˜hdec t = RNNdec (˜hdec t 1, ˜zt) ˜ct = ˜ct 1 + write(˜hdec t ) ˜x ⇠ D(X|˜cT ) RNNdec RNNdec canvas D ˜zt ˜hdec t ˜ct ˜cT ˜ct 1 ˜hdec t 1 52
  53. 53. Experiments MNIST hNps://www.youtube.com/watch?v=Zt-7MI9eKEo generated data training data 53
  54. 54. Experiments SVHN hNps://www.youtube.com/watch?v=Zt-7MI9eKEo generated data training data 54
  55. 55. Experiments CIFAR hNps://www.youtube.com/watch?v=Zt-7MI9eKEo generated data training data 55
  56. 56. Further Reading •  Neural Network Back PropagaEon: –  hNp://cpmarkchang.logdown.com/posts/277349-neural- network-backward-propagaEon •  VariaEonal Autoencoder: –  hNps://arxiv.org/abs/1312.6114 –  hNps://www.youtube.com/playlist? list=PLeeHDpwX2Kj55He_jfPojKrZf22HVjAZY •  DRAW –  hNps://arxiv.org/pdf/1502.04623v2.pdf •  DCGAN –  hNps://arxiv.org/abs/1511.06434 56
  57. 57. Source Code •  hNps://github.com/ericjang/draw 57
  58. 58. About the Speaker •  Email: ckmarkoh at gmail dot com •  Blog: hNp://cpmarkchang.logdown.com •  Github: hNps://github.com/ckmarkoh Mark Chang •  Facebook: hNps://www.facebook.com/ckmarkoh.chang •  Slideshare: hNp://www.slideshare.net/ckmarkohchang •  Linkedin: hNps://www.linkedin.com/pub/mark-chang/85/25b/847 •  Youtube: hNps://www.youtube.com/channel/UCckNPGDL21aznRhl3EijRQw 58

×