Ⅰ. Neural Network
Ⅱ. Generative Adversarial Nets
Ⅲ. Image-to-Image Translation
1. How does Neural Network learn?
2. What do we have to decide?
3. Why it’s hard to decide a loss function?
Neural Network
Ⅰ
What is the Neural Network?
How does Neural Network learn?
Preparing input and target pairs.
inputs targets
Lion
Cat
map
0
1
1
0
0
1
One-hot
encoding
Dog 2
0
0
0
0
1
How does Neural Network learn?
The weights of the network are arbitrarily set.
0.6
0.2
0.3
0.9
0.1
How does Neural Network learn?
Feed Forward
How does Neural Network learn?
Feed Forward
0.2
0.1
0.6
0.3
0.2
0.7
0.3
0.1
𝑠𝑢𝑚: 0.2 × 0.2 + 0.1 × 0.7 + 0.6 × 0.3 + 0.3 × 0.1 = 0.32
N21
𝑂𝑢𝑡𝑝𝑢𝑡 𝑜𝑓 𝑁21 = 𝑓 0.32 𝑓 𝑖𝑠 𝑎𝑐𝑡𝑖𝑣𝑎𝑡𝑖𝑜𝑛 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝑁21
𝑂𝑢𝑡𝑝𝑢𝑡 𝑜𝑓 𝑁21 = 𝑓 0.32 = 0.1024. 𝑖𝑓 𝑓 𝑥 = 𝑥2
How does Neural Network learn?
Calculate error
Sum of squares loss
Softmax loss
Cross entropy loss
Hinge loss
How does Neural Network learn?
−
Sum of squares loss
Softmax loss
Cross entropy loss
Hinge loss
0.2
0.8
Sum of squares loss = 0.08
0.2
0.8
Output of ANN
0.0
1.0
Target value
= 0.04
0.04
( )
2
How does Neural Network learn?
Feedback
What we have to decide?
Gradient Descent Optimization Algorithms
• Batch Gradient Descent
• Stochastic Gradient Descent (SGD)
• Momentum
• Nesterov Accelerated Gradient (NAG)
• Adagrad
• RMSProp
• AdaDelta
• Adam
What we have to decide?
Neural network structure
• VGG-19
• GoogLeNet
Training techniques
• Drop out
• sparse
Loss function and cost function
• Cross entropy
• Sum of squeares
Optimization algorithm
• Adam
• SDG
Why it’s hard to decide a loss function?
In classification.
Input
NN
Output of NN Target
Output of NN
Calculate NN output Calculate loss
loss
NN
Update weights
of NN using loss
Why it’s hard to decide a loss function?
In classification.
Output of NN Target
0.67
0.00
0.02
0.12
0.04
0.00
0.03
0.14
1.0
0.00
0.00
0.00
0.00
0.00
0.00
0.00
Loss
Sum of L1 norm Cross entropy
0.68 2.45
Why it’s hard to decide a loss function?
When an output of NN is image.
Input Ground truth L1
This image is captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Networks”,
CVPR, 2016
Why it’s hard to decide a loss function?
If output form is a digit.
Multiple choice questions
Essay questions
Art practical exam
If output form is a image.
Why it’s hard to decide a loss function?
If output form is a digit.
Multiple choice questions
Essay questions
Art practical exam
If output form is a image.
A difficulty of assessment
1. Generative Adversarial Networks
2. Training Tip
Generative Adversarial Nets
Ⅱ
Generative Adversarial Nets
Leonardo Dicaprio:
a counterfeiter
Tom Hanks:
FBI – a counterfeit money
discriminator
Generative Adversarial Nets
Counterfeiter
(Generator)
FBI
(Discriminator)
50,000 won
Can you
discriminate it is
counterfeit or
not?
I made a
counterfeit
money!
Generative Adversarial Nets
Counterfeiter
(Generator)
FBI
(Discriminator)
Oh, no!
I can’t
discriminate it is
counterfeit or
not.
Maybe it’s
counterfeit
money with 55%
probability!
50,000 won
Generative Adversarial Nets
FBI
(Discriminator)
50,000 won
Compare target and
output of generator.
Output of generator
target
Generative Adversarial Nets
Counterfeiter
(Generator)
FBI
(Discriminator)
50,000 won
Can you
discriminate it is
counterfeit or
not?
I made a
counterfeit
money!
Generative Adversarial Nets
Counterfeiter
(Generator)
FBI
(Discriminator)
50,000 won
It’s counterfeit
money with
99.9%
probability!
loss
Generative Adversarial Nets
FBI
(Discriminator)
50,000 won
Compare target and
output of generator.
Output of generator
target
Generative Adversarial Nets
Counterfeiter
(Generator)
FBI
(Discriminator)
Can you
discriminate it is
counterfeit or
not?
I made a
counterfeit
money again!
Generative Adversarial Nets
Counterfeiter
(Generator)
FBI
(Discriminator)
It’s counterfeit
money with
70.5%
probability!
loss
Generative Adversarial Nets
FBI
(Discriminator)
Compare target and
output of generator.
Output of generator
target
Generative Adversarial Nets
Counterfeiter
(Generator)
FBI
(Discriminator)
Can you
discriminate it is
counterfeit or
not?
I made a
counterfeit
money again!
Generative Adversarial Nets
Counterfeiter
(Generator)
FBI
(Discriminator)
Oh, no!
I can’t
discriminate it is
counterfeit or
not.
loss
Maybe it’s
counterfeit
money with 50%
probability!
Generative Adversarial Nets
FBI
(Discriminator)
Compare target and
output of generator.
Output of generator
target
Generative Adversarial Nets
D tries to make D(G(z)) near 0, G tries to make D(G(z)) near 1
This image is captured from Ian J. Goodfellow, et al., “Generative Adversarial Nets”.
Training Tip
min
𝐺
max
𝐷
𝑉(𝐷, 𝐺) = 𝔼 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝔼 𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷 𝐺(𝑧) )]
max
𝐺
𝑉(𝐷, 𝐺) = 𝔼 𝑧~𝑝 𝑧(𝑧)[log(𝐷 𝐺(𝑧) )]
max
𝐷
𝑉(𝐷, 𝐺) = 𝔼 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝔼 𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷 𝐺(𝑧) )]
min
𝐺
𝑉(𝐷, 𝐺) = −(𝔼 𝑧~𝑝 𝑧(𝑧)[log(𝐷 𝐺(𝑧) )])
min
𝐷
𝑉(𝐷, 𝐺) = −(𝔼 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝔼 𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷 𝐺(𝑧) )])
Training Tip
min
𝐺
max
𝐷
𝑉(𝐷, 𝐺) = 𝔼 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝔼 𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷 𝐺(𝑧) )]
max
𝐺
𝑉(𝐷, 𝐺) = 𝔼 𝑧~𝑝 𝑧(𝑧)[log(𝐷 𝐺(𝑧) )]
max
𝐷
𝑉(𝐷, 𝐺) = 𝔼 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝔼 𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷 𝐺(𝑧) )]
min
𝐺
𝑉(𝐷, 𝐺) = −(𝔼 𝑧~𝑝 𝑧(𝑧)[log(𝐷 𝐺(𝑧) )])
min
𝐷
𝑉(𝐷, 𝐺) = −(𝔼 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝔼 𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷 𝐺(𝑧) )])
1. Introduce
2. Method
3. Experiments
Image to Image translation
Ⅲ
Introduce
Conditional adversarial nets are a general-purpose solution
for image-to-image translation.
Code: https://github.com/phillipi/pix2pix
This image is captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Networks”,
CVPR, 2016
Method
GAN
G: z  y
Conditional GAN
G: {x, z}  y
This image is captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Networks”,
CVPR, 2016
Method
ℒ 𝑐𝐺𝐴𝑁(𝐺, 𝐷) = 𝔼 𝑥,𝑦 log 𝐷 𝑥, 𝑦 + 𝔼 𝑥,𝑧[log(1 − 𝐷 𝑥, 𝐺(𝑥, 𝑧) )]
ℒ 𝐺𝐴𝑁(𝐺, 𝐷) = 𝔼 𝑦 log 𝐷 𝑦 + 𝔼 𝑥,𝑧[log(1 − 𝐷 𝐺(𝑥, 𝑧) )]
ℒ 𝐿1(𝐺) = 𝔼 𝑥,𝑦,𝑧 𝑦 − 𝐺(𝑥, 𝑧) 1
𝐺∗ = 𝑎𝑟𝑔 min
𝐺
max
𝐷
ℒ 𝑐𝐺𝐴𝑁 𝐺, 𝐷 + 𝜆 ℒ 𝐿1(𝐺)
Objective function for GAN
Objective function for cGAN
Final objective function
Method
Network architectures
Generator
Discriminator – Markovian discriminator (PatchGAN)
This discriminator effectively models the image as a Markov random field.
This image is captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Netowrks”,
CVPR, 2016
Method
This image is captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Nets”,
https://www.slideshare.net/xavigiro/imagetoimage-translation-with-conditional-adversarial-nets-upc-reading-group
This image is captured from http://ccvl.jhu.edu/datasets/
Experiments
This image is captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Networks”,
CVPR, 2016
Experiments
This image is captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Networks”,
CVPR, 2016
Experiments
This image is captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Networks”,
CVPR, 2016
Experiments
Patch size variations.
This images are captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Networks”,
CVPR, 2016
References
[1] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley,
Sherjil Ozair, Aaron Courville, Yoshua Bengio, “Generative Adversarial Nets”, NIPS
2014
[2] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros, “Image-to-Image
Translation with Conditional Adversarial Networks”, CVPR 2016
[3] Kwangil Kim, “Artificial Neural Networks”, Multimedia system lecture of KHU,
2017
[4] DL4J, “A Beginner’s Guide to Recurrent Networks and LSTMs”, 2017,
https://deeplearning4j.org/lstm.html. Accessed, 2018-01-29
[5] Phillip Isola, Jun-Yan Zhu, Tinghui, “Image-to-Image translation with conditional
Adversarial Nets”, Nov 25, 2016,
https://www.slideshare.net/xavigiro/imagetoimage-translation-with-conditional-
adversarial-nets-upc-reading-group. Accessed, 2018-01-29
[6] CCVL, “Datasets: PASCAL Part Segmentation Challenge”, 2018
http://ccvl.jhu.edu/datasets/. Accessed, 2018-01-29
Gan seminar

Gan seminar

  • 2.
    Ⅰ. Neural Network Ⅱ.Generative Adversarial Nets Ⅲ. Image-to-Image Translation
  • 3.
    1. How doesNeural Network learn? 2. What do we have to decide? 3. Why it’s hard to decide a loss function? Neural Network Ⅰ
  • 4.
    What is theNeural Network?
  • 5.
    How does NeuralNetwork learn? Preparing input and target pairs. inputs targets Lion Cat map 0 1 1 0 0 1 One-hot encoding Dog 2 0 0 0 0 1
  • 6.
    How does NeuralNetwork learn? The weights of the network are arbitrarily set. 0.6 0.2 0.3 0.9 0.1
  • 7.
    How does NeuralNetwork learn? Feed Forward
  • 8.
    How does NeuralNetwork learn? Feed Forward 0.2 0.1 0.6 0.3 0.2 0.7 0.3 0.1 𝑠𝑢𝑚: 0.2 × 0.2 + 0.1 × 0.7 + 0.6 × 0.3 + 0.3 × 0.1 = 0.32 N21 𝑂𝑢𝑡𝑝𝑢𝑡 𝑜𝑓 𝑁21 = 𝑓 0.32 𝑓 𝑖𝑠 𝑎𝑐𝑡𝑖𝑣𝑎𝑡𝑖𝑜𝑛 𝑓𝑢𝑛𝑐𝑡𝑖𝑜𝑛 𝑜𝑓 𝑁21 𝑂𝑢𝑡𝑝𝑢𝑡 𝑜𝑓 𝑁21 = 𝑓 0.32 = 0.1024. 𝑖𝑓 𝑓 𝑥 = 𝑥2
  • 9.
    How does NeuralNetwork learn? Calculate error Sum of squares loss Softmax loss Cross entropy loss Hinge loss
  • 10.
    How does NeuralNetwork learn? − Sum of squares loss Softmax loss Cross entropy loss Hinge loss 0.2 0.8 Sum of squares loss = 0.08 0.2 0.8 Output of ANN 0.0 1.0 Target value = 0.04 0.04 ( ) 2
  • 11.
    How does NeuralNetwork learn? Feedback
  • 12.
    What we haveto decide? Gradient Descent Optimization Algorithms • Batch Gradient Descent • Stochastic Gradient Descent (SGD) • Momentum • Nesterov Accelerated Gradient (NAG) • Adagrad • RMSProp • AdaDelta • Adam
  • 13.
    What we haveto decide? Neural network structure • VGG-19 • GoogLeNet Training techniques • Drop out • sparse Loss function and cost function • Cross entropy • Sum of squeares Optimization algorithm • Adam • SDG
  • 14.
    Why it’s hardto decide a loss function? In classification. Input NN Output of NN Target Output of NN Calculate NN output Calculate loss loss NN Update weights of NN using loss
  • 15.
    Why it’s hardto decide a loss function? In classification. Output of NN Target 0.67 0.00 0.02 0.12 0.04 0.00 0.03 0.14 1.0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Loss Sum of L1 norm Cross entropy 0.68 2.45
  • 16.
    Why it’s hardto decide a loss function? When an output of NN is image. Input Ground truth L1 This image is captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Networks”, CVPR, 2016
  • 17.
    Why it’s hardto decide a loss function? If output form is a digit. Multiple choice questions Essay questions Art practical exam If output form is a image.
  • 18.
    Why it’s hardto decide a loss function? If output form is a digit. Multiple choice questions Essay questions Art practical exam If output form is a image. A difficulty of assessment
  • 19.
    1. Generative AdversarialNetworks 2. Training Tip Generative Adversarial Nets Ⅱ
  • 20.
    Generative Adversarial Nets LeonardoDicaprio: a counterfeiter Tom Hanks: FBI – a counterfeit money discriminator
  • 21.
    Generative Adversarial Nets Counterfeiter (Generator) FBI (Discriminator) 50,000won Can you discriminate it is counterfeit or not? I made a counterfeit money!
  • 22.
    Generative Adversarial Nets Counterfeiter (Generator) FBI (Discriminator) Oh,no! I can’t discriminate it is counterfeit or not. Maybe it’s counterfeit money with 55% probability! 50,000 won
  • 23.
    Generative Adversarial Nets FBI (Discriminator) 50,000won Compare target and output of generator. Output of generator target
  • 24.
    Generative Adversarial Nets Counterfeiter (Generator) FBI (Discriminator) 50,000won Can you discriminate it is counterfeit or not? I made a counterfeit money!
  • 25.
    Generative Adversarial Nets Counterfeiter (Generator) FBI (Discriminator) 50,000won It’s counterfeit money with 99.9% probability! loss
  • 26.
    Generative Adversarial Nets FBI (Discriminator) 50,000won Compare target and output of generator. Output of generator target
  • 27.
    Generative Adversarial Nets Counterfeiter (Generator) FBI (Discriminator) Canyou discriminate it is counterfeit or not? I made a counterfeit money again!
  • 28.
  • 29.
    Generative Adversarial Nets FBI (Discriminator) Comparetarget and output of generator. Output of generator target
  • 30.
    Generative Adversarial Nets Counterfeiter (Generator) FBI (Discriminator) Canyou discriminate it is counterfeit or not? I made a counterfeit money again!
  • 31.
    Generative Adversarial Nets Counterfeiter (Generator) FBI (Discriminator) Oh,no! I can’t discriminate it is counterfeit or not. loss Maybe it’s counterfeit money with 50% probability!
  • 32.
    Generative Adversarial Nets FBI (Discriminator) Comparetarget and output of generator. Output of generator target
  • 33.
    Generative Adversarial Nets Dtries to make D(G(z)) near 0, G tries to make D(G(z)) near 1 This image is captured from Ian J. Goodfellow, et al., “Generative Adversarial Nets”.
  • 34.
    Training Tip min 𝐺 max 𝐷 𝑉(𝐷, 𝐺)= 𝔼 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝔼 𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷 𝐺(𝑧) )] max 𝐺 𝑉(𝐷, 𝐺) = 𝔼 𝑧~𝑝 𝑧(𝑧)[log(𝐷 𝐺(𝑧) )] max 𝐷 𝑉(𝐷, 𝐺) = 𝔼 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝔼 𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷 𝐺(𝑧) )] min 𝐺 𝑉(𝐷, 𝐺) = −(𝔼 𝑧~𝑝 𝑧(𝑧)[log(𝐷 𝐺(𝑧) )]) min 𝐷 𝑉(𝐷, 𝐺) = −(𝔼 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝔼 𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷 𝐺(𝑧) )])
  • 35.
    Training Tip min 𝐺 max 𝐷 𝑉(𝐷, 𝐺)= 𝔼 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝔼 𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷 𝐺(𝑧) )] max 𝐺 𝑉(𝐷, 𝐺) = 𝔼 𝑧~𝑝 𝑧(𝑧)[log(𝐷 𝐺(𝑧) )] max 𝐷 𝑉(𝐷, 𝐺) = 𝔼 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝔼 𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷 𝐺(𝑧) )] min 𝐺 𝑉(𝐷, 𝐺) = −(𝔼 𝑧~𝑝 𝑧(𝑧)[log(𝐷 𝐺(𝑧) )]) min 𝐷 𝑉(𝐷, 𝐺) = −(𝔼 𝑥~𝑝 𝑑𝑎𝑡𝑎(𝑥) log 𝐷 𝑥 + 𝔼 𝑧~𝑝 𝑧(𝑧)[log(1 − 𝐷 𝐺(𝑧) )])
  • 36.
    1. Introduce 2. Method 3.Experiments Image to Image translation Ⅲ
  • 37.
    Introduce Conditional adversarial netsare a general-purpose solution for image-to-image translation. Code: https://github.com/phillipi/pix2pix This image is captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Networks”, CVPR, 2016
  • 38.
    Method GAN G: z y Conditional GAN G: {x, z}  y This image is captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Networks”, CVPR, 2016
  • 39.
    Method ℒ 𝑐𝐺𝐴𝑁(𝐺, 𝐷)= 𝔼 𝑥,𝑦 log 𝐷 𝑥, 𝑦 + 𝔼 𝑥,𝑧[log(1 − 𝐷 𝑥, 𝐺(𝑥, 𝑧) )] ℒ 𝐺𝐴𝑁(𝐺, 𝐷) = 𝔼 𝑦 log 𝐷 𝑦 + 𝔼 𝑥,𝑧[log(1 − 𝐷 𝐺(𝑥, 𝑧) )] ℒ 𝐿1(𝐺) = 𝔼 𝑥,𝑦,𝑧 𝑦 − 𝐺(𝑥, 𝑧) 1 𝐺∗ = 𝑎𝑟𝑔 min 𝐺 max 𝐷 ℒ 𝑐𝐺𝐴𝑁 𝐺, 𝐷 + 𝜆 ℒ 𝐿1(𝐺) Objective function for GAN Objective function for cGAN Final objective function
  • 40.
    Method Network architectures Generator Discriminator –Markovian discriminator (PatchGAN) This discriminator effectively models the image as a Markov random field. This image is captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Netowrks”, CVPR, 2016
  • 41.
    Method This image iscaptured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Nets”, https://www.slideshare.net/xavigiro/imagetoimage-translation-with-conditional-adversarial-nets-upc-reading-group This image is captured from http://ccvl.jhu.edu/datasets/
  • 42.
    Experiments This image iscaptured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Networks”, CVPR, 2016
  • 43.
    Experiments This image iscaptured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Networks”, CVPR, 2016
  • 44.
    Experiments This image iscaptured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Networks”, CVPR, 2016
  • 45.
    Experiments Patch size variations. Thisimages are captured from Phillip Isola, et al., “Image-to-Image with Conditional Adversarial Networks”, CVPR, 2016
  • 46.
    References [1] Ian J.Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio, “Generative Adversarial Nets”, NIPS 2014 [2] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, Alexei A. Efros, “Image-to-Image Translation with Conditional Adversarial Networks”, CVPR 2016 [3] Kwangil Kim, “Artificial Neural Networks”, Multimedia system lecture of KHU, 2017 [4] DL4J, “A Beginner’s Guide to Recurrent Networks and LSTMs”, 2017, https://deeplearning4j.org/lstm.html. Accessed, 2018-01-29 [5] Phillip Isola, Jun-Yan Zhu, Tinghui, “Image-to-Image translation with conditional Adversarial Nets”, Nov 25, 2016, https://www.slideshare.net/xavigiro/imagetoimage-translation-with-conditional- adversarial-nets-upc-reading-group. Accessed, 2018-01-29 [6] CCVL, “Datasets: PASCAL Part Segmentation Challenge”, 2018 http://ccvl.jhu.edu/datasets/. Accessed, 2018-01-29