SlideShare a Scribd company logo
1 of 36
Download to read offline
TRAINING GENERATIVE ADVERSARIAL NETWORKS WITH
BINARY NEURONS BY END-TO-END BACKPROPAGATION
Hao-Wen Dong and Yi-Hsuan Yang
OUTLINES
 Backgrounds
 Generative Adversarial Networks
 Binary Neurons
 Straight-through Estimators
 BinaryGAN
 Experiments & Results
 Discussions & Conclusion
2
BACKGROUNDS
GENERATIVE ADVERSARIAL NETWORKS
 Goal—learn a mapping from the prior distribution to the data
distribution [2]
4
prior
distribution
data
distribution
GENERATIVE ADVERSARIAL NETWORKS
 Goal—learn a mapping from the prior distribution to the data
distribution [2]
5
prior
distribution
data
distribution
can be intractable
GENERATIVE ADVERSARIAL NETWORKS
 Use a deep neural network to learn an implicit mapping
generator
prior
distribution
model
distribution
data
distribution
6
GENERATIVE ADVERSARIAL NETWORKS
 Use another deep neural network to provide guidance/critics
generator
prior
distribution
model
distribution
data
distribution
discriminator
real/fake
7
BINARY NEURONS
 Definition: neurons that output binary-valued predictions
8
BINARY NEURONS
 Definition: neurons that output binary-valued predictions
 Deterministic binary neurons (DBNs):
𝑫𝑫𝑫𝑫𝑫𝑫 𝒙𝒙 ≡ �
𝟏𝟏, 𝒊𝒊𝒊𝒊 𝝈𝝈 𝒙𝒙 > 𝟎𝟎. 𝟓𝟓
𝟎𝟎, 𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐
9
(hard thresholding)
BINARY NEURONS
 Definition: neurons that output binary-valued predictions
 Deterministic binary neurons (DBNs):
𝑫𝑫𝑫𝑫𝑫𝑫 𝒙𝒙 ≡ �
𝟏𝟏, 𝒊𝒊𝒊𝒊 𝝈𝝈 𝒙𝒙 > 𝟎𝟎. 𝟓𝟓
𝟎𝟎, 𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐
 Stochastic binary neurons (SBNs):
𝑺𝑺𝑩𝑩𝑩𝑩 𝒙𝒙 ≡ �
𝟏𝟏, 𝒊𝒊𝒊𝒊 𝒛𝒛 < 𝝈𝝈 𝒙𝒙
𝟎𝟎, 𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐
, 𝒛𝒛~𝑼𝑼 𝟎𝟎, 𝟏𝟏
10
(hard thresholding)
(Bernoulli sampling)
BACKPROPAGATING THROUGH BINARY NEURONS
 Backpropagating through binary neurons is intractable
 For DBNs, it involves the nondifferentiable threshold function
 For SBNs, it requires the computation of expected gradients on all possible
combinations (exponential to the number of binary neurons) of values taken
by the binary neurons
11
BACKPROPAGATING THROUGH BINARY NEURONS
 Backpropagating through binary neurons is intractable
 For DBNs, it involves the nondifferentiable threshold function
 For SBNs, it requires the computation of expected gradients on all possible
combinations (exponential to the number of binary neurons) of values taken
by the binary neurons
 We can introduce gradient estimators for the binary neurons
 Examples include Straight-through [3,4] (the one adopted in this work),
REINFORCE [5], REBAR [6], RELAX [7] estimators
12
STRAIGHT-THROUGH ESTIMATORS
 Straight-through estimators: [3,4]
 Forward pass—use hard thresholding (DBNs) or Bernoulli sampling (SBNs)
 Backward pass—pretend it as an identity function
13
STRAIGHT-THROUGH ESTIMATORS
 Straight-through estimators: [3,4]
 Forward pass—use hard thresholding (DBNs) or Bernoulli sampling (SBNs)
 Backward pass—pretend it as an identity function
 Sigmoid-adjusted straight-through estimators
 Use the derivative of the sigmoid function in the backward pass
 Found to achieve better performance in a classification task presented in [4]
 Adopted in this work
14
BINARYGAN
BINARYGAN
 Use binary neurons at the output layer of the generator
 Use sigmoid-adjusted straight-through estimators to provide the
gradients for the binary neurons
 Train the whole network by end-to-end backpropagation
16
BINARYGAN
17
Generator Discriminator
real data
random neuron
binary neuron
normal neuron
(implemented by multilayer perceptrons)
BINARYGAN
18
Generator Discriminator
real data
random neuron
binary neuron
normal neuron
(implemented by multilayer perceptrons)
binary
neurons
EXPERIMENTS & RESULTS
TRAINING DATA
 Binarized MNIST handwritten digit database [8]
 Digits with nonzero intensities → 1
 Digits with zero intensities → 0
20
Sample binarized MNIST digits
IMPLEMENTATION DETAILS
 Batch size is 64
 Use WGAN-GP [9] objectives
 Use Adam optimizer [10]
 Apply batch normalization [11] to the generator (but not to the
discriminator)
 Binary neurons are implemented with the code kindly provided in
a blog post on the R2RT blog [12]
21
IMPLEMENTATION DETAILS
 Apply slope annealing trick [13]
 Gradually increase the slopes of the sigmoid functions used in the sigmoid-
adjusted straight-through estimators as the training proceeds
 We multiply the slopes by 1.1 after each epoch
 The slopes start from 1.0 and reach 6.1 at the end of 20 epochs
22
EXPERIMENT I—DBNS VS SBNS
 DBNs and SBNs can achieve similar qualities
 They show distinct characteristics on the preactivated outputs
23
DBNs
0 1
SBNs
EXPERIMENT I—DBNS VS SBNS
24
Histograms of the preactivated outputs
DBNs
– more values in the middle
– a notch at 0.5 (the threshold)
SBNs
– more values close to 0 and 1
preactivated values
density
0.0 0.2 0.4 0.6 0.8 1.0
10-4
10-3
10-2
10-1
100
DBNs
SBNs
z
EXPERIMENT II—REAL-VALUED MODEL
 Use no binary neurons
 Train the discriminator by the real-valued outputs of the generator
250
1
raw predictions
hard thresholding Bernoulli sampling
binarized results
EXPERIMENT II—REAL-VALUED MODEL
26
Histograms of the preactivated outputs
DBNs
– more values in the middle
– a notch at 0.5 (the threshold)
SBNs
– more values close to 0 and 1
Real-valued model
– even more U-shaped
preactivated values
density
0.0 0.2 0.4 0.6 0.8 1.0
10-4
10-3
10-2
10-1
100
DBNs
SBNs
z
Real-valued model
EXPERIMENT III—GAN OBJECTIVES
 WGAN [14] model can achieve similar qualities to the WGAN-GP
 GAN [2] model suffers from mode collapse issue
27
GAN + DBNs GAN + SBNs WGAN + DBNs WGAN + SBNs
EXPERIMENT IV—MLPS VS CNNS
 CNN model produces less artifacts even with a small number of
trainable parameters (MLP—0.53M; CNN—1.4M)
28
DBNs SBNsSBNs DBNs
MLP model CNN model
DISCUSSIONS & CONCLUSION
DISCUSSIONS
 Why is binary neurons important?
 Open the possibility of conditional computation graph [4,13]
 Move toward a stronger AI that can make reliable decisions
30
DISCUSSIONS
 Why is binary neurons important?
 Open the possibility of conditional computation graph [4,13]
 Move toward a stronger AI that can make reliable decisions
 Other approaches to model discrete distributions with GANs
 Replace the target discrete outputs with continuous relaxations
 View the generator as agent in reinforcement learning (RL) and introduce
RL-based training strategies
31
CONCLUSION
 A new GAN model that
 can generate binary-valued predictions without further post-processing
 can be trained by end-to-end backpropagation
 Experimentally compare
 deterministic and stochastic binary neurons
 the proposed model and the real-valued model
 GAN, WGAN, WGAN-GP objectives
 MLPs and CNNs
32
FUTURE WORK
 Examine the use of gradient estimators for training a GAN that has
a conditional computation graph
33
REFERENCES
[1] Hao-Wen Dong and Yi-Hsuan Yang. Training Generative Adversarial Networks with Binary Neurons by
End-to-end Backpropagation. arXiv Preprint arXiv:1810.04714, 2018
[2] Ian J. Goodfellow et al. Generative adversarial nets. In Proc. NIPS, 2014.
[3] Geoffrey Hinton. Neural networks for machine learning—Using noise as a regularizer (lecture 9c), 2012.
Coursera, video lectures. [Online] https://www.coursera.org/lecture/neural-networks/using-noise-as-a-regularizer-
7-min-wbw7b.
[4] Yoshua Bengio, Nicholas Léonard, and Aaron C. Courville. Estimating or propagating gradients
through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432, 2013.
[5] Ronald J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement
learning. Machine learning, 8(3-4):229–256, 1992.
[6] George Tucker, Andriy Mnih, Chris J Maddison, and Jascha Sohl-Dickstein. REBAR: Low-variance,
unbiased gradient estimates for discrete latent variable models. In Proc. NIPS, 2017.
[7] Will Grathwohl, Dami Choi, Yuhuai Wu, Geoffrey Roeder, and David Duvenaud. Backpropagation
through the Void: Optimizing control variates for black-box gradient estimation. In Proc. ICLR. 2018.
34
REFERENCES
[8] Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to
document recognition. Proc. IEEE, 86(11):2278–2324, 1998.
[9] Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron Courville. Improved
training of Wasserstein GANs. In Proc. NIPS, 2017.
[10] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint
arXiv:1412.6980, 2014.
[11] Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by
reducing internal covariate shift. In Proc. ICML, 2015.
[12] Binary stochastic neurons in tensorflow, 2016. Blog post on the R2RT blog. [Online]
https://r2rt.com/binary-stochastic-neurons-in-tensorflow.
[13] Junyoung Chung, Sungjin Ahn, and Yoshua Bengio. Hierarchical multiscale recurrent neural networks.
In Proc. ICLR, 2017.
[14] Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein generative adversarial networks. In
Proc. ICML, 2017.
35
Thank you for your attention

More Related Content

What's hot

APPLIED MACHINE LEARNING
APPLIED MACHINE LEARNINGAPPLIED MACHINE LEARNING
APPLIED MACHINE LEARNINGRevanth Kumar
 
Details of Lazy Deep Learning for Images Recognition in ZZ Photo app
Details of Lazy Deep Learning for Images Recognition in ZZ Photo appDetails of Lazy Deep Learning for Images Recognition in ZZ Photo app
Details of Lazy Deep Learning for Images Recognition in ZZ Photo appPAY2 YOU
 
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...Simplilearn
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakPyData
 
Kernels in convolution
Kernels in convolutionKernels in convolution
Kernels in convolutionRevanth Kumar
 
Image Compression Using Wavelet Packet Tree
Image Compression Using Wavelet Packet TreeImage Compression Using Wavelet Packet Tree
Image Compression Using Wavelet Packet TreeIDES Editor
 
NeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximateProgramsNeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximateProgramsMohid Nabil
 
Voice Activity Detection using Single Frequency Filtering
Voice Activity Detection using Single Frequency FilteringVoice Activity Detection using Single Frequency Filtering
Voice Activity Detection using Single Frequency FilteringTejus Adiga M
 
Introduction to Applied Machine Learning
Introduction to Applied Machine LearningIntroduction to Applied Machine Learning
Introduction to Applied Machine LearningSheilaJimenezMorejon
 
Discrete Cosine Transform Stegonagraphy
Discrete Cosine Transform StegonagraphyDiscrete Cosine Transform Stegonagraphy
Discrete Cosine Transform StegonagraphyKaushik Chakraborty
 
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Seonho Park
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network Yan Xu
 
Deep learning algorithms
Deep learning algorithmsDeep learning algorithms
Deep learning algorithmsRevanth Kumar
 
Deep neural networks & computational graphs
Deep neural networks & computational graphsDeep neural networks & computational graphs
Deep neural networks & computational graphsRevanth Kumar
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networksSi Haem
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learningJunaid Bhat
 

What's hot (20)

Deep learning (2)
Deep learning (2)Deep learning (2)
Deep learning (2)
 
APPLIED MACHINE LEARNING
APPLIED MACHINE LEARNINGAPPLIED MACHINE LEARNING
APPLIED MACHINE LEARNING
 
Details of Lazy Deep Learning for Images Recognition in ZZ Photo app
Details of Lazy Deep Learning for Images Recognition in ZZ Photo appDetails of Lazy Deep Learning for Images Recognition in ZZ Photo app
Details of Lazy Deep Learning for Images Recognition in ZZ Photo app
 
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
What Is Deep Learning? | Introduction to Deep Learning | Deep Learning Tutori...
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
 
Kernels in convolution
Kernels in convolutionKernels in convolution
Kernels in convolution
 
Image Compression Using Wavelet Packet Tree
Image Compression Using Wavelet Packet TreeImage Compression Using Wavelet Packet Tree
Image Compression Using Wavelet Packet Tree
 
NeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximateProgramsNeuralProcessingofGeneralPurposeApproximatePrograms
NeuralProcessingofGeneralPurposeApproximatePrograms
 
DCT
DCTDCT
DCT
 
Voice Activity Detection using Single Frequency Filtering
Voice Activity Detection using Single Frequency FilteringVoice Activity Detection using Single Frequency Filtering
Voice Activity Detection using Single Frequency Filtering
 
Mc culloch pitts neuron
Mc culloch pitts neuronMc culloch pitts neuron
Mc culloch pitts neuron
 
Introduction to Applied Machine Learning
Introduction to Applied Machine LearningIntroduction to Applied Machine Learning
Introduction to Applied Machine Learning
 
Discrete Cosine Transform Stegonagraphy
Discrete Cosine Transform StegonagraphyDiscrete Cosine Transform Stegonagraphy
Discrete Cosine Transform Stegonagraphy
 
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
Convolutional Neural Network for Alzheimer’s disease diagnosis with Neuroim...
 
Deep learning
Deep learningDeep learning
Deep learning
 
Convolutional neural network
Convolutional neural network Convolutional neural network
Convolutional neural network
 
Deep learning algorithms
Deep learning algorithmsDeep learning algorithms
Deep learning algorithms
 
Deep neural networks & computational graphs
Deep neural networks & computational graphsDeep neural networks & computational graphs
Deep neural networks & computational graphs
 
Deep neural networks
Deep neural networksDeep neural networks
Deep neural networks
 
Introduction to deep learning
Introduction to deep learningIntroduction to deep learning
Introduction to deep learning
 

Similar to Training Generative Adversarial Networks with Binary Neurons by End-to-end Backpropagation

Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentationOwin Will
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101Felipe Prado
 
Batch normalization: Accelerating Deep Network Training by Reducing Internal ...
Batch normalization: Accelerating Deep Network Training by Reducing Internal ...Batch normalization: Accelerating Deep Network Training by Reducing Internal ...
Batch normalization: Accelerating Deep Network Training by Reducing Internal ...ssuser6a46522
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017Shuai Zhang
 
Invertible Denoising Network: A Light Solution for Real Noise Removal
Invertible Denoising Network: A Light Solution for Real Noise RemovalInvertible Denoising Network: A Light Solution for Real Noise Removal
Invertible Denoising Network: A Light Solution for Real Noise Removalivaderivader
 
Recent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectivesRecent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectivesNamkug Kim
 
Data driven model optimization [autosaved]
Data driven model optimization [autosaved]Data driven model optimization [autosaved]
Data driven model optimization [autosaved]Russell Jarvis
 
Introduction to Deep Learning and neon at Galvanize
Introduction to Deep Learning and neon at GalvanizeIntroduction to Deep Learning and neon at Galvanize
Introduction to Deep Learning and neon at GalvanizeIntel Nervana
 
Neural Discrete Representation Learning - A paper review
Neural Discrete Representation Learning - A paper reviewNeural Discrete Representation Learning - A paper review
Neural Discrete Representation Learning - A paper reviewAbhishek Koirala
 
PPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at UberPPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at UberJisang Yoon
 
Convolutional Neural Networks CNN
Convolutional Neural Networks CNNConvolutional Neural Networks CNN
Convolutional Neural Networks CNNAbdullah al Mamun
 
Information processing with artificial spiking neural networks
Information processing with artificial spiking neural networksInformation processing with artificial spiking neural networks
Information processing with artificial spiking neural networksAdvanced-Concepts-Team
 
(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...Jacky Liu
 
Forecasting of Sales using Neural network techniques
Forecasting of Sales using Neural network techniquesForecasting of Sales using Neural network techniques
Forecasting of Sales using Neural network techniquesHitesh Dua
 
Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)Fatimakhan325
 
A Survey of Deep Learning Algorithms for Malware Detection
A Survey of Deep Learning Algorithms for Malware DetectionA Survey of Deep Learning Algorithms for Malware Detection
A Survey of Deep Learning Algorithms for Malware DetectionIJCSIS Research Publications
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsChester Chen
 
AI&BigData Lab 2016. Артем Чернодуб: Обучение глубоких, очень глубоких и реку...
AI&BigData Lab 2016. Артем Чернодуб: Обучение глубоких, очень глубоких и реку...AI&BigData Lab 2016. Артем Чернодуб: Обучение глубоких, очень глубоких и реку...
AI&BigData Lab 2016. Артем Чернодуб: Обучение глубоких, очень глубоких и реку...GeeksLab Odessa
 

Similar to Training Generative Adversarial Networks with Binary Neurons by End-to-end Backpropagation (20)

Batch normalization presentation
Batch normalization presentationBatch normalization presentation
Batch normalization presentation
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101
 
Batch normalization: Accelerating Deep Network Training by Reducing Internal ...
Batch normalization: Accelerating Deep Network Training by Reducing Internal ...Batch normalization: Accelerating Deep Network Training by Reducing Internal ...
Batch normalization: Accelerating Deep Network Training by Reducing Internal ...
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
 
Invertible Denoising Network: A Light Solution for Real Noise Removal
Invertible Denoising Network: A Light Solution for Real Noise RemovalInvertible Denoising Network: A Light Solution for Real Noise Removal
Invertible Denoising Network: A Light Solution for Real Noise Removal
 
Recent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectivesRecent advances of AI for medical imaging : Engineering perspectives
Recent advances of AI for medical imaging : Engineering perspectives
 
Data driven model optimization [autosaved]
Data driven model optimization [autosaved]Data driven model optimization [autosaved]
Data driven model optimization [autosaved]
 
Introduction to Deep Learning and neon at Galvanize
Introduction to Deep Learning and neon at GalvanizeIntroduction to Deep Learning and neon at Galvanize
Introduction to Deep Learning and neon at Galvanize
 
Neural Discrete Representation Learning - A paper review
Neural Discrete Representation Learning - A paper reviewNeural Discrete Representation Learning - A paper review
Neural Discrete Representation Learning - A paper review
 
PPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at UberPPT - Deep and Confident Prediction For Time Series at Uber
PPT - Deep and Confident Prediction For Time Series at Uber
 
IROS 2017 Slides
IROS 2017 SlidesIROS 2017 Slides
IROS 2017 Slides
 
Thesis Presentation
Thesis PresentationThesis Presentation
Thesis Presentation
 
Convolutional Neural Networks CNN
Convolutional Neural Networks CNNConvolutional Neural Networks CNN
Convolutional Neural Networks CNN
 
Information processing with artificial spiking neural networks
Information processing with artificial spiking neural networksInformation processing with artificial spiking neural networks
Information processing with artificial spiking neural networks
 
(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...(Research Note) Delving deeper into convolutional neural networks for camera ...
(Research Note) Delving deeper into convolutional neural networks for camera ...
 
Forecasting of Sales using Neural network techniques
Forecasting of Sales using Neural network techniquesForecasting of Sales using Neural network techniques
Forecasting of Sales using Neural network techniques
 
Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)Types of Machine Learnig Algorithms(CART, ID3)
Types of Machine Learnig Algorithms(CART, ID3)
 
A Survey of Deep Learning Algorithms for Malware Detection
A Survey of Deep Learning Algorithms for Malware DetectionA Survey of Deep Learning Algorithms for Malware Detection
A Survey of Deep Learning Algorithms for Malware Detection
 
Improving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN ApplicationsImproving Hardware Efficiency for DNN Applications
Improving Hardware Efficiency for DNN Applications
 
AI&BigData Lab 2016. Артем Чернодуб: Обучение глубоких, очень глубоких и реку...
AI&BigData Lab 2016. Артем Чернодуб: Обучение глубоких, очень глубоких и реку...AI&BigData Lab 2016. Артем Чернодуб: Обучение глубоких, очень глубоких и реку...
AI&BigData Lab 2016. Артем Чернодуб: Обучение глубоких, очень глубоких и реку...
 

More from Hao-Wen (Herman) Dong

Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...Hao-Wen (Herman) Dong
 
Organizing Machine Learning Projects - Repository Organization
Organizing Machine Learning Projects - Repository OrganizationOrganizing Machine Learning Projects - Repository Organization
Organizing Machine Learning Projects - Repository OrganizationHao-Wen (Herman) Dong
 
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...Hao-Wen (Herman) Dong
 
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GANRecent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GANHao-Wen (Herman) Dong
 
Introduction to Deep Generative Models
Introduction to Deep Generative ModelsIntroduction to Deep Generative Models
Introduction to Deep Generative ModelsHao-Wen (Herman) Dong
 

More from Hao-Wen (Herman) Dong (6)

Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
Convolutional Generative Adversarial Networks with Binary Neurons for Polypho...
 
Organizing Machine Learning Projects - Repository Organization
Organizing Machine Learning Projects - Repository OrganizationOrganizing Machine Learning Projects - Repository Organization
Organizing Machine Learning Projects - Repository Organization
 
What is Critical in GAN Training?
What is Critical in GAN Training?What is Critical in GAN Training?
What is Critical in GAN Training?
 
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic ...
 
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GANRecent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
Recent Progress on Utilizing Tag Information with GANs - StarGAN & TD-GAN
 
Introduction to Deep Generative Models
Introduction to Deep Generative ModelsIntroduction to Deep Generative Models
Introduction to Deep Generative Models
 

Recently uploaded

Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Silpa
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learninglevieagacer
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...Monika Rani
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry Areesha Ahmad
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...Scintica Instrumentation
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body Areesha Ahmad
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptxryanrooker
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....muralinath2
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfSumit Kumar yadav
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.Silpa
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Silpa
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIADr. TATHAGAT KHOBRAGADE
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxseri bangash
 
Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Silpa
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxSilpa
 

Recently uploaded (20)

Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.Porella : features, morphology, anatomy, reproduction etc.
Porella : features, morphology, anatomy, reproduction etc.
 
Module for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learningModule for Grade 9 for Asynchronous/Distance learning
Module for Grade 9 for Asynchronous/Distance learning
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
GBSN - Biochemistry (Unit 2) Basic concept of organic chemistry
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
(May 9, 2024) Enhanced Ultrafast Vector Flow Imaging (VFI) Using Multi-Angle ...
 
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body GBSN - Microbiology (Unit 3)Defense Mechanism of the body
GBSN - Microbiology (Unit 3)Defense Mechanism of the body
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Chemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdfChemistry 5th semester paper 1st Notes.pdf
Chemistry 5th semester paper 1st Notes.pdf
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.Phenolics: types, biosynthesis and functions.
Phenolics: types, biosynthesis and functions.
 
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIACURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
CURRENT SCENARIO OF POULTRY PRODUCTION IN INDIA
 
The Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptxThe Mariana Trench remarkable geological features on Earth.pptx
The Mariana Trench remarkable geological features on Earth.pptx
 
Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.Atp synthase , Atp synthase complex 1 to 4.
Atp synthase , Atp synthase complex 1 to 4.
 
Cyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptxCyanide resistant respiration pathway.pptx
Cyanide resistant respiration pathway.pptx
 

Training Generative Adversarial Networks with Binary Neurons by End-to-end Backpropagation

  • 1. TRAINING GENERATIVE ADVERSARIAL NETWORKS WITH BINARY NEURONS BY END-TO-END BACKPROPAGATION Hao-Wen Dong and Yi-Hsuan Yang
  • 2. OUTLINES  Backgrounds  Generative Adversarial Networks  Binary Neurons  Straight-through Estimators  BinaryGAN  Experiments & Results  Discussions & Conclusion 2
  • 4. GENERATIVE ADVERSARIAL NETWORKS  Goal—learn a mapping from the prior distribution to the data distribution [2] 4 prior distribution data distribution
  • 5. GENERATIVE ADVERSARIAL NETWORKS  Goal—learn a mapping from the prior distribution to the data distribution [2] 5 prior distribution data distribution can be intractable
  • 6. GENERATIVE ADVERSARIAL NETWORKS  Use a deep neural network to learn an implicit mapping generator prior distribution model distribution data distribution 6
  • 7. GENERATIVE ADVERSARIAL NETWORKS  Use another deep neural network to provide guidance/critics generator prior distribution model distribution data distribution discriminator real/fake 7
  • 8. BINARY NEURONS  Definition: neurons that output binary-valued predictions 8
  • 9. BINARY NEURONS  Definition: neurons that output binary-valued predictions  Deterministic binary neurons (DBNs): 𝑫𝑫𝑫𝑫𝑫𝑫 𝒙𝒙 ≡ � 𝟏𝟏, 𝒊𝒊𝒊𝒊 𝝈𝝈 𝒙𝒙 > 𝟎𝟎. 𝟓𝟓 𝟎𝟎, 𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐 9 (hard thresholding)
  • 10. BINARY NEURONS  Definition: neurons that output binary-valued predictions  Deterministic binary neurons (DBNs): 𝑫𝑫𝑫𝑫𝑫𝑫 𝒙𝒙 ≡ � 𝟏𝟏, 𝒊𝒊𝒊𝒊 𝝈𝝈 𝒙𝒙 > 𝟎𝟎. 𝟓𝟓 𝟎𝟎, 𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐  Stochastic binary neurons (SBNs): 𝑺𝑺𝑩𝑩𝑩𝑩 𝒙𝒙 ≡ � 𝟏𝟏, 𝒊𝒊𝒊𝒊 𝒛𝒛 < 𝝈𝝈 𝒙𝒙 𝟎𝟎, 𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐𝒐 , 𝒛𝒛~𝑼𝑼 𝟎𝟎, 𝟏𝟏 10 (hard thresholding) (Bernoulli sampling)
  • 11. BACKPROPAGATING THROUGH BINARY NEURONS  Backpropagating through binary neurons is intractable  For DBNs, it involves the nondifferentiable threshold function  For SBNs, it requires the computation of expected gradients on all possible combinations (exponential to the number of binary neurons) of values taken by the binary neurons 11
  • 12. BACKPROPAGATING THROUGH BINARY NEURONS  Backpropagating through binary neurons is intractable  For DBNs, it involves the nondifferentiable threshold function  For SBNs, it requires the computation of expected gradients on all possible combinations (exponential to the number of binary neurons) of values taken by the binary neurons  We can introduce gradient estimators for the binary neurons  Examples include Straight-through [3,4] (the one adopted in this work), REINFORCE [5], REBAR [6], RELAX [7] estimators 12
  • 13. STRAIGHT-THROUGH ESTIMATORS  Straight-through estimators: [3,4]  Forward pass—use hard thresholding (DBNs) or Bernoulli sampling (SBNs)  Backward pass—pretend it as an identity function 13
  • 14. STRAIGHT-THROUGH ESTIMATORS  Straight-through estimators: [3,4]  Forward pass—use hard thresholding (DBNs) or Bernoulli sampling (SBNs)  Backward pass—pretend it as an identity function  Sigmoid-adjusted straight-through estimators  Use the derivative of the sigmoid function in the backward pass  Found to achieve better performance in a classification task presented in [4]  Adopted in this work 14
  • 16. BINARYGAN  Use binary neurons at the output layer of the generator  Use sigmoid-adjusted straight-through estimators to provide the gradients for the binary neurons  Train the whole network by end-to-end backpropagation 16
  • 17. BINARYGAN 17 Generator Discriminator real data random neuron binary neuron normal neuron (implemented by multilayer perceptrons)
  • 18. BINARYGAN 18 Generator Discriminator real data random neuron binary neuron normal neuron (implemented by multilayer perceptrons) binary neurons
  • 20. TRAINING DATA  Binarized MNIST handwritten digit database [8]  Digits with nonzero intensities → 1  Digits with zero intensities → 0 20 Sample binarized MNIST digits
  • 21. IMPLEMENTATION DETAILS  Batch size is 64  Use WGAN-GP [9] objectives  Use Adam optimizer [10]  Apply batch normalization [11] to the generator (but not to the discriminator)  Binary neurons are implemented with the code kindly provided in a blog post on the R2RT blog [12] 21
  • 22. IMPLEMENTATION DETAILS  Apply slope annealing trick [13]  Gradually increase the slopes of the sigmoid functions used in the sigmoid- adjusted straight-through estimators as the training proceeds  We multiply the slopes by 1.1 after each epoch  The slopes start from 1.0 and reach 6.1 at the end of 20 epochs 22
  • 23. EXPERIMENT I—DBNS VS SBNS  DBNs and SBNs can achieve similar qualities  They show distinct characteristics on the preactivated outputs 23 DBNs 0 1 SBNs
  • 24. EXPERIMENT I—DBNS VS SBNS 24 Histograms of the preactivated outputs DBNs – more values in the middle – a notch at 0.5 (the threshold) SBNs – more values close to 0 and 1 preactivated values density 0.0 0.2 0.4 0.6 0.8 1.0 10-4 10-3 10-2 10-1 100 DBNs SBNs z
  • 25. EXPERIMENT II—REAL-VALUED MODEL  Use no binary neurons  Train the discriminator by the real-valued outputs of the generator 250 1 raw predictions hard thresholding Bernoulli sampling binarized results
  • 26. EXPERIMENT II—REAL-VALUED MODEL 26 Histograms of the preactivated outputs DBNs – more values in the middle – a notch at 0.5 (the threshold) SBNs – more values close to 0 and 1 Real-valued model – even more U-shaped preactivated values density 0.0 0.2 0.4 0.6 0.8 1.0 10-4 10-3 10-2 10-1 100 DBNs SBNs z Real-valued model
  • 27. EXPERIMENT III—GAN OBJECTIVES  WGAN [14] model can achieve similar qualities to the WGAN-GP  GAN [2] model suffers from mode collapse issue 27 GAN + DBNs GAN + SBNs WGAN + DBNs WGAN + SBNs
  • 28. EXPERIMENT IV—MLPS VS CNNS  CNN model produces less artifacts even with a small number of trainable parameters (MLP—0.53M; CNN—1.4M) 28 DBNs SBNsSBNs DBNs MLP model CNN model
  • 30. DISCUSSIONS  Why is binary neurons important?  Open the possibility of conditional computation graph [4,13]  Move toward a stronger AI that can make reliable decisions 30
  • 31. DISCUSSIONS  Why is binary neurons important?  Open the possibility of conditional computation graph [4,13]  Move toward a stronger AI that can make reliable decisions  Other approaches to model discrete distributions with GANs  Replace the target discrete outputs with continuous relaxations  View the generator as agent in reinforcement learning (RL) and introduce RL-based training strategies 31
  • 32. CONCLUSION  A new GAN model that  can generate binary-valued predictions without further post-processing  can be trained by end-to-end backpropagation  Experimentally compare  deterministic and stochastic binary neurons  the proposed model and the real-valued model  GAN, WGAN, WGAN-GP objectives  MLPs and CNNs 32
  • 33. FUTURE WORK  Examine the use of gradient estimators for training a GAN that has a conditional computation graph 33
  • 34. REFERENCES [1] Hao-Wen Dong and Yi-Hsuan Yang. Training Generative Adversarial Networks with Binary Neurons by End-to-end Backpropagation. arXiv Preprint arXiv:1810.04714, 2018 [2] Ian J. Goodfellow et al. Generative adversarial nets. In Proc. NIPS, 2014. [3] Geoffrey Hinton. Neural networks for machine learning—Using noise as a regularizer (lecture 9c), 2012. Coursera, video lectures. [Online] https://www.coursera.org/lecture/neural-networks/using-noise-as-a-regularizer- 7-min-wbw7b. [4] Yoshua Bengio, Nicholas Léonard, and Aaron C. Courville. Estimating or propagating gradients through stochastic neurons for conditional computation. arXiv preprint arXiv:1308.3432, 2013. [5] Ronald J. Williams. Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine learning, 8(3-4):229–256, 1992. [6] George Tucker, Andriy Mnih, Chris J Maddison, and Jascha Sohl-Dickstein. REBAR: Low-variance, unbiased gradient estimates for discrete latent variable models. In Proc. NIPS, 2017. [7] Will Grathwohl, Dami Choi, Yuhuai Wu, Geoffrey Roeder, and David Duvenaud. Backpropagation through the Void: Optimizing control variates for black-box gradient estimation. In Proc. ICLR. 2018. 34
  • 35. REFERENCES [8] Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proc. IEEE, 86(11):2278–2324, 1998. [9] Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron Courville. Improved training of Wasserstein GANs. In Proc. NIPS, 2017. [10] Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. [11] Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In Proc. ICML, 2015. [12] Binary stochastic neurons in tensorflow, 2016. Blog post on the R2RT blog. [Online] https://r2rt.com/binary-stochastic-neurons-in-tensorflow. [13] Junyoung Chung, Sungjin Ahn, and Yoshua Bengio. Hierarchical multiscale recurrent neural networks. In Proc. ICLR, 2017. [14] Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein generative adversarial networks. In Proc. ICML, 2017. 35
  • 36. Thank you for your attention