SlideShare a Scribd company logo
CSC321 Lecture 19: Generative Adversarial Networks
Roger Grosse
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 1 / 25
Overview
In generative modeling, we’d like to train a network that models a
distribution, such as a distribution over images.
One way to judge the quality of the model is to sample from it.
This field has seen rapid progress:
2009 2015
2018
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 2 / 25
Overview
Four modern approaches to generative modeling:
Generative adversarial networks (today)
Reversible architectures (next lecture)
Autoregressive models (Lecture 7, and next lecture)
Variational autoencoders (CSC412)
All four approaches have different pros and cons.
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 3 / 25
Implicit Generative Models
Implicit generative models implicitly define a probability distribution
Start by sampling the code vector z from a fixed, simple distribution
(e.g. spherical Gaussian)
The generator network computes a differentiable function G mapping
z to an x in data space
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 4 / 25
Implicit Generative Models
A 1-dimensional example:
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 5 / 25
Implicit Generative Models
https://blog.openai.com/generative-models/
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 6 / 25
Implicit Generative Models
This sort of architecture sounded preposterous to many of us, but
amazingly, it works.
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 7 / 25
Generative Adversarial Networks
The advantage of implicit generative models: if you have some
criterion for evaluating the quality of samples, then you can compute
its gradient with respect to the network parameters, and update the
network’s parameters to make the sample a little better
The idea behind Generative Adversarial Networks (GANs): train two
different networks
The generator network tries to produce realistic-looking samples
The discriminator network tries to figure out whether an image came
from the training set or the generator network
The generator network tries to fool the discriminator network
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 8 / 25
Generative Adversarial Networks
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 9 / 25
Generative Adversarial Networks
Let D denote the discriminator’s predicted probability of being data
Discriminator’s cost function: cross-entropy loss for task of classifying
real vs. fake images
JD = Ex∼D[− log D(x)] + Ez[− log(1 − D(G(z)))]
One possible cost function for the generator: the opposite of the
discriminator’s
JG = −JD
= const + Ez[log(1 − D(G(z)))]
This is called the minimax formulation, since the generator and
discriminator are playing a zero-sum game against each other:
max
G
min
D
JD
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 10 / 25
Generative Adversarial Networks
Updating the discriminator:
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 11 / 25
Generative Adversarial Networks
Updating the generator:
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 12 / 25
Generative Adversarial Networks
Alternating training of the generator and discriminator:
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 13 / 25
A Better Cost Function
We introduced the minimax cost function for the generator:
JG = Ez[log(1 − D(G(z)))]
One problem with this is saturation.
Recall from our lecture on classification: when the prediction is really
wrong,
“Logistic + squared error” gets a weak gradient signal
“Logistic + cross-entropy” gets a strong gradient signal
Here, if the generated sample is really bad, the discriminator’s
prediction is close to 0, and the generator’s cost is flat.
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 14 / 25
A Better Cost Function
Original minimax cost:
JG = Ez[log(1 − D(G(z)))]
Modified generator cost:
JG = Ez[− log D(G(z))]
This fixes the saturation problem.
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 15 / 25
Generative Adversarial Networks
Since GANs were introduced in 2014, there have been hundreds of
papers introducing various architectures and training methods.
Most modern architectures are based on the Deep Convolutional GAN
(DC-GAN), where the generator and discriminator are both conv nets.
GAN Zoo: https://github.com/hindupuravinash/the-gan-zoo
Good source of horrible puns (VEEGAN, Checkhov GAN, etc.)
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 16 / 25
GAN Samples
Celebrities:
Karras et al., 2017. Progressive growing of GANs for improved quality, stability, and variation
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 17 / 25
GAN Samples
Bedrooms:
Karras et al., 2017. Progressive growing of GANs for improved quality, stability, and variation
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 18 / 25
GAN Samples
Objects:
Karras et al., 2017. Progressive growing of GANs for improved quality, stability, and variation
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 19 / 25
GAN Samples
GANs revolutionized generative modeling by producing crisp,
high-resolution images.
The catch: we don’t know how well they’re modeling the distribution.
Can’t measure the log-likelihood they assign to held-out data.
Could they be memorizing training examples? (E.g., maybe they
sometimes produce photos of real celebrities?)
We have no way to tell if they are dropping important modes from the
distribution.
See Wu et al., “On the quantitative analysis of decoder-based
generative models” for partial answers to these questions.
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 20 / 25
CycleGAN
Style transfer problem: change the style of an image while preserving the
content.
Data: Two unrelated collections of images, one for each style
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 21 / 25
CycleGAN
If we had paired data (same content in both styles), this would be a
supervised learning problem. But this is hard to find.
The CycleGAN architecture learns to do it from unpaired data.
Train two different generator nets to go from style 1 to style 2, and
vice versa.
Make sure the generated samples of style 2 are indistinguishable from
real images by a discriminator net.
Make sure the generators are cycle-consistent: mapping from style 1 to
style 2 and back again should give you almost the original image.
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 22 / 25
CycleGAN
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 23 / 25
CycleGAN
Style transfer between aerial photos and maps:
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 24 / 25
CycleGAN
Style transfer between road scenes and semantic segmentations (labels of
every pixel in an image by object category):
Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 25 / 25

More Related Content

Similar to lec19.pdf

Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Universitat Politècnica de Catalunya
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)
Manohar Mukku
 
gan-190318135433 (1).pptx
gan-190318135433 (1).pptxgan-190318135433 (1).pptx
gan-190318135433 (1).pptx
kiran814572
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial Networks
MLReview
 
MODELLING AND SYNTHESIZING OF 3D SHAPE WITH STACKED GENERATIVE ADVERSARIAL NE...
MODELLING AND SYNTHESIZING OF 3D SHAPE WITH STACKED GENERATIVE ADVERSARIAL NE...MODELLING AND SYNTHESIZING OF 3D SHAPE WITH STACKED GENERATIVE ADVERSARIAL NE...
MODELLING AND SYNTHESIZING OF 3D SHAPE WITH STACKED GENERATIVE ADVERSARIAL NE...
Sangeetha Mam
 
Volodymyr Lyubinets “Generative models for images”
Volodymyr Lyubinets  “Generative models for images”Volodymyr Lyubinets  “Generative models for images”
Volodymyr Lyubinets “Generative models for images”
Lviv Startup Club
 
Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Overlapping community detection in Large-Scale Networks using BigCLAM model b...Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Thang Nguyen
 
Alberto Massidda - Scenes from a memory - Codemotion Rome 2019
Alberto Massidda - Scenes from a memory - Codemotion Rome 2019Alberto Massidda - Scenes from a memory - Codemotion Rome 2019
Alberto Massidda - Scenes from a memory - Codemotion Rome 2019
Codemotion
 
Deep Generative Models II (DLAI D10L1 2017 UPC Deep Learning for Artificial I...
Deep Generative Models II (DLAI D10L1 2017 UPC Deep Learning for Artificial I...Deep Generative Models II (DLAI D10L1 2017 UPC Deep Learning for Artificial I...
Deep Generative Models II (DLAI D10L1 2017 UPC Deep Learning for Artificial I...
Universitat Politècnica de Catalunya
 
Generation of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptGeneration of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.ppt
DivyaGugulothu
 
Safety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfSafety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdf
Polytechnique Montréal
 
Generative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksGenerative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural Networks
Denis Dus
 
IMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTIONIMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTION
ijscai
 
Image Generation from Caption
Image Generation from Caption Image Generation from Caption
Image Generation from Caption
IJSCAI Journal
 
Vladislav Kolbasin “Introduction to Generative Adversarial Networks (GANs)”
Vladislav Kolbasin “Introduction to Generative Adversarial Networks (GANs)”Vladislav Kolbasin “Introduction to Generative Adversarial Networks (GANs)”
Vladislav Kolbasin “Introduction to Generative Adversarial Networks (GANs)”
Lviv Startup Club
 
Creating Objects for Metaverse using GANs and Autoencoders
Creating Objects for Metaverse using GANs and AutoencodersCreating Objects for Metaverse using GANs and Autoencoders
Creating Objects for Metaverse using GANs and Autoencoders
IRJET Journal
 
Domain adaptation for Image Segmentation
Domain adaptation for Image SegmentationDomain adaptation for Image Segmentation
Domain adaptation for Image Segmentation
Deepak Thukral
 
1809.05680.pdf
1809.05680.pdf1809.05680.pdf
1809.05680.pdf
AnkitBiswas31
 
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representations
Devansh16
 

Similar to lec19.pdf (20)

Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
Generative Adversarial Networks GAN - Santiago Pascual - UPC Barcelona 2018
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)
 
gan-190318135433 (1).pptx
gan-190318135433 (1).pptxgan-190318135433 (1).pptx
gan-190318135433 (1).pptx
 
Tutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial NetworksTutorial on Theory and Application of Generative Adversarial Networks
Tutorial on Theory and Application of Generative Adversarial Networks
 
MODELLING AND SYNTHESIZING OF 3D SHAPE WITH STACKED GENERATIVE ADVERSARIAL NE...
MODELLING AND SYNTHESIZING OF 3D SHAPE WITH STACKED GENERATIVE ADVERSARIAL NE...MODELLING AND SYNTHESIZING OF 3D SHAPE WITH STACKED GENERATIVE ADVERSARIAL NE...
MODELLING AND SYNTHESIZING OF 3D SHAPE WITH STACKED GENERATIVE ADVERSARIAL NE...
 
Volodymyr Lyubinets “Generative models for images”
Volodymyr Lyubinets  “Generative models for images”Volodymyr Lyubinets  “Generative models for images”
Volodymyr Lyubinets “Generative models for images”
 
Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Overlapping community detection in Large-Scale Networks using BigCLAM model b...Overlapping community detection in Large-Scale Networks using BigCLAM model b...
Overlapping community detection in Large-Scale Networks using BigCLAM model b...
 
Alberto Massidda - Scenes from a memory - Codemotion Rome 2019
Alberto Massidda - Scenes from a memory - Codemotion Rome 2019Alberto Massidda - Scenes from a memory - Codemotion Rome 2019
Alberto Massidda - Scenes from a memory - Codemotion Rome 2019
 
Deep Generative Models II (DLAI D10L1 2017 UPC Deep Learning for Artificial I...
Deep Generative Models II (DLAI D10L1 2017 UPC Deep Learning for Artificial I...Deep Generative Models II (DLAI D10L1 2017 UPC Deep Learning for Artificial I...
Deep Generative Models II (DLAI D10L1 2017 UPC Deep Learning for Artificial I...
 
Generation of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.pptGeneration of Deepfake images using GAN and Least squares GAN.ppt
Generation of Deepfake images using GAN and Least squares GAN.ppt
 
Safety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdfSafety Verification of Deep Neural Networks_.pdf
Safety Verification of Deep Neural Networks_.pdf
 
Generative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural NetworksGenerative modeling with Convolutional Neural Networks
Generative modeling with Convolutional Neural Networks
 
IMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTIONIMAGE GENERATION FROM CAPTION
IMAGE GENERATION FROM CAPTION
 
Image Generation from Caption
Image Generation from Caption Image Generation from Caption
Image Generation from Caption
 
Vladislav Kolbasin “Introduction to Generative Adversarial Networks (GANs)”
Vladislav Kolbasin “Introduction to Generative Adversarial Networks (GANs)”Vladislav Kolbasin “Introduction to Generative Adversarial Networks (GANs)”
Vladislav Kolbasin “Introduction to Generative Adversarial Networks (GANs)”
 
Creating Objects for Metaverse using GANs and Autoencoders
Creating Objects for Metaverse using GANs and AutoencodersCreating Objects for Metaverse using GANs and Autoencoders
Creating Objects for Metaverse using GANs and Autoencoders
 
Domain adaptation for Image Segmentation
Domain adaptation for Image SegmentationDomain adaptation for Image Segmentation
Domain adaptation for Image Segmentation
 
1809.05680.pdf
1809.05680.pdf1809.05680.pdf
1809.05680.pdf
 
A simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representationsA simple framework for contrastive learning of visual representations
A simple framework for contrastive learning of visual representations
 

Recently uploaded

Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls ChennaiCall Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
paraasingh12 #V08
 
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUESAN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
drshikhapandey2022
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
uqyfuc
 
Beckhoff Programmable Logic Control Overview Presentation
Beckhoff Programmable Logic Control Overview PresentationBeckhoff Programmable Logic Control Overview Presentation
Beckhoff Programmable Logic Control Overview Presentation
VanTuDuong1
 
ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
VANDANAMOHANGOUDA
 
Introduction to Artificial Intelligence.
Introduction to Artificial Intelligence.Introduction to Artificial Intelligence.
Introduction to Artificial Intelligence.
supriyaDicholkar1
 
paper relate Chozhavendhan et al. 2020.pdf
paper relate Chozhavendhan et al. 2020.pdfpaper relate Chozhavendhan et al. 2020.pdf
paper relate Chozhavendhan et al. 2020.pdf
ShurooqTaib
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
Atif Razi
 
ELS: 2.4.1 POWER ELECTRONICS Course objectives: This course will enable stude...
ELS: 2.4.1 POWER ELECTRONICS Course objectives: This course will enable stude...ELS: 2.4.1 POWER ELECTRONICS Course objectives: This course will enable stude...
ELS: 2.4.1 POWER ELECTRONICS Course objectives: This course will enable stude...
Kuvempu University
 
This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...
DharmaBanothu
 
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
upoux
 
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
DharmaBanothu
 
FULL STACK PROGRAMMING - Both Front End and Back End
FULL STACK PROGRAMMING - Both Front End and Back EndFULL STACK PROGRAMMING - Both Front End and Back End
FULL STACK PROGRAMMING - Both Front End and Back End
PreethaV16
 
Accident detection system project report.pdf
Accident detection system project report.pdfAccident detection system project report.pdf
Accident detection system project report.pdf
Kamal Acharya
 
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdfAsymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
felixwold
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
ydzowc
 
3rd International Conference on Artificial Intelligence Advances (AIAD 2024)
3rd International Conference on Artificial Intelligence Advances (AIAD 2024)3rd International Conference on Artificial Intelligence Advances (AIAD 2024)
3rd International Conference on Artificial Intelligence Advances (AIAD 2024)
GiselleginaGloria
 
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASICINTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
GOKULKANNANMMECLECTC
 
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
PriyankaKilaniya
 
Supermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdfSupermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdf
Kamal Acharya
 

Recently uploaded (20)

Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls ChennaiCall Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
 
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUESAN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
 
一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理一比一原版(USF毕业证)旧金山大学毕业证如何办理
一比一原版(USF毕业证)旧金山大学毕业证如何办理
 
Beckhoff Programmable Logic Control Overview Presentation
Beckhoff Programmable Logic Control Overview PresentationBeckhoff Programmable Logic Control Overview Presentation
Beckhoff Programmable Logic Control Overview Presentation
 
ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
 
Introduction to Artificial Intelligence.
Introduction to Artificial Intelligence.Introduction to Artificial Intelligence.
Introduction to Artificial Intelligence.
 
paper relate Chozhavendhan et al. 2020.pdf
paper relate Chozhavendhan et al. 2020.pdfpaper relate Chozhavendhan et al. 2020.pdf
paper relate Chozhavendhan et al. 2020.pdf
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
 
ELS: 2.4.1 POWER ELECTRONICS Course objectives: This course will enable stude...
ELS: 2.4.1 POWER ELECTRONICS Course objectives: This course will enable stude...ELS: 2.4.1 POWER ELECTRONICS Course objectives: This course will enable stude...
ELS: 2.4.1 POWER ELECTRONICS Course objectives: This course will enable stude...
 
This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...
 
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
 
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
A high-Speed Communication System is based on the Design of a Bi-NoC Router, ...
 
FULL STACK PROGRAMMING - Both Front End and Back End
FULL STACK PROGRAMMING - Both Front End and Back EndFULL STACK PROGRAMMING - Both Front End and Back End
FULL STACK PROGRAMMING - Both Front End and Back End
 
Accident detection system project report.pdf
Accident detection system project report.pdfAccident detection system project report.pdf
Accident detection system project report.pdf
 
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdfAsymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
 
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
原版制作(Humboldt毕业证书)柏林大学毕业证学位证一模一样
 
3rd International Conference on Artificial Intelligence Advances (AIAD 2024)
3rd International Conference on Artificial Intelligence Advances (AIAD 2024)3rd International Conference on Artificial Intelligence Advances (AIAD 2024)
3rd International Conference on Artificial Intelligence Advances (AIAD 2024)
 
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASICINTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
 
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
 
Supermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdfSupermarket Management System Project Report.pdf
Supermarket Management System Project Report.pdf
 

lec19.pdf

  • 1. CSC321 Lecture 19: Generative Adversarial Networks Roger Grosse Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 1 / 25
  • 2. Overview In generative modeling, we’d like to train a network that models a distribution, such as a distribution over images. One way to judge the quality of the model is to sample from it. This field has seen rapid progress: 2009 2015 2018 Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 2 / 25
  • 3. Overview Four modern approaches to generative modeling: Generative adversarial networks (today) Reversible architectures (next lecture) Autoregressive models (Lecture 7, and next lecture) Variational autoencoders (CSC412) All four approaches have different pros and cons. Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 3 / 25
  • 4. Implicit Generative Models Implicit generative models implicitly define a probability distribution Start by sampling the code vector z from a fixed, simple distribution (e.g. spherical Gaussian) The generator network computes a differentiable function G mapping z to an x in data space Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 4 / 25
  • 5. Implicit Generative Models A 1-dimensional example: Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 5 / 25
  • 6. Implicit Generative Models https://blog.openai.com/generative-models/ Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 6 / 25
  • 7. Implicit Generative Models This sort of architecture sounded preposterous to many of us, but amazingly, it works. Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 7 / 25
  • 8. Generative Adversarial Networks The advantage of implicit generative models: if you have some criterion for evaluating the quality of samples, then you can compute its gradient with respect to the network parameters, and update the network’s parameters to make the sample a little better The idea behind Generative Adversarial Networks (GANs): train two different networks The generator network tries to produce realistic-looking samples The discriminator network tries to figure out whether an image came from the training set or the generator network The generator network tries to fool the discriminator network Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 8 / 25
  • 9. Generative Adversarial Networks Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 9 / 25
  • 10. Generative Adversarial Networks Let D denote the discriminator’s predicted probability of being data Discriminator’s cost function: cross-entropy loss for task of classifying real vs. fake images JD = Ex∼D[− log D(x)] + Ez[− log(1 − D(G(z)))] One possible cost function for the generator: the opposite of the discriminator’s JG = −JD = const + Ez[log(1 − D(G(z)))] This is called the minimax formulation, since the generator and discriminator are playing a zero-sum game against each other: max G min D JD Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 10 / 25
  • 11. Generative Adversarial Networks Updating the discriminator: Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 11 / 25
  • 12. Generative Adversarial Networks Updating the generator: Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 12 / 25
  • 13. Generative Adversarial Networks Alternating training of the generator and discriminator: Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 13 / 25
  • 14. A Better Cost Function We introduced the minimax cost function for the generator: JG = Ez[log(1 − D(G(z)))] One problem with this is saturation. Recall from our lecture on classification: when the prediction is really wrong, “Logistic + squared error” gets a weak gradient signal “Logistic + cross-entropy” gets a strong gradient signal Here, if the generated sample is really bad, the discriminator’s prediction is close to 0, and the generator’s cost is flat. Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 14 / 25
  • 15. A Better Cost Function Original minimax cost: JG = Ez[log(1 − D(G(z)))] Modified generator cost: JG = Ez[− log D(G(z))] This fixes the saturation problem. Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 15 / 25
  • 16. Generative Adversarial Networks Since GANs were introduced in 2014, there have been hundreds of papers introducing various architectures and training methods. Most modern architectures are based on the Deep Convolutional GAN (DC-GAN), where the generator and discriminator are both conv nets. GAN Zoo: https://github.com/hindupuravinash/the-gan-zoo Good source of horrible puns (VEEGAN, Checkhov GAN, etc.) Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 16 / 25
  • 17. GAN Samples Celebrities: Karras et al., 2017. Progressive growing of GANs for improved quality, stability, and variation Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 17 / 25
  • 18. GAN Samples Bedrooms: Karras et al., 2017. Progressive growing of GANs for improved quality, stability, and variation Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 18 / 25
  • 19. GAN Samples Objects: Karras et al., 2017. Progressive growing of GANs for improved quality, stability, and variation Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 19 / 25
  • 20. GAN Samples GANs revolutionized generative modeling by producing crisp, high-resolution images. The catch: we don’t know how well they’re modeling the distribution. Can’t measure the log-likelihood they assign to held-out data. Could they be memorizing training examples? (E.g., maybe they sometimes produce photos of real celebrities?) We have no way to tell if they are dropping important modes from the distribution. See Wu et al., “On the quantitative analysis of decoder-based generative models” for partial answers to these questions. Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 20 / 25
  • 21. CycleGAN Style transfer problem: change the style of an image while preserving the content. Data: Two unrelated collections of images, one for each style Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 21 / 25
  • 22. CycleGAN If we had paired data (same content in both styles), this would be a supervised learning problem. But this is hard to find. The CycleGAN architecture learns to do it from unpaired data. Train two different generator nets to go from style 1 to style 2, and vice versa. Make sure the generated samples of style 2 are indistinguishable from real images by a discriminator net. Make sure the generators are cycle-consistent: mapping from style 1 to style 2 and back again should give you almost the original image. Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 22 / 25
  • 23. CycleGAN Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 23 / 25
  • 24. CycleGAN Style transfer between aerial photos and maps: Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 24 / 25
  • 25. CycleGAN Style transfer between road scenes and semantic segmentations (labels of every pixel in an image by object category): Roger Grosse CSC321 Lecture 19: Generative Adversarial Networks 25 / 25