SlideShare a Scribd company logo
1 of 17
Generative Adversarial Networks in Computer Vision
SHREE GOWRI RADHAKRISHNA
COMPUTER SCIENCE DEPARTMENT, SAN JOSE STATE UNIVERSITY
A review of,
Generative Adversarial Networks in Computer Vision: A
Survey and Taxonomy
ZHENGWEI WANG,
QI SHE,
TOMÁS E. WARD
https://arxiv.org/abs/1906.01529
Objective
• Introduce GAN
• Understand challenges of GANs and propose improvements
• Look at various GAN architectures from 2 perspectives:
• Architecture-variant
• Loss-variant
Architecture of GAN
• Two Deep Neural Networks
• Discriminator
• Generator
• Discriminator optimized to distinguish real vs fake images
• Generator creates images to fool discriminator
Architecture of a GAN
Applications of GAN
• Applications:
• Image generation
• Image to image translation
• Image super resolution
• Image completion
• Advantages over tradition Deep Generative Networks:
• Produce better outputs than DGMs
• Can train any type of network
• No restriction on size of latent variable
Challenges in GANs
• High quality image generation
• Diverse image generation
• Stable training.
Two broad classification of GANs
• Architecture – variant GANs
• Focus on architectural improvements to solve issues
• Network Size and Batch Size
• Loss – variant GANs
• Focus on modifying loss function to improve performance
• Normalization and regularization
Architecture Variant GANS
• Fully-connected GAN (FCGAN)
• Laplacian Pyramid of Adversarial Networks (LAPGAN)
• Deep Convolutional GAN (DCGAN)
• Boundary Equilibrium GAN (BEGAN)
• Progressive GAN (PROGAN)
• Self-attention GAN (SAGAN)
• BigGAN
Performance of Architecture-variant GANS
Architectural variant GAN comparison
Summary of architecture-variants
• All proposed architecture-variants are able to improve image
quality.
• SAGAN is proposed for improving the capacity of multi-class
learning in GANs, to produce more diverse images
• PROGAN and BigGAN are able to produce high resolution
images
• SAGAN and BigGAN is effective for the vanishing gradient
challenge
Loss – variant GANs
• Wasserstein GAN (WGAN)
• WGAN-GP
• Least Square GAN (LSGAN)
• f-GAN
• Unrolled GAN (UGAN)
• Loss Sensitive GAN (LS-GAN)
• Mode Regularized GAN (MRGAN)
• Geometric GAN
• Relativistic GAN (RGAN)
• Spectral normalization GAN (SN-GAN)
Performance of Loss – variant GANs
Summary of loss variants
• Losses of LSGAN, RGAN and WGAN are similar to the original
GAN loss
• LSGAN argues that the vanishing gradient is mainly caused by
the sigmoid function in the discriminator so it uses a least
squares loss to optimize the GAN
Conclusion
• Reviewed GAN-variants based on performance improvement
• Stable training: improve loss functions
• Image quality: progressive training in PROGRAN
• Spectral Normalization has good generalization
Thank you

More Related Content

Similar to Generative adversarial networks in computer vision

Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Universitat Politècnica de Catalunya
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
Daniel Cahall
 

Similar to Generative adversarial networks in computer vision (20)

J. Yun, AISTATS 2022, MLILAB, KAISTAI
J. Yun, AISTATS 2022, MLILAB, KAISTAIJ. Yun, AISTATS 2022, MLILAB, KAISTAI
J. Yun, AISTATS 2022, MLILAB, KAISTAI
 
DiscoGAN
DiscoGANDiscoGAN
DiscoGAN
 
Project_Final_Review.pdf
Project_Final_Review.pdfProject_Final_Review.pdf
Project_Final_Review.pdf
 
Evolution of the StyleGAN family
Evolution of the StyleGAN familyEvolution of the StyleGAN family
Evolution of the StyleGAN family
 
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
A Certain Slant of Light - Past, Present and Future Challenges of Global Illu...
 
Past, Present and Future Challenges of Global Illumination in Games
Past, Present and Future Challenges of Global Illumination in GamesPast, Present and Future Challenges of Global Illumination in Games
Past, Present and Future Challenges of Global Illumination in Games
 
Anguiano varshneya sccur-poster_20141122
Anguiano varshneya sccur-poster_20141122Anguiano varshneya sccur-poster_20141122
Anguiano varshneya sccur-poster_20141122
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution Overview
 
5 Major Challenges in Real-time Rendering (2012)
5 Major Challenges in Real-time Rendering (2012)5 Major Challenges in Real-time Rendering (2012)
5 Major Challenges in Real-time Rendering (2012)
 
An Intelligent approach to Pic to Cartoon Conversion using White-box-cartooni...
An Intelligent approach to Pic to Cartoon Conversion using White-box-cartooni...An Intelligent approach to Pic to Cartoon Conversion using White-box-cartooni...
An Intelligent approach to Pic to Cartoon Conversion using White-box-cartooni...
 
# Can we trust ai. the dilemma of model adjustment
# Can we trust ai. the dilemma of model adjustment# Can we trust ai. the dilemma of model adjustment
# Can we trust ai. the dilemma of model adjustment
 
gan.pdf
gan.pdfgan.pdf
gan.pdf
 
Image colorization
Image colorizationImage colorization
Image colorization
 
Image colorization
Image colorizationImage colorization
Image colorization
 
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
Image-to-Image Translation with Conditional Adversarial Nets (UPC Reading Group)
 
Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)
 
11_gan.pdf
11_gan.pdf11_gan.pdf
11_gan.pdf
 
Detection of surface flaws in a pipe using vision based technique
Detection of surface flaws in a pipe using vision based techniqueDetection of surface flaws in a pipe using vision based technique
Detection of surface flaws in a pipe using vision based technique
 
Generative Adversarial Networks and Their Applications in Medical Imaging
Generative Adversarial Networks  and Their Applications in Medical ImagingGenerative Adversarial Networks  and Their Applications in Medical Imaging
Generative Adversarial Networks and Their Applications in Medical Imaging
 
Cahall Final Intern Presentation
Cahall Final Intern PresentationCahall Final Intern Presentation
Cahall Final Intern Presentation
 

Recently uploaded

Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
drjose256
 
Seizure stage detection of epileptic seizure using convolutional neural networks
Seizure stage detection of epileptic seizure using convolutional neural networksSeizure stage detection of epileptic seizure using convolutional neural networks
Seizure stage detection of epileptic seizure using convolutional neural networks
IJECEIAES
 

Recently uploaded (20)

Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
 
Autodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptxAutodesk Construction Cloud (Autodesk Build).pptx
Autodesk Construction Cloud (Autodesk Build).pptx
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptx
 
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility Applications
 
engineering chemistry power point presentation
engineering chemistry  power point presentationengineering chemistry  power point presentation
engineering chemistry power point presentation
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptx
 
Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...
 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
 
Software Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdfSoftware Engineering Practical File Front Pages.pdf
Software Engineering Practical File Front Pages.pdf
 
Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1Research Methodolgy & Intellectual Property Rights Series 1
Research Methodolgy & Intellectual Property Rights Series 1
 
Working Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdfWorking Principle of Echo Sounder and Doppler Effect.pdf
Working Principle of Echo Sounder and Doppler Effect.pdf
 
UNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptxUNIT 4 PTRP final Convergence in probability.pptx
UNIT 4 PTRP final Convergence in probability.pptx
 
21scheme vtu syllabus of visveraya technological university
21scheme vtu syllabus of visveraya technological university21scheme vtu syllabus of visveraya technological university
21scheme vtu syllabus of visveraya technological university
 
Seizure stage detection of epileptic seizure using convolutional neural networks
Seizure stage detection of epileptic seizure using convolutional neural networksSeizure stage detection of epileptic seizure using convolutional neural networks
Seizure stage detection of epileptic seizure using convolutional neural networks
 
Basics of Relay for Engineering Students
Basics of Relay for Engineering StudentsBasics of Relay for Engineering Students
Basics of Relay for Engineering Students
 
Independent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging StationIndependent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging Station
 
History of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & ModernizationHistory of Indian Railways - the story of Growth & Modernization
History of Indian Railways - the story of Growth & Modernization
 
Dynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptxDynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptx
 
Circuit Breakers for Engineering Students
Circuit Breakers for Engineering StudentsCircuit Breakers for Engineering Students
Circuit Breakers for Engineering Students
 
15-Minute City: A Completely New Horizon
15-Minute City: A Completely New Horizon15-Minute City: A Completely New Horizon
15-Minute City: A Completely New Horizon
 

Generative adversarial networks in computer vision

  • 1. Generative Adversarial Networks in Computer Vision SHREE GOWRI RADHAKRISHNA COMPUTER SCIENCE DEPARTMENT, SAN JOSE STATE UNIVERSITY
  • 2. A review of, Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy ZHENGWEI WANG, QI SHE, TOMÁS E. WARD https://arxiv.org/abs/1906.01529
  • 3. Objective • Introduce GAN • Understand challenges of GANs and propose improvements • Look at various GAN architectures from 2 perspectives: • Architecture-variant • Loss-variant
  • 4. Architecture of GAN • Two Deep Neural Networks • Discriminator • Generator • Discriminator optimized to distinguish real vs fake images • Generator creates images to fool discriminator
  • 6. Applications of GAN • Applications: • Image generation • Image to image translation • Image super resolution • Image completion • Advantages over tradition Deep Generative Networks: • Produce better outputs than DGMs • Can train any type of network • No restriction on size of latent variable
  • 7. Challenges in GANs • High quality image generation • Diverse image generation • Stable training.
  • 8. Two broad classification of GANs • Architecture – variant GANs • Focus on architectural improvements to solve issues • Network Size and Batch Size • Loss – variant GANs • Focus on modifying loss function to improve performance • Normalization and regularization
  • 9. Architecture Variant GANS • Fully-connected GAN (FCGAN) • Laplacian Pyramid of Adversarial Networks (LAPGAN) • Deep Convolutional GAN (DCGAN) • Boundary Equilibrium GAN (BEGAN) • Progressive GAN (PROGAN) • Self-attention GAN (SAGAN) • BigGAN
  • 12. Summary of architecture-variants • All proposed architecture-variants are able to improve image quality. • SAGAN is proposed for improving the capacity of multi-class learning in GANs, to produce more diverse images • PROGAN and BigGAN are able to produce high resolution images • SAGAN and BigGAN is effective for the vanishing gradient challenge
  • 13. Loss – variant GANs • Wasserstein GAN (WGAN) • WGAN-GP • Least Square GAN (LSGAN) • f-GAN • Unrolled GAN (UGAN) • Loss Sensitive GAN (LS-GAN) • Mode Regularized GAN (MRGAN) • Geometric GAN • Relativistic GAN (RGAN) • Spectral normalization GAN (SN-GAN)
  • 14. Performance of Loss – variant GANs
  • 15. Summary of loss variants • Losses of LSGAN, RGAN and WGAN are similar to the original GAN loss • LSGAN argues that the vanishing gradient is mainly caused by the sigmoid function in the discriminator so it uses a least squares loss to optimize the GAN
  • 16. Conclusion • Reviewed GAN-variants based on performance improvement • Stable training: improve loss functions • Image quality: progressive training in PROGRAN • Spectral Normalization has good generalization