Read&Subscribe to my blogs@siddhithakkar.com. Details of slides below:
This document is aimed at explaining the following:
1) What use cases made GAN popular?
2) What is GAN?
3) How does it work?
2. WHY?
The POWER of GAN includes:
• Create photos and videos from texts
• Create anime characters
• Pose guided person image generation
• Transform existing image
Real image Target pose Result
Text Image
Hello and welcome to short explanation video about Generative Adversarial Network.
I would like to begin with by talking about some of the reasons as to why are GANs so popular!
The response lies in the power of its real world applications which include:
1) Creating new photos and videos by just giving some sort of text description. For instance, imagine you uploading the script of a movie on a website and the resulting video is available for you to download. Or you get to create new pictures based on their captions. This happens because of GAN’s text-to-image capability- text that I give as a user is converted into image by the algorithms.
2) Second application is in the gaming industry where expensive production artists are hired to create animation avatars. With the help of GAN, it is possible to feed personality traits of an animation character to an algorithm- and you will getsuggested avatars back as result
3) Next application is very relevant for the retail industry where you have a photo shoot done for a model or a product. You take some of those pictures as input, put it together with a desired target pose and run some GAN magic over it. The result will be the same model or product standing in your desired pose.
4) It is also possible to change existing images where either the foreground is changed while everything remains the same or the other way around. An example would be converting zebra to horse or changing a summer landspace to that snow and winter.
You could also perform some sort of calculations on a group of photos like you see here
Now that we understand the power of it, lets try to define GAN. For the sake of simplicity I have broken down the term into three obvious parts to make it easy for us to grasp the meaning of each term individually.
So, the first there is Generative. GANs are a specific class of machine learning that is based on the concept of Generative Models. Simply put, these models are the ones that help you predict the best possible outcome based on some input samples. For instance, they can predict what will user search for next based on his or other histories of similar users.
Adversarial comes from the fact that GANs consist of two networks that compete against each other. These two networks and their adversary are extremely important aspects of how GANs function
The keyword Network exists because the competing networks we just spoke about are actually neural networks. Most simply, neural networks can be understood as a computer model/program that immaculates a human brain
Lets dig a bit deeper to know how does it really work.
So a GAN set-up typically consists of two models- one is called Generator and the other one is Discriminator.
Generator is the one that generates new data- hence this name where as discriminator is the one that tries to differentiate between real and fake data.
The whole process runs as follows:
Generator is fed with some random variables in the beginning itself, on the basis of which it generates certain images.
These generated images are then passed to Discriminator along with some real world images from a different source.
Discriminator then differentiates between the images it received from two difference sources.
If it finds a discrepancy, the feedback is given back to generator to send better samples next time.
As an example, imagine that the discriminator got images of cats from both sides- from generator as well as real world. Discriminator noticed that some cat images have three eyes while others have two. At that point in time, feedback is given to generator to not create images to cats with three eyes.
After getting such repeated feedbacks, Generator continuously improves itself and ultimately reaches a point where it is able to generate images so real that the Discriminator can’t differentiate them anymore. Such images become the outcome of GAN set-up.
That’s all for today- I hope you find the topic and my explanation interesting enough!s
, image or video based on some random parameters that it has.
That generated data is then fed to Discriminator.