5 years ago, Generative Adversarial Networks (GANs) started the deep learning revolution.This revolution produced some major technological breakthroughs. Ian Goodfellow and others introduced generative adversarial networks in a paper entitled "Generative Adversarial Networks"- https://arxiv.org/abs/1406.2661. Academically accepted publiclyGANThe industry also welcomes GAN. The rise of GAN is inevitable.
First of all, the best thing about GAN is their learning nature, which is unsupervised. GAN does not need to tag data, which makes GAN powerful because it doesn't require the tedious work of data tagging.
Second, the potential use case of GAN makes GAN the center of conversation. They can generate high quality images, enhance photos, generate images from text, convert images from one domain to another, change the appearance of facial images as they age, and more. The list is endless. We will introduce some of the popular GAN architectures in this article.
Third, the endless research around GAN is so fascinating that it attracts the attention of all other industries. We will discuss major technological breakthroughs later in this article.
Birth
The generation of the confrontation network or GAN for short is the setting of the two networks, the generator network and the discriminator network. These two networks can be neural networks, from convolutional neural networks, recurrent neural networks to automatic encoders. In this setup, the two networks participate in competitive games and try to transcend each other while helping each other to complete their tasks. After thousands of iterations, if all goes well, the generator network can perfectly generate realistic fake images, and the discriminator network can well determine whether the displayed image is fake or real. In other words, the generator network transforms random noise vectors from potential space (not all GAN samples from potential space) into samples from real data sets. Training GAN is a very intuitive process.
GAN has a large number of practical use cases such as image generation, artwork generation, music generation and video generation. In addition, they can improve image quality, stylize or color images, generate faces, and perform more interesting tasks.
The figure above shows the architecture of the Vanilla GAN network. First, the D-dimensional noise vector is sampled from the potential space and fed to the generator network. The generator network converts the noise vector into an image. The generated image is then fed to a discriminator network for classification. The discriminator network continually obtains images from real data sets and images generated by the generator network. Its job is to distinguish between real and false images. All GAN architectures follow the same design. This is the birth of GAN. Now explore the puberty of GAN.
puberty
During his puberty, GAN produced a widely popular architecture such as DCGAN, StyleGAN, BigGAN, StackGAN, Pix2pix, Age-cGAN, CycleGAN. The results of these architectures are very promising. By observing the results, it is clear that GAN has reached puberty. Let's explore these architectures in detail.
DCGAN
For the first time, convolutional neural networks were used in GAN and impressive results were achieved. Before that,CNNUnprecedented results have been achieved in overseeing computer vision tasks. But in GAN, the cable news network has not yet been developed. DCGAN was introduced in a paper by Alec Radford, Luke Metz, Soumith Chintala entitled "Unsuperconservative Generation vs. Unsupervised Representation Learning of Networks". This is an important milestone in GAN research because it introduces major architectural changes to address issues such as training instability, pattern collapse, and internal covariate conversion. Since then, DCGAN-based architectures have introduced many GAN architectures.
BigGAN
This is the latest development in GAN for image generation. Google interns and two researchers at Google’s DeepMind department published a paper entitled “Large-scale GAN training for high-fidelity natural image synthesis”.https://arxiv.org/abs/1809.11096获得. This article is an internship between Andrew Brock from Heriot-Watt University and Jeff Donahue and Karen Simonyan from DeepMind.
These images are generated by BigGAN, and as you can see, their quality is impressive. GAN first produced images with high fidelity and low variety gaps. The previous highest initial score was 52.52, and BigGAN's initial score was 166.3, which is better than the prior art (SOTA) by 100%. In addition, they increased the Frechet initial distance (FID) score from 18.65 to 9.6. These are very impressive results, and I hope to see more development in this area. The most important improvement is the orthogonal regularization of the generator.
Not very impressive!
StyleGAN
StyleGAN is another major breakthrough in GAN research. StyleGAN is introduced by Nvidia in a paper titled "Building Architecture for Style-Based Generation Against the Network", available from the following linkhttps://arxiv.org/pdf/1710.10196.pdf获得.
StyleGAN sets up new records in the Face generation task. The core of the algorithm is style transfer technology or style blending. In addition to generating a face, it can also produce high quality images of cars, bedrooms and more. This is a major improvement in the field of GANs and a source of inspiration for researchers in deep learning.
StackGAN
StackJANs was proposed by Han Zhang, Tao Xu, Hongsheng Li and others in a paper entitled StackGAN: Text-to-Image Realistic Image Synthesis and Stacking Generation Against Networks, available at the following link:https : //arxiv.org/ pdf / 1612.03242.pdf. They used StackGAN to explore the synthesis of text to images, and the results were impressive. StackGAN is a pair of networks that produce realistic images when textual descriptions are provided. My book "Generative Adversarial Networks Projects" has a chapter devoted to StackGANs.
As you can see in the image above, StackGAN generates realistic bird images when providing textual descriptions. The most important thing is that the resulting image is exactly like the text provided. Text-to-image compositing has many practical applications, such as generating an image from a textual description, and converting a textual-style story into a comic form to create an internal representation of the textual description.
CycleGAN
CycleGAN has some very interesting use cases, such as converting a photo to a painting, and vice versa, converting a summer shot to a winter shot, and vice versa, or converting a horse's photo to a zebra photo and vice versa. CycleJANs was proposed by Jun-Yan Zhu, Taesung Park, Phillip Isola and Alexei A. Efros in a paper entitled "Using Cyclic Consistency Against Unpaired Image-to-Image Translation of Networks", which is available at the following link:Https: //arxiv.org/pdf/1703.10593. CycleGAN explores different image-to-image translation use cases.
Pix2pix
For image-to-image translation tasks, pix2pix also shows impressive results. Whether converting night images to daytime images or converting images to white images, tinting black and white images, converting sketches to photos, etc., Pix2pix excels in all of these use cases. The pix2pix network is presented by Phillip Isola, Jun-Yan Zhu, Tinghui Zhou and Alexei A. Efros in their paper entitled "Image-to-Image Translation Using Conditional Confrontation Networks", available at the following link:Https:/ /arxiv.org/abs/1611.07004.
This is a pix2pix interactive demo that generates real images from sketches.
Age-cGAN (age condition generation confrontation network)
There are many industry use cases for facial aging, including cross-age face recognition, finding missing children and entertainment. Grigory Antipov, Moez Baccouche and Jean-Luc Dugelay proposed the use of conditional GAN for facial aging in their paper entitled "Face aging in the face of conditional generative confrontation networks", which can be obtained from the following link:https : //arxiv.org/pdf/1702.01983 .PDF.
This figure shows how Age-cGAN transitions from source age to target age.
These are some of the most popular GAN architectures. In addition, there are thousands of GAN architectures. It depends on which architecture you are looking for is right for your needs.
Rise
As the famous theoretical physicist Richard Feynman said: "I can't create, I don't understand."
The idea behind GAN is to train a network that understands data. GAN is now beginning to understand the data, and with this understanding they are beginning to create realistic images. Let us witness the rise of GAN.
Edmond de Belamy
Created by Generative Adversarial NetworksEdmond de BelamyAt the Christie’s auction432,500The price of the dollar is sold for sale. This is an important step in the progress of GAN. For the first time in the world, GAN and its potential have been witnessed. Prior to this, GAN was mainly limited to research laboratories and used by machine learning engineers. This behavior became an entry point for GAN to the public.
You may be familiar withhttps://thispersondoesnotexist.comwebsite. Last month, this was the entire internet. The site,Https: //thispersondoesnotexist.com was created by Uber's software engineer Philip Wan. He created the site based on NVIDIA's code called StyleGAN. Whenever you click Refresh, it generates a new fake face that looks really unrecognizable if it is false. This is terrible autofocus, but at the same time destructive. This technology has the potential to create an endless virtual world.
DeepFakes
DeepFakes is another terrible AF but destructive technology. Based on GAN, this can paste a face onto a target person in the video. DeepFakes is also spread across the Internet. People speculate on the shortcomings of this technology. But for AI researchers, this is a major breakthrough. This technology has the potential to save millions of dollars in the film industry, where hours of editing are needed to change the acrobats that actors face.
This technology is always scary, but we have a responsibility to use it for social products.
trend
StyleGAN is currently the sixth most popular python project on GitHub. The number of named GANs proposed so far is thousands. This repository has a list of popular GANs and their respective papershttps://github.com/hindupuravinash/the-gan-zooHindupuraviash / the-gan-zoo
All lists named GAN! Contribute to the development of hindupuravinash / the-gan-zoo by creating an account on GitHub.github.com
In the real world
GAN has been used to enhance game graphics. I am very excited about this use case of GAN. Recently, NVIDIA released a video showing how to use GAN to gamify the environment in the video.
in conclusion
In this article, we have seen how GAN became famous and became a global phenomenon. I hope that we will see the democratization of GAN in the next few years. In this article, we start with the birth of GAN. Then we explored some of the popular GAN architectures. Finally, we witnessed the rise of GAN. I was confused when I saw negative news around GAN. I believe that we have a responsibility to make everyone aware of the impact of GAN and how we use GAN as ethically and ethically as possible. Let us come together and spread enthusiasm around GAN. GAN has great potential to create new industries and employment opportunities. We must make sure that it does not fall into the wrong hands.
This article is from usejournal,Original address
Comments