Generating adversarial networks-GAN is a very popular unsupervised algorithm in the last 2 years. It can generate very realistic photos, images and even videos. It will be used in the photo processing software in our mobile phone.
This article will introduce in detail the original design, basic principles, 10 typical algorithms and 13 practical applications of the Generative Adversarial Network-GAN.
The original intention of GAN
In a word, the motivation for designing GANs is automation.
Manual extraction of features - automatic extraction of features
we are at"A text to understand deep learning (concept + advantages and disadvantages + typical algorithm)As mentioned in the book, the most special and powerful part of deep learning is the ability to learn feature extraction by yourself.
The super power of the machine can solve many problems that cannot be solved manually. After automation, the learning ability is stronger and the adaptability is stronger.
Manually judge the quality of the generated results - automatic judgment and optimization
we are at"Supervised learningAs mentioned in the article, the training set requires a large amount of manual labeling data, which is costly and inefficient. The same is true for the quality of the results of manual judgment, and there are problems of high cost and low efficiency.
GAN can automatically complete this process and continuously optimize it. This is a very efficient and low-cost way. How does GAN achieve automation? Now we explain his principle.
Basic Principles of Generating Adversarial Network GANs
Large vernacular version
I know that there is a very good explanation, everyone should understand:
Suppose a city is in chaos, and soon there will be countless thieves in the city. Among these thieves, some may be master thieves, and some may have no technology at all. If the city began to revive its law and order, and suddenly launched a "sports" to fight crime, the police began to resume patrols in the city. Soon, a group of "skilled artists" thieves were caught. The reason why they caught the thieves who didn’t have the technical content was because the police’s skills were not good. After catching a group of low-end thieves, it’s hard to say how the city’s security level has become worse, but it’s obvious that the city The average level of thieves has been greatly improved.
The police began to train their own crime-killing techniques and began to seize the increasingly rampant thieves. With the arrest of these professional recidivists, the police have also developed special skills. They can quickly find suspicious people from a group of people, so they go to the front and check the suspects. The thieves have a hard time. Because the level of the police has greatly improved, if you still want to behave like a ghost, you will soon be caught by the police.
In order to avoid being arrested, the thieves tried to behave less suspiciously, while the magic height was one foot and the height was one foot. The police also constantly improved their level and tried to distinguish the thief from the innocent ordinary people. With this kind of "communication" and "learning" between the police and the thief, the thieves have become very cautious. They have extremely high stealing skills and behave like ordinary people. The police have all trained themselves. Once suspicious people are discovered, they can be immediately detected and controlled in time - eventually, we get the strongest thief and the strongest police.
Non-big vernacular version
Generative Adversarial Network (GAN) consists of 2 important parts:
- Generator): Generate data through the machine (in most cases, the image), the purpose is to "cheat" the discriminator
- Discriminator): Determine whether the image is real or machine-generated, the purpose is to find out the "fake data" made by the generator.
The following describes the process in detail:
Stage 1: Fix "Discriminator D" and train "Generator G"
We use an OK OK discriminator to let a "Generator G" continuously generate "false data" and then give this "Discriminator D" to judge.
In the beginning, "Generator G" was still weak, so it was easy to get rid of it.
However, with the constant training, the "Generator G" skill has been continuously improved, and eventually the "Discriminator D" has been fooled.
At this time, the "discriminator D" basically belongs to the state of guessing, and the probability of judging whether it is false data is 50%.
Stage 2: Fix "Generator G" and train "Discriminator D"
When the first stage is passed, it is meaningless to continue training "Generator G". At this time we fixed "Generator G" and then started training "Discriminator D".
"Discriminator D" improves his discriminating ability through continuous training, and finally he can accurately judge all the fake pictures.
At this time, "Generator G" has been unable to fool "Discriminator D".
Cycle Phase 1 and Phase 2
Through continuous looping, the capabilities of "Generator G" and "Discriminator D" are getting stronger and stronger.
In the end we got a very good "Generator G", we can use it to generate the image we want.
The actual application section below will show a lot of "stunning" cases.
If you are interested in the detailed technical principles of GAN, you can check out the following 2 articles:
Advantages and disadvantages of GAN
- Better modeling of data distribution (images sharper and clearer)
- In theory, GANs can train any kind of generator network. Other frameworks require generator networks to have some specific form of functionality, such as the output layer being Gaussian.
- There is no need to use the Markov chain to repeatedly sample, without inferring in the learning process, without complicated variational lower bounds, avoiding the difficulty of approximating the difficult probability of calculation.
- Hard to train, unstable. Good synchronization is required between the generator and the discriminator, but in actual training it is easy for D to converge and G to diverge. D/G training requires careful design.
- Mode Collapse issue. The learning process of GANs may have a missing pattern, the generator begins to degenerate, and the same sample points are always generated, and the learning cannot be continued.
Extended reading:Why is it so difficult to train against the network??"Reading this article is very demanding on mathematics."
10 typical GAN algorithms
There are hundreds of GAN algorithms, and everyone's research on GANs has increased exponentially. At present, hundreds of forums per month are on adversarial networks.
The following figure shows the number of papers published on GANs per month:
If you are interested in the GANs algorithm, you canGANs ZooSee almost all the algorithms in it. We have selected 10 more representative algorithms from a variety of algorithms, and technicians can look at his papers and code.
|GAN||Paper address||Code address|
|DCGAN||Paper address||Code address|
|CGAN||Paper address||Code address|
|CycleGAN||Paper address||Code address|
|CoGAN||Paper address||Code address|
|ProGAN||Paper address||Code address|
|WGAN||Paper address||Code address|
|SAGAN||Paper address||Code address|
|BigGAN||Paper address||Code address|
The above content is organized fromGenerative Adversarial Networks – The Story So FarIn the original text, there are some rough explanations for the algorithm. If you are interested, you can take a look.
13 practical applications of GAN
GANs are not as intuitive as "speech recognition" or "text mining". But his application has already entered our lives. Here are some practical applications of GANs.
Generate image dataset
The training of artificial intelligence requires a large number of data sets. If all of them are collected and labeled manually, the cost is very high. GAN can automatically generate some data sets to provide low-cost training data.
Generating face photos
Generating face photos is an application that everyone is familiar with, but the photos that are generated to be used are questions that need to be considered. Because this kind of face photo is still on the edge of the law.
Generate photos, cartoon characters
GAN can not only generate human faces, but also other types of photos, even comic characters.
Image to image conversion
Simply put, converting a form of image into another form of image is as magical as adding a filter. E.g:
- Convert drafts into photos
- Convert satellite photos to images of Google Maps
- Convert photos into oil paintings
- Convert daylight into night
Text to image conversion
The title in 2016 is " StackGAN: Image synthesis using realistic text from StackGAN The paper demonstrates the use of GAN, especially their StackGAN, to generate realistic photos from textual descriptions of simple objects such as birds and flowers.
The title in 2017 is " High-resolution image synthesis and semantic manipulation with conditional GAN In the paper, it is demonstrated that a conditional GAN is used to generate a realistic image in the case of a semantic image or sketch as an input.
Automatically generate models
The title in 2017 is " Posture guide image generation In the paper, mannequins can be automatically generated and new poses can be used.
Photo to Emojis
GANs can automatically generate corresponding emoticons (Emojis) from face photos.
Using GAN can generate specific photos, such as changing hair color, changing facial expressions, and even changing gender.
Forecast the looks of different ages
Give a face photo, GAN can help you predict what you will look like at different ages.
Improve photo resolution and make photos clearer
Give GAN a photo, and he will be able to generate a higher resolution photo, making this photo clearer.
If there is a problem in an area in the photo (such as being painted or erased), GAN can repair the area and restore the original state.
Automatically generate 3D models
Given multiple 2D images at different angles, you can generate an 3D model.
Baidu Encyclopedia + Wikipedia