Generating adversarial networks-GAN is a very popular unsupervised algorithm in the last 2 years. It can generate very realistic photos, images and even videos. It will be used in the photo processing software in our mobile phone.
This article will introduce in detail the original design, basic principles, 10 typical algorithms and 13 practical applications of the Generative Adversarial Network-GAN.
The original intention of GAN
In a word, the motivation for designing GANs is automation.
Manual extraction of features - automatic extraction of features
we are at"A text to understand deep learning (concept + advantages and disadvantages + typical algorithm)As mentioned in the book, the most special and powerful part of deep learning is the ability to learn feature extraction by yourself.
The super power of the machine can solve many problems that cannot be solved manually. After automation, the learning ability is stronger and the adaptability is stronger.
Manually judge the quality of the generated results - automatic judgment and optimization
we are at"Supervised learningAs mentioned in the article, the training set requires a large amount of manual labeling data, which is costly and inefficient. The same is true for the quality of the results of manual judgment, and there are problems of high cost and low efficiency.
GAN can automatically complete this process and continuously optimize it. This is a very efficient and low-cost way. How does GAN achieve automation? Now we explain his principle.
Basic Principles of Generating Adversarial Network GANs
Large vernacular version
I know that there is a very good explanation, everyone should understand:
Suppose that the security of a city is chaotic. Soon, countless thieves will appear in this city. Among these thieves, some may be masters of theft, and some may not be technical at all. If the city begins to rectify its law and order and suddenly launch a "campaign" to combat crime, the police will resume patrols in the city. Soon, a group of "skillless" thieves will be caught. The reason why the thieves who have no technical content are caught is because the police's skills are not working. After catching a group of low-end thieves, it is difficult to say how the public security level of the city has become, but it is clear that the city The average level of thieves has greatly improved.
The police began to train their own crime-killing techniques and began to seize the increasingly rampant thieves. With the arrest of these professional recidivists, the police have also developed special skills. They can quickly find suspicious people from a group of people, so they go to the front and check the suspects. The thieves have a hard time. Because the level of the police has greatly improved, if you still want to behave like a ghost, you will soon be caught by the police.
In order to avoid being arrested, the thieves worked hard to be less "suspicious", and while the devil was one foot tall and the road high, the police were constantly improving their level, trying to distinguish the thief from the innocent ordinary people. With this "communication" and "study" between the police and the thieves, the thieves have become very cautious. They have very high stealing skills and behave exactly like ordinary people, and the police have developed "eyes of fire" "When suspicious people are found, they can be found immediately and controlled in time-eventually, we get the strongest thief and the strongest police at the same time.
Non-big vernacular version
Generative Adversarial Network (GAN) consists of 2 important parts:
- Generator): Generate data through the machine (in most cases, the image), the purpose is to "cheat" the discriminator
- Discriminator): Determine whether the image is real or machine-generated, the purpose is to find out the "fake data" made by the generator.
The following describes the process in detail:
The first stage: fixed "discriminator D", training "generator G"
We use an OK discriminator, let a "generator G" continuously generate "false data", and then give this "discriminator D" to judge.
At the beginning, "Generator G" was still weak, so it was easy to get rid of.
However, with continuous training, the skills of "Generator G" continued to improve, and eventually deceived "Discriminator D".
At this time, the "discriminator D" basically belongs to the state of blind guessing, and the probability of judging whether it is false data is 50%.
The second stage: fixed "generator G", training "discriminator D"
When the first stage is passed, it is meaningless to continue training "Generator G". At this time we fixed "Generator G" and then started training "Discriminator D".
"Discriminator D" has improved his discrimination ability through continuous training. In the end, he can accurately judge all fake pictures.
At this time, "Generator G" has been unable to fool "Discriminator D".
Cycle Phase 1 and Phase 2
Through continuous cycles, the capabilities of "Generator G" and "Discriminator D" are getting stronger and stronger.
In the end we got a very good "generator G", and we can use it to generate the picture we want.
The actual application section below will show a lot of "stunning" cases.
If you are interested in the detailed technical principles of GAN, you can check out the following 2 articles:
'Generating Confrontation Network (GAN) Beginner's Guide – with code"
'Long text interpretation generates detailed principles against network GAN (20 minute reading)"
Advantages and disadvantages of GAN
3 advantages
- Better modeling of data distribution (images sharper and clearer)
- In theory, GANs can train any kind of generator network. Other frameworks require generator networks to have some specific form of functionality, such as the output layer being Gaussian.
- There is no need to use the Markov chain to repeatedly sample, without inferring in the learning process, without complicated variational lower bounds, avoiding the difficulty of approximating the difficult probability of calculation.
2 defects
- Hard to train, unstable. Good synchronization is required between the generator and the discriminator, but in actual training it is easy for D to converge and G to diverge. D/G training requires careful design.
- Mode Collapse issue. The learning process of GANs may have a missing pattern, the generator begins to degenerate, and the same sample points are always generated, and the learning cannot be continued.
Extended reading:Why is it so difficult to train against the network??"Reading this article is very demanding on mathematics."
10 typical GAN algorithms
There are hundreds of GAN algorithms, and everyone's research on GANs has increased exponentially. At present, hundreds of forums per month are on adversarial networks.
The following figure shows the number of papers published on GANs per month:
If you are interested in GANs algorithms,GANs Zoo"Look at almost all the algorithms. We have selected 10 more representative algorithms for you from many algorithms, and the technician can look at his paper and code.
algorithm | paper | 代码 |
---|---|---|
GAN | Paper address | Code address |
DCGAN | Paper address | Code address |
CGAN | Paper address | Code address |
CycleGAN | Paper address | Code address |
CoGAN | Paper address | Code address |
ProGAN | Paper address | Code address |
WGAN | Paper address | Code address |
SAGAN | Paper address | Code address |
BigGAN | Paper address | Code address |
The above content is organized fromGenerative Adversarial Networks – The Story So FarIn the original text, there are some rough explanations for the algorithm. If you are interested, you can take a look.
13 practical applications of GAN
GAN does not look as intuitive as "speech recognition" or "text mining". But his application has already entered our lives. Here are some practical applications of GANs.
Generate image dataset
The training of artificial intelligence requires a large number of data sets. If all of them are collected and labeled manually, the cost is very high. GAN can automatically generate some data sets to provide low-cost training data.
Generating face photos
Generating face photos is an application that everyone is familiar with, but the photos that are generated to be used are questions that need to be considered. Because this kind of face photo is still on the edge of the law.
Generate photos, cartoon characters
GAN can not only generate human faces, but also other types of photos, even comic characters.
Image to image conversion
Simply put, converting a form of image into another form of image is as magical as adding a filter. E.g:
- Convert drafts into photos
- Convert satellite photos to images of Google Maps
- Convert photos into oil paintings
- Convert daylight into night
Text to image conversion
The title in 2016 is " StackGAN: Image synthesis using realistic text from StackGAN The paper demonstrates the use of GAN, especially their StackGAN, to generate realistic photos from textual descriptions of simple objects such as birds and flowers.
Semantic-image-photo conversion
The title in 2017 is " High-resolution image synthesis and semantic manipulation with conditional GAN In the paper, it is demonstrated that a conditional GAN is used to generate a realistic image in the case of a semantic image or sketch as an input.
Automatically generate models
The title in 2017 is " Posture guide image generation In the paper, mannequins can be automatically generated and new poses can be used.
Photo to Emojis
GANs can automatically generate corresponding emoticons (Emojis) from face photos.
Photo editing
Using GAN can generate specific photos, such as changing hair color, changing facial expressions, and even changing gender.
Forecast the looks of different ages
Give a face photo, GAN can help you predict what you will look like at different ages.
Improve photo resolution and make photos clearer
Give GAN a photo, and he will be able to generate a higher resolution photo, making this photo clearer.
Photo fix
If there is a problem in an area in the photo (such as being painted or erased), GAN can repair the area and restore the original state.
Automatically generate 3D models
Given multiple 2D images at different angles, you can generate an 3D model.
Baidu Encyclopedia + Wikipedia
Generative Adversarial Networks (GAN) is a deep learning model and one of the most promising methods for unsupervised learning in complex distribution in recent years. The model passes (at least) two modules in the framework: the generated model (Generative ModelAnd discriminant models (Discriminative ModelThe mutual game learning produces a fairly good output. In the original GAN theory, G and D are not required to be neural networks, and only need to be able to fit the corresponding generation and discriminant functions. However, deep neural networks are generally used as G and D in practice. An excellent GAN application requires a good training method, otherwise the output may be unsatisfactory due to the freedom of the neural network model.
Generating Confrontation Network (GAN) is a type of artificial intelligence algorithm for unsupervised machine learning, implemented by two neural network systems that compete with each other in the zero-sum game framework. They were introduced by Ian Goodfellow and others. In 2014 this technology can generate observers of photos that look at least on the surface of real people, with many real-world features (although people in the test can really tell in many cases).
Comments