I understand deep learning in one article

Deep learning has a good performance and led the third wave of artificial intelligence. At present, most of the outstanding applications use deep learning, and the AlphaGo is used for deep learning.

This article will introduce you to the basic concepts, advantages and disadvantages of deep learning and several mainstream algorithms.


Deep learning, neural network, machine learning, artificial intelligence

Deep learning, machine learning, artificial intelligence

simply put:

  1. Deep learning is a branch of machine learning (the most important branch)
  2. Machine learning is a branch of artificial intelligence

The relationship between deep learning, machine learning, and artificial intelligence

Most of the best-performing applications are deep learning. It is precisely because of the outstanding performance of deep learning that the third wave of artificial intelligence has been triggered. For details, please seeThe history of artificial intelligence - 3 AI wave"


Learn more about artificial intelligence:What is "2019 Update" for artificial intelligence? (The essence of AI + development history + limitations)"

Learn more about machine learning:"75 Page PDF Free Download" for everyone's machine learning science"


Deep learning, neural network

The concept of deep learning stems from the study of artificial neural networks, but it is not exactly equal to traditional neural networks.

However, in terms of the name, many deep learning algorithms will include the word "neural network", such as: convolutional neural network, recurrent neural network.

Therefore, deep learning can be said to be an upgrade based on traditional neural networks, which is approximately equal to neural networks.

The relationship between deep learning and neural networks


Explaining deep learning in the vernacular

I read many versions of the explanation and found that Li Kaifu was in人工智能The book is the easiest to understand, so I will quote his explanation directly below:

Let us take the example of identifying Chinese characters in pictures.

Assume that the information to be processed by deep learning is “water flow”, and the deep learning network that processes data is a huge water network composed of pipes and valves. The entrance to the network is a number of duct openings, and the exit of the network is also a number of duct openings. This water pipe network has many layers, each of which consists of a number of regulating valves that control the flow and flow of the water. According to the needs of different tasks, the number of layers of the water network and the number of regulating valves per layer can be differently combined. For complex tasks, the total number of regulating valves can be tens of thousands or more. In the water network, each regulating valve on each floor is connected to all the regulating valves on the next layer through a water pipe to form a water flow system that is completely connected from front to back and connected layer by layer.

Deep learning is similar to a water flow system

So how can computers use this huge water network to learn literacy?

For example, when a computer sees a picture with the word "田", it simply makes up all the numbers that make up the picture (in the computer, each color point of the picture is composed of "0" and "1". The numbers all represent the flow of information, which is poured into the water network from the entrance.

Deep learning - digitize pictures

We pre-insert a plate at each exit of the water network, corresponding to every Chinese character we want the computer to know. At this time, because the input is the Chinese character "田", when the water flows through the entire water pipe network, the computer will go to the pipe exit position to see if the water flow from the pipe outlet of the word "田" is the most. . If so, it means that the pipeline network meets the requirements. If this is not the case, adjust each flow regulating valve in the water network to allow the “field” to exit the “outflow” of water.

This time, the computer has to be busy for a while, to adjust so many valves! Fortunately, the speed of the computer, the calculation of violence and the optimization of the algorithm, can always give a solution quickly, adjust all the valves, and let the flow at the exit meet the requirements.

Deep learning - identifying the field

Next, when we learn the word "申", we use a similar method to turn each picture with the word "申" into a stream of digital numbers, into the water network, and see if it is The pipe with the word "申" has the most outlet flow. If not, we have to adjust all the valves. This time, we must ensure that the word "Tian" that we have just learned is not affected, and that the new "Shen" word can be correctly handled.

Deep learning - learning application

Repeatedly, it is known that the water flow corresponding to all Chinese characters can flow through the entire water pipe network in a desired manner. At this time, we said that this water pipe network is a trained deep learning model. When a large number of Chinese characters are processed by this pipeline network and all valves are adjusted in place, the entire water network can be used to identify Chinese characters. At this time, we can "weld" all the valves that have been adjusted, waiting for the new water to flow.

Deep learning - learning all Chinese characters

Similar to what is done during training, an unknown picture is converted into a stream of data by the computer and poured into the trained water network. At this time, the computer only needs to observe which water outlet flows out of the water outlet, and which word is written in this picture.

Deep learning is roughly the whole structure constructed by human mathematical knowledge and computer algorithms. Combined with as much training data as possible and the large-scale computing power of the computer to adjust the internal parameters, the semi-theoretical and half of the problem target are as close as possible. The way in which experience is modeled.


Traditional machine learning VS deep learning

Similarities between traditional machine learning and deep learning

Similarities between traditional machine learning and deep learning

The two are very similar in terms of data preparation and pre-processing.

They may all do something with the data:

  • Data cleaning
  • Data label
  • Normalized
  • Denoising
  • Dimensionality reduction

For those interested in data preprocessing, you can take a look.The most common 6 big problem in AI dataset (with solution)"


The core difference between traditional machine learning and deep learning

The core difference between traditional machine learning and deep learning

The feature extraction of traditional machine learning mainly relies on manual. It is simple and effective to extract features manually for specific simple tasks, but it is not universal.

The feature extraction of deep learning is not dependent on labor, but is automatically extracted by the machine. This is why everyone says that deep learning is poorly explainable, because sometimes deep learning can perform well, but we don't know what his principles are.


Advantages and disadvantages of deep learning

Advantages and disadvantages of deep learning

Advantages 1: strong learning ability

From the results, the performance of deep learning is very good, and his learning ability is very strong.

Advantages 2: wide coverage and good adaptability

The deep learning neural network has many layers and a wide breadth. In theory, it can be mapped to arbitrary functions, so it can solve very complicated problems.

Advantage 3: data driven, high ceiling

Deep learning is highly dependent on data, and the larger the amount of data, the better his performance. In image recognition, face recognition,NLP Some of the tasks have even surpassed human performance. At the same time, he can further increase his upper limit by adjusting the parameters.

Advantage 4: good portability

Due to the excellent performance of deep learning, there are many frameworks that can be used, for example TensorFlow,Pytorch. These frameworks are compatible with many platforms.


Disadvantages 1: large amount of calculation, poor portability

Deep learning requires a lot of data and a lot of computing power, so the cost is high. And many applications are not yet suitable for use on mobile devices. There are already many companies and teams working on chips for portable devices. This issue will be resolved in the future.

Disadvantages 2: high hardware requirements

Deep learning requires a lot of computing power, and ordinary CPUs can no longer meet the requirements of deep learning. Mainstream computing power is used GPUTPUTherefore, the requirements for hardware are high and the cost is high.

Disadvantages 3: complex model design

The model design of deep learning is very complicated, and it takes a lot of manpower, material and time to develop new algorithms and models. Most people can only use off-the-shelf models.

Disadvantage 4: No "human nature", prone to prejudice

Because deep learning relies on data and is not interpretable. Gender discrimination and racial discrimination can arise when training data is unbalanced.


4 typical deep learning algorithm

4 typical deep learning algorithm

Convolutional Neural Network-CNN

CNN the value of:

  1. Ability to effectively reduce the amount of large data to a small amount of data (without affecting the results)
  2. Ability to preserve the characteristics of the image, similar to the human visual principle

The basic principle of CNN:

  1. Convolution layer – the main role is to preserve the characteristics of the picture
  2. Pooling layer – the main function is to reduce the dimensionality of the data, which can effectively avoid over-fitting
  3. Fully connected layer – output the results we want according to different tasks

Practical application of CNN:

  1. Image classification, retrieval
  2. Target location detection
  3. Target segmentation
  4. Face recognition
  5. Skeletal recognition

understand more"A paper to understand the convolutional neural network - CNN (basic principle + unique value + practical application)"


Recurrent Neural Network-RNN

RNN It is an algorithm that can effectively process sequence data. For example: article content, voice audio, stock price trend...

The reason why he can process sequence data is because the input in front of the sequence will also affect the output behind, which is equivalent to having a "memory function". But RNN has serious short-term memory problems, and long-term data has little impact (even if it is important information).

So based on RNN LSTM And variant algorithms such as GRU. These variant algorithms have several main features:

  1. Long-term information can be effectively retained
  2. Pick important information to keep, unimportant information will choose "forgotten"

A few typical applications of RNN are as follows:

  1. Text generation
  2. Speech Recognition
  3. machine translation
  4. Generate image description
  5. Video tag

understand more"A text to understand the cyclic neural network - RNN (unique value + optimization algorithm + practical application)"


Generate a confrontation network – GANs

Suppose a city is in chaos, and soon there will be countless thieves in the city. Among these thieves, some may be master thieves, and some may have no technology at all. If the city began to revive its law and order, and suddenly launched a "sports" to fight crime, the police began to resume patrols in the city. Soon, a group of "skilled artists" thieves were caught. The reason why they caught the thieves who didn’t have the technical content was because the police’s skills were not good. After catching a group of low-end thieves, it’s hard to say how the city’s security level has become worse, but it’s obvious that the city The average level of thieves has been greatly improved.

The police began to train their own crime-killing techniques and began to seize the increasingly rampant thieves. With the arrest of these professional recidivists, the police also practiced special skills. They can quickly find suspicious people from a group of people, so they go to the front and check the suspects. The thieves have a hard time. Because the level of the police has greatly improved, if you still want to behave like a ghost, you will soon be caught by the police. In order to avoid being arrested, the thieves tried to behave less suspiciously, while the magic height was one foot and the height was one foot. The police also constantly improved their level and tried to distinguish the thief from the innocent ordinary people. With this kind of "communication" and "learning" between the police and the thief, the thieves have become very cautious. They have extremely high stealing skills and behave like ordinary people. The police have all trained themselves. Once suspicious people are discovered, they can be immediately detected and controlled in time - eventually, we get the strongest thief and the strongest police.

At the same time, I got the strongest thief and the strongest policeman.

understand more"What is a Generative Adversarial Network-GAN? (Basic concept + working principle)"


Deep reinforcement learning-RL

The idea of ​​reinforce learning algorithms is very simple. Take games as an example. If you adopt a strategy in the game to achieve higher scores, then further strengthen this strategy in order to continue to achieve better results. This strategy is very similar to the various "performance rewards" in everyday life. We often use such strategies to improve our game level.

In the game Flappy bird, we need a simple click to control the birds, avoid the various water pipes, and fly as far as possible, because the farther you fly, you can get higher points rewards.

This is a typical intensive learning scenario:

  • The machine has a clear bird role - agent
  • Need to control the bird farther away - the goal
  • Need to avoid all kinds of water pipes throughout the game - environment
  • The way to avoid the water pipe is to let the bird fly hard - action
  • The farther you fly, the more points you will earn - rewards

The game is a typical intensive learning scene

You will find that the biggest difference between intensive learning and supervised learning, unsupervised learning is that you don't need a lot of “data feeding”. Instead, learn some skills by trying to keep on trying.

understand more:"What do you understand in a text is reinforcement learning? (basic concept + application scenario + mainstream algorithm)"


Final Thoughts

Deep learning belongs to the category of machine learning. Deep learning can be said to be an upgrade based on traditional neural networks, which is equivalent to neural network.

Deep learning and traditional machine learning are similar in terms of data preprocessing. The core difference is in the feature extraction process. The deep learning is done by the machine itself, and no manual extraction is required.


The advantages of deep learning:

  1. High learning capibility
  2. Wide coverage and good adaptability
  3. Data driven, high ceiling
  4. Good portability

Disadvantages of deep learning:

  1. Large amount of calculation and poor portability
  2. High hardware requirements
  3. Complex model design
  4. Without "human nature", prejudice easily exists


4 typical algorithm for deep learning:

  1. Convolutional Neural Network-CNN
  2. Recurrent Neural Network-RNN
  3. Generate a confrontation network – GANs
  4. Deep reinforcement learning-RL


Baidu Encyclopedia Version + Wikipedia

Baidu Encyclopedia version

The concept of deep learning stems from the study of artificial neural networks. A multilayer perceptron with multiple hidden layers is a deep learning structure. Deep learning combines low-level features to form more abstract high-level representation attribute categories or features to discover distributed feature representations of data.

The concept of deep learning was proposed by Hinton et al. in 2006. Based on the Deep Trusted Network (DBN), an unsupervised greedy layer-by-layer training algorithm is proposed to bring about the hope of solving the deep structure-related optimization problems. Then the deep structure of the multi-layer automatic encoder is proposed. In addition, the convolutional neural network proposed by Lecun et al. is the first true multi-layer structure learning algorithm that uses spatial relative relationships to reduce the number of parameters to improve training performance.

Deep learning is a method based on the representation of data in machine learning. Observations (e.g., an image) can be represented in a variety of ways, such as a vector of each pixel intensity value, or more abstractly represented as a series of edges, regions of a particular shape, and the like. It is easier to learn tasks from instances (eg, face recognition or facial expression recognition) using some specific representation methods. The advantage of deep learning is the use of unsupervised or semi-supervised feature learning and hierarchical feature extraction efficient algorithms instead of manual acquisition features.

Deep learning is a new field in machine learning research. Its motivation is to build and simulate a neural network for human brain analysis and learning. It mimics the mechanism of the human brain to interpret data such as images, sounds and texts.

Like the machine learning method, the deep machine learning method also has the distinction between supervised learning and unsupervised learning. The learning models established under different learning frameworks are very different. For example, Convolutional Neural Networks (CNNs) is a machine learning model under deep supervision and learning, and Deep Belief Nets (DBNs) is a machine learning model under unsupervised learning. .

Read More


Wikipedia version

Deep learning (also known as deep structured learning or hierarchical learning) is part of a broader family of machine learning methods based on learning data representation, rather than task-specific algorithms. Learning can be supervised, semi-supervised or unsupervised.

Deep learning architectures such as deep neural networks, deep belief networks and recurrent neural networks have been applied to computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, bioinformatics, drug design, medical image analysis. And other fields. Material inspection and board game programs, which produce results comparable to human experts and, in some cases, superior to human experts.

The deep learning model is inspired by the fuzzy processing of information processing and communication patterns in the biological nervous system, but there are various differences in the structural and functional characteristics of the biological brain (especially the human brain), which makes them incompatible with neuroscience evidence.

Read More