From a neuroscience perspective, trying to read this article is a complex task. At the moment, you may be bombarded by emails, news, notifications on our mobile phones, usually annoying colleagues and other disturbances that make your brain rotate in multiple directions. In order to read this little article or perform many other cognitive tasks, you need to focus and pay attention.
Attention is a cognitive skill that is essential to the formation of knowledge. However, the dynamics of attention has been a mystery for neuroscientists for centuries, and just recently we have made major breakthroughs that help explain how attention works. In the context of deep learning programs, establishing attention dynamics seems to be an obvious step to improve the model's knowledge and adapt the model to different scenarios. Building attention mechanisms into deep learning systems is a very active emerging research area. A few months ago, researchers on the Google Brain teamPublished a paper, Details some of the key models that can be used to simulate attention in deep neural networks.
How does it work?
In order to understand the attention in deep learning systems, it may be useful to study how this cognitive phenomenon occurs in the human brain. From a neuroscience perspective, attention is the ability of the brain to selectively focus on one aspect of the environment and ignore others. Current research has identified two main types of attention, both of which are related to different regions of the brain. Object-based attention usually refers to the ability of the brain to focus on a specific object, such as a picture of a part of this article. Space-based attention is primarily related to attention to specific locations. Both types of attention are related to deep learning models. Although object-based attention can be used in systems such as image recognition or machine translation, spatial attention is related to deep reinforcement learning scenarios such as autonomous vehicles.
Attention interface in deep neural networks
When it comes to deep learning systems, different technologies have been created in order to simulate different types of attention. Google research papers focus on recurrent neural networks (RNN) Related to the four basic models. Why use RNN? Well, RNN is a network that is mainly used to process sequential data and gain more advanced knowledge. As a result, RNNs are often used to improve other neural network models (such as convolutional neural networks (CNN) Or the second step of generating the interface). Establishing the attention mechanism in RNN can help improve the understanding of different deep neural models. The Google Brain team has identified four techniques for attracting attention to RNN models:
· Neural Turing Machine:The simplest attention interface, the neural Turing machine (non-tariff measure) adds the structure of the memory to traditional RNNs. Using a memory structure allows the ATM to specify a "attention distribution" section that describes the area that the model should focus on. Last year iNTM overview releasedAnd explored some of its basic concepts. NTM implementations can be found in many popular deep learning frameworks, such as TensorFlow and Theano.
· Note the interface:Note that the interface uses an RNN model to focus on a specific part of another neural network. A classic example of this technique can be found in an image recognition model using CNN-RNN duplex. In this architecture, the RNN will focus on specific parts of the image generated by the CNN to improve it and improve the quality of the knowledge.
· Adaptive calculation time:This is a completely new technique that allows RNNs to perform multiple calculation steps at each time step. What does this have to do with attention? Quite simply, the standard RNN performs the same amount of calculations in each step. Adaptive Computational Time technology uses an attention distribution model to determine the number of steps per run, allowing you to focus more on specific parts of the model.
· Neural programmer: neural programmerIs a fascinating new field in deep learning, focusing on learning to create programs to solve specific tasks. In fact, it learned to generate such programsExamples without proper procedures. It discovered how to generate programs as a means to accomplish certain tasks. Conceptually, neural programmer technology attempts to bridge the gap between neural networks and traditional programming techniques, which can be used to develop attention mechanisms in deep learning models.
Attention is one of the most complex cognitive abilities of the human brain. Simulating attention mechanisms in neural networks can open up new fascinating possibilities for the future of deep learning.
This article was transferred from awardsdatascience,Original address
Comments