Natural Language Understanding (NLU) NLP What is the relationship?Why is it a difficult point in the field of artificial intelligence? What is the history of the development of NLU and the current state-of-the-art method?
This article will answer the above questions and give you a comprehensive understanding of Natural Language Understanding (NLU).
To learn more about NLP-related content, please visit the NLP topic, and a 59-page NLP document download is available for free.Visit the NLP topic and download a 59-page free PDF
What is Natural Language Understanding (NLU)?
The most common thing you hear is NLP, and Natural Language Understanding (NLU) is part of NLP:
What is natural language??
Natural language is the expression that everyone usually uses in life. What people usually say is "speaking people".
Natural language: I have a little camel on my back (unnatural language: my back is curved)
Natural language: baby's agent sleeps baby's baby
Natural language understanding means that the machine is like a human being, and has the ability to understand the language of a normal person. Because natural language has many difficulties in understanding (detailed below), NLU is still far from human performance.
Let's take a concrete look at the natural language understanding (NLU):
The dialogue system suddenly started to fire in 2015 year, mainly because of the popularity of a technology: machine learning, especially deep learning, and NLU (Natural Language Understanding) - the main solution is to recognize people.
The popularity of this technology has enabled many teams to master a set of key skills:Intent recognition and entity extraction.
what does this mean? Let's look at an example.
In life, if you want to book a ticket, people will have many natural expressions:
"book a flight";
“Is there a flight to Shanghai?”;
"Look at the flight and depart for New York next Tuesday";
"To travel, help me check the ticket";
It can be said that "natural expression" has an infinite number of combinations (natural language) that are all in the intention of "booking a ticket." Those who hear these expressions can accurately understand that these expressions refer to the "booking of tickets".
To understand such many different expressions is a challenge to the machine. In the past, machines could only deal with "structured data" (such as keywords), which means that if you want to understand what people are talking about, you must enter the precise instructions.
So, whether you say "I want to travel" or "Help me see the flight to Beijing", as long as these words do not contain the keyword "book booking" set in advance, the system can not handle. Moreover, as long as there are keywords, such as "I want to unsubscribe ticket" also have these three words, will also be processed into the user wants to book a ticket.
After the emergence of natural language understanding skills, the machine can be distinguished from the expressions of various natural languages, and which words belong to this intention; and those expressions are not attributed to this category, and no longer rely on such rigid keywords. For example, after training, the machine can identify “help me recommend a nearby restaurant”, which is not an expression of the intention of “booking a ticket”.
And, through training, the machine can also automatically extract "Shanghai" in the sentence, these two words refer to the concept of the destination (ie, the entity); "Next Tuesday" refers to the departure time.
In this way, it seems that "the machine can understand people!".
Application of Natural Language Understanding (NLU)
Almost all text- and speech-related applications use the NLU. Here are some specific examples.
Rule-based translations are often not very good, so if you want to improve the translation, you must build on the understanding of the content.
If you don't understand the context, the following jokes will appear:
I like apple, it's so fast!
I like "Apple", it is fast!
Machine customer service
If you want to achieve a question and answer, you must build on the understanding of multiple rounds of dialogue, natural language understanding is an essential ability.
The following examples are hard to understand for machines:
"How can I help you?"
"Hello, I want to complain"
"What is the license plate number of the complaint?"
"Excuse me, what is the problem?"
"As soon as I got in the car, the Gotham citizen with a bad attitude was angry at me"
The machine is easy to understand as: the attitude is bad / / Gotham / citizen / just ignite me
In smart speakers, NLU is also an important part. Many voice interactions are short phrases, and the speaker needs to recognize not only what the user is saying, but also the user's intention.
Machine: Help you turn the air conditioner up to 1
The user did not mention the air conditioner, but the machine needs to know the user's intention - the air conditioner is a bit cold and needs to be turned up.
Difficulties in Natural Language Understanding (NLU)
Here are some examples of machines that are not easy to understand:
- The principal said that there is nothing else in the clothes other than the school badge.
- Bad weather every day for a few days
- When I saw the light from the Ximen blowing snow, Ye Gucheng sneered and said: "I also want to blow the light that the snow blows," and then blow out the light.
- Today, more than Xie Xun shot to save, here I want to really thank "thank you thank you heroes shot"
- The tyrants put the US team on the ground while rubbing and brainwashing him. The iron man who was slain said: "The tyrant’s father’s horns are in the slap.
- Aunt, you estimate how many valleys and mushrooms in my bulging pocket! !
- "Do you see Wang Gang?" "Wang just just left."
- Zhang Jie accompanied the two daughters to jump the plaid: pretty, we don’t want to jump and jump over the plaid.
So for the machine, the NLU difficulty can be roughly classified as 5 class:
Difficult 1: the diversity of languages
Natural language has no general rules, and you can always find many exceptions.
In addition, the combination of natural language is very flexible, and the different combinations of words, words, phrases, sentences, paragraphs... can express many meanings. E.g:
I want to listen to the king calling me to the mountain.
Give me the big king to call me to the mountain.
I want to hear the song king call me to the mountain.
Let the first king call me to the mountain
Give a singer to call me to the mountain
Let the music king call me to the mountain
Let the song king call me to the mountain
Give the uncle to the first king to call me to the mountain.
Difficult 2: language ambiguity
If you don't contact the context and lack the constraints of the environment, the language is very ambiguous. E.g:
I am going to Lhasa.
- Need a train ticket?
- Need a plane ticket?
- Want to listen to music?
- Still want to find an attraction?
Difficult 3: language robustness
Natural language In the process of input, especially the text obtained through speech recognition, there will be problems such as multi-word, less words, typo, noise. E.g:
Dawang told me to come to Johor Bahru
The king called me to come to the mountain.
The king called me to patrol the mountain
Difficult 4: Knowledge Relating to Language
Language is a symbolic description of the world, and language is naturally connected to world knowledge, such as:
In addition to expressing fruit, it can also indicate the name of the restaurant.
Can indicate time or hotel name
There is a song called "Good Night"
Difficult 5: the context of the language
The concept of context includes many kinds: the context of the conversation, the context of the device, the context of the application, the user portrait...
U: Buy a train ticket
A: Where are you going?
U: Come to the first song.
A: What song do you want to hear?
Natural language understanding is similar to the history of the entire artificial intelligence, and it has experienced 3 iterations:
- Rule-based approach
- Statistical based method
- Deep learning based approach
At first, everyone judged the intention of natural language by summarizing the law. Common methods are: CFG, JSGF, etc.
Later, there was a statistical-based NLU approach. Common methods are:SVM, ME, etc.
With the outbreak of deep learning,CNN,RNN,LSTM Have become the latest "rulers."
In the 2019 year,BERT And the performance of GPT-2 shocked the industry, they are all used TransformerThe following will focus on Transformer because he is currently the "most advanced" method.
Comparison of Transformer and CNN / RNN
The principle of Transformer is more complicated, so I won't explain it in detail here. Interested friends can check the following article, which is very detailed:
'BERT fire does not understand Transformer? Reading this one is enough."
The following will pick a part ofWhy Self-Attention? A Targeted Evaluation of Neural Machine Translation ArchitecturesThe data in the "intuitive" let everyone see the comparison of 3.
Semantic feature extraction
From the perspective of semantic feature extraction capabilities, the current experiment supports the following conclusions: Transformer's ability in this respect is significantly more than RNN and CNN (in the WSD for the task of examining semantic capabilities, Transformer exceeds RNN and CNN by 4-8 absolute percentage points) ), RNN and CNN are not much different.
Long-distance feature capture capability
The native CNN feature extractor is very significantly weaker than RNN and Transformer in this respect. Transformer is slightly better than RNN model (especially when the subject-predicate distance is less than 13), and its ability is ranked from strong to weak as Transformer>RNN>>CNN; but in At a relatively long distance (the subject-predicate distance is greater than 13), RNN is slightly better than Transformer, so comprehensively, it can be considered that Transformer and RNN have little difference in this aspect, while CNN is significantly weaker than the former two.
Task comprehensive feature extraction ability
Transformer's comprehensive ability is obviously stronger than RNN and CNN (you need to know that it is very difficult for BLEU to increase 1 points in the absolute stage of technology development), while RNN and CNN look basically the same, it seems that CNN performs slightly better. some.
Parallel computing power and computational efficiency
Transformer Base is the fastest, CNN is second, Transformer Big is the second, and the slowest is RNN. RNN is between 3 and tens of times slower than the first two.
Regarding Transformer, I recommend a few excellent articles for everyone, so that everyone has a more comprehensive understanding:
'Abandon fantasy, fully embrace Transformer: natural language processing three feature extractors (CNN/RNN/TF) comparison"
'From Word Embedding to Bert Model - History of Pre-training Technology in Natural Language Processing"
'Amazing GPT 2.0 model: What did it tell us?"
Baidu Encyclopedia - Wikipedia
Natural Language Processing (NLP) is a technique for communicating with computers using natural language. Because the key to dealing with natural language is to let computers "understand" natural language, natural language processing is also called natural language understanding (NLU, Natural). Language Understanding), also known as Computational Linguistics. On the one hand, it is a branch of language information processing, on the other hand it is one of the core topics of artificial intelligence (AI).
Natural Language Understanding (NLU) or Natural Language Interpretation (NLI) is a sub-theme of natural language processing in artificial intelligence and machines involving reading comprehension. Natural language understanding is considered a problem of artificial intelligence.
Because of its application to automatic reasoning, machine translation, question and answer, news gathering, text categorization, voice activation, archiving and large-scale content analysis, the field has considerable commercial benefits. . The NLU is a post-processing of text after using the NLP algorithm (recognition of part of speech, etc.), which utilizes context (automatic speech recognition) from the recognition device [ASR], visual recognition, last session, misrecognition from ASR, personalized profile, microphone proximity, etc.), in all its forms, distinguishing the meaning of fragments and continuous sentences to perform intents usually from voice commands. The NLU has a body that is vertical around a particular product and is used to calculate the probability of intent. The NLU has a defined list of known intents that derive the message payload from the specified context information identification source. The NLU will provide multiple message outputs to separate the service (software) or resources (hardware) from the single derived intent (the response to the voice command originator with visual sentences (displayed or spoken) and the converted voice command message will consume Too many different output messages are used for M2M communication and actions).
Cash >> advanced
Thank you very much, it has been corrected