Word2vec is Word Embedding One of the ways NLP field. He is the process of converting words into "computable" and "structured" vectors. This article will explain the principles, advantages and disadvantages of Word2vec.
This way is more mainstream before 2018, but with BERTThe emergence of GPT2.0, this method is not the best way to do it.
To learn more about NLP-related content, please visit the NLP topic, and a 59-page NLP document download is available for free.
What is Word2vec?
What is Word Embedding?
Before explaining Word2vec, we need to explain Word Embedding. It is the translation of the words "uncomputable" and "unstructured" into "computable" and "structured" vectors.
This step is to "transform the real problem into a mathematical problem"Is a very critical step in artificial intelligence.
To learn more, see this article: "A word to understand word embedding (compared with other text representations + 2 mainstream algorithms)"
Turning a real problem into a mathematical problem is only the first step, and you need to solve this mathematical problem later. So the model of Word Embedding itself is not important.The important thing is the generated result - the word vector. This word vector is used directly in subsequent tasks.
What is Word2vec?
Word2vec is one of the ways of Word Embedding. He was a new word embedding method proposed by Google's Mikolov in 2013.
The location of Word2vec throughout NLP can be represented by the following image:
Before the advent of Word2vec, there were already some ways of Word Embedding, but the previous methods were not mature and there was no large-scale application.
The training model and usage of Word2vec are detailed below.
2 training mode for Word2vec
CBOW (Continuous Bag-of-Words Model) and Skip-gram (Continuous Skip-gram Model) are two training modes of Word2vec. Here's a simple explanation:
The current value is predicted by context. Equivalent to deducting a word from a sentence, letting you guess what the word is.
Use the current word to predict the context. It is equivalent to giving you a word that lets you guess what words may appear in front and behind.
To increase speed, Word2vec often uses 2 acceleration methods:
- Negative Sample
- Hierarchical Softmax
The specific acceleration method will not be explained in detail, and you can find the information yourself if you are interested.
Advantages and disadvantages of Word2vec
It should be noted that Word2vec is the product of the previous generation (before 18). After 18, you want to get the best results. You have not used Word Embedding, so you will not use Word2vec.
- Since Word2vec considers the context, it works better than the previous Embedding method (but not as good as 18 years later)
- Less dimension than the previous Embedding method, so it's faster
- Versatile and can be used in a variety of NLP tasks
Things to note:
- Since words and vectors are one-to-one, the problem of polysemous words cannot be solved.
- Word2vec is a static way, although it is versatile, it cannot be dynamically optimized for specific tasks.