This article is organized as follows:
cs224d Day 7: Project 2-Using DNN to solve NER problem Course Project Description Address What is NER? Named entity recognition (NER) refers to the identification of entities with specific meanings in text, mainly including names of people, places, institutions, proper nouns, etc. Named entity recognition is an important basic tool in application fields such as information extraction, question-answering systems, syntactic analysis, and machine translation, and is an important step in extracting structured information. Excerpted from BosonNLP How to identify?I will first explain the logic of solving the problem, and then explain the main code. If you are interested, please go here to see the complete code. The code is to build a DNN with only one hidden layer under Tensorflow to handle the NER problem. 1. Problem identification: NER is a classification problem. Given a word, we need to determine which of the following four categories it belongs to based on the context. If it does not belong to any of the following four categories, then the category is 0, which means it is not an entity. Therefore, this is a problem that needs to be divided into 5 categories:
Our training data has two columns, the first column is the word and the second column is the label.
2. Model: Next we train it using a deep neural network. The model is as follows: The x^(t) of the input layer is the context with a window size of 3 centered on x_t. x_t is a one-hot vector. After x_t and L are applied, they become the corresponding word vector, and the length of the word vector is d = 50: We build a neural network with only one hidden layer, the hidden layer dimension is 100, y^ is the predicted value, the dimension is 5: Use cross entropy to calculate the error: J is differentiated with respect to each parameter: The following derivation formula is obtained: In TensorFlow, derivation is automatically implemented. Here, the Adam optimization algorithm is used to update the gradient and continuously iterate to make the loss smaller and smaller until convergence. 3. Specific implementationIn def test_NER(), we perform max_epochs iterations. Each time, we train the model with the training data to get a pair of train_loss and train_acc. Then we use this model to predict the validation data and get a pair of val_loss and predictions. We select the smallest val_loss and save the corresponding parameter weights. Finally, we use these parameters to predict the category label of the test data:
4. How is the model trained? First, import the data training, validation, and test:
After converting the words into one-hot vectors, convert them into word vectors:
Build the neural layer, including using xavier to initialize the first layer, L2 regularization and using dropout to reduce overfitting:
For more information about what L2 regularization and dropout are and how to reduce overfitting problems, please read this blog post, which summarizes them simply and clearly. Use cross entropy to calculate loss:
Then use Adam Optimizer to minimize the loss:
After each training, the corresponding weights that minimize the loss are obtained. In this way, the NER classification problem is solved. Of course, in order to improve accuracy and other issues, we still need to consult literature to learn. Next time, we will implement an RNN first. |
<<: AI helps you solve the problem of "too long to read": How to build a deep abstract summary model
The annual 618 e-commerce promotion is coming. Th...
Your familiarity with channels means whether you ...
The World Cup, which is held every four years, is...
How to build a good store brand? For example, fir...
Offline reasoning for large models Features Offli...
Founders often underestimate the importance of hir...
[[127859]] A year ago, I left San Francisco, sold...
More and more businesses are paying attention to ...
Again, the same words: The method is the method, ...
This article mainly introduces how newcomers on D...
The Internet has been developed for more than 20 ...
How much does it cost to join a moving app in Gui...
Bilibili has special platform attributes. From a ...
1. Introduction to iOS APP Listing Process Apply ...
As early as last year, 2015, some industry inside...