The advent of the big data era has brought unprecedented data dividends to the rapid development of artificial intelligence. Under the "feeding" of big data, artificial intelligence technology has made unprecedented progress. Its progress is prominently reflected in related fields such as knowledge engineering represented by knowledge graphs and machine learning represented by deep learning. As the dividends of deep learning for big data are exhausted, the ceiling of the effect of deep learning models is approaching. On the other hand, a large number of knowledge graphs continue to emerge, and these treasure troves containing a large amount of human prior knowledge have not yet been effectively utilized by deep learning. The integration of knowledge graphs and deep learning has become one of the important ideas to further improve the effect of deep learning models. Symbolism represented by knowledge graphs and connectionism represented by deep learning are increasingly moving away from the original independent development track and embarking on a new path of coordinated progress. Historical background of the integration of knowledge graph and deep learning Big data brings unprecedented data dividends to machine learning, especially deep learning. Thanks to large-scale labeled data, deep neural networks can learn effective hierarchical feature representations, thus achieving excellent results in areas such as image recognition. However, as the data dividend disappears, deep learning is increasingly showing its limitations, especially in terms of reliance on large-scale labeled data and difficulty in effectively utilizing prior knowledge. These limitations hinder the further development of deep learning. On the other hand, in a large number of practices of deep learning, people are increasingly finding that the results of deep learning models often conflict with people's prior knowledge or expert knowledge. How can deep learning get rid of its dependence on large-scale samples? How can deep learning models effectively utilize a large amount of prior knowledge? How to make the results of deep learning models consistent with prior knowledge has become an important issue in the current field of deep learning. At present, human society has accumulated a large amount of knowledge. In particular, in recent years, driven by knowledge graph technology, various machine-friendly online knowledge graphs have emerged in large numbers. Knowledge graph is essentially a semantic network that expresses various entities, concepts and their semantic relationships. Compared with traditional knowledge representation forms (such as ontology and traditional semantic network), knowledge graph has the advantages of high entity/concept coverage, diverse semantic relationships, friendly structure (usually expressed in RDF format) and high quality, making knowledge graph increasingly become the most important knowledge representation method in the era of big data and artificial intelligence. Whether the knowledge contained in the knowledge graph can be used to guide the learning of deep neural network models and thus improve the performance of the models has become one of the important issues in the study of deep learning models. At present, the method of applying deep learning technology to knowledge graphs is relatively direct. A large number of deep learning models can effectively complete end-to-end entity recognition, relationship extraction and relationship completion tasks, and can then be used to build or enrich knowledge graphs. This paper mainly discusses the application of knowledge graphs in deep learning models. From the current literature, there are two main ways. One is to input the semantic information in the knowledge graph into the deep learning model; express the discretized knowledge graph as a continuous vector, so that the prior knowledge of the knowledge graph can become the input of deep learning. The second is to use knowledge as a constraint of the optimization target to guide the learning of the deep learning model; usually, the knowledge in the knowledge graph is expressed as a posterior regular term of the optimization target. There are many papers on the research work of the former, and it has become a current research hotspot. The vector representation of the knowledge graph has been effectively applied as an important feature in practical tasks such as question answering and recommendation. The research on the latter has just started, and this paper will focus on the deep learning model with first-order predicate logic as a constraint. Knowledge graph as input for deep learning Knowledge graphs are a typical example of recent progress in symbolism in artificial intelligence. Entities, concepts, and relationships in knowledge graphs are all represented in discrete, explicit symbolic representations. However, these discrete symbolic representations are difficult to directly apply to neural networks based on continuous numerical representations. In order to enable neural networks to effectively utilize the symbolic knowledge in knowledge graphs, researchers have proposed a large number of representation learning methods for knowledge graphs. Representation learning of knowledge graphs aims to learn real-valued vectorized representations of the constituent elements (nodes and edges) of knowledge graphs. These continuous vectorized representations can be used as inputs to neural networks, allowing neural network models to make full use of the large amount of prior knowledge in knowledge graphs. This trend has spawned a large number of studies on representation learning of knowledge graphs. This chapter first briefly reviews representation learning of knowledge graphs, and then further introduces how these vector representations are applied to various practical tasks based on deep learning models, especially practical applications such as question answering and recommendation. 1. Representation Learning of Knowledge Graphs The representation learning of knowledge graph aims to learn the vectorized representation of entities and relations. The key is to reasonably define the loss function ƒr(h,t) about facts (triplets <h,r,t>) in the knowledge graph, where and are the vectorized representations of the two entities h and t in the triplet. Usually, when the fact <h,r,t> holds, it is expected to minimize ƒr(h,t). Considering the facts of the entire knowledge graph, it can be minimized by To learn the vectorized representation of entities and relations, O represents the set of all facts in the knowledge graph. Different representation learning can use different principles and methods to define the corresponding loss function. Here, the basic idea of knowledge graph representation is introduced based on the distance and translation model [1]. Distance-based models. The representative work is the SE model [2]. The basic idea is that when two entities belong to the same triple <h, r, t>, their vector representations should also be close to each other in the projected space. Therefore, the loss function is defined as the distance after vector projection The matrices Wr,1 and Wr,2 are used for the projection operation of the head entity h and the tail entity t in the triple. However, since SE introduces two separate projection matrices, it is difficult to capture the semantic correlation between entities and relations. To address this problem, Socher et al. used a third-order tensor to replace the linear transformation layer in the traditional neural network to characterize the scoring function. Bordes et al. proposed an energy matching model, which captures the interactive relationship between entity vectors and relationship vectors by introducing the Hadamard product of multiple matrices. Representation learning based on translation. Its representative work TransE model describes the correlation between entities and relations through vector translation in vector space[3]. The model assumes that if <h,r,t> holds, the embedding representation of the tail entity t should be close to the embedding representation of the head entity h plus the relation vector r, that is, h+r≈t. Therefore, TransE adopts As a scoring function. When a triple is established, the score is low, otherwise the score is high. TransE is very effective in dealing with simple 1-1 relationships (i.e., the ratio of the number of entities connected at both ends of the relationship is 1:1), but its performance is significantly reduced when dealing with complex relationships of N-1, 1-N, and NN. For these complex relationships, Wang proposed the TransH model to learn different representations of entities under different relationships by projecting the entities to the hyperplane where the relationship is located. Lin proposed the TransR model to project the entities to the relationship subspace through the projection matrix, thereby learning different entity representations under different relationships. In addition to the two typical knowledge graph representation learning models mentioned above, there are a large number of other representation learning models. For example, Sutskever et al. used tensor factorization and Bayesian clustering to learn relational structures. Ranzato et al. introduced a three-way restricted Boltzmann machine to learn the vectorized representation of knowledge graphs and parameterized it through a tensor. The current mainstream knowledge graph representation learning methods still have various problems, such as the inability to well describe the semantic correlation between entities and relationships, the inability to well handle the representation learning of complex relationships, the model is too complex due to the introduction of a large number of parameters, and the computational efficiency is low and difficult to expand to large-scale knowledge graphs, etc. In order to better provide prior knowledge for machine learning or deep learning, the representation learning of knowledge graphs is still a long-term research topic. Application of knowledge graph vectorization representation Application 1 Question answering system. Natural language question answering is an important form of human-computer interaction. Deep learning makes it possible to generate question answers based on question answering corpus. However, most deep question answering models still find it difficult to use a large amount of knowledge to achieve accurate answers. Yin et al. proposed a deep learning question answering model based on an encoder-decoder framework for simple factual questions, which can make full use of the knowledge in the knowledge graph [4]. In deep neural networks, the semantics of a question is often represented as a vector. Questions with similar vectors are considered to have similar semantics. This is a typical approach of connectionism. On the other hand, the knowledge representation of the knowledge graph is discrete, that is, there is no gradual relationship between knowledge and knowledge. This is a typical approach of symbolism. By vectorizing the knowledge graph, questions can be matched with triples (that is, their vector similarity is calculated), so as to find the best triple matching from the knowledge base for a specific question. The matching process is shown in Figure 1. For question Q: "How tallis Yao Ming?", first represent the words in the question as a vector array HQ. Then search for candidate triples in the knowledge graph that can match it. Finally, for these candidate triples, the semantic similarity between the question and different attributes is calculated respectively. It is determined by the following similarity formula: Here, S(Q,τ) represents the similarity between question Q and candidate triplet τ; xQ represents the vector of the question (calculated from HQ), uτ represents the vector of the triplet of the knowledge graph, and M is the parameter to be learned. Application 2 Recommendation system. Personalized recommendation system is one of the important intelligent services of major social media and e-commerce websites on the Internet. With the increasing application of knowledge graphs, a large number of research works have realized that the knowledge in knowledge graphs can be used to improve the content (feature) description of users and items in content-based recommendation systems, thereby improving the recommendation effect. On the other hand, recommendation algorithms based on deep learning are increasingly superior to traditional recommendation models based on collaborative filtering in terms of recommendation effect [5]. However, research on personalized recommendation by integrating knowledge graphs into the framework of deep learning is still relatively rare. Zhang et al. made such an attempt. The authors made full use of three typical types of knowledge: structured knowledge (knowledge graphs), textual knowledge, and visual knowledge (images) [6]. The authors obtained the vectorized representation of structured knowledge through network embedding, and then used SDAE (Stacked Denoising Auto-Encoder) and stacked convolution-autoencoder to extract textual knowledge features and image knowledge features respectively; and finally integrated the three types of features into the collaborative ensemble learning framework, and used the integration of the three types of knowledge features to achieve personalized recommendation. The authors conducted experiments on movie and book datasets and proved that this recommendation algorithm that integrates deep learning and knowledge graphs has good performance. Knowledge graphs as constraints for deep learning Hu et al. proposed a model that integrates first-order predicate logic into deep neural networks and successfully used it to solve problems such as sentiment classification and named entity recognition [7]. Logical rules are a flexible form of representation for high-order cognition and structured knowledge, and are also a typical form of knowledge representation. It is of great significance to introduce various logical rules that people have accumulated into deep neural networks and use human intentions and domain knowledge to guide neural network models. Some other research works have attempted to introduce logical rules into probabilistic graphical models. The representative of this type of work is Markov logic networks [8], but few works have been able to introduce logical rules into deep neural networks. The solution framework proposed by Hu et al. can be summarized as a "teacher-student network", as shown in Figure 2, which includes two parts: teacher network q(y|x) and student network pθ(y|x). The teacher network is responsible for modeling the knowledge represented by the logical rules, and the student network uses the back-propagation method plus the constraints of the teacher network to learn the logical rules. This framework can introduce logical rules to most tasks that use deep neural networks as models, including sentiment analysis, named entity recognition, etc. By introducing logical rules, the effect can be improved on the basis of the deep neural network model. The learning process mainly includes the following steps:
Conclusion With the further development of deep learning research, how to effectively utilize a large amount of prior knowledge and reduce the model's dependence on large-scale labeled samples has gradually become one of the mainstream research directions. Representation learning of knowledge graphs has laid the necessary foundation for exploration in this direction. Some recent pioneering work on integrating knowledge into deep neural network models is also quite inspiring. But overall, the current deep learning models still have very limited means of using prior knowledge, and the academic community still faces huge challenges in exploring this direction. These challenges are mainly reflected in two aspects:
|
<<: How to explain deep learning to non-professionals?
>>: iOS - Solution to NSTimer circular reference
Recently, scientists discovered a giant galaxy in...
Tonight is the day of the annual Geminid meteor s...
This year, there has been a global chip shortage,...
Android M made its debut, and many developers wit...
The article is a sharing of operational experienc...
Audit expert: Zhang Yuhong Chief Physician of Der...
The 2020 Lunar New Year is approaching. This is a...
The Spring Festival holiday has always been a tim...
How to achieve growth in the field of online educ...
Acidic foods cause cancer, alkaline foods are hea...
According to the China Manned Space Engineering O...
Brands may have concerns about doing content mark...
No different from the earlier news, Faraday Futur...
As summer turns to winter, the Paris Olympics hav...
In the hot summer, the temptation of delicious fo...