above Artificial intelligence algorithms and Python big data Get more dry goods
On the top right ··· Set to star *, Get resources the first time
Just for academic sharing , If there is any infringement , Contact deletion
Reproduced in ： Almost Human
4 month 23 Japan , World wide The net top will WWW-2021(T he Web Conference 2021: International World Wi d e Web Conference ) Male The agenda of this session was announced The best paper award goes to the winner army （Winner and Ru nner-Up）, Come on Zitian tianjin university Deputy brother Kim professor Team The paper 《Heterogeneous Graph Neural Network via Attribute Completion》 Capture The best Paper prize （Runner-Up） .
WWW（ It is now renamed TheWebConf） Conference is the top meeting in the world wide web , By the Turing prize winner Tim founded , It is certified by China Computer Association as CCF-A Class meeting , Once a year . Current WWW Received in total 1736 Contributions , Employment 357 Papers , The employment rate is 20.6%, Among them, the winner and runner up of the best paper award are one each .
WWW2021 Best Paper Award （ runner-up ） Won by the team of associate professor Jindi from Tianjin University . This research creatively puts forward the attribute completion problem of heterogeneous information network and its efficient solution . The scheme is orthogonal to the existing heterogeneous graph neural network framework , Excellent results have been obtained on multiple real-world heterogeneous data sets .
Thesis link ：https://dl.acm.org/doi/10.1145/3442381.3449914
Code link ：https://github.com/jindi-tju/HGNN-AC
1. Content abstract
Heterogeneous information networks （HINs） Also known as heterogeneous graphs , It is a complex network composed of many types of nodes and edges , It contains comprehensive information and rich semantics . Figure neural network （GNNs） As a powerful tool for processing graph structure data , It shows excellent performance in network analysis tasks . Recently, many heterogeneous graph models based on graph neural networks have been proposed , And it was a great success . Graph neural network aims to complete the graph representation learning task through the propagation and aggregation of node attributes , Therefore, complete node attributes are the necessary premise for the operation of the algorithm . However , Most real-world scenes usually have the problem of incomplete information , Performance in heterogeneous information networks is ： There is often the phenomenon that the attributes of some types of nodes are completely missing , For example, in a citation network containing three types of nodes ACM in , Only paper The node contains the original attributes ,author and subject Node has no attributes . It is different from the lack of attributes of some nodes or the lack of node attributes in some dimensions in isomorphic networks , The degree of attribute missing in heterogeneous networks is greater 、 More complex .
Some existing heterogeneous network representation learning methods mainly aim at improving the model to improve the performance of the algorithm , For the missing attributes, some simple manual interpolation methods are used （ For example, average interpolation 、one-hot Vector interpolation ） To complete . These methods separate attribute completion from graph representation learning process , Ignoring the importance of accurate attributes for downstream tasks , Therefore, it is difficult to use simple interpolated attributes to ensure the performance of the model . actually , Accurate input is the basis for performance improvement of any model , In the absence of more complex attributes of heterogeneous networks , Accurate attributes become more important . therefore , Compared with designing a new model , Scientific and accurate completion of missing attributes should become another important research direction of heterogeneous network analysis task , And attribute completion and model design can enhance each other . Based on this , This paper proposes a learnable way to complete the missing attributes , A general framework of heterogeneous graph neural network for attribute missing heterogeneous networks is constructed by using the pattern of mutual enhancement between attribute completion and graph neural network model （HGNN-AC）.
HGNN-AC There are four key designs ： A priori knowledge pre learning based on topology 、 Attribute completion based on attention mechanism 、 Design of weakly supervised reconfiguration loss and construction of end-to-end model . In this paper, a large number of experiments are carried out on three real-world heterogeneous networks , The results show that the proposed framework is better than the latest benchmark .
The framework proposed in this paper is mainly composed of four parts （ As shown in the figure below ）. First , The classical heterogeneous network representation learning method , The topology structure is used to obtain the topology representation of nodes , This method captures the high-order topological relationship between nodes as a priori knowledge of attribute completion . secondly , Node based topology representation calculates the relationship between non attribute nodes and directly connected existing attribute nodes , The attributes of existing attribute nodes are weighted and aggregated to complete the attributes of non attribute nodes . then , Delete the attributes of some existing attribute nodes randomly , The proposed attribute completion method is used to reconstruct the attributes for these nodes to construct weak supervision loss . Last , Design attribute completion is combined with heterogeneous model based on graph neural network , Make the whole system end-to-end , Complete task oriented attribute completion .
1） Pre learning of node topology representation
Because the semantic information carried by topology and attributes in the network often has strong similarity , This paper holds that the high-order heterogeneous relationship in network topology is helpful to attribute completion , Therefore, the classical heterogeneous network representation learning method is used in this paper （ for example metapath2vec） The topology is used to capture the relationship between nodes to learn the representation of nodes H, And take it as a priori knowledge to guide the completion of attributes .
2） Attribute completion based on attention mechanism
Is a collection of nodes with attributes , A collection of missing nodes for the attribute . This paper uses the prior knowledge obtained above H, The attention mechanism is used to calculate the importance of the first-order neighbor nodes of the target node with missing attributes , The first-order neighbor nodes of existing attributes are aggregated according to the importance coefficient （ Node in ） Properties of , For the target node （ The nodes in the ） Complete the attribute .
say concretely , Given node pair （v,u） And its corresponding node represents h_v and h_u ( Where nodes v For target nodes without attributes , node u Belonging node v A set of nodes with attributes in the first-order neighbors of ), Calculation u Nodes for v Importance coefficient of nodes ：
Aggregate according to the normalized coefficient The original attribute of the node in is the target node v Complete the attribute ：
In order to stabilize the learning process and reduce high variance , Finally, this paper uses the multi head attention mechanism to complete the attributes ：
3） Delete the original attribute to build a weak supervision loss
To ensure that the attribute completion process is learnable , At the same time, the attribute of complement is accurate , In this paper, nodes with original attributes are randomly divided into and , take Delete the attribute of the node in , The attribute completion mechanism in the previous step is used to reconstruct the deleted attributes ：
The weak supervision loss of attribute completion is obtained by calculating the Euclidean distance between the original attribute and the reconstructed attribute ：
4） Combined with heterogeneous graph neural network model to construct end-to-end system
Through the proposed attribute completion mechanism , This paper combines the existing attributes and the completed attributes , Get the complete attribute matrix ：
The complete attribute matrix and topology are input into the graph neural network model , Get the label of the model to predict the loss ：
In order to achieve task oriented attribute completion , This paper combines tag prediction loss and attribute completion loss , Build an end-to-end system to jointly optimize the two ：
Experiments are carried out on three real heterogeneous network data sets . The statistics of the data set are as follows ：
1) Node classification results - The framework proposed in this paper is compared with two heterogeneous graph neural networks SOTA Model （MAGNN,GTN） Combine to HGNN-AC The framework is evaluated ：
2) case analysis - Different attribute completion methods are used for experimental comparison ,ACM Data set paper Node has attributes ,author and subject The node has no original attributes , The attribute completion methods from left to right in the following table are ：paper and subject The attribute vector of a node comes from its directly connected paper The average value of the attribute vector of the node ;author The attribute vector of the node is one-hot vector ,subject The attribute vector of a node is directly connected to it paper The average value of the attribute vector of the node ;author and subject The attributes of all nodes are one-hot vector ;author The attributes of nodes are completed by the method proposed in this paper ,subject The attribute of the node is one-hot vector ;author and subject The attributes of nodes are completed by the method proposed in this paper .
This paper finds that , Facing the complex lack of attributes in heterogeneous networks , Compared with the traditional research direction focusing on designing new models , Attribute completion becomes particularly important , And can be a new 、 More effective ways to improve performance . In this paper, the missing attributes in heterogeneous networks are scientifically completed for the first time , A general framework is proposed to solve the problem of attribute deletion in heterogeneous graph neural network model .
In particular , In this framework , Firstly, the relationship between nodes is mined based on the high-order topology information oriented to meta path , It is used as a priori knowledge of the semantic relationship between nodes . Then, a method with a priori information guidance is provided for the completion of node attributes 、 Effective attention mechanism , The definition of weak supervision loss is realized by randomly deleting attributes , Thus, attribute completion becomes a reasonable learnable process under the guidance of a priori knowledge . Finally, the attribute completion process and target task are defined under the framework of the same graph neural network , To build a task oriented end-to-end framework , Realize the mutual enhancement of the two . The framework can be orthogonal to most heterogeneous graph neural network models , Bring stable performance improvements to these models . This paper also hopes that this new view can provide a new perspective for the existing research of heterogeneous networks based on graph neural networks 、 Effective direction .
Statement ： This content comes from the Internet , The copyright belongs to the original author
Picture source network , It does not represent the position of the official account . If there is any infringement , Contact deletion
AI Doctor's personal wechat , There are still a few vacancies
Let's have a look and support