Tianjin University: a new architecture of heterogeneous graph neural network based on attribute completion

Doctor of artificial intelligence 2021-10-14 04:58:18

above Artificial intelligence algorithms and Python big data Get more dry goods

On the top right  ···  Set to star  *, Get resources the first time

Just for academic sharing , If there is any infringement , Contact deletion

Reproduced in : Almost Human

4  month 23 Japan , World wide The net top will WWW-2021(T he Web Conference 2021: International World Wi d e Web Conference )  Male The agenda of this session was announced The best paper award goes to the winner army (Winner and Ru nner-Up), Come on Zitian tianjin university Deputy brother Kim professor Team The paper 《Heterogeneous Graph Neural Network via Attribute Completion》 Capture The best Paper prize (Runner-Up) .


WWW( It is now renamed TheWebConf) Conference is the top meeting in the world wide web , By the Turing prize winner Tim founded , It is certified by China Computer Association as CCF-A Class meeting , Once a year . Current WWW Received in total 1736 Contributions , Employment 357 Papers , The employment rate is 20.6%, Among them, the winner and runner up of the best paper award are one each .

WWW2021 Best Paper Award ( runner-up ) Won by the team of associate professor Jindi from Tianjin University . This research creatively puts forward the attribute completion problem of heterogeneous information network and its efficient solution . The scheme is orthogonal to the existing heterogeneous graph neural network framework , Excellent results have been obtained on multiple real-world heterogeneous data sets .


  • Thesis link :https://dl.acm.org/doi/10.1145/3442381.3449914

  • Code link :https://github.com/jindi-tju/HGNN-AC

1. Content abstract

Heterogeneous information networks (HINs) Also known as heterogeneous graphs , It is a complex network composed of many types of nodes and edges , It contains comprehensive information and rich semantics . Figure neural network (GNNs) As a powerful tool for processing graph structure data , It shows excellent performance in network analysis tasks . Recently, many heterogeneous graph models based on graph neural networks have been proposed , And it was a great success . Graph neural network aims to complete the graph representation learning task through the propagation and aggregation of node attributes , Therefore, complete node attributes are the necessary premise for the operation of the algorithm . However , Most real-world scenes usually have the problem of incomplete information , Performance in heterogeneous information networks is : There is often the phenomenon that the attributes of some types of nodes are completely missing , For example, in a citation network containing three types of nodes ACM in , Only paper The node contains the original attributes ,author and subject Node has no attributes . It is different from the lack of attributes of some nodes or the lack of node attributes in some dimensions in isomorphic networks , The degree of attribute missing in heterogeneous networks is greater 、 More complex .

Some existing heterogeneous network representation learning methods mainly aim at improving the model to improve the performance of the algorithm , For the missing attributes, some simple manual interpolation methods are used ( For example, average interpolation 、one-hot Vector interpolation ) To complete . These methods separate attribute completion from graph representation learning process , Ignoring the importance of accurate attributes for downstream tasks , Therefore, it is difficult to use simple interpolated attributes to ensure the performance of the model . actually , Accurate input is the basis for performance improvement of any model , In the absence of more complex attributes of heterogeneous networks , Accurate attributes become more important . therefore , Compared with designing a new model , Scientific and accurate completion of missing attributes should become another important research direction of heterogeneous network analysis task , And attribute completion and model design can enhance each other . Based on this , This paper proposes a learnable way to complete the missing attributes , A general framework of heterogeneous graph neural network for attribute missing heterogeneous networks is constructed by using the pattern of mutual enhancement between attribute completion and graph neural network model (HGNN-AC).

HGNN-AC There are four key designs : A priori knowledge pre learning based on topology 、 Attribute completion based on attention mechanism 、 Design of weakly supervised reconfiguration loss and construction of end-to-end model . In this paper, a large number of experiments are carried out on three real-world heterogeneous networks , The results show that the proposed framework is better than the latest benchmark .

2. Method

The framework proposed in this paper is mainly composed of four parts ( As shown in the figure below ). First , The classical heterogeneous network representation learning method , The topology structure is used to obtain the topology representation of nodes , This method captures the high-order topological relationship between nodes as a priori knowledge of attribute completion . secondly , Node based topology representation calculates the relationship between non attribute nodes and directly connected existing attribute nodes , The attributes of existing attribute nodes are weighted and aggregated to complete the attributes of non attribute nodes . then , Delete the attributes of some existing attribute nodes randomly , The proposed attribute completion method is used to reconstruct the attributes for these nodes to construct weak supervision loss . Last , Design attribute completion is combined with heterogeneous model based on graph neural network , Make the whole system end-to-end , Complete task oriented attribute completion .


1) Pre learning of node topology representation

Because the semantic information carried by topology and attributes in the network often has strong similarity , This paper holds that the high-order heterogeneous relationship in network topology is helpful to attribute completion , Therefore, the classical heterogeneous network representation learning method is used in this paper ( for example metapath2vec) The topology is used to capture the relationship between nodes to learn the representation of nodes H, And take it as a priori knowledge to guide the completion of attributes .

2) Attribute completion based on attention mechanism

964f52cba0e67eaec2686caf4f406c4f.png Is a collection of nodes with attributes ,b978327236802c526b4d4ad0ed8443a2.png A collection of missing nodes for the attribute . This paper uses the prior knowledge obtained above H, The attention mechanism is used to calculate the importance of the first-order neighbor nodes of the target node with missing attributes , The first-order neighbor nodes of existing attributes are aggregated according to the importance coefficient (0db7ffc65accba5546eb32fb206b2578.png Node in ) Properties of , For the target node (ada5f0252c0f49f81164e9fafeecfb35.png The nodes in the ) Complete the attribute .

say concretely , Given node pair (v,u) And its corresponding node represents h_v and h_u ( Where nodes v For target nodes without attributes , node u Belonging node v A set of nodes with attributes in the first-order neighbors of 8d801f7babeea39f928cd75720b26244.png), Calculation u Nodes for v Importance coefficient of nodes :


Normalize :


Aggregate according to the normalized coefficient 6c12f54622ac1e481f6688e948b5bea4.png The original attribute of the node in is the target node v Complete the attribute :


In order to stabilize the learning process and reduce high variance , Finally, this paper uses the multi head attention mechanism to complete the attributes :


3) Delete the original attribute to build a weak supervision loss

To ensure that the attribute completion process is learnable , At the same time, the attribute of complement is accurate , In this paper, nodes with original attributes are randomly divided into 2ca47ab41c6bb86d421534e00bf542b8.png and 3ce889d6b22425e41ae2b88aefb8cbbc.png, take 607c72906f72974279f762c9fd497d9b.png Delete the attribute of the node in , The attribute completion mechanism in the previous step is used to reconstruct the deleted attributes :


The weak supervision loss of attribute completion is obtained by calculating the Euclidean distance between the original attribute and the reconstructed attribute :


4) Combined with heterogeneous graph neural network model to construct end-to-end system

Through the proposed attribute completion mechanism , This paper combines the existing attributes and the completed attributes , Get the complete attribute matrix :


The complete attribute matrix and topology are input into the graph neural network model , Get the label of the model to predict the loss :


In order to achieve task oriented attribute completion , This paper combines tag prediction loss and attribute completion loss , Build an end-to-end system to jointly optimize the two :


3. experiment

Experiments are carried out on three real heterogeneous network data sets . The statistics of the data set are as follows :


1) Node classification results - The framework proposed in this paper is compared with two heterogeneous graph neural networks SOTA Model (MAGNN,GTN) Combine to HGNN-AC The framework is evaluated :




2) case analysis - Different attribute completion methods are used for experimental comparison ,ACM Data set paper Node has attributes ,author and subject The node has no original attributes , The attribute completion methods from left to right in the following table are :paper and subject The attribute vector of a node comes from its directly connected paper The average value of the attribute vector of the node ;author The attribute vector of the node is one-hot vector ,subject The attribute vector of a node is directly connected to it paper The average value of the attribute vector of the node ;author and subject The attributes of all nodes are one-hot vector ;author The attributes of nodes are completed by the method proposed in this paper ,subject The attribute of the node is one-hot vector ;author and subject The attributes of nodes are completed by the method proposed in this paper .


4. summary

This paper finds that , Facing the complex lack of attributes in heterogeneous networks , Compared with the traditional research direction focusing on designing new models , Attribute completion becomes particularly important , And can be a new 、 More effective ways to improve performance . In this paper, the missing attributes in heterogeneous networks are scientifically completed for the first time , A general framework is proposed to solve the problem of attribute deletion in heterogeneous graph neural network model .

In particular , In this framework , Firstly, the relationship between nodes is mined based on the high-order topology information oriented to meta path , It is used as a priori knowledge of the semantic relationship between nodes . Then, a method with a priori information guidance is provided for the completion of node attributes 、 Effective attention mechanism , The definition of weak supervision loss is realized by randomly deleting attributes , Thus, attribute completion becomes a reasonable learnable process under the guidance of a priori knowledge . Finally, the attribute completion process and target task are defined under the framework of the same graph neural network , To build a task oriented end-to-end framework , Realize the mutual enhancement of the two . The framework can be orthogonal to most heterogeneous graph neural network models , Bring stable performance improvements to these models . This paper also hopes that this new view can provide a new perspective for the existing research of heterogeneous networks based on graph neural networks 、 Effective direction .


Statement : This content comes from the Internet , The copyright belongs to the original author

Picture source network , It does not represent the position of the official account . If there is any infringement , Contact deletion

AI Doctor's personal wechat , There are still a few vacancies



How to draw a beautiful deep learning model ?

How to draw a beautiful neural network diagram ?

Read all kinds of convolutions in deep learning

Let's have a look and support 64cb7939e557de70cd0a675f0a752367.pngad614dd0db02eb586af5db50111c6d41.png

Please bring the original link to reprint ,thank
Similar articles