[paper reading notes] fast network embedding enhancement via high order probability approximation
Qinze 2021-06-04 10:18:43

[ Paper reading notes ] Fast Network Embedding Enhancement via High Order Proximity Approximation


The structure of this paper

  1. solve the problem
  2. Main contributions
  3. primary coverage
  4. reference

(1) solve the problem

Most previous jobs , Or the high-order similarity of the network is not considered ( Spectral clustering ,DeepWalk,LINE,Node2Vec), Either it is considered, but the efficiency of the algorithm is very low , It can't be extended to large-scale networks ( Such as GraRep).


(2) Main contributions

Contribution 1. Many of the existing NRL The algorithm architecture is summarized into a unified framework ( Similarity matrix construction and dimension reduction ), And come to a conclusion , If higher order similarity information is considered in the similarity matrix , that NRL The representation effect of the algorithm will be improved .

Contribution 2. Put forward NEU Enhancement strategy to improve the existing NRL The representation effect of the algorithm , Through NEU The representation matrix processed by the algorithm R In theory, the higher-order similarity of nodes is integrated ( The approximate ). Last , Experiments on multi label classification and link prediction show that the algorithm is not only effective in time , And the accuracy is also greatly improved .


(3) primary coverage

1. Preliminary knowledge

  • K Order similarity : The first-order similarity can be expressed as the edge weight of two nodes , Second order similarity can be expressed as the number of common neighbors of two nodes , So how to extend it to higher-order similarity ? First, consider another explanation of second-order similarity : node vi Two steps to the node vj Probability . The first and second order similarity is simply extended to k Order similarity , That is node vi go k Step to the node vj Probability . hypothesis A Is the normalized adjacency matrix ( First order similarity transition probability matrix ), that k The transition probability matrix of order similarity is Ak(k Order similarity transfer matrix ),Akij Representation node vi go k Step to the node vj Probability .

Personal understanding : Why does higher-order similarity work ? Because the real network is often sparse , This means that the size of the edge and the size of the node are often the same . therefore , The first-order similarity matrix of real networks is usually very sparse , The first-order similarity is not enough to reflect the relationship between nodes . therefore , Need to combine higher-order node similarity )

2. Unified framework

The paper proposes that A dimension reduction method based on similarity matrix ( Matrix decomposition ) A unified framework for , And the existing algorithms are summed up in this framework .

Dimension reduction based on similarity matrix ( Matrix decomposition ) The unified framework consists of two steps :

  • Step 1: Similarity matrix M Construction .( Such as adjacency matrix , Laplace matrix ,k Order similarity matrix, etc )
  • Step 2: Dimension reduction of similarity matrix , That is, matrix decomposition , Such as eigenvalue decomposition or SVD decompose .
    The goal is : Decomposition matrix M=RCT, That is to find the matrix R And matrices C Like a matrix M, matrix M And matrices RCT The deviation of can be expressed by the matrix norm of difference . among ,R Represent moments for the central vector ,C Representation matrix for context vector .

An example shows that the algorithm conforms to the above unified framework :

Example 1:Spectral Clustering(SC)

Similarity matrix M: Normalized Laplacian matrix ( First order similarity )
Dimensionality reduction method : Eigenvalue decomposition .

Example 2:Graph Factorization (GF)

Similarity matrix M: The normalized adjacency matrix ( First order similarity ) Dimensionality reduction method :SCD decompose .

Example 3:DeepWalk

Similarity matrix M:

DeepWalk The algorithm approximates high-order similarity by sampling based on random walk generation , And there's no accurate calculation in practice k Order similarity matrix .

Dimensionality reduction method : In the way of objective function optimization ,SkipGram Objective optimization of (SGD), Looking for matrices R And matrices C bring RCT The approximate M.

Example 4:GraRep

Algorithm principle :
GraRep Calculate accurately 1,...k rank ,k A similarity matrix , And calculate a specific table for each similarity matrix sign ( utilize SVD decompose ), Finally, I will k Connect the two representations .

In essence, it is also based on similarity matrix decomposition , Belongs to the proposed unified framework, but ,GraRep It doesn't work for large-scale networks , Computational efficiency is too low .

3. Algorithm principle

According to the problems of the above algorithm : This paper studies how to start from Approximate higher order Effective learning network representation in similarity matrix ( It makes the algorithm efficient and effective ).

Suppose we've used the above NRL An algorithm in the framework learns a relatively low order similarity matrix f(A) Approximation of RCT. On this basis , Our goal is to learn a better R' and C', Its R'C'T Approximate a higher order matrix g(A), Its degree ratio f(A) Higher .

f(A) The definition of ( Similarity matrix ): By A Of 1...k Polynomials of powers .f(A) Degree k Represents the similarity of the largest order considered in a polynomial , namely A The greatest power of , Refer to DeepWalk The similarity matrix of ,f(A)=M.

be aware NEU The main purpose of the algorithm is to enhance the embedding results of other representation learning models , That is to say, on the basis of embedding vector containing low order information , Fusing higher-order information to generate better embedding vectors . The principle of this algorithm is very simple , That is to do a post-processing operation on the representation vector embedding matrix obtained by other algorithms , The iterative updating formula is as follows :

A question : This R and C How does the iterative update of take into account higher-order similarity ?

Theorem
Given the network representation matrix R And vector representation matrix C( It can be learned from other representation algorithms ), hypothesis RCT Approximate similarity matrix M=f(A), Approximate error limits

And f(A) The degree of K. Through the above iterative formula (3) Updated R’ and C’ Product of R’C’T It's like a matrix

g(A) have K+2 Degree , And the approximate error limit is

The conclusion can be drawn from the above theorem : That is, every iteration is updated , The degree promotion of approximate similarity matrix of decomposition 2, But the corresponding error limit will increase 2.25 times , Therefore, we must balance the high-order node similarity information and the corresponding error .

A variant of the iterative formula :

It can be deduced that the variant iterative updating formula can obtain higher order similarity in one iteration ( The first iteration formula is just more than one iteration 2 rank ).( Of course, the iterative formula which is more complex than the variant iterative formula and obtains higher order similarity in one iteration can be similarly generalized )

summary : That's the iterative update of the embedding matrix obtained by other representation learning algorithms , The higher-order information can be integrated into the embedded vector .


(4) reference

Yang C , Sun M , Liu Z , et al. Fast Network Embedding Enhancement via High Order Proximity Approximation[C]// International Joint Conference on Artificial Intelligence. AAAI Press, 2017.


Please bring the original link to reprint ,thank
Similar articles

2021-08-09

2021-08-09