## [ Paper reading notes ] Fast Network Embedding Enhancement via High Order Proximity Approximation

## The structure of this paper

- solve the problem
- Main contributions
- primary coverage
- reference

### (1) solve the problem

Most previous jobs , Or the high-order similarity of the network is not considered （ Spectral clustering ,DeepWalk,LINE,Node2Vec）, Either it is considered, but the efficiency of the algorithm is very low , It can't be extended to large-scale networks （ Such as GraRep）.

### (2) Main contributions

**Contribution 1.** Many of the existing NRL The algorithm architecture is summarized into a unified framework （ Similarity matrix construction and dimension reduction ）, And come to a conclusion , If higher order similarity information is considered in the similarity matrix , that NRL The representation effect of the algorithm will be improved .

**Contribution 2.** Put forward NEU Enhancement strategy to improve the existing NRL The representation effect of the algorithm , Through NEU The representation matrix processed by the algorithm R In theory, the higher-order similarity of nodes is integrated （ The approximate ）. Last , Experiments on multi label classification and link prediction show that the algorithm is not only effective in time , And the accuracy is also greatly improved .

### (3) primary coverage

#### 1. Preliminary knowledge

- K Order similarity ： The first-order similarity can be expressed as the edge weight of two nodes , Second order similarity can be expressed as the number of common neighbors of two nodes , So how to extend it to higher-order similarity ？ First, consider another explanation of second-order similarity ： node v
_{i}Two steps to the node v_{j}Probability . The first and second order similarity is simply extended to k Order similarity , That is node v_{i}go k Step to the node v_{j}Probability . hypothesis A Is the normalized adjacency matrix （ First order similarity transition probability matrix ）, that k The transition probability matrix of order similarity is A^{k}（k Order similarity transfer matrix ）,A^{k}_{ij}Representation node v_{i}go k Step to the node v_{j}Probability .

（** Personal understanding ：** Why does higher-order similarity work ？ Because the real network is often sparse , This means that the size of the edge and the size of the node are often the same . therefore , The first-order similarity matrix of real networks is usually very sparse , The first-order similarity is not enough to reflect the relationship between nodes . therefore , Need to combine higher-order node similarity ）

#### 2. Unified framework

The paper proposes that ** A dimension reduction method based on similarity matrix （ Matrix decomposition ） A unified framework for **, And the existing algorithms are summed up in this framework .

Dimension reduction based on similarity matrix （ Matrix decomposition ） The unified framework consists of two steps ：

**Step 1**： Similarity matrix M Construction .（ Such as adjacency matrix , Laplace matrix ,k Order similarity matrix, etc ）**Step 2**： Dimension reduction of similarity matrix , That is, matrix decomposition , Such as eigenvalue decomposition or SVD decompose .

**The goal is ：**Decomposition matrix M=RC^{T}, That is to find the matrix R And matrices C Like a matrix M, matrix M And matrices RCT The deviation of can be expressed by the matrix norm of difference . among ,R Represent moments for the central vector ,C Representation matrix for context vector .

** An example shows that the algorithm conforms to the above unified framework ：**

**Example 1**：Spectral Clustering（SC）

Similarity matrix M： Normalized Laplacian matrix （ First order similarity ）

Dimensionality reduction method ： Eigenvalue decomposition .

**Example 2**：Graph Factorization （GF）

Similarity matrix M： The normalized adjacency matrix （ First order similarity ） Dimensionality reduction method ：SCD decompose .

**Example 3**：DeepWalk

Similarity matrix M：

DeepWalk The algorithm approximates high-order similarity by sampling based on random walk generation , And there's no accurate calculation in practice k Order similarity matrix .

Dimensionality reduction method ： In the way of objective function optimization ,SkipGram Objective optimization of （SGD）, Looking for matrices R And matrices C bring RC^{T} The approximate M.

**Example 4**：GraRep

Algorithm principle ：

GraRep Calculate accurately 1,...k rank ,k A similarity matrix , And calculate a specific table for each similarity matrix sign （ utilize SVD decompose ）, Finally, I will k Connect the two representations .

In essence, it is also based on similarity matrix decomposition , Belongs to the proposed unified framework, but ,GraRep It doesn't work for large-scale networks , Computational efficiency is too low .

#### 3. Algorithm principle

** According to the problems of the above algorithm **： This paper studies how to start from ** Approximate higher order ** Effective learning network representation in similarity matrix （ It makes the algorithm efficient and effective ）.

Suppose we've used the above NRL An algorithm in the framework learns a relatively low order similarity matrix f(A) Approximation of RC^{T}. On this basis , Our goal is to learn a better R' and C', Its R'C'^{T} Approximate a higher order matrix g(A), Its degree ratio f(A) Higher .

**f(A) The definition of （ Similarity matrix ）**： By A Of 1...k Polynomials of powers .f(A) Degree k Represents the similarity of the largest order considered in a polynomial , namely A The greatest power of , Refer to DeepWalk The similarity matrix of ,f(A)=M.

** be aware NEU The main purpose of the algorithm is to enhance the embedding results of other representation learning models **, That is to say, on the basis of embedding vector containing low order information , Fusing higher-order information to generate better embedding vectors . The principle of this algorithm is very simple , That is to do a post-processing operation on the representation vector embedding matrix obtained by other algorithms , The iterative updating formula is as follows ：

** A question **： This R and C How does the iterative update of take into account higher-order similarity ？

**Theorem**：

Given the network representation matrix R And vector representation matrix C（ It can be learned from other representation algorithms ）, hypothesis RC^{T} Approximate similarity matrix M=f(A), Approximate error limits

And f(A) The degree of K. Through the above iterative formula （3） Updated R’ and C’ Product of R’C’^{T} It's like a matrix

g(A) have K+2 Degree , And the approximate error limit is

** The conclusion can be drawn from the above theorem ：** That is, every iteration is updated , The degree promotion of approximate similarity matrix of decomposition 2, But the corresponding error limit will increase 2.25 times , Therefore, we must balance the high-order node similarity information and the corresponding error .

A variant of the iterative formula ：

It can be deduced that the variant iterative updating formula can obtain higher order similarity in one iteration （ The first iteration formula is just more than one iteration 2 rank ）.（ Of course, the iterative formula which is more complex than the variant iterative formula and obtains higher order similarity in one iteration can be similarly generalized ）

** summary **： That's the iterative update of the embedding matrix obtained by other representation learning algorithms , The higher-order information can be integrated into the embedded vector .

### (4) reference

Yang C , Sun M , Liu Z , et al. Fast Network Embedding Enhancement via High Order Proximity Approximation[C]// International Joint Conference on Artificial Intelligence. AAAI Press, 2017.