Imeta | Nannong shenqirong team released microbial network analysis and visualization R package ggclusternet

Liuyongxin Adam2022-06-23 18:10:56

ggClusterNet: Contains a variety of microbial network mining algorithms based on module visual layout R package



Volume1, Issue3


●2022 year 6 month 13 Japan , Nannong shenqirong's team is in iMeta Published online entitled “ggClusterNet: An R package for microbiome network analysis and modularity-based multiple network layouts” The article .

●  This article developed a ggClusterNet Of R package , Exhibition Microbial network modular information , For microbial network data mining and cross domain network data mining .

●  First author : Wen Tao , Xiepenghao

●  Corresponding author : Liu Yongxin  , Yuan Jun (

●  joint authorship : Yangshengdie , Niuguoqing , Liuxiaoyu , Dingzhexu , Xue Chao , Shen Qirong

pick     want

Network analysis is gradually valued by ecologists and continues to be applied in the field of ecology , It is necessary to develop more powerful and convenient network analysis tools . therefore , We developed a program called ggClusterNet Of R package , For easier network data analysis, mining and visualization . stay ggClusterNet Dozens of network layout algorithms are designed in the package to better display the modular information of microbial network (randomClusterG, PolygonClus-terG, PolygonRrClusterG, ArtifCluster, randSNEClusterG, PolygonModsquar-eG, PolyRdmNotdCirG, model_Gephi.2, model_igraph, and model_maptree). For the convenience of researchers , Designed a variety of microbial network data mining functions , For example, correlation calculation (corMicor()), Network attribute calculation (net_properties()), Node attribute calculation (node_properties()), Random network calculation and comparison (random_Net_compate()). In order to mine the network more quickly , Further integrate these functions into two functions network.2() and corBionetwork(), It is used in microbial network data mining and cross domain network data mining respectively . at present ,ggClusterNet stay github(,Gitee( Open for use on . The complete description and examples are in wiki Pages can be read .

key word : Microbiome , Network analysis , R package , visualization

bright     spot


● ggClusterNet For mining microbial networks and cross domain networks , At the same time, it provides dozens of network layout algorithms based on modular display , Adapt to microbiome network mining

Video interpretation



Full text interpretation

Development and status of microbiome network analysis methods

Network analysis is not just exploring a set of projects ( node ) And the connection between them ( edge ) Mathematics of 、 Statistical and structural properties play an important role , It is also widely used to explore co-occurrence patterns among microbial taxa in complex communities . for example , Mabin et al. Found the interaction mode of microbiome in different environments based on network analysis , The importance of network analysis to microbiome community analysis is emphasized ;Yuan Through network analysis, et al. Found that climate warming enhanced the complexity and stability of microbial networks ;Franciska By using network analysis, et al. Found that drought would cause great disturbance to the soil bacterial network , But it has little effect on the fluctuation of fungal network .

The tools of microbiome network analysis mainly include web page analysis tools MENA(MENAP)、R software package (WGCNA、igraph、ggraph、SpiecEasi)、 Interactive biological software (Cytoscape and Gephi)、python Model (NetworkX and SparCC) etc. . The main tools for network related computing are :MENA、WGCNA、SpiecEasi etc. .MENA The correlation matrix of microbial network can be calculated , Based on random matrix theory (RMT) Method , It is robust to noise ;WGCNA It is used to construct a scale-free topology weighted network based on soft threshold , Weighting by other correlations outside the nodes is more biologically meaningful ;SpiecEasi The data transformation developed for the data is combined with the graphical model reasoning framework , Generated a set of calculation tools , It can effectively solve the robustness problem of small samples , Reducing false positives in correlation calculations . The main tools for network visualization are Cytoscape、Gephi and R package (igraph、ggraph etc. ).Cytoscape Not only can it provide a powerful visualization scheme , It also allows users to develop new functions through a variety of plug-ins , It can complete microbial network mining, etc ;Gephi Beautiful network graphics can be easily completed through a few steps .R Visualization tools in software packages (igraph、ggraph) You can also quickly show the network in the command line , And it has high repeatability . however , Although these tools have advantages in different aspects of network analysis , But it still can not meet our requirements for fast computing , Repeatability , In depth mining of network requirements . for example ,Cytoscape The steps are cumbersome , Not easy to repeat , and igraph and ggraph The layout of the is not beautiful .

at present , More researchers pay attention to the interaction between microbial network modules , To explore their functions . However , There are too many connections between various microbial species in the network , Moreover, there is a lack of appropriate visualization scheme and software to clearly show the relationship between modules , These tools are not enough to meet the needs of relevant researchers in practical use . In order to solve the increasing demand in network analysis , We go through R Language development R package ggClusterNet. It provides a fast way for network analysis 、 Repeatable and easy-to-use processes , And has a variety of powerful visual layout based on modularization .ggClusterNet It has the following characteristics :1) The network analysis process can be completed quickly ;2) The network analysis process can provide a variety of beautiful visual layouts ;3) The network analysis process can be completed with a small amount of code , Very repetitive .

ggClusterNet Introduction to development environment and process

ggClesterNet Is based on R Language development of network analysis process tools . among , The tools used in Network Computing mainly call cor() function (R package WGCNA),sparccboot() function (R package SpiecEasi) and corr.test() function (R package psych). Some network layouts refer to ggraph and sna package . The network attribute calculation tool mainly calls igraph Functions in the package (average.path.length()、edge_connectivity()、no.clusters()、centralization.closeness()、 etc. ).ggClusterNet Software open source , And can be in Github Download the use (, It can also be done through R Language command line tools for installation (devtools::install_github("taowenmicro/ggClusterNet")).

To enrich ggClusterNet Support for modular layouts , More than... Have been developed here 10 A network visualization layout function based on modularization , These functions are :randomClusterG、PolygonClusterG、PolygonRrClusterG、ArtifCluster、randSNEClusterG、PolygonModsquareG、PolyRdmNotdCirG、model_Gephi.2、model_igraph and model_maptree. These algorithms require correlation matrix and modular information of nodes as input , Can calculate various styles of network layout . These layout results can finally be used R Language ggplot2 Package for visualization .


chart 1. ggClusterNet Function and function introduction

ggClusterNet Use

●  ggClusterNet workflow

stay ggClusterNet in ,corMciro()( Correlation matrix calculation for microbial networks ) or corBiostripe()( It is used to calculate the correlation matrix of the binary network ) Is used to calculate the correlation matrix . More than ten network layout algorithms are designed to calculate the visual layout of correlation matrix , The calculation results are combined with ggplot2 Drawing . Separate use net_properties(),node_properties() and ZiPiPlot() To calculate network properties 、 Node properties , And calculate the function of nodes according to the module . Use random_Net_compate() Calculate the zero model and generate a random network , And compare with the network properties in microbial network . These functions are included in nerwork.2()( Microbiome network process ) or corBionetwork()( The flow of the binary network ), You can run the network analysis through a function . All in all ,ggClusterNet It can quickly complete the analysis of the whole microbiome and binary network , Including correlation calculation 、 Network Visualization 、 Network attribute calculation 、 Construction and comparison of node attributes and random networks .


chart 2. ggClusterNet Function usage and output results

●   Network layout

In order to better realize the visualization of microbial network and highlight the interaction between modules , Ten visual layout algorithms are developed . The following describes the functions of these algorithms .

1)randomClusterG: Put a single module ( Group ) The nodes of are all arranged in a ring , Then draw multiple modules into multiple circles with the same radius and randomly arrange these circles on the drawing panel .

2)PolygonClusterG: Put a single module ( Group ) The nodes of are arranged in a circle . These multiple modules are drawn as multiple circles of the same radius . Finally, these rings representing different modules are arranged on the polygon vertices with the same number of edges .

3)PolygonRrClusterG: Single module ( Group ) The nodes of are arranged in a ring . Draw the module into several circles with different radii according to the number of nodes in the module ( The more nodes , The larger the radius ). then , These circles are regularly arranged on the vertices of the polygon centered on the coordinate axis ( The number of edges equals the number of modules ).

4)ArtifCluster: Single module ( Group ) The nodes of are arranged in a circle with the same radius . By setting coordinate values , Arrange these circles manually .

5)randSNEClusterG: Single module ( Group ) Node of can call sna Visual layout calculation coordinates in the package , Different modules are randomly arranged on the drawing panel .

6)PolygonModsquareG: A single module ( Group ) The nodes of are arranged in a circle . Different modules are drawn as circles with different radii ( The more nodes , Larger radius ). These circles are arranged in one or more lines with a few parameters .
7)PolyRdmNotdCirG: According to the module information , The nodes are randomly distributed in several circles with different radii ( The more nodes , Larger radius ). These modules are regularly arranged on the vertices of the polygon centered on the origin of the coordinate axis ( The number of edges equals the number of modules ).

8)model_Gephi.2: All the nodes calculate the coordinates and arrange them into a circle . Reassign coordinates using microbial clustering results .

9)model_igraph: Use Fruchterman and Reingold The algorithm of , Project all nodes onto the coordinate axis according to the module classification .

10)model_maptree: Firstly, the network is analyzed by modularization , The nodes are grouped according to the degree of network modularity , Then it is used to calculate the coordinates . The relative position of nodes is calculated according to the algorithm developed by wangweixin et al , The algorithm attempts to map the same module node to the near position of the coordinate axis according to the modular information .


chart 3. ggClusterNet Visual layout in

ggClusterNet Functional features and future development direction

Network analysis is increasingly favored by microbial ecologists . In recent years , Developed many powerful tools , Such as Cytoscape、Gephi、igraph etc. . In terms of Visualization ,Cytoscape and Gephi It is known for its interactive graphical user interface and attractive visualization results .Cytoscape It provides powerful functions for network analysis , However, many parameters need to be adjusted to realize module visualization in the network .Gephi You can use default parameters to quickly show modules in the network , But the deep mining of the network is insufficient .ggClusterNet A combination of igraph、ggraph、sna The advantages of , Provides a variety of layout algorithms , And provide process functions (network.2() and network()) It can quickly show the modules in the network and deeply mine the network .
Future work will continue to develop optimization ggClusterNet package . In order to enhance the function of microbiome network analysis , The stability of the network will be added to the process , And deeply excavate the ecological function of the module ; Use Shiny Build a user-friendly interface , It is convenient for more researchers to conduct in-depth exploration of the network ; This will further excavate the analysis of binary networks , Develop more visual layout algorithms suitable for binary networks . All in all ,ggCLusterNet Hope to continuously improve the relevant content of network analysis , And help the majority of relevant scientific research workers .

lead   writing

Tao Wen, Penghao Xie, Shengdie Yang, Guoqing Niu, Xiaoyu Liu, Zhexu Ding, Chao Xue, Yong-Xin Liu, Qirong Shen, Jun Yuan. 2022. ggClusterNet: An R package for microbiome network analysis and modularity-based multiple network layouts. iMeta 1: e32.

Author's brief introduction


Wen Tao ( First author )

●   Zhongshan Young Researcher, Nanjing Agricultural University ,iMeta Journal youth Editorial Board

Focus on soil borne disease microbial process research , Good at using various biological information tools to solve ecological problems . Developed ggClusterNet, EasyStat etc. R package , EasyAmplicon, EasyMetabolome And so on . stay iMeta、ISME、Micribiome、Fundamental Research、Horticulture Research、SEL、BMC Plant Biology And other journals have published many articles .


Yuan Jun ( Corresponding author )

●   Associate professor of Nanjing Agricultural University , Academician assistant . His research interests focus on the process of rhizosphere metabolite mediated rhizosphere interaction .

●  The main research contents include :1. The interaction between plant and microorganism during the occurrence of soil borne diseases ;2. Research on big data integration of environmental microorganisms ;3. Regulation mechanism and technology development of rhizosphere microecology . As the first author or corresponding author in ISME J,Microbiome,Fundamental Research,iMeta, PCE,SBB,Horticulture Research,SEL,BMC plant biology And other journals have published more than 20 articles , Be cited beyond 1500 Time .

