Null Model and Community Structure in Multiplex Networks

Zhai, Xuemeng; Zhou, Wanlei; Fei, Gaolei; Liu, Weiyi; Xu, Zhoujun; Jiao, Chengbo; Lu, Cai; Hu, Guangmin

doi:10.1038/s41598-018-21286-0

Download PDF

Article
Open access
Published: 19 February 2018

Null Model and Community Structure in Multiplex Networks

Xuemeng Zhai¹,
Wanlei Zhou²,
Gaolei Fei¹,
Weiyi Liu¹,
Zhoujun Xu³,
Chengbo Jiao³,
Cai Lu¹ &
…
Guangmin Hu¹^nAff4

Scientific Reports volume 8, Article number: 3245 (2018) Cite this article

5087 Accesses
12 Citations
Metrics details

Subjects

Abstract

The multiple relationships among objects in complex systems can be described well by multiplex networks, which contain rich information of the connections between objects. The null model of networks, which can be used to quantify the specific nature of a network, is a powerful tool for analysing the structural characteristics of complex systems. However, the null model for multiplex networks remains largely unexplored. In this paper, we propose a null model for multiplex networks based on the node redundancy degree, which is a natural measure for describing the multiple relationships in multiplex networks. Based on this model, we define the modularity of multiplex networks to study the community structures in multiplex networks and demonstrate our theory in practice through community detection in four real-world networks. The results show that our model can reveal the community structures in multiplex networks and indicate that our null model is a useful approach for providing new insights into the specific nature of multiplex networks, which are difficult to quantify.

Graphlets in Multiplex Networks

Article Open access 05 February 2020

Degree difference: a simple measure to characterize structural heterogeneity in complex networks

Article Open access 07 December 2020

Characterizing the interactions between classical and community-aware centrality measures in complex networks

Article Open access 12 May 2021

Introduction

Network science is a fundamental tool for modelling and analysing complex systems^1,2,3. The general theories and approaches that have emerged from network science have provided guidelines and resulted in applications for analysis of the objects in the systems^4,5,6. Therefore, research on the quantitative and qualitative features of network science has always been a focus for improving the scientific understanding of complex systems^7,8,9,10.

Because network models capture the common features of complex systems, many network models have been proposed to study the modelling of real-world systems^11,12,13. These single-network models provide a general framework of systems from different fields such as social science¹⁴, Internet topology¹⁵, bioscience¹⁶, engineering¹⁷, economics¹⁸, education¹⁹, and so on. In network science, null models are especially notable because they reveal important network properties that could not be directly quantified due to the complexity of the studied systems^20,21. The null model concept was proposed by Maslov and Senppen²² and consists of a network that matches one specific graph in some of its structural features but that is otherwise taken to be a random network instance. The null model is used in comparisons to quantify complex network properties such as community structure^23,24, assortativity^25,26, degree correlation²⁷, epidemic spreading rate²⁸, motif identification^29,30, routing efficiency³¹, pattern detection³², microbial diversification³³, etc.—all of which have been shown to be significant in various complex networks. Therefore, the null model of single networks has been a powerful tool over the past few decades in analysing the nature of modelling, structures and dynamics of complex networks^34,35,36.

However, the limitations of single networks have become increasingly evident over the past few years since the mass emergence of complex systems with multiple interaction layers, which are almost impossible to represent using isolated networks. Multiple relationships among objects give rise to multiplex networks in real-world systems that consist of multiple layers^37,38,39. In such networks, all the relationship types are constrained by the same objects and are therefore not completely independent. Thus, each type of relationship among nodes can be described in each layer of the multiplex networks, and each network layer contains the same set of nodes. Examples of such multiplex systems include social networks involving multiple relationships from different social platforms such as Twitter, YouTube and Facebook⁴⁰, epidemic networks with multiple diseases⁴¹, and Internet topologies with multiple levels from the route level to the AS level⁴². Therefore, multiplex networks, including multilayer networks^43,44, multiscale networks^45,46, and time-dependent networks^47,48,49, are a general framework for modelling and analysing the new phenomena emerging from these multi-layered systems. The research on multiplex networks, including community detection⁵⁰, link prediction⁵¹, epidemic spreading⁴¹, controllability⁵², synchronization⁵³, and network evolution⁵⁴, has illustrated that obvious differences exist between multiplex networks and isolated networks. For example, the synchronization state of the entire system is influenced by each layer; thus, a global unstable state may be caused by the interactions among various stable layers⁵⁵. It is not possible to build the null model of single network for each layer of network separately because each layer is interrelated. However, the null model of multiplex networks remains unexplored, as there are few effective stochastic models that can be used to quantify the specific nature of multiplex networks.

In multiplex networks, the rich node connection information leads to redundancies in the networks, meaning that edges between the same pair of nodes could appear repeatedly in different network layers⁵⁶. Nodes with many repeated edges are more likely to belong to the same community. For example, close friends may contact each other using different social networks such as WeChat, Twitter, and Facebook; intuitively such nodes potentially belonging to the same community. Without redundancy, the connection tightness between objects in multiplex networks could not be represented effectively and accurately. Moreover, edge redundancy leads to node redundancy in multiplex networks. The node degree of a single network cannot be used in a multiplex network due to this redundancy. Therefore, a new measure is needed to replace the node degree in multiplex networks to constrain the null model of multiplex networks.

In this paper, we propose a new general measure of nodes to fill this gap and generate a novel Null Model with Redundancy (NMR) for multiplex networks. Our goal is to describe the redundant connection relationships among nodes and provide a general framework to quantify the specific nature of multiplex networks. To achieve this, two measures, the Node Redundancy Degree (NRD) and Edge RedundanCy (ERC), are calculated based on the redundancies in multiplex networks. We build the NMR with the same NRD that exists in the original multiplex network through a configuration method. The NMR can also be explained using the traditional random-walk method. The final result is a model with an explicit edge probability under Laplacian dynamics that provides new insight into the specific nature of multiplex networks, which are difficult to quantify. Our model requires no preconditions on the systems and applies to both directed and undirected systems. We demonstrate the performance of our model by building the modularity⁵⁷ of multiplex networks to study the community structure. The experimental results show that the community structure of multiplex networks can effectively be exposed through the NMR. Our findings fill the gap in the field of null modelling of multiplex networks and provide a powerful tool for modelling and analysing complex systems with multiple relationships in many general scientific fields.

Results

The Basic Model and Redundancy

In this paper, we choose an adjacency matrix to represent a network because it contains all the connection relationships in the network. A multiplex network consists of a set of networks. Therefore, we use the set of adjacency matrixes representing each isolated network in a multiplex network to preserve the complete connection information of the multiplex network. That is, MN = {A₁, A₂, …, A_k, …, A_M}, k ≤ M, where M denotes the number of networks in the multiplex network and A_k = (a_ij)_{N × N} represents the adjacency matrix of each single network k. N represents the number of nodes in the network.

Because more than one network exists in a multiplex network, an edge between node i and node j could exist in duplicate. This redundancy represents the degree of repetition of a graph structure; therefore, this measure captures the phenomenon that a set of nodes constituting a community in one network tend to also constitute a community in other networks. Such redundancy is a basic attribute of multiplex networks. Here, we first define ERC (Fig. 1a–c) as follows:

Definition 1.

Edge Redundancy (ERC): The ERC refers to the number of duplicates of an edge in a multiplex network. We use m_ij to represent the measure

$${m}_{ij}=|\{k|\exists {e}_{ij}\in {E}_{k}\}|-1,$$

(1)

where E_k is the set of edges in layer k and e_ij is the edge between node i and node j. The formula (1) means the number of layers where there are edges between node i and node j and minus one. To a certain extent, the ERC captures the phenomenon that an edge that exists in one network tends to appear in other networks. Intuitively, the edges with high ERC values should be segmented into a community instead of between communities. Naturally, we could divide the edges into M groups according to the ERC.

We also define the NRD as follows:

Definition 2.

Node Redundancy Degree (NRD): The NRD of node i refers to the number of connected nodes j for which the ERC values of edge m_ij differ. We use ${r}_{i}^{m}$ to represent the m-order NRD of node i, which denotes the number of connected nodes j for which the ERC of edge m_ij is equal to m:

$${r}_{i}^{m}=|\{j|\exists {m}_{ij}=m\,\}|,\,0\le m < M$$

(2)

where M denotes the number of single networks in the multiplex network. The NRD ${r}_{i}^{m}$ represents the degree of the connected edge for which the ERC equals m (Fig. 1a,b,c). When the multiplex network degenerates to a single network, the NRD ${r}_{i}^{m}$ becomes the degree k_i of node i. Therefore, the NRD is a new parameter that measures the degrees of nodes in multiplex networks.

Definition 3.

Redundant Relation Matrix (RM): The RM is a matrix that describes the redundant connections between two nodes:

$$RM=\sum _{k}{A}_{k},\,k\le M$$

(3)

where A_k = (a_ij)_N _× _N represents the adjacency matrix of each single network k. The element rm_ij in the matrix refers to the number of occurrences of the edge between node i and node j. That is, rm_ij = m_ij + 1. Using the RM, we can simplify the calculation of the ERC and the NRD as follows:

$${m}_{ij}=r{m}_{ij}-1$$

(4)

$${r}_{i}^{m}=|\{j|\exists r{m}_{ij}=m+1\}|.$$

(5)

Null Model with Redundancy for Multiplex Networks

One of the null models of a single network proposed by Newman used the node degree k_i to determine the structures of random networks; later, Mahadevan proposed their higher-order representations (see Supplementary Note 1). Because NRD is an evolution of the concept of node degree for multiplex networks, we use the NRD to define null models and their higher-order representations in multiplex networks. The null model with redundancy for multiplex networks is based on the configuration model of a single network^58,59 and dKGRAPHS²⁷. The model can also be explained by Laplacian Dynamics⁶⁰ and random walk³⁵. In this model, the edge probability of the configuration method and the random walk method are unified. Based on the null model of a single network, we introduce our null model with redundancy for multiplex networks (NMR):

Definition 4.

Null Model with Redundancy for Multiplex Networks (NMR): The NMR is a network model that matches the original multiplex network in NRD but is otherwise taken to be a random network instance.

Definition 5.

K-Order NMR: This network model matches the original multiplex network in size and d-order NRD distribution P(r) but is otherwise taken to be an instance of a random network.

The 1-order NMR is shown in Fig. 2. A 1-order NMR is a random model for the whole multiplex network rather than for each layer. Therefore, the aggregated information can be encoded into the multiplex structure. However, the NMR is not only a randomized aggregate version of the original network but also of each layer of the network under the constraint of the NRD. In Fig. 2, each network layer is connected differently between the NMR and the original network. The connections in each layer are also randomized—but they are not completely random. The NRD is a measure that applies to the whole multiplex network. It describes the relationships among each layer and ensures that they are not completely independent in the multiplex network. Therefore, randomization under the constraint of the NRD can randomize both the aggregated information and the information in each layer while preserving the basic relationships among each layer in a multiplex network.

Note that in multiplex networks, we use the NRD distribution instead of the degree distribution and “of the same size” means that the model has the same number of nodes N and number of networks M as the original multiplex network. Here, we provide the details of K-Order NMR, and a summary is shown in Table 1.

Table 1 The summary of k-order null model with redundancy of multiplex networks.

Full size table

0-Order: A random network with the same number of nodes N, number of networks M, and average NRD as in the original multiplex network.

1-Order: A random network with the same number of nodes N, number of networks M, and NRD distribution P₁(r) as in the original multiplex network.

2-Order: A random network with the same number of nodes N, number of networks M, and 2-order NRD distribution P₂(r₁, r₂) as in the original multiplex network.

N-Order: A random network with the same number of nodes N, number of networks M, and n-order NRD distribution P_n(r₁, r₂…, r_n) as in the original multiplex network.

Modularity of Multiplex Networks

In this study, with the NMR, we propose the modularity of a multiplex network. Based on the modularity of a single network (see Supplementary Note 5), the modularity of a multiplex network refers to the actual number of edges within communities minus the expected number of such edges in the 1-order NMR.

In a multiplex network, the actual number of edges between node i and node j is rm_ij in the RM, and the expected number of such edges in the first-order NMR is

$${P}_{ij}=2\mu \times p(i,j)=2\mu M\sum _{m}\frac{{C}_{m+1}^{1}}{{C}_{M}^{m+1}}\frac{{(m+1)}^{2}{r}_{i}^{m}{r}_{j}^{m}}{{(2{\mu }^{m})}^{2}},$$

(6)

where μ is the total number of edges, p(i, j) is the probability of there being an edge between node i and node j in the NMR (See Methods), M is the number of layers in the original multiplex networks and μ^m is the number of m-ERC edges, meaning that there are μ^m edges whose ERC equals m. According to this definition, we can obtain the modularity function of a multiplex network:

$$\begin{array}{rcl}{Q}_{M} & = & \frac{1}{2\mu }\sum _{ij}[r{m}_{ij}-{P}_{ij}]\delta ({g}_{i},{g}_{j})\\ & = & \,\frac{1}{2\mu }\sum _{ij}[r{m}_{ij}-2\,\mu M\sum _{m}\frac{{C}_{m+1}^{1}}{{C}_{M}^{m+1}}\frac{{(m+1)}^{2}{r}_{i}^{m}{r}_{j}^{m}}{{(2{\mu }^{m})}^{2}}]\delta ({g}_{i},{g}_{j})\end{array}$$

(7)

where g_i refers to the community that node i belonging to, δ(g_i, g_j) = 1 if g_i = g_j, and δ(g_i, g_j) = 0, otherwise. When the number of networks M is 1, the multiplex network degenerates into a single network and the modularity function of the multiplex network automatically becomes the single network modularity proposed by Newman. Thus, we can consider the modularity function of a multiplex network as an extension of single network modularity to multiple networks. Compared with the modularity of a multi-slice network, this function focuses on the impact of NRD instead of on virtual connections, which do not exist in reality. Thus, our framework is more in line with the actual structures of multiplex networks and is a more acceptable measure for analysing multiplex networks.

Community Detection in Multiplex Networks

We first give a definition of community in a multiplex network:

Definition 6.

Community in a Multiplex Network: In a multiplex network, a community consists of a group of nodes that are tightly connected. Here, the tight connection means that many more edges exist within the community than among the communities. Note that each layer of a multiplex network contains the same nodes but the edges are different; the number of edges between two nodes should be calculated from all layers of the network.

We executed some community-detection algorithms across the Twitter event networks, Noordin terrorist relationship networks, student-cooperation social networks and global terrorism networks. These algorithms are BGLL for multiplex networks (BGLLMN)^61,62, bridge detection (BD)⁶³,tensor decomposition for multiplex networks (TD)⁶⁴, Modularity-driven Ensemble-Based Community Detection (M-EMCD)⁶⁵, Multidimensional Label Propagation Algorithm (MDLPA)⁶⁶, Multilayer Local Community Detection (ML-LCD)⁶⁷ and our modularity function for multiplex networks (see Supplementary Note 2). Figure 3 shows the results of this quantitative comparison (see Supplementary Note 3) on three of the tested networks and indicates that the modularity function for multiplex networks results in higher-quality communities than do the other tested methods (see Supplementary Note 3). In addition, the results in Fig. 3 show that communities in real networks always have much higher redundancy, which verifies the importance of checking the redundancy in multiplex networks.

Twitter Event Networks

We analysed the relationships among events detected from Twitter. The tweet stream is captured through the Tweet API⁶⁸. Tweets are clustered using similar keywords to detect the Twitter events. Each node in the network represents a Twitter event. We build the three networks below to construct the multiplex relationship among Twitter events (see Supplementary Note 4). The results of the four community-detection algorithms are visualized in Fig. 4. To facilitate the visualization, we combined the three networks into one network. Nodes of the same colour represent a community, meaning that these nodes correspond to the same event. In Fig. 4, BD results are not obviously better than those of the six other algorithms, but the other six algorithms could not be judged intuitively. Therefore, we present the community quality measures in Table 2. As listed in Table 2, the three measures for our method is much higher than those of the other methods, especially redundancy, which is 0.16, and the ground truth, which is 0.2. The high redundancy and node similarity lead to the high accuracy (72%) of our method, which is considerably higher than the accuracies achieved by BD and TD. BGLLMN also attains high accuracy (72%) because it is based on the novel modularity of a single network. When we combine these three networks into one network, some connection information is lost, but these losses are determined by the network structures. When the losses are relatively low, BGLLMN can exhibit good performance; however, the community redundancy of BGLLMN (0.07) is still much lower than that of our method (0.16). Also, the three new algorithm (M-EMCD, MDLPA and ML-LCD) perform the relative high accuracy but low redundancy, which means that our modularity for multiplex network catch the redundancy of the network.

Table 2 Community Quality Measures of Twitter Event Networks.

Full size table

Noordin Terrorist Relationship Networks

Using the Noordin terrorist network data^69,70, we constructed the multiplex terrorist relationship networks based on six relationships between terrorists. Each node in the network represents a terrorist (see Supplementary Note 4). The results of the four community-detection algorithms are visualized in Fig. 5. To facilitate visualization, we combined the six networks into one network. Nodes of the same colour represent a community, meaning that these nodes likely belong to the same terrorist organization. In Fig. 5, shows almost the same conclusion as in the previous test—that the results of BD are not better than those of the three other methods. The number of communities found by BGLLMN is less than that for TD and for our method. This result may cause high node similarity (shown in Table 3) because most of the nodes are divided into the same community, which results in pairs of nodes having more common neighbours (see Supplementary Note 3). However, according to Table 3, our method still has the highest community redundancy (0.30) and accuracy (27%), which again shows that the communities in real multiplex networks always have high redundancy. Three new algorithms still have a good performance on node similarity and accuracy but low redundancy. In addition, the accuracy of all the algorithms is low because there are many noise data and the ground truth may not agree with the network structure. Therefore, we could judge only whether the algorithm is good or bad through comparisons. Based on the results, our method performs better than do the others (see Table 3).

Table 3 Community Quality Measures of Terrorist Relationship Networks.

Full size table

Students’ Cooperation Social Networks

The Students’ Cooperation Social Networks dataset is constructed based on a Computer and Network Security course given at Ben-Gurion University of the Negev⁷¹ in which students are required to submit a paper to specific web sites. We built the students’ cooperation social networks based on the course website log. Each node in the network represents a student (see Supplementary Note S4). The results of the four community-detection algorithms are visualized in Fig. 6. To facilitate the visualization, we combined the six networks into one network. Nodes of the same colour represent a community, meaning that these nodes likely belong to the same group. In the Students’ Cooperation Social Networks, the first network represents the partner relationships between pairs of students (see Supplementary Note S4). We use these disconnected communities as the ground truth and the other two networks as noise data. Intuitively, BGLLMN and our method perform better than do BD and TD, as shown in Fig. 6, because the community discrimination in the BD and TD results is insufficient. From the measure comparison in Table 4, we can see directly that BD and TD have lower values on all three measures than do the other five methods. Though the MDLPA has the highest redundancy (0.21), the accuracy of it (45%) is much lower than our method (57%). This is because the MDLPA detect 29 communities, which is less than our method (49). Moreover, there are 51 communities in the real network. Our method has the highest values of node similarity (0.34) and accuracy (59%). The results show that in an environment with noisy networks, our method demonstrates a strong anti-noise capability.

Table 4 Community Quality Measures of Students’ Cooperation Social Networks.

Full size table

Global Terrorism Networks

From the database of global terrorism⁷², we created four networks in which one terrorist organization is connected to another if they both performed an attack in the same country during the same year. Each node in the network represents a terrorist organization (see Supplementary Note 4). Nodes of the same colour represent a community, meaning that these nodes performed an attack in the same country. In Fig. 7a, there are four complete sub-graphs in each network. The other single nodes in the networks are organizations that did not attack during this year and in this country; therefore, there is no connection between them. We find that the six other community-detection algorithms (BGLLMN, BD, TD, M-EMCD, MDLPA, ML-LCD) obtained the same results: the community is divided into these four networks, as shown in Fig. 7b. When we combine these four networks into one network, none of the edges are redundant except for the edges in the red box in Fig. 7c, which displays the results of our algorithm. The nodes in the red box are connected to each other by edges with a weight of 2. The four nodes are divided into different communities, meaning that our algorithm could reveal the organizations that performed attacks twice in two countries. More generally, our community detection function captures the edges with high redundancy, leading to the high redundancy of communities. This is because we achieved high accuracy on the three multiplex networks described above (see Tables 2, 3, and 4).

Discussion

The results reported in the preceding section demonstrate the advantageous community detection performance on real-world multiplex networks based on the NMR. In all three networks, our algorithm obtained considerably higher values on all three measures: node similarity, community redundancy and accuracy with ground truth. In turn, the meaningful community structures with different redundant parts of multiplex networks are revealed by our NMR, as demonstrated on the fourth multiplex network. Therefore, we have shown that our framework accurately reflects the community quality and that it maximally preserves the community redundancy, which indicates that it could be a reasonable function for community detection in multiplex networks.

The general conclusion from the results presented in this paper is that communities in real-world networks always have much higher redundancy, which verifies the importance of capturing the NRD in multiplex networks. Both the theoretical and experimental results show that NRD is a reasonable measure for describing the connection relationships of multiplex networks. With regard to a single network, NRD automatically degenerates to the node degree. Therefore, NRD is a more general and fundamental measure that includes the node degree as a specific case for single networks. Indeed, this measure can be used in systems with arbitrary nodes, edges and layers—not only in social networks as described above but also in other multi-layer networks such as traffic networks, metabolic networks, epidemic networks and Internet topology.

In a more general sense, the NMR is a general null model for any multiple-relationship system such as the social networks utilized above. We developed the NMR and its higher-order representation using the basic configuration method based on the NRD. The rationality of the NMR can also be explained by the traditional random-walk theory. The connection between the 0-order NMR and the original networks is almost completely random, except for size. As the order increases, the model gradually becomes closer to the original multiplex network, and as more attributes match those of the original network, the model becomes the same as the original multiplex network. For different purposes, the order of the NMR can be controlled to guarantee the connection similarity to the original network, and other properties of the original network can be exposed by the comparison.

The general significance of the NMR is that in addition to community structure, many other specific properties can be revealed through the different orders of the model. These properties, including motif identities, propagation-rate threshold, redundancy-distribution correlations and synchronization-state stability, have already been shown to be important in network science. Additionally, the NMR can be used in directed networks based on in-and-out NRD. For example, a comparison of the number of structures appearing in the NMR with the same in-and-out NRD distribution may help researchers determine whether this higher-order structure is the most important motif in the original multiplex network. Our future work is based on such extensions of our NMR and its high-order representations, which may lead to some problems involving the applications of all systems with multi-relationships that can be described by multiplex networks.

Finally, our null model of multiplex networks provides a powerful tool for the structure analysis of complex systems with multiple relationships. Through comparisons, the specific nature of these multi-relationship systems can be exposed quantitatively by the NMR. We believe that the NMR can give rise to much stronger and more general applications in many areas, including social science, Internet topology, bioscience, engineering, economics, and education, where multi-relationship systems can be described by multiplex networks. To accomplish this, much more work needs to be done to gain a deeper understanding of the model and its high-order representations, such as a determination of the NRD distribution law. We hope that many more attributes of multi-relationship systems can be modelled and analysed through the null model with redundancy for multiplex networks.

Methods

Generation of the 1-Order Null Model with Redundancy for Multiplex Network

To generate the one-order NMR, we introduce the random configuration model of multiplex networks based on the configuration model in single networks.

The random configuration model is constructed as follows:

(1)
Divide the edges into M groups according to their ERC values. Because each edge e_ij may have different ERC values between different nodes, we should assign them separately according to the ERC to ensure that the NRD of each node is the same as that of the original one. The number of edges in each group μ^m is defined by the number of edges in the m-ERC, where 0 ≤ m < M.
(2)
Assign the M groups of edges randomly to the models that have the same size as the original multiplex network. There are M layers of the network and m + 1 edges in the (m + 1)-th groups in which the m-ERC equals m (0 ≤ m < M). Therefore, the edge assignment process is an (m + 1)-combination problem. The total number of assignments in (m + 1) groups is ${C}_{M}^{m+1}=\frac{M!}{[M-(m+1)]!\times (m+1)!}$, which can also be written as $(\begin{array}{c}M\\ m+1\end{array})$. The number of assignments in a specific layer of network A_i is ${C}_{m+1}^{1}$. Therefore, the probability that network A_i is assigned an m-ERC edge is ${p}_{m}({A}_{i})=\frac{{C}_{m+1}^{1}}{{C}_{M}^{m+1}}$.
(3)
Assign the m-ERC edges to nodes i and j. For node i, there are $(m+1){r}_{i}^{m}$ edges that can be assigned for the m-ERC edges. For the entire network, a total of 2 μ^m edges can be assigned for the m-ERC edges. We consider that the process of one edge selecting the two end nodes is independent. Therefore, the probability that node i and node j in network A_i are assigned an m-ERC edge is ${p}_{{A}_{i}}(i,j)=\frac{(m+1){r}_{i}^{m}}{2{\mu }^{m}}\times \frac{(m+1){r}_{j}^{m}}{2{\mu }^{m}}$.
(4)
Assign all the edges to the model. Note that the processes of edge assignment to networks and to nodes are independent. Thus, at the end of the assignment processes, the probability of an edge existing between a node i and node j in one network is

$$p(i,j)^{\prime} =\sum _{m}{p}_{m}({A}_{i})\times {p}_{{A}_{i}}(i,j)=\sum _{m}\frac{{C}_{m+1}^{1}}{{C}_{M}^{m+1}}\frac{{(m+1)}^{2}{r}_{i}^{m}{r}_{j}^{m}}{{(2{\mu }^{m})}^{2}},0\le m < M$$

(8)

Thus, in M networks, the probability of an edge existing between node i and node j is:

$$p(i,j)=M\sum _{m}\frac{{C}_{m+1}^{1}}{{C}_{M}^{m+1}}\frac{{(m+1)}^{2}{r}_{i}^{m}{r}_{j}^{m}}{{(2{\mu }^{m})}^{2}},0\le m < M$$

(9)

Explanation of Random Travel

We can also generate our NMR based on a random walk under Laplacian dynamics⁶⁰. Here, we suppose there is a traveller who travels randomly from any one node to any other node in a multiplex network, even if the two nodes are in different networks. In contrast to a random walk on a single network, the traveller can travel between different networks only when two nodes are connected in any network. Thus, we call the agent a “traveller” rather than a “walker”.

Because the edges can be divided into M groups according to their ERC values, we can divide the multiplex network into M layers in which the ERC is the same for all edges in each layer. Thus, the traveller can travel among all the layers of the multiplex network, which means that the traveller can choose edges with different ERC values to travel between layers. In the m-layer, where the ERC values of all the edges are equal to m, the probability of the traveller travelling from node i to node j in the model is

$$p{(j|i)}_{m}=\frac{{C}_{m+1}^{1}}{{C}_{M}^{m+1}}\frac{(m+1){r}_{j}^{m}}{2{\mu }^{m}}$$

(10)

The random travel process, similar to the random walk process, is a Markov process. When the process is stable in each layer, the steady-state probability distribution is

$${p}_{i}^{m\ast }=\frac{(m+1){r}_{i}^{m}}{2{\mu }^{m}}$$

(11)

Thus, the joint probability of traveler traveling from node i to j in one network in the model is:

$$p(i,j)^{\prime} =\sum _{m}p{(j|i)}_{m}\times {p}_{i}^{m\ast }=\sum _{m}\frac{{C}_{m+1}^{1}}{{C}_{M}^{m+1}}\frac{{(m+1)}^{2}{r}_{i}^{m}{r}_{j}^{m}}{{(2{\mu }^{m})}^{2}},0\le m < M$$

(12)

In M networks, the probability of an edge existing between node i and node j is

$$p(i,j)=M\sum _{m}\frac{{C}_{m+1}^{1}}{{C}_{M}^{m+1}}\frac{{(m+1)}^{2}{r}_{i}^{m}{r}_{j}^{m}}{{(2{\mu }^{m})}^{2}},0\le m < M$$

(13)

The probability of each edge occurring in the random travel model is the same as that in the random configuration model. Thus, the two models are unified for multiplex networks, which verifies the correctness and validity of our NMR.

Fast Algorithm of Community Detection based on the Multiplex Networks Modularity Function

In the era of big data, the scale of networks is becoming increasingly large. Thus, we propose a new fast algorithm for community detection based on the multiplex networks modularity function in large networks (FCDMNN). This work is based on the work of V. D. Blondel⁶¹. The steps in the algorithm are as follows:

(1)
Initialization: We regard each node in the multiplex network as a community. Thus, the number of communities is N, which also denotes the number of nodes.
(2)
Traverse each node i in the multiplex network to find all the nodes connected with node i. Compute the modularity increment ΔQ of each neighbouring node k of node i. ΔQ is defined as follows:
$${\rm{\Delta }}{Q}_{ij}=\frac{1}{2\mu }\{\sum _{k\in {g}_{j}}[{w}_{ik}-P(i,k)]-\sum _{k\in {g}_{i}}[{w}_{ik}-P(i,k)]\}$$
(14)
where $P(i,k)=2\mu M\sum _{m}\frac{{C}_{m+1}^{1}}{{C}_{M}^{m+1}}\frac{{(m+1)}^{2}{r}_{i}^{m}{r}_{k}^{m}}{{(2{\mu }^{m})}^{2}}$ is the expected number of edges of 1-order NMR.
(3)
Find the community C_kof node k with the maximum ΔQ. Add node i to community C_k.
(4)
Repeat steps (2) and (3) until the communities no longer change.
(5)
When step (4) is complete, regard each community as a node. The edges within each community can be regarded as the loopback weighted edges of the new node. Here, the weight is the number of edges within the community to which the node belongs.
(6)
The edges between two communities can be regarded as the weighted edges of the two new nodes. Here, the weight is the total number of edges between the two communities to which the nodes belong.
(7)
Repeat steps (2)–(5) until the communities no longer change.

The time complexity of FCDMNN is $O(N\times \sum _{m}{r}_{max}^{m})$, where ${r}_{max}^{m}$ denotes the maximum m-order NRD and N refers to the number of nodes. Compared with the BGLL algorithm, the time complexity of our algorithm is slightly higher. However, for large networks, $\sum _{m}{r}_{max}^{m}$ is far less than the number of nodes N. Thus, the time complexities of the two algorithms are both O(n). However, our algorithm is acceptable for multiplex networks and the quality of the resulting communities is better compared to other multiplex-network community detection algorithms.

References

Börner, K., Sanyal, S. & Vespignani, A. Network science. Annual Review of Information Science & Technology 41(1), 537–607 (2010).
Article Google Scholar
Thiery, J. P. & Sleeman, J. P. Complex networks orchestrate epithelial–mesenchymal transitions. Nature reviews Molecular cell biology 7(2), 131–142 (2006).
Article CAS PubMed Google Scholar
Ravasz, E. & Barabási, A.-L. Hierarchical organization in complex networks. Physical Review E 67(2), 026112 (2003).
Article ADS MATH Google Scholar
Brito, S. G. A., de Silva, L. R. & Tsallis, C. Role of dimensionality in complex networks: Connection with nonextensive statistics. Physics 6 (2015).
Leitold, D., Vathy-Fogarassy, Á. & Abonyi, J. Controllability and observability in complex networks–the effect of connection types. Scientific Reports 7(1), 151 (2017).
Article ADS PubMed PubMed Central Google Scholar
Li, D., Wang, X. & Huang, P. A fractal growth model: Exploring the connection pattern of hubs in complex networks. Physica A: Statistical Mechanics and its Applications 471, 200–211 (2017).
Article ADS Google Scholar
Morone, F. & Makse, H. A. Influence maximization in complex networks through optimal percolation. Nature 527(7579), 544 (2015).
Article CAS PubMed Google Scholar
Kleinberg, J. M. Navigation in a small world. Nature 406(6798), 845 (2000).
Article ADS CAS PubMed Google Scholar
Song, C., Havlin, S. & Makse, H. A. Self-similarity of complex networks. Nature 433(7024), 392–5 (2005).
Article ADS CAS PubMed Google Scholar
Tan, F., Xia, Y. & Zhu, B. Link prediction in complex networks: a mutual information perspective. Plos One 9(9), e107056 (2014).
Article ADS PubMed PubMed Central Google Scholar
Erdös, P. & Rényi, A. On Random Graphs. Publicationes Mathematicae 6, 290–291 (1959).
MathSciNet MATH Google Scholar
Watts, D. J. & Strogatz, S. H. Collectivedynamics of ‘small-world’ networks. Nature, 440–442 (1998).
Barabasi, A. L. & Albert, R. Emergence of Scaling in Random Networks. Science 286(5439), 509–512 (1999).
Article ADS MathSciNet CAS PubMed MATH Google Scholar
Chiasserini, C., Fabiana, Garetto, M. & Leonardi, E. Social Network De-Anonymization Under Scale-Free User Relations. IEEE/ACM Transactions on Networking 24(6), 3756–3769 (2016).
Article Google Scholar
Faloutsos, M., Faloutsos, P. & Faloutsos, C. On power-law relationships of the Internet topology. ACM SIGCOMM Computer Communication Review 29(4), 251–262 (2003).
Article Google Scholar
Koonin, E. V., Wolf, Y. I. & G. P. K. D. Sci. Power Laws, Scale-Free Networks and Genome Biology. Springer Berlin (2006).
Wen, L., Dromey, R. G. & Kirk, D. Software engineering and scale-free networks. IEEE Transactions on Systems Man & Cybernetics Part B Cybernetics A Publication of the IEEE Systems Man & Cybernetics. Society 39(4), 845 (2009).
Google Scholar
Kurakin, A. Scale-free flow of life: on the biology, economics, and physics of the cell. Theoretical Biology and Medical Modelling 6(1), 6 (2009).
Article PubMed PubMed Central Google Scholar
Wang, J. & Liu, F. Complex networks and its application in education resources network. International Symposium on Information Technology in Medicine and Education IEEE, 345–348 (2012).
Newman, M. E., Strogatz, S. H. & Watts, D. J. Random graphs with arbitrary degree distributions and their applications. Physical Review E Statistical Nonlinear & Soft Matter Physics 64(2 Pt 2), 359–382 (2001).
Google Scholar
Karrer, B. & Newman, M. E. J. Stochastic blockmodels and community structure in networks. Physical Review E Statistical Nonlinear & Soft Matter Physics 83(1Pt 2), 016107 (2010).
MathSciNet Google Scholar
Maslov, S. & Sneppen, K. Specificity and stability in topology of protein networks. Science 296(5569), 910–913 (2002).
Article ADS CAS PubMed Google Scholar
Newman, M. E. J. Finding community structure in networks using the eigenvectors of matrices. Physical Review E Statistical Nonlinear & Soft Matter Physics 74(3 Pt 2), 92–100 (2006).
MathSciNet Google Scholar
Cazabet, R., Borgnat, P. & Jensen, P. Enhancing Space-Aware Community Detection Using Degree Constrained Spatial Null Model. Workshop on Complex Networks Comple Net. Springer, Cham 47–55 (2017).
Pastorsatorras, R., Vázquez, A. & Vespignani, A. Dynamical and Correlation Properties of the Internet[J]. Physical Review Letters 87(25), 258701 (2001).
Article ADS CAS Google Scholar
Foster, J. G. & Stanley, H. E. Edge direction and the structure of networks. Proceedings of the National Academy of Sciences of the United States of America 107(24), 10815–20 (2010).
Article ADS CAS PubMed PubMed Central Google Scholar
Mahadevan, P. et al. Systematic topology analysis and generation using degree correlations. Acm Sigcomm Computer Communication Review 36(4), 135–146 (2006).
Article Google Scholar
Estrada, E. et al. Epidemic spreading in random rectangular networks. Phys. rev. E 94(5)-1, 052316 (2016).
Article ADS PubMed Google Scholar
Juszczyszyn, K., Kazienko, P. & Musiał. K. Local topology of social network based on motif analysis. Knowledge-based intelligent information and engineering systems. Springer Berlin/Heidelberg (2008).
Schlauch, W. E. & Zweig, K. A. Influence of the null-model on motif detection. IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining. ACM 514–519 (2015).
Nian, X. & Fu, H. Efficient routing on two layer degree-coupled networks. Physica A Statistical Mechanics & Its Applications 410(12), 421–427 (2014).
Article ADS Google Scholar
Ulrich, W. & Gotelli, N. J. Pattern detection in null model analysis. Oikos 122(1), 2–18 (2013).
Article Google Scholar
Straub, T. J. & Zhaxybayeva, O. A null model for microbial diversification. Proceedings of the National Academy of Sciences: 201619993 (2017).
Wang, Z., Wang, Y. & Liu, Y. Global synchronization for discrete-time stochastic complex networks with randomly occurred nonlinearities and mixed time delays. IEEE Transactions on Neural Networks 21(1), 11–25 (2010).
Article PubMed Google Scholar
Noh, J. D. & Rieger, H. Random walks on complex networks. Physical review letters 92(11), 118701 (2004).
Article ADS PubMed Google Scholar
Barabási, A.-L., Albert, R. & Jeong, H. Scale-free characteristics of random networks: the topology of the world-wide web. Physica A: statistical mechanics and its applications 281(1), 69–77 (2000).
Article ADS Google Scholar
Nicosia, V. et al. Growing multiplex networks. Physical review letters 111(5), 058701 (2013).
Article ADS CAS PubMed Google Scholar
Bianconi, G. Statistical mechanics of multiplex networks: Entropy and overlap. Physical Review E 87(6), 062806 (2013).
Article ADS Google Scholar
Menichetti, G. et al. Weighted multiplex networks. PloS one 9(6), e97857 (2014).
Article ADS PubMed PubMed Central Google Scholar
Magnani, M. & Rossi, L. The ml-model for multi-layer social networks. Advances in Social Networks Analysis and Mining (ASONAM), 2011 International Conference on. IEEE (2011).
Granell, C., Gómez, S. & Arenas, A. Dynamical interplay between awareness and epidemic spreading in multiplex networks. Physical review letters 111(12), 128701 (2013).
Article ADS PubMed Google Scholar
Andreopoulos, B., An, A. & Wang, X. Clustering the internet topology at multiple layers. WSEAS Transactions on Information Science and Applications 2(10), 1625–1634 (2005).
Google Scholar
Boccaletti, S. et al. The structure and dynamics of multilayer networks. Physics Reports 544(1), 1–122 (2014).
Article ADS MathSciNet Google Scholar
Kivelä, M. et al. Multilayer networks. Ssrn Electronic Journal 2(3), 261–268 (2013).
Google Scholar
Foti, N. J., James, M. H. & Daniel, N. R. Nonparametric sparsification of complex multiscale networks. PloS one 6(2), e16431 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Balcan, D. et al. Multiscale mobility networks and the spatial spreading of infectious diseases. Proceedings of the National Academy of Sciences 106(51), 21484–21489 (2009).
Article ADS CAS Google Scholar
Dean, B. C. Algorithms for minimum‐cost paths in time‐dependent networks with waiting policies. Networks 44(1), 41–46 (2004).
Article MathSciNet MATH Google Scholar
Brodal, G. S. & Jacob, R. Time-dependent networks as models to achieve fast exact time-table queries. Electronic Notes in Theoretical Computer Science 92, 3–15 (2004).
Article MATH Google Scholar
Gao, S. & Chabini, I. Optimal routing policy problems in stochastic time-dependent networks. Transportation Research Part B: Methodological 40(2), 93–122 (2006).
Article Google Scholar
Mucha, P. J. et al. Community structure in time-dependent, multiscale, and multiplex networks. Science 328(5980), 876–878 (2010).
Article ADS MathSciNet CAS PubMed MATH Google Scholar
Pujari, M. & Kanawati, R. Link prediction in multiplex networks. NHM 10(1), 17–35 (2015).
Article MathSciNet MATH Google Scholar
Yuan, Z. et al. Exact controllability of multiplex networks. New Journal of Physics 16(10), 103036 (2014).
Article ADS MathSciNet Google Scholar
Gambuzza, L., Valentina, M. F. & Gomez-Gardeñes, J. Intra-layer synchronization in multiplex networks. EPL (Europhysics Letters) 110(2), 20010 (2015).
Article ADS Google Scholar
Gómez-Gardenes, J. et al. Evolution of cooperation in multiplex networks. Scientific reports 2, 620 (2012).
Article PubMed PubMed Central Google Scholar
Genio, D. et al. Synchronization in networks with multiple interaction layers. Science Advances 2(11), e1601679 (2016).
Article ADS PubMed PubMed Central Google Scholar
Berlingerio, M., Coscia, M. & Giannotti, F. Finding and Characterizing Communities in Multidimensional Networks, International Conference on Advances in Social Networks Analysis and Mining. IEEE Computer Society, 490–494 (2011).
Newman, M. E. Modularity and community structure in networks. Proceedings of the National Academy of Sciences of the United States of America 103(103), 8577–8582 (2006).
Article ADS CAS PubMed PubMed Central Google Scholar
Baldick, R. et al. Initial review of methods for cascading failure analysis in electric power transmission systems IEEE PES CAMS task force on understanding, prediction, mitigation and restoration of cascading failures. Power and Energy Society General Meeting-Conversion and Delivery of Electrical Energy in the 21st Century, 2008 IEEE (2008).
Buldyrev, S. V. et al. Catastrophic cascade of failures in interdependent networks. Nature 464(7291), 1025–8 (2010).
Article ADS CAS PubMed Google Scholar
Lambiotte, R., Delvenne, J. C. & Barahona, M. Laplacian Dynamics and Multiscale Modular Structure in Networks. Physics (2008).
Blondel, V. D. et al. Fast unfolding of communities in large networks. Journal of Statistical Mechanics Theory & Experiment 2008(10), 155–168 (2008).
Article Google Scholar
Loe, C. W. & Jensen, H. J. Comparison of communities detection algorithms for multiplex. Physica A Statistical Mechanics & Its Applications 431, 29–45 (2015).
Article ADS MathSciNet Google Scholar
Bródka, P., Filipowski, T. & Kazienko, P. An introduction to community detection in multi-layered social network. World Summit on Knowledge Society. Springer Berlin Heidelberg (2011).
Rodriguez, M. A. & Shinavier, J. Exposing multi-relational networks to single-relational network analysis algorithms. Computer Science 4(1), 29–41 (2009).
Google Scholar
Tagarelli, A., Amelio, A. & Gullo, F. Ensemble-based community detection in multilayer networks. Data Mining & Knowledge Discovery 3, 1–38 (2017).
MathSciNet Google Scholar
Boutemine, O. & Bouguessa, M. Mining Community Structures in Multidimensional Networks. Acm Transactions on Knowledge Discovery from Data 11(4), 1–36 (2017).
Article Google Scholar
Interdonato, R. et al. Local community detection in multilayer networks. Data Mining and Knowledge Discovery 31(5), 1444–1479 (2017).
Article MathSciNet Google Scholar
Twitter, Inc. Twitter API Developer. http://dev.twitter.com/ (2017).
Khan, M. S. et al. Terrorist attacks in the largest metropolitan city of Pakistan: Profile of soft tissue and skeletal injuries from a single trauma center. World Journal of Emergency Medicine 66(33), 217–220 (2015).
Article Google Scholar
Roberts, N. & Everton, S. The Noordin Top Terrorist Network Data. http://www.thearda.com/Archive/Files/Descriptions/TERRNET.asp (2011)
Fire, M. et al. Predicting student exam’s scores by analyzing social network data. Active Media Technology. Springer Berlin Heidelberg, 584–595 (2012).
National Consortium for the Study of Terrorism and Responses to Terrorism (START). Global Terrorism Database. Retrieved from http://www.start.umd.edu/gtd (2016).

Download references

Acknowledgements

This work was supported by National Natural Science Foundation of China (No. 61471101 & No. 61301274 & No. 61571094). The data sets used to obtain the results in this manuscript are available at http://www.thearda.com/Archive/Files/Descriptions/TERRNET.asp, http://proj.ise.bgu.ac.il/sns/students.html and http://www.start.umd.edu/gtd.

Author information

Guangmin Hu
Present address: Center for Information Geoscience, University of Electronic Science and Technology of China, Chengdu, China

Authors and Affiliations

School of Communication and Information Engineering, University of Electronic Science and Technology of China, Chengdu, China
Xuemeng Zhai, Gaolei Fei, Weiyi Liu, Cai Lu & Guangmin Hu
Faculty of Science, Engineering and Built Environment, Deakin University, 221 Burwood Highway, Burwood, VIC 3125, Australia
Wanlei Zhou
Beijing Information Technology Institute, Beijing, China
Zhoujun Xu & Chengbo Jiao

Authors

Xuemeng Zhai
View author publications
You can also search for this author in PubMed Google Scholar
Wanlei Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Gaolei Fei
View author publications
You can also search for this author in PubMed Google Scholar
Weiyi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhoujun Xu
View author publications
You can also search for this author in PubMed Google Scholar
Chengbo Jiao
View author publications
You can also search for this author in PubMed Google Scholar
Cai Lu
View author publications
You can also search for this author in PubMed Google Scholar
Guangmin Hu
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Xuemeng Zhai and Guangmin Hu designed and performed research and wrote the main manuscript. Wanlei Zhou discussed the results and wrote the main manuscript. Gaolei Fei and Weiyi Liu collected the data and conducted the experiment. Zhoujun Xu, Chengbo Jiao and Cai Lu prepared all figures and wrote the supplementary information.

Corresponding author

Correspondence to Guangmin Hu.

Ethics declarations

Competing Interests

The authors declare no competing interests.

Additional information

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zhai, X., Zhou, W., Fei, G. et al. Null Model and Community Structure in Multiplex Networks. Sci Rep 8, 3245 (2018). https://doi.org/10.1038/s41598-018-21286-0

Download citation

Received: 02 August 2017
Accepted: 31 January 2018
Published: 19 February 2018
DOI: https://doi.org/10.1038/s41598-018-21286-0

This article is cited by

Motif-based community detection in heterogeneous multilayer networks
- Yafang Liu
- Aiwen Li
- Zengru Di
Scientific Reports (2024)
Spreading dynamics of a preferential information model with hesitation psychology on scale-free networks
- Xiongding Liu
- Tao Li
- Hao Xu
Advances in Difference Equations (2019)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Graphlets in Multiplex Networks

Degree difference: a simple measure to characterize structural heterogeneity in complex networks

Characterizing the interactions between classical and community-aware centrality measures in complex networks

Introduction

Results

The Basic Model and Redundancy

Definition 1.

Definition 2.

Definition 3.

Null Model with Redundancy for Multiplex Networks

Definition 4.

Definition 5.

Modularity of Multiplex Networks

Community Detection in Multiplex Networks

Definition 6.

Twitter Event Networks

Noordin Terrorist Relationship Networks

Students’ Cooperation Social Networks

Global Terrorism Networks

Discussion

Methods

Generation of the 1-Order Null Model with Redundancy for Multiplex Network

Explanation of Random Travel

Fast Algorithm of Community Detection based on the Multiplex Networks Modularity Function

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing Interests

Additional information

Electronic supplementary material

Supplementary Information

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Motif-based community detection in heterogeneous multilayer networks

Spreading dynamics of a preferential information model with hesitation psychology on scale-free networks

Comments

Search

Quick links