Introduction

Data amount and variety have soared as never seen before, offering a unique opportunity to better understand complex systems. Among the different modes of representation of data, networks appear as particularly successful. Networks are indeed interesting to refine raw data and extract relevant features, patterns, and classes. They are exploited for years to study complex systems, and a wide and powerful range of tools from graph theory are available for their exploration.

However, the integrated exploration of large multidimensional datasets remains a major challenge in many scientific fields. For instance, a comprehensive understanding of biological systems would require the integrated analysis of dozens of different datasets produced at different molecular, cellular or tissular scales. Recently, multilayer networks emerged as essential players in the analysis of such complex systems. Multilayer networks allow integrating more than one network in a unified formalism, in which the different networks are considered as layers1. For instance, Duran-Frigola et al.2 combined 25 different networks of chemical compounds and their relationships, gathering relationships from chemical structures to clinical outcomes. This multilayer framework allows an integrated study of chemical compounds and their biological activities. Another example is given by the Hetionet project. The authors collected dozen of heterogeneous networks, i.e networks with various types of nodes such as genes, drugs or diseases, to prioritize drugs for repurposing3.

Several definitions of multilayer networks have been proposed, based on the (in)homogeneity of the layers and the properties of the connections between layers4,5,6. For instance, multiplex networks are multilayer networks composed of different layers containing the same nodes (called replica nodes) but different types of edges, and thereby different topologies. Heterogeneous networks link networks composed of different types of nodes thanks to bipartite interactions. Temporal networks follow the dynamic of a network over time: all the layers have the same nodes, but each layer represents the interaction state at a given time7. We will here consider universal multilayer networks, which can be defined as multilayer networks composed of any number of multiplex (or monoplex) networks (with edges that can be directed and/or weighted), linked by bipartite networks (with edges that can be directed and/or weighted) (Fig. 1). A wide range of methods have been developed in the recent years to analyze multilayer networks. For instance, different network metrics have been adapted to multilayer networks8, as well as various network clustering algorithms for community detection9,10,11 or random walk for network exploration12,13,14,15.

Fig. 1: A universal multilayer network.
figure 1

A universal multilayer network composed of three multiplex networks (green, blue and red multiplex networks). Each multiplex network contains different types of nodes (denoted 1 to 4, α to ϵ, and a to d, respectively). Their corresponding Supra-adjacency matrices are denoted by \({{{{{{{{\mathcal{A}}}}}}}}}_{i}\). The three multiplex networks are linked by six bipartite networks (represented here as bipartite interactions for the sake of visualization). The corresponding Bipartite network matrices are denoted by Bi,j. It is to note that a connection between a node i in a first multiplex network and α and a node j in a second multiplex network β imposes the creation of edges between all replicas of node i present in the different layers of the multiplex network α and all replicas of node j present in the different layers of multiplex network β. All the edges of the universal multilayer networks can be weighted and/or directed.

Random walks are iterative stochastic processes widely used to explore network topologies. They can be described as simulated particles that walk iteratively from one node to one of its neighbors with some probability16. The PageRank algorithm, for instance, is based on a random walk simulating the behavior of an internet user walking from one page to another thanks to hyper-links. The user can also restart the walk on any arbitrary page17. In this particular random walk strategy, the restart prevents the random walker from being trapped in dead-ends18. An interesting alternative strategy restricts the restart to specific node(s), called the seed(s)19. In this strategy, named Random Walk with Restart (RWR) or Personalized PageRank, the random walk represents a measure of proximity from all the nodes in the network to the seed(s). RWR can also be described as a diffusion process, in which the objective is to determine the steady-state of an initial probability distribution20.

RWR are widely used to exploit large-scale networks. In computational biology, for instance, RWR strategies have been shown to significantly outperform methods based on local distance measures for the prioritization of gene-disease associations21. Importantly, different upgrades of the RWR approach have been implemented during the last decade, including its extension to (i) heterogeneous networks12, (ii) multiplex networks13 and (iii) multiplex-heterogeneous networks15. In RWR, the degrees of freedom are summarized in the Transition rate matrix, and correspond to the available transitions between the different nodes of the graph. The extensions of RWR are challenging because the Transition rate matrices need to be normalized. To the best of our knowledge, this normalization is currently only solved for multilayer networks composed of two heterogeneous multiplex networks15,22 and the more universal case of N multiplex networks remains unsolved.

We propose here MultiXrank, a framework composed of a method and a Python package to execute RWR on universal multilayer networks. We first introduce the mathematical bases of this RWR for universal multilayer networks, which correspond to a generalization of the approach from12. We evaluate MultiXrank with leave-one-out cross-validation and link prediction protocols. These evaluations reveal that more network data is not always better and highlight the critical influence of the bipartite networks. We finally present an in-depth exploration of the parameter space to measure the stability of the RWR output scores under variations of the input parameters. The MultiXrank Python package is freely available at https://github.com/anthbapt/multixrank, with an optimized implementation allowing its application to large multilayer networks.

Results

Random walk with restart (RWR)

Let us consider an irreducible and aperiodic Markov chain, for instance a network composed of a giant component with undirected edges, G = (V, E), where V is the set of vertices and E (V × V) is the set of edges. In the case of irreducible and aperiodic Markov chains, a stationary probability p* exists and satisfies the following properties:

$$\left\{\begin{array}{ll}{{{{{{{{\bf{p}}}}}}}}}^{* }(i)\ > \ 0;\forall i\in V \hfill \\ {\sum }_{i\in V}{{{{{{{{\bf{p}}}}}}}}}^{* }(i)=1\hfill\end{array}\right.$$
(1)

We next introduce the probability defining the walk from one node to another. Let us define x, a particle that explores the network, xt its position at time t and xt+1 its position at time t + 1. Considering two nodes i and j:

$${\mathbb{P}}({x}_{t+1}=j\,| \,{x}_{t}=i)=\left\{\begin{array}{l}\frac{1}{{d}_{i}}\,\,\,\,\,{{{{{{{\rm{if}}}}}}}}\ (i,j)\in E\\ 0\,\,\,\,\,\,{{{{{{{\rm{Otherwise}}}}}}}}\end{array}\right.$$
(2)

with di being the degree of the node i. All the normalized possible transitions can be included in the Transition rate matrix. This Transition rate matrix, noted M, can be seen as the matrix of the degrees of freedom of the particle in the system. It is useful to note that the Transition rate matrix is equal to the column-normalized Adjacency matrix. The distribution denoted by \({{{{{{{{\bf{p}}}}}}}}}_{t}={({{{{{{{{\bf{p}}}}}}}}}_{t}(i))}_{i\in V}\) describes the probability of being in the node i at time t, and the stationary distribution p* is obtained thanks to the homogeneous linear difference equation [3]18,23:

$${{{{{{{{\bf{p}}}}}}}}}_{t+1}^{T}=M{{{{{{{{\bf{p}}}}}}}}}_{t}^{T}$$
(3)

with \({{{{{{{{\bf{p}}}}}}}}}_{t}^{T}\) denoting the transpose of the vector pt. Moreover, we can introduce a non-homogeneous linear difference equation [4]23 to take into account the restart on the seed(s). When the Transition rate matrix is a Stochastic matrix, the stationary distribution is reached18 (Supplementary Note 1.A.1 for elements of proof of convergence) and this distribution can be seen as a measure of proximity of all the network nodes with respect to the seed(s).

$${{{{{{{{\bf{p}}}}}}}}}_{t+1}^{T}=(1-r)M{{{{{{{{\bf{p}}}}}}}}}_{t}^{T}+r{{{{{{{{\bf{p}}}}}}}}}_{0}^{T}$$
(4)

The distribution p0 corresponds to the initial probability distribution, where only the seed(s) have non-zero values; r represents the restart probability.

RWR on multiplex networks

The RWR method has been extended to multiplex networks, i.e., multilayer networks with a one-to-one mapping between the (replica) nodes of the different layers (Fig. 1)1,13,14. Multiplex networks can be represented by Supra-adjacency matrices, which correspond to a generalization of the standard Adjacency matrix. In the following, we will use several multiplex networks, indexed by k. We denoted by \({{{{{{{{\mathcal{A}}}}}}}}}_{k}\) the Supra-adjacency matrix of the multiplex network indexed by k. The Adjacency matrix of the layer l of the multiplex network k is denoted by \({A}_{k}^{[l]}\). The element of this adjacency matrix from node i to node j is defined as \({({A}_{k}^{[l]})}_{i,j}\ge 0\). The dimension of the Supra-adjacency matrix \({{{{{{{{\mathcal{A}}}}}}}}}_{k}\) of the multiplex network k is equal to (Lk*nk)*(Lk*nk), with nk the number of nodes in each layer of the multiplex network k and Lk the number of layers in the multiplex network k. The Supra-adjacency matrix \({{{{{{{{\mathcal{A}}}}}}}}}_{k}\) is defined as follows:

$${({{{{{{{{\mathcal{A}}}}}}}}}_{k})}_{{i}_{l},{j}_{m}}=\left\{\begin{array}{ll}{\left({A}_{k}^{[l]}\right)}_{i,j}&{{{{{{{\rm{if}}}}}}}}\,l=m\\ {\delta }_{i,j}\hfill&{{{{{{{\rm{if}}}}}}}}\,l\ \ne\ m\hfill\end{array}\right.$$
(5)

where δ defines the Kronecker delta (i.e., 1 if i equal j and 0 otherwise), and l and m represent the layers of the multiplex network k. We can also define a multiplex network as a set of nodes, \({V}_{{{{{{{{{\mathcal{A}}}}}}}}}_{k}}\) and a set of edges, \({E}_{{{{{{{{{\mathcal{A}}}}}}}}}_{k}}\):

$$\left\{\begin{array}{l}{G}_{{{{{{{{{\mathcal{A}}}}}}}}}_{k}}=({V}_{{{{{{{{{\mathcal{A}}}}}}}}}_{k}},{E}_{{{{{{{{{\mathcal{A}}}}}}}}}_{k}})\hfill \\ {V}_{{{{{{{{{\mathcal{A}}}}}}}}}_{k}}=\{{v}_{i}^{l},i=1,\ldots ,{n}_{k},l=1,\ldots ,{L}_{k}\}\hfill\\ {E}_{{{{{{{{{\mathcal{A}}}}}}}}}_{k}}=\{{e}_{i,j}^{ll},i,j=1,\ldots ,{n}_{k},l=1,\ldots ,{L}_{k},{({A}_{k}^{[l]})}_{i,j}\ \ne\ 0\}\\ \cup \{{e}_{i,i}^{lm},i=1,\ldots ,{n}_{k},l\ \ne\ m\}\hfill\end{array}\right.$$
(6)

Importantly, we need to column-normalize the Supra-adjacency matrix defined in the equations [56] in order to converge to the steady-state, as defined in15. This normalization requires including the parameters δk related to the jumps from one layer to another inside the matrix representation, as described in13 (Fig. 2). In the next section, we need to index by k all the parameters that are dedicated to the multiplex network k. The Supra-adjacency matrix representing the multiplex network k can be written as described in equation [7]. The matrix Ik represents the Identity matrix of size nk.

$${{{{{{{{\mathcal{A}}}}}}}}}_{k}=\left[\begin{array}{cccc}(1-{\delta }_{k}){A}_{k}^{[1]}&\frac{{\delta }_{k}}{({L}_{k}-1)}{I}_{k}&\ldots &\frac{{\delta }_{k}}{({L}_{k}-1)}{I}_{k}\\ \frac{{\delta }_{k}}{({L}_{k}-1)}{I}_{k}&(1-{\delta }_{k}){A}_{k}^{[2]}&\ldots &\frac{{\delta }_{k}}{({L}_{k}-1)}{I}_{k}\hfill\\ \vdots &\vdots &\ddots &\vdots \\ \frac{{\delta }_{k}}{({L}_{k}-1)}{I}_{k}&\frac{{\delta }_{k}}{({L}_{k}-1)}{I}_{k}&\ldots &(1-{\delta }_{k}){A}_{k}^{[{L}_{k}]}\end{array}\right]$$
(7)
Fig. 2: MultiXrank Random Walk with Restart parameters.
figure 2

Parameters of the Random Walk with Restart allowing to explore universal multilayer networks composed of N multiplex networks (each composed of several layers containing the same set of (replica) nodes but different edges). The parameters δ are associated with the probability to jump from one layer to another in a given multiplex network, λ with the probability to jump from one multiplex network to another multiplex network, τ with the probability to restart in a given layer of a given multiplex network, and η with the probability to restart in a given multiplex network.

RWR on universal multilayer networks

We here define a RWR method that can be applied to universal multilayer networks. Universal multilayer networks are composed of any combination of multiplex networks, linked by any combination of bipartite networks (Fig. 1). All network edges can also be weighted and/or directed. The formalism for the application of RWR on multiplex networks is described in the previous section. We will now detail the Bipartite network matrices, and how to combine intra- and inter- multiplex networks information to obtain the Supra-heterogeneous adjacency matrix. The Supra-heterogeneous adjacency matrix will embed all the possible transitions in a universal multilayer network.

Bipartite networks connect heterogeneous nodes

The Bipartite network matrices contain the transitions between different types of nodes present in different networks. If the network α has nα nodes, and the network β has nβ nodes, the Bipartite network matrix denoted bα,β has a size equal to nα*nβ. Now, let us define \({{{{{{{{\mathcal{A}}}}}}}}}_{\alpha }\) and \({{{{{{{{\mathcal{A}}}}}}}}}_{\beta }\), two Supra-adjacency matrices representing the multiplex networks α and β. The Bipartite network matrix Bα,β represents the transitions from the nodes of the multiplex network α to the nodes of the multiplex network β. The size of the Bipartite network matrix Bα,β is equal to (Lα*nα)*(Lβ*nβ). The Bipartite network matrices are composed of (Lα*Lβ) times the Bipartite network matrix bα,β (equation [8]). The matrix bα,β is composed of all the transitions from one layer of the multiplex network α to one layer of the multiplex network β. We extended the formalism used in15 in order to consider more than two different multiplex networks.

$${B}_{\alpha ,\beta } = \underbrace{\left[\begin{array}{cccc}{b}_{\alpha ,\beta }&{b}_{\alpha ,\beta }&\ldots &{b}_{\alpha ,\beta }\\ {b}_{\alpha ,\beta }&{b}_{\alpha ,\beta }&\ldots &{b}_{\alpha ,\beta }\\ \vdots &\vdots &\ddots & \vdots \\ {b}_{\alpha ,\beta }&{b}_{\alpha ,\beta }&\ldots &{b}_{\alpha ,\beta } \end{array}\right]}_{{L}_{\beta } \, {{{{{{{\rm{times}}}}}}}}}\left.\vphantom{\begin{array}{c}1\\ 1\\ 1\\ 1\\ 1\\ 1\end{array}}\right\}\scriptstyle{{L}_{\alpha }} \, {{{{{{{\rm{times}}}}}}}}$$
(8)

The representation of the bipartite networks as a set of nodes \({V}_{{{{{{{{\mathcal{B}}}}}}}}}\) and a set of edges \({E}_{{{{{{{{\mathcal{B}}}}}}}}}\) can be written as:

$$\left\{\begin{array}{l}{G}_{{{{{{{{\mathcal{B}}}}}}}}}=({V}_{{{{{{{{\mathcal{B}}}}}}}}},{E}_{{{{{{{{\mathcal{B}}}}}}}}})\hfill\\ {V}_{{{{{{{{\mathcal{B}}}}}}}}}=\{{v}_{k}^{\alpha },k=1,\ldots ,{n}_{\alpha }\}\cup \{{v}_{l}^{\beta },l=1,\ldots ,{n}_{\beta }\}\hfill\\ {E}_{{{{{{{{\mathcal{B}}}}}}}}}=\{{e}_{k,l}^{\alpha \beta }\,k=1,\ldots ,{n}_{\alpha }\,,\,l=1,\ldots ,{n}_{\beta }\,;\,{({b}_{\alpha ,\beta })}_{k,l}\ \ne\ 0\}\end{array}\right.$$
(9)

It is to note that if the bipartite networks are undirected, \({b}_{\beta ,\alpha }^{T}={b}_{\alpha ,\beta }\) and \({B}_{\beta ,\alpha }^{T}={B}_{\alpha ,\beta }\).

Universal multilayer networks unify the representation of heterogeneous multiplex networks

We previously defined the Supra-adjacency matrices of each multiplex network and the Bipartite network matrices connecting the different multiplex networks. We now introduce the Supra-heterogeneous adjacency matrix, denoted by \({{{{{{{\mathcal{S}}}}}}}}\). This matrix, defined in equation [10], collects the N Supra-adjacency matrices representing each multiplex network, \({{{{{{{{\mathcal{A}}}}}}}}}_{1},{{{{{{{{\mathcal{A}}}}}}}}}_{2},\ldots ,{{{{{{{{\mathcal{A}}}}}}}}}_{N}\), and the N*(N − 1) Bipartite network matrices connecting each multiplex network, B1,2, B1,3, …, B1,N, B2,1, …, BN,N−1.

$${{{{{{{\mathcal{S}}}}}}}}=\left[\begin{array}{cccc}{{{{{{{{\mathcal{A}}}}}}}}}_{1}&{B}_{1,2}&\ldots &{B}_{1,N}\\ {B}_{2,1}&{{{{{{{{\mathcal{A}}}}}}}}}_{2}&\ldots &{B}_{2,N}\\ \vdots &\vdots &\ddots &\vdots \\ {B}_{N,1}&{B}_{N,2}&\ldots &{{{{{{{{\mathcal{A}}}}}}}}}_{N}\end{array}\right]$$
(10)

We can also define the Supra-heterogeneous adjacency matrix as a set of nodes and edges:

$$\left\{\begin{array}{rcl}&{G}_{{{{{{{{\mathcal{S}}}}}}}}}=\left({V}_{{{{{{{{\mathcal{S}}}}}}}}},{E}_{{{{{{{{\mathcal{S}}}}}}}}}\right)\hfill\\ &{V}_{{{{{{{{\mathcal{S}}}}}}}}}=\mathop{\bigcup }\limits_{k=1}^{N}\{{v}_{k,i}^{{\alpha }_{k}},i=1,\ldots ,{n}_{k},{\alpha }_{k}=1,\ldots ,{L}_{k}\}\hfill\\ &{E}_{{{{{{{{\mathcal{S}}}}}}}}}=\mathop{\bigcup }\limits_{k=1}^{N}\left(\{{e}_{i,j}^{{\alpha }_{k},{\alpha }_{k}},i,j=1,\ldots ,{n}_{k},{\left({A}_{k}^{[{\alpha }_{k}]}\right)}_{i,j}\ \ne\ 0\}\right.\hfill\\ &\quad\quad\ \cup \left.\{{e}_{i,i}^{{\alpha }_{k},{\beta }_{k}},i=1,\ldots ,{n}_{k},{\alpha }_{k}\ \ne\ {\beta }_{k}\,,\,{\alpha }_{k},{\beta }_{k}=1,\ldots ,{L}_{k}\}\right)\hfill\\ &\quad\quad\ \ \cup \mathop{\bigcup }\limits_{k,l=1;k\ne l}^{N}\{{e}_{i,j}^{{\alpha }_{k},{\alpha }_{l}},i=1,\ldots ,{n}_{k},j=1,\ldots ,{n}_{l},{\left({B}_{k,l}\right)}_{i,j}\ \ne\ 0\}\end{array}\right.$$
(11)

The normalization of the Supra-heterogeneous adjacency matrix ensures the convergence of the RWR to the steady-state

The most complex issue is the normalization of the Supra-heterogeneous adjacency matrix into a Transition rate matrix that can be used in equation [4]. The normalization allows obtaining a Stochastic matrix that guarantees the convergence of the RWR to the steady-state18 (see elements of proof in Supplementary Note 1.A.1). It is important to note that we have chosen a column normalization. The resulting normalized matrix, denoted by \(\widehat{{{{{{{{\mathcal{S}}}}}}}}}\) is defined in equation [12]. We generalized the formalism of Li and Patra12 established for two heterogeneous monoplex networks (Supplementary Note 1.D). This generalization to universal multilayer networks is done thanks to the intra- and inter- multiplex network normalizations defined in equations [1314], with α [[1, N]], β [[1, N]]. In addition, \({c}_{{i}_{\alpha }}\) is the number of bipartite networks in which the node iα appears as source of the multiplex network α denoted by Mα.

$$\widehat{{{{{{{{\mathcal{S}}}}}}}}}=\left[\begin{array}{cccc}{\widehat{S}}_{11}&{\widehat{S}}_{12}&\ldots &{\widehat{S}}_{1N}\\ {\widehat{S}}_{21}&{\widehat{S}}_{22}&\ldots &{\widehat{S}}_{2N}\\ \vdots &\vdots &\ddots &\vdots \\ {\widehat{S}}_{N2}&{\widehat{S}}_{N2}&\ldots &{\widehat{S}}_{NN}\end{array}\right]$$
(12)

In equation [13], \({\widehat{S}}_{\alpha \alpha }\) defines the transition probabilities inside a given multiplex network. In the case of a multiplex network, if a node has no bipartite interactions with nodes from another multiplex networks, we can use the standard normalization. If bipartite interactions exist, then the normalization takes into account the probability that the walker can stay in the multiplex network \((1-\mathop{\sum }\nolimits_{\beta = 1}^{{c}_{{i}_{\alpha }}}{\lambda }_{\alpha \beta })\). In equation [14], \({\widehat{S}}_{\alpha \beta }\) defines the transition probability between two different multiplex networks. There are here three possibilities. If the node has no bipartite interactions, the transition probability is equal to zero. If the node has bipartite interactions, the transition probability is equal to the standard normalization weighted by the jump probability (λαβ). Finally, if the node exists only in the bipartite network, the normalization corresponds to the standard normalization weighted by a modified jump probability. This normalization takes into account all the bipartite interactions of the considered node.

$${\widehat{S}}_{\alpha \alpha }({i}_{\alpha },{j}_{\alpha })=\left\{\begin{array}{l}\frac{{A}_{\alpha }({i}_{\alpha },{j}_{\alpha })}{\mathop{\sum }\limits_{{k}_{\alpha } = 1}^{{n}_{\alpha }}{A}_{\alpha }({i}_{\alpha },{k}_{\alpha })} \; \; \; {{{{{{{\rm{if}}}}}}}}\ \forall \beta \,:\,\mathop{\sum }\limits_{{k}_{\beta }=1}^{{n}_{\beta }}{B}_{\alpha ,\beta }({i}_{\alpha },{k}_{\beta })=0\\ \frac{\left(1-\mathop{\sum }\limits_{\beta = 1}^{{c}_{{i}_{\alpha }}}{\lambda }_{\alpha \beta }\right)* {A}_{\alpha }({i}_{\alpha },{j}_{\alpha })}{\mathop{\sum }\limits_{{k}_{\alpha } = 1}^{{n}_{\alpha }}{A}_{\alpha }({i}_{\alpha },{k}_{\alpha })} \; \; \; {{{{{{{\rm{Otherwise}}}}}}}}\hfill\end{array}\right.$$
(13)
$${\widehat{S}}_{\alpha \beta }({i}_{\alpha },{j}_{\beta })=\left\{\begin{array}{l}\frac{{\lambda }_{\alpha \beta }{B}_{\alpha ,\beta }({i}_{\alpha },{j}_{\beta })}{\mathop{\sum }\limits_{{k}_{\beta } = 1}^{{n}_{\beta }}{B}_{\alpha ,\beta }({i}_{\alpha },{k}_{\beta })} \; \; \; {{{{{{{\rm{if}}}}}}}}\mathop{\sum }\limits_{{k}_{\beta }=1}^{{n}_{\beta }}{B}_{\alpha ,\beta }({i}_{\alpha },{k}_{\beta })\ \ne\ 0\\ \frac{\frac{{\lambda }_{\alpha \beta }}{\mathop{\sum }\limits_{\beta = 1}^{c}{\lambda }_{\alpha \beta }}\mathop{\sum }\limits_{{i}_{\alpha }=1}^{c}{B}_{\alpha ,\beta }({i}_{\alpha },{j}_{\beta })}{\mathop{\sum }\limits_{{i}_{\alpha }=1}^{c}\mathop{\sum }\limits_{{k}_{\beta }=1}^{{n}_{\beta }}{B}_{\alpha ,\beta }({i}_{\alpha },{k}_{\beta })} \; \; \;{{{{{{{\rm{if}}}}}}}}\,{i}_{\alpha }\,{{{{{{{\rm{not}}}}}}}}\,{{{{{{{\rm{in}}}}}}}}\,{M}_{\alpha }\\ 0 \; \; \; {{{{{{{\rm{Otherwise}}}}}}}}\end{array}\right.$$
(14)

The normalization allows including the parameters λαβ to jump between the multiplex networks (Fig. 2). In other words, these parameters weight the jumps from one multiplex network α to another multiplex network β, if the bipartite interaction exists. Moreover, the standard probability condition of normalization imposes that \(\mathop{\sum }\nolimits_{\alpha = 1}^{N}{\lambda }_{\alpha \beta }=1,\forall \,\beta\), where N represents the number of multiplex networks. Finally, the RWR equation on universal multilayer networks is defined as:

$${{{{{{{{\bf{p}}}}}}}}}_{t+1}^{T}=(1-r)\widehat{S}{{{{{{{{\bf{p}}}}}}}}}_{t}^{T}+r{{{{{{{{\bf{p}}}}}}}}}_{0}^{T}.$$
(15)

RWR initial probability distribution in universal multilayer networks

The initial probability distribution p0 from equation [15], which contains the probabilities to restart on the seed(s), can be written in its general form as follows:

$${{{{{{{{\bf{p}}}}}}}}}_{0}^{T}=\left[\begin{array}{c}{\eta }_{1}{\bar{{{{{{{{\bf{v}}}}}}}}}}_{0}^{1}\\ {\eta }_{2}{\bar{{{{{{{{\bf{v}}}}}}}}}}_{0}^{2}\\ \ldots \\ {\eta }_{N}{\bar{{{{{{{{\bf{v}}}}}}}}}}_{0}^{N}\end{array}\right]$$
(16)

where ηk is the probability to restart in one of the layers of the multiplex network k, and \({\bar{{{{{{{{\bf{v}}}}}}}}}}_{0}^{k}\) is the initial probability distribution of the multiplex network k. The size of \({\bar{{{{{{{{\bf{v}}}}}}}}}}_{0}^{k}\) is equal to (Lk*nk), where Lk is the number of layers in the multiplex network k and nk is the number of nodes in the multiplex network k. We constraint the parameter η with the standard condition of normalization of the probability that imposes \(\mathop{\sum }\nolimits_{k = 1}^{N}{\eta }_{k}=1\). We defined another parameter, τ, to take into account the probability of restarting in the different layers of a given multiplex network. This parameter includes τkj, where k corresponds to the index of the multiplex network, and j to the index of the layer of the multiplex network k (Fig. 2). In other words, τkj corresponds to the probability to restart in the jth layer of the multiplex network k. Finally, \({\bar{{{{{{{{\bf{v}}}}}}}}}}_{0}^{k}\) is defined as follows: \({\bar{{{{{{{{\bf{v}}}}}}}}}}_{0}^{k}={[{\tau }_{k1}{{{{{{{{\bf{v}}}}}}}}}_{0}^{k},{\tau }_{k2}{{{{{{{{\bf{v}}}}}}}}}_{0}^{k},\ldots , {\tau }_{k{L}_{k}}{{{{{{{{\bf{v}}}}}}}}}_{0}^{k}]}^{T}\), with \({{{{{{{{\bf{v}}}}}}}}}_{0}^{k}\) being a vector with 1/ωk in the position(s) of seed(s) and zeros elsewhere, and ωk being the number of seeds in the multiplex network k. The standard condition of normalization of the probability gives the constraint: \(\mathop{\sum }\nolimits_{j = 1}^{{L}_{k}}{\tau }_{kj}=1\),   k.

Numerical implementation: multiXrank

Our RWR on universal multilayer networks is implemented as a Python package called MultiXrank (Supplementary Note 2). MultiXrank has an optimized implementation. Default parameters allow exploring homogeneously the multilayer network (Supplementary Note 1.B). The running time of the package depends on the number of edges of the multilayer network (complexity analyses in Supplementary Note 2.A). The package is available on GitHub https://github.com/anthbapt/multixrank, and can be installed with standard pip installation command: https://pypi.org/project/MultiXrank.

Evaluations

We evaluated the performances of MultiXrank using two different multilayer networks. The first one is a large biological multilayer network composed of two multiplex networks and one monoplex network. It contains a gene multiplex network gathering gene physical and functional relationships, a drug multiplex network containing drug clinical and chemical relationships, and a disease monoplex network representing disease phenotypic similarities. Each monoplex/multiplex network is connected to the others thanks to bipartite networks containing gene-disease, drug-gene, and drug-disease interactions (Supplementary Note 3.B). The second multilayer network is composed of three multiplex networks. It contains a French airports multiplex network, a British airports multiplex network, and a German airports multiplex network. In each multiplex network, the nodes represent the airports of each country and the edges represent the national flight connections between these airports for three different airline companies. The three multiplex networks are linked with bipartite networks corresponding to transnational flight connections (Supplementary Note 3.A).

We designed a Leave-One-Out Cross-Validation (LOOCV) protocol inspired by F.Mordelet and J.P.Vert24 and A.Valdeolivas et al.15. In this protocol, we systematically leave-out some known associations and assess the reconstruction of this left-out data using the data remaining in the network (Supplementary Note 4.A and Fig. S9). In the case of the biological multilayer network,we systematically left-out known gene-disease associations. More specifically, for each disease associated with at least two genes, each gene is remove one-by-one and considered as the left-out gene. The remaining gene(s) associated with the same disease are used as seed(s). When the disease network is considered in the evaluation, the disease node is used as seed together with the gene node(s). The RWR algorithm is then applied, and all the network nodes are scored according to their proximity to the seed(s). The rank of the gene node that was left-out in the ongoing run is recorded. The perfect ranking for the left-out gene is 1; the closer the rank is to 1, the better the prediction. The gene left-out process is repeated iteratively for all the genes. Finally, the Cumulative Distribution Function (CDF) of the ranks of the left-out genes is plotted (Fig. 3). The CDF displays the ratio of left-out genes that are ranked by the RWR within the top-K ranked gene nodes. The CDFs are used to evaluate and compare the performance of the RWR applied to different combinations of biological networks: the protein-protein interactions (PPI) network alone, the gene multiplex network, the multilayer network composed of the gene multiplex and the disease monoplex networks, and the multilayer network composed of the gene and drug multiplex networks and the disease monoplex network (Fig. 3a).

Fig. 3: Evaluation and comparison of multiXrank performances on different combinations of multilayer networks.
figure 3

a, b Cumulative Distribution Functions (CDFs) representing the ranks of the left-out nodes in the Leave-One-Out Cross-Validation (LOOCV) protocol. a: focus on different combinations of biological networks: protein-protein interactions network alone (PPI), gene multiplex network (multi-1), multilayer network composed of the gene multiplex network and the disease monoplex network (multi-2), and multilayer network composed of the gene and drug multiplex networks and the disease monoplex network, for two different sets of parameters (multi-3, multi-3 bis). The multilayer networks are connected by the bipartite networks described in the Evaluations section. b: focus on different combinations of airports networks: French multiplex network (multi-1), multilayer network composed of the French and British airports multiplex networks (multi-2), and multilayer network composed of the French, British, and German airports multiplex networks, for two different sets of parameters (multi-3, multi-3 bis). These multilayer networks are connected by the bipartite networks described in the Evaluations section. c, d CDFs representing the ranks of the left-out nodes in the LOOCV protocol for the multi-3 multilayer networks described previously with artificially increased connectivity in the gene-drug and disease-drug bipartite networks. c The connectivity is artificially increased thanks to the addition of 1 (multi3+1), 2 (multi3+2) or 5 (multi3+5) transit drug nodes for each gene-disease association. d: In the airport multilayer network, the connectivity is artificially increased in the French-German and British-German bipartite networks thanks to the addition of 1 (multi3+1), 2 (multi3+2) or 5 (multi3+5) transit German nodes for each French-British airports association. The parameters of the Random Walk with Restart (RWR) are detailed in Supplementary Tables S5S6.

We observed that considering multiple sources of network data is always better than considering the PPI alone. In addition, considering multilayer information is better than considering only the gene multiplex network. However, the increased performances in the LOOCV seem to arise only from combining the gene multiplex network with the disease monoplex network (and associated gene-disease bipartite network). Indeed, the addition of the drug multiplex network (and associated drug-gene and drug-disease bipartite networks) to the multilayer system does not increase the performances (Fig. 3a).

We repeated the same LOOCV protocol for the airports multilayer network, in which the left-out nodes are French airport nodes associated with a given British airport node. Here, the behavior is different, as adding the third multiplex network containing German airports connections (and associated French-German and British-German bipartite networks) increases the performances of the RWR to predict the associations between French and British airports (Fig. 3b).

To better understand these different behaviors, we examined in detail the amount of common nodes (called overlaps) existing between the nodes of the different bipartite networks. We observed that only 23% of the genes from the gene-disease bipartite network are present in the drug-gene bipartite network. Similarly, only 5% of the diseases from the gene-disease bipartite network are present in the disease-drug bipartite network (Fig. S10). Given these low overlaps, the drug multiplex network might not contribute significantly to connecting gene and disease nodes during the random walks. This might explain why adding the drug multiplex network does not improve the performances of the LOOCV. Contrarily, the bipartite networks of the airport multilayer network displays high overlaps (Fig. S10). These high overlaps might explain why the addition of the third multiplex network in this case increases the predictive power (Fig. 3b).

To validate the proposed central role of bipartite networks in the RWR performances, we artificially increased the connectivity of the gene-drug and disease-drug bipartite networks before applying the same LOOCV protocol. To this goal, we added artificial transit drug nodes linking existing gene-disease associations (strategy described in Supplementary Note 4C and Fig. S12). We observed that these artificially added transit nodes increased drastically the performances of the LOOCV (Fig. 3c). The same phenomenon is observed for the airports multilayer network (Fig. 3d). In addition, we checked if random perturbations in these artificially enhanced bipartite networks would decrease the performances of the LOOCV. To do so, we progressively randomized the edges in the bipartite networks with artificially increased connectivity, until obtaining completely random bipartite networks. We observed that the progressive randomization of the bipartite networks continuously decreases the predictive power of the RWR up to obtaining the same performances as with only two multiplex networks (Fig. S13.A for the airport multilayer networks and S13.B for the biological multilayer networks).

Finally, we repeated all these evaluations using a standard Link Prediction (LP) protocol (Supplementary Note 4.B). LP has already been used to measure the predictive power of RWR methods25. In the LP protocol, we systematically removed gene-disease edges from the gene-disease bipartite network, and predicted the rank of the removed gene using the disease as seed in the RWR. The LP protocol is applied on the airport multilayer network by removing a French-British edge from the French-British bipartite network, and predicting the rank of the French airport using the British airport node as a seed in the RWR. We overall observed similar behaviors as in the LOOCV (Fig. S11 and S14).

Importantly, the LOOCV and LP protocols can be used to evaluate the pertinence of adding new multiplex networks in a multilayer network or new network layers in a multiplex network. Both evaluation protocols are available within the MultiXrank package.

Parameter space exploration

We next evaluated the stability of MultiXrank output scores upon variations of the input parameters. We illustrate this exploration of the parameter space with the biological multilayer network composed of the gene multiplex network and the disease monoplex network. We first compared the top-5 and top-100 gene and disease nodes prioritized by MultiXrank using 125 different sets of parameters (see Supplementary Note 5 for the definition of the sets of parameters). We observed that the top-ranked gene nodes vary more depending on the input parameters than the top-ranked disease nodes (Fig. 4a).

Fig. 4: Exploration of multiXrank parameter space.
figure 4

a Comparison of the top-5 and top-100 nodes ranked by MultiXrank using a biological multilayer networks composed of the gene multiplex network and the disease monoplex network for 125 different sets of parameters. The top-5 or top-100 ranked nodes for each set of parameters are merged, and the number of occurrences of each node are counted. The nodes are represented in bars colored in red when the node is found in all top-5 or top-100 scores, and in blue otherwise. b Clustering in the Principal Component Analysis (PCA) space of the output scores obtained with MultiXrank on the biological multilayer network composed of the gene multiplex network and the disease monoplex network using 125 different sets of parameters. The zoom-in emphasizes the clusters number 4 and 6. c Comparison of the top-100 nodes retrieved for the sets of parameters belonging to clusters 4 and 6 defined in (b). The bar is colored in red when a node is found in all top-100 scores, and in blue otherwise. The parameters of the Random Walk with Restart (RWR) are detailed in Supplementary Table S7.

To better understand the stability of the output scores upon variations of the input parameters, we proposed a protocol based on 5 successive steps: (i) definition of the sets of parameters, (ii) construction of a matrix containing the similarities of the RWR output scores obtained with each set of input parameters, using a the similarity measure defined in equation [17]. The similarities are computed for each type of node independently (i.e., for gene and disease nodes independently).

$${{{\Theta }}}_{\gamma \sigma }^{k}=\mathop{\sum }\limits_{j=1}^{{n}_{k}}\frac{\sqrt{{\left(\frac{{1}}{\left[{\left({{{{{{{{\bf{r}}}}}}}}}_{\gamma }^{k}\right)}_{j}-{\left({{{{{{{{\bf{r}}}}}}}}}_{\gamma \sigma }^{k}\right)}_{j}\right]}\right)}^{2}+{\left(\frac{{1}}{\left[{\left({{{{{{{{\bf{r}}}}}}}}}_{\sigma }^{k}\right)}_{j}-{\left({{{{{{{{\bf{r}}}}}}}}}_{\sigma \gamma }^{k}\right)}_{j}\right]}\right)}^{2}}}{{\left(\frac{{\left({{{{{{{{\bf{r}}}}}}}}}_{\gamma }^{k}\right)}_{j}\ +\ {\left({{{{{{{{\bf{r}}}}}}}}}_{\sigma }^{k}\right)}_{j}}{{2}}\right)}^{2}}$$
(17)

where γ and σ define two sets of parameters, nk is the number of nodes associated with the multiplex network k. In addition, \({{{{{{{{\bf{r}}}}}}}}}_{\gamma }^{k}\) (resp. \({{{{{{{{\bf{r}}}}}}}}}_{\sigma }^{k}\)) is the rank output scores distribution that associates with each node its rank given by the RWR with the set of parameters γ (resp. σ) for the multiplex network k. Finally, \({{{{{{{{\bf{r}}}}}}}}}_{\gamma \sigma }^{k}\) (resp. \({{{{{{{{\bf{r}}}}}}}}}_{\sigma \gamma }^{k}\)) gives to each node of the output scores distribution obtained by the set of parameters γ (resp. σ) (in the multiplex network k) their rank in the distribution σ (resp. γ).

We next computed a consensus Similarity matrix with a normalized euclidean norm of each individual Similarity matrix (equation [18]).

$${{{\Theta }}}_{\gamma \sigma }=\sqrt{\mathop{\sum }\limits_{k=1}^{N}\frac{{({{{\Theta }}}_{\gamma \sigma }^{k})}^{2}}{{n}_{k}}}$$
(18)

where N is the number of multiplex networks.

The next step is (iii) projection of the consensus Similarity matrix into a Principal Component Analysis (PCA) space (Fig. 4b). In this PCA space, each dot represents the output scores resulting from a set of parameters. Then, (iv) clustering (using k-means on the two first principal components) to identify sub-regions containing similar RWR output scores. Finally, (v) comparing the top-ranked nodes obtained with the set of parameters belonging to each cluster (Fig 4c, Supplementary Note 5).

We applied this protocol to evaluate the output scores obtained by MultiXrank on the previously defined biological multilayer network composed of the gene multiplex network and the disease monoplex network, using 125 different combinations of parameters (Fig. 4, supplementary Fig. S16). We projected the consensus Similarity matrix into a PCA space and identified 8 clusters (Fig. 4b). To illustrate the behavior inside clusters, we concentrated our analyses on the two clusters defined in the bottom left subspace (clusters number 4 and 6, zoom-in Fig. 4b). The top-100 ranked gene and disease nodes inside each of the two clusters are overall similar (Fig. 4c). This means that, even if the node prioritization can be sensitive to input parameters, we can identify regions of stability in the parameter space. Moreover, the protocol allows identifying the monoplex/multiplex networks that generate most variability in the output scores upon changes in the input parameters.

We applied the parameter space exploration protocol to other multilayer networks and observed diverse behaviors, from highly variable top-rankings and scattered projections in the PCA space for the airport multilayer network (Supplementary Fig. S15) to robust top-rankings with well-clustered projections in the PCA space for the biological multilayer network composed of 3 types of nodes (genes, diseases and drugs, Supplementary Fig. S16). Overall, our parameter space study reveals different sensitivities to input parameters depending on the multilayer network explored. The protocol is available within the MultiXrank package and can be used to characterize in-depth the sensitivity to input parameters of any multilayer network.

Discussion

Multilayer networks are nowadays very popular, in particular because they allow capturing a larger part of real and engineered systems. In biology, multilayer networks integrating multiscale sources of heterogeneous interactions provide a more comprehensive picture of biological system functionalities. However, data representation as multilayer networks must be accompanied by the development of tools allowing their exploration. Many efforts are thereby dedicated to extend classical network theory algorithms to multilayer systems5,26. These algorithms include for instance clustering algorithms27, Graph Convolutional Networks28,29 or meta-path based methods3,30. Other important network exploration algorithms, such as diffusion kernels or methods based on random walk, are based on the principle of network propagation26. The methods based on random walk, such as PageRank, biased random walk or Random Walk with Restart (RWR), are widely used in network science. They are indeed versatile: the random walk output scores can be used directly for node prioritization and subnetwork extraction, but can also be used as input for downstream analyses, for instance for supervised classification or node embedding22.

Different random walk methods have been adapted to consider multilayer networks. However, a large variety of multilayer networks exist, from multiplex to temporal networks, for instance. To the best of our knowledge, network exploration algorithms that have been adapted to handle multilayer networks can usually be applied only to specific categories of multilayer networks, such as multiplex networks composed of the same set of nodes.

We present here MultiXrank, a tool that proposes an optimized and general formalism for RWR on universal multilayer networks. MultiXrank can be applied to explore multilayer networks composed of any combination of multiplex, monoplex or bipartite networks, and all the network edges can be directed and/or weighted. To the best of our knowledge, any type of multilayer networks could be represented with our formalism, even if it might sometimes require some adaptations. We illustrated the use of MultiXrank with RWR on biological and airport multilayer networks and thereby provide guidelines for users. Even if one’s initial intuition in data analysis could be that “more data is better”, the addition of interaction network layers also brings additional degrees of freedom5. To evaluate the pertinence of the addition of multiplex networks or the addition of layers in a multilayer system, MultiXrank includes a systematic evaluation protocol based on Leave-One-Out-Cross-Validation and Link Prediction. Overall, our results show that adding networks data does not always increase the predictive power of the RWR, as already suggested by previous studies11. Our evaluation protocol can be used, for the first time to our knowledge, to evaluate in-depth the signal-to-noise of multilayer system combinations. Finally, we complemented MultiXrank with a parameter space exploration protocol to measure the influence of varying the input parameters on the global stability of the output scores. It is to note that this parameter space exploration protocol is universal and can be used to study any complex system exploration approach providing scores as outputs.

The output scores of MultiXrank can be used in a wide variety of downstream analyses. For instance, shallow embedding methods need similarity measures for the optimization of the loss function22,31. MultiXrank can produce such a similarity measure respecting the global topology of the multilayer network. An interesting application could be to use MultiXrank output scores for embedding and evaluate the predictive power of the gene-disease association prediction task. Indeed, the embedding is expected to be more robust to the noise than the direct network space32.

The MultiXrank package can be applied to any kind of multilayer network such as social, economic, or ecological multilayer networks. MultiXrank is optimized and can handle multilayer networks containing up to millions edges. To consider billion-scale network problems, several strategies could be considered, such as the Block Elimination Approach for RWR (BEAR) that can be exact or approximate33 or the Best of Preprocessing and Iterative approaches (BEPI) that is an approximate approach34.