Universal multilayer network exploration by random walk with restart

Baptista, Anthony; Gonzalez, Aitor; Baudot, Anaïs

doi:10.1038/s42005-022-00937-9

Download PDF

Article
Open access
Published: 01 July 2022

Universal multilayer network exploration by random walk with restart

Communications Physics volume 5, Article number: 170 (2022) Cite this article

4365 Accesses
8 Citations
13 Altmetric
Metrics details

Subjects

Abstract

The amount and variety of data have been increasing drastically for several years. These data are often represented as networks and explored with approaches arising from network theory. Recent years have witnessed the extension of network exploration approaches to capitalize on more complex and richer network frameworks. Random walks, for instance, have been extended to explore multilayer networks. However, current random walk approaches are limited in the combination and heterogeneity of networks they can handle. New analytical and numerical random walk methods are needed to cope with the increasing diversity and complexity of multilayer networks. We propose here MultiXrank, a method and associated Python package that enables Random Walk with Restart on any kind of multilayer network. We evaluate MultiXrank with leave-one-out cross-validation and link prediction, and measure the impact of the addition or removal of network data on prediction performances. Finally, we measure the sensitivity of MultiXrank to input parameters by in-depth exploration of the parameter space.

Community detection with node attributes in multilayer networks

Article Open access 25 September 2020

Application of hyperbolic geometry in link prediction of multiplex networks

Article Open access 30 August 2019

Discrimination reveals reconstructability of multiplex networks from partial observations

Article Open access 27 June 2022

Introduction

Data amount and variety have soared as never seen before, offering a unique opportunity to better understand complex systems. Among the different modes of representation of data, networks appear as particularly successful. Networks are indeed interesting to refine raw data and extract relevant features, patterns, and classes. They are exploited for years to study complex systems, and a wide and powerful range of tools from graph theory are available for their exploration.

However, the integrated exploration of large multidimensional datasets remains a major challenge in many scientific fields. For instance, a comprehensive understanding of biological systems would require the integrated analysis of dozens of different datasets produced at different molecular, cellular or tissular scales. Recently, multilayer networks emerged as essential players in the analysis of such complex systems. Multilayer networks allow integrating more than one network in a unified formalism, in which the different networks are considered as layers¹. For instance, Duran-Frigola et al.² combined 25 different networks of chemical compounds and their relationships, gathering relationships from chemical structures to clinical outcomes. This multilayer framework allows an integrated study of chemical compounds and their biological activities. Another example is given by the Hetionet project. The authors collected dozen of heterogeneous networks, i.e networks with various types of nodes such as genes, drugs or diseases, to prioritize drugs for repurposing³.

Several definitions of multilayer networks have been proposed, based on the (in)homogeneity of the layers and the properties of the connections between layers^4,5,6. For instance, multiplex networks are multilayer networks composed of different layers containing the same nodes (called replica nodes) but different types of edges, and thereby different topologies. Heterogeneous networks link networks composed of different types of nodes thanks to bipartite interactions. Temporal networks follow the dynamic of a network over time: all the layers have the same nodes, but each layer represents the interaction state at a given time⁷. We will here consider universal multilayer networks, which can be defined as multilayer networks composed of any number of multiplex (or monoplex) networks (with edges that can be directed and/or weighted), linked by bipartite networks (with edges that can be directed and/or weighted) (Fig. 1). A wide range of methods have been developed in the recent years to analyze multilayer networks. For instance, different network metrics have been adapted to multilayer networks⁸, as well as various network clustering algorithms for community detection^9,10,11 or random walk for network exploration^12,13,14,15.

**Fig. 1: A universal multilayer network.**

Random walks are iterative stochastic processes widely used to explore network topologies. They can be described as simulated particles that walk iteratively from one node to one of its neighbors with some probability¹⁶. The PageRank algorithm, for instance, is based on a random walk simulating the behavior of an internet user walking from one page to another thanks to hyper-links. The user can also restart the walk on any arbitrary page¹⁷. In this particular random walk strategy, the restart prevents the random walker from being trapped in dead-ends¹⁸. An interesting alternative strategy restricts the restart to specific node(s), called the seed(s)¹⁹. In this strategy, named Random Walk with Restart (RWR) or Personalized PageRank, the random walk represents a measure of proximity from all the nodes in the network to the seed(s). RWR can also be described as a diffusion process, in which the objective is to determine the steady-state of an initial probability distribution²⁰.

RWR are widely used to exploit large-scale networks. In computational biology, for instance, RWR strategies have been shown to significantly outperform methods based on local distance measures for the prioritization of gene-disease associations²¹. Importantly, different upgrades of the RWR approach have been implemented during the last decade, including its extension to (i) heterogeneous networks¹², (ii) multiplex networks¹³ and (iii) multiplex-heterogeneous networks¹⁵. In RWR, the degrees of freedom are summarized in the Transition rate matrix, and correspond to the available transitions between the different nodes of the graph. The extensions of RWR are challenging because the Transition rate matrices need to be normalized. To the best of our knowledge, this normalization is currently only solved for multilayer networks composed of two heterogeneous multiplex networks^15,22 and the more universal case of N multiplex networks remains unsolved.

We propose here MultiXrank, a framework composed of a method and a Python package to execute RWR on universal multilayer networks. We first introduce the mathematical bases of this RWR for universal multilayer networks, which correspond to a generalization of the approach from¹². We evaluate MultiXrank with leave-one-out cross-validation and link prediction protocols. These evaluations reveal that more network data is not always better and highlight the critical influence of the bipartite networks. We finally present an in-depth exploration of the parameter space to measure the stability of the RWR output scores under variations of the input parameters. The MultiXrank Python package is freely available at https://github.com/anthbapt/multixrank, with an optimized implementation allowing its application to large multilayer networks.

Results

Random walk with restart (RWR)

Let us consider an irreducible and aperiodic Markov chain, for instance a network composed of a giant component with undirected edges, G = (V, E), where V is the set of vertices and E ⊆ (V × V) is the set of edges. In the case of irreducible and aperiodic Markov chains, a stationary probability p* exists and satisfies the following properties:

$$\left\{\begin{array}{ll}{{{{{{{{\bf{p}}}}}}}}}^{* }(i)\ > \ 0;\forall i\in V \hfill \\ {\sum }_{i\in V}{{{{{{{{\bf{p}}}}}}}}}^{* }(i)=1\hfill\end{array}\right.$$

(1)

We next introduce the probability defining the walk from one node to another. Let us define x, a particle that explores the network, x_t its position at time t and x_t+1 its position at time t + 1. Considering two nodes i and j:

$${\mathbb{P}}({x}_{t+1}=j\,| \,{x}_{t}=i)=\left\{\begin{array}{l}\frac{1}{{d}_{i}}\,\,\,\,\,{{{{{{{\rm{if}}}}}}}}\ (i,j)\in E\\ 0\,\,\,\,\,\,{{{{{{{\rm{Otherwise}}}}}}}}\end{array}\right.$$

(2)

with d_i being the degree of the node i. All the normalized possible transitions can be included in the Transition rate matrix. This Transition rate matrix, noted M, can be seen as the matrix of the degrees of freedom of the particle in the system. It is useful to note that the Transition rate matrix is equal to the column-normalized Adjacency matrix. The distribution denoted by ${{{{{{{{\bf{p}}}}}}}}}_{t}={({{{{{{{{\bf{p}}}}}}}}}_{t}(i))}_{i\in V}$ describes the probability of being in the node i at time t, and the stationary distribution p* is obtained thanks to the homogeneous linear difference equation [3]^18,23:

$${{{{{{{{\bf{p}}}}}}}}}_{t+1}^{T}=M{{{{{{{{\bf{p}}}}}}}}}_{t}^{T}$$

(3)

with ${{{{{{{{\bf{p}}}}}}}}}_{t}^{T}$ denoting the transpose of the vector p_t. Moreover, we can introduce a non-homogeneous linear difference equation [4]²³ to take into account the restart on the seed(s). When the Transition rate matrix is a Stochastic matrix, the stationary distribution is reached¹⁸ (Supplementary Note 1.A.1 for elements of proof of convergence) and this distribution can be seen as a measure of proximity of all the network nodes with respect to the seed(s).

$${{{{{{{{\bf{p}}}}}}}}}_{t+1}^{T}=(1-r)M{{{{{{{{\bf{p}}}}}}}}}_{t}^{T}+r{{{{{{{{\bf{p}}}}}}}}}_{0}^{T}$$

(4)

The distribution p₀ corresponds to the initial probability distribution, where only the seed(s) have non-zero values; r represents the restart probability.

RWR on multiplex networks

The RWR method has been extended to multiplex networks, i.e., multilayer networks with a one-to-one mapping between the (replica) nodes of the different layers (Fig. 1)^1,13,14. Multiplex networks can be represented by Supra-adjacency matrices, which correspond to a generalization of the standard Adjacency matrix. In the following, we will use several multiplex networks, indexed by k. We denoted by ${{{{{{{{\mathcal{A}}}}}}}}}_{k}$ the Supra-adjacency matrix of the multiplex network indexed by k. The Adjacency matrix of the layer l of the multiplex network k is denoted by ${A}_{k}^{[l]}$. The element of this adjacency matrix from node i to node j is defined as ${({A}_{k}^{[l]})}_{i,j}\ge 0$. The dimension of the Supra-adjacency matrix ${{{{{{{{\mathcal{A}}}}}}}}}_{k}$ of the multiplex network k is equal to (L_k*n_k)*(L_k*n_k), with n_k the number of nodes in each layer of the multiplex network k and L_k the number of layers in the multiplex network k. The Supra-adjacency matrix ${{{{{{{{\mathcal{A}}}}}}}}}_{k}$ is defined as follows:

$${({{{{{{{{\mathcal{A}}}}}}}}}_{k})}_{{i}_{l},{j}_{m}}=\left\{\begin{array}{ll}{\left({A}_{k}^{[l]}\right)}_{i,j}&{{{{{{{\rm{if}}}}}}}}\,l=m\\ {\delta }_{i,j}\hfill&{{{{{{{\rm{if}}}}}}}}\,l\ \ne\ m\hfill\end{array}\right.$$

(5)

where δ defines the Kronecker delta (i.e., 1 if i equal j and 0 otherwise), and l and m represent the layers of the multiplex network k. We can also define a multiplex network as a set of nodes, ${V}_{{{{{{{{{\mathcal{A}}}}}}}}}_{k}}$ and a set of edges, ${E}_{{{{{{{{{\mathcal{A}}}}}}}}}_{k}}$:

$$\left\{\begin{array}{l}{G}_{{{{{{{{{\mathcal{A}}}}}}}}}_{k}}=({V}_{{{{{{{{{\mathcal{A}}}}}}}}}_{k}},{E}_{{{{{{{{{\mathcal{A}}}}}}}}}_{k}})\hfill \\ {V}_{{{{{{{{{\mathcal{A}}}}}}}}}_{k}}=\{{v}_{i}^{l},i=1,\ldots ,{n}_{k},l=1,\ldots ,{L}_{k}\}\hfill\\ {E}_{{{{{{{{{\mathcal{A}}}}}}}}}_{k}}=\{{e}_{i,j}^{ll},i,j=1,\ldots ,{n}_{k},l=1,\ldots ,{L}_{k},{({A}_{k}^{[l]})}_{i,j}\ \ne\ 0\}\\ \cup \{{e}_{i,i}^{lm},i=1,\ldots ,{n}_{k},l\ \ne\ m\}\hfill\end{array}\right.$$

(6)

Importantly, we need to column-normalize the Supra-adjacency matrix defined in the equations [5–6] in order to converge to the steady-state, as defined in¹⁵. This normalization requires including the parameters δ_k related to the jumps from one layer to another inside the matrix representation, as described in¹³ (Fig. 2). In the next section, we need to index by k all the parameters that are dedicated to the multiplex network k. The Supra-adjacency matrix representing the multiplex network k can be written as described in equation [7]. The matrix I_k represents the Identity matrix of size n_k.

$${{{{{{{{\mathcal{A}}}}}}}}}_{k}=\left[\begin{array}{cccc}(1-{\delta }_{k}){A}_{k}^{[1]}&\frac{{\delta }_{k}}{({L}_{k}-1)}{I}_{k}&\ldots &\frac{{\delta }_{k}}{({L}_{k}-1)}{I}_{k}\\ \frac{{\delta }_{k}}{({L}_{k}-1)}{I}_{k}&(1-{\delta }_{k}){A}_{k}^{[2]}&\ldots &\frac{{\delta }_{k}}{({L}_{k}-1)}{I}_{k}\hfill\\ \vdots &\vdots &\ddots &\vdots \\ \frac{{\delta }_{k}}{({L}_{k}-1)}{I}_{k}&\frac{{\delta }_{k}}{({L}_{k}-1)}{I}_{k}&\ldots &(1-{\delta }_{k}){A}_{k}^{[{L}_{k}]}\end{array}\right]$$

(7)

**Fig. 2: MultiXrank Random Walk with Restart parameters.**

RWR on universal multilayer networks

We here define a RWR method that can be applied to universal multilayer networks. Universal multilayer networks are composed of any combination of multiplex networks, linked by any combination of bipartite networks (Fig. 1). All network edges can also be weighted and/or directed. The formalism for the application of RWR on multiplex networks is described in the previous section. We will now detail the Bipartite network matrices, and how to combine intra- and inter- multiplex networks information to obtain the Supra-heterogeneous adjacency matrix. The Supra-heterogeneous adjacency matrix will embed all the possible transitions in a universal multilayer network.

Bipartite networks connect heterogeneous nodes

The Bipartite network matrices contain the transitions between different types of nodes present in different networks. If the network α has n_α nodes, and the network β has n_β nodes, the Bipartite network matrix denoted b_α,β has a size equal to n_α*n_β. Now, let us define ${{{{{{{{\mathcal{A}}}}}}}}}_{\alpha }$ and ${{{{{{{{\mathcal{A}}}}}}}}}_{\beta }$, two Supra-adjacency matrices representing the multiplex networks α and β. The Bipartite network matrix B_α,β represents the transitions from the nodes of the multiplex network α to the nodes of the multiplex network β. The size of the Bipartite network matrix B_α,β is equal to (L_α*n_α)*(L_β*n_β). The Bipartite network matrices are composed of (L_α*L_β) times the Bipartite network matrix b_α,β (equation [8]). The matrix b_α,β is composed of all the transitions from one layer of the multiplex network α to one layer of the multiplex network β. We extended the formalism used in¹⁵ in order to consider more than two different multiplex networks.

$${B}_{\alpha ,\beta } = \underbrace{\left[\begin{array}{cccc}{b}_{\alpha ,\beta }&{b}_{\alpha ,\beta }&\ldots &{b}_{\alpha ,\beta }\\ {b}_{\alpha ,\beta }&{b}_{\alpha ,\beta }&\ldots &{b}_{\alpha ,\beta }\\ \vdots &\vdots &\ddots & \vdots \\ {b}_{\alpha ,\beta }&{b}_{\alpha ,\beta }&\ldots &{b}_{\alpha ,\beta } \end{array}\right]}_{{L}_{\beta } \, {{{{{{{\rm{times}}}}}}}}}\left.\vphantom{\begin{array}{c}1\\ 1\\ 1\\ 1\\ 1\\ 1\end{array}}\right\}\scriptstyle{{L}_{\alpha }} \, {{{{{{{\rm{times}}}}}}}}$$

(8)

The representation of the bipartite networks as a set of nodes ${V}_{{{{{{{{\mathcal{B}}}}}}}}}$ and a set of edges ${E}_{{{{{{{{\mathcal{B}}}}}}}}}$ can be written as:

$$\left\{\begin{array}{l}{G}_{{{{{{{{\mathcal{B}}}}}}}}}=({V}_{{{{{{{{\mathcal{B}}}}}}}}},{E}_{{{{{{{{\mathcal{B}}}}}}}}})\hfill\\ {V}_{{{{{{{{\mathcal{B}}}}}}}}}=\{{v}_{k}^{\alpha },k=1,\ldots ,{n}_{\alpha }\}\cup \{{v}_{l}^{\beta },l=1,\ldots ,{n}_{\beta }\}\hfill\\ {E}_{{{{{{{{\mathcal{B}}}}}}}}}=\{{e}_{k,l}^{\alpha \beta }\,k=1,\ldots ,{n}_{\alpha }\,,\,l=1,\ldots ,{n}_{\beta }\,;\,{({b}_{\alpha ,\beta })}_{k,l}\ \ne\ 0\}\end{array}\right.$$

(9)

It is to note that if the bipartite networks are undirected, ${b}_{\beta ,\alpha }^{T}={b}_{\alpha ,\beta }$ and ${B}_{\beta ,\alpha }^{T}={B}_{\alpha ,\beta }$.

Universal multilayer networks unify the representation of heterogeneous multiplex networks

We previously defined the Supra-adjacency matrices of each multiplex network and the Bipartite network matrices connecting the different multiplex networks. We now introduce the Supra-heterogeneous adjacency matrix, denoted by ${{{{{{{\mathcal{S}}}}}}}}$. This matrix, defined in equation [10], collects the N Supra-adjacency matrices representing each multiplex network, ${{{{{{{{\mathcal{A}}}}}}}}}_{1},{{{{{{{{\mathcal{A}}}}}}}}}_{2},\ldots ,{{{{{{{{\mathcal{A}}}}}}}}}_{N}$, and the N*(N − 1) Bipartite network matrices connecting each multiplex network, B_1,2, B_1,3, …, B_1,N, B_2,1, …, B_N,N−1.

$${{{{{{{\mathcal{S}}}}}}}}=\left[\begin{array}{cccc}{{{{{{{{\mathcal{A}}}}}}}}}_{1}&{B}_{1,2}&\ldots &{B}_{1,N}\\ {B}_{2,1}&{{{{{{{{\mathcal{A}}}}}}}}}_{2}&\ldots &{B}_{2,N}\\ \vdots &\vdots &\ddots &\vdots \\ {B}_{N,1}&{B}_{N,2}&\ldots &{{{{{{{{\mathcal{A}}}}}}}}}_{N}\end{array}\right]$$

(10)

We can also define the Supra-heterogeneous adjacency matrix as a set of nodes and edges:

$$\left\{\begin{array}{rcl}&{G}_{{{{{{{{\mathcal{S}}}}}}}}}=\left({V}_{{{{{{{{\mathcal{S}}}}}}}}},{E}_{{{{{{{{\mathcal{S}}}}}}}}}\right)\hfill\\ &{V}_{{{{{{{{\mathcal{S}}}}}}}}}=\mathop{\bigcup }\limits_{k=1}^{N}\{{v}_{k,i}^{{\alpha }_{k}},i=1,\ldots ,{n}_{k},{\alpha }_{k}=1,\ldots ,{L}_{k}\}\hfill\\ &{E}_{{{{{{{{\mathcal{S}}}}}}}}}=\mathop{\bigcup }\limits_{k=1}^{N}\left(\{{e}_{i,j}^{{\alpha }_{k},{\alpha }_{k}},i,j=1,\ldots ,{n}_{k},{\left({A}_{k}^{[{\alpha }_{k}]}\right)}_{i,j}\ \ne\ 0\}\right.\hfill\\ &\quad\quad\ \cup \left.\{{e}_{i,i}^{{\alpha }_{k},{\beta }_{k}},i=1,\ldots ,{n}_{k},{\alpha }_{k}\ \ne\ {\beta }_{k}\,,\,{\alpha }_{k},{\beta }_{k}=1,\ldots ,{L}_{k}\}\right)\hfill\\ &\quad\quad\ \ \cup \mathop{\bigcup }\limits_{k,l=1;k\ne l}^{N}\{{e}_{i,j}^{{\alpha }_{k},{\alpha }_{l}},i=1,\ldots ,{n}_{k},j=1,\ldots ,{n}_{l},{\left({B}_{k,l}\right)}_{i,j}\ \ne\ 0\}\end{array}\right.$$

(11)

The normalization of the Supra-heterogeneous adjacency matrix ensures the convergence of the RWR to the steady-state

The most complex issue is the normalization of the Supra-heterogeneous adjacency matrix into a Transition rate matrix that can be used in equation [4]. The normalization allows obtaining a Stochastic matrix that guarantees the convergence of the RWR to the steady-state¹⁸ (see elements of proof in Supplementary Note 1.A.1). It is important to note that we have chosen a column normalization. The resulting normalized matrix, denoted by $\widehat{{{{{{{{\mathcal{S}}}}}}}}}$ is defined in equation [12]. We generalized the formalism of Li and Patra¹² established for two heterogeneous monoplex networks (Supplementary Note 1.D). This generalization to universal multilayer networks is done thanks to the intra- and inter- multiplex network normalizations defined in equations [13–14], with α ∈ [[1, N]], β ∈ [[1, N]]. In addition, ${c}_{{i}_{\alpha }}$ is the number of bipartite networks in which the node i_α appears as source of the multiplex network α denoted by M_α.

$$\widehat{{{{{{{{\mathcal{S}}}}}}}}}=\left[\begin{array}{cccc}{\widehat{S}}_{11}&{\widehat{S}}_{12}&\ldots &{\widehat{S}}_{1N}\\ {\widehat{S}}_{21}&{\widehat{S}}_{22}&\ldots &{\widehat{S}}_{2N}\\ \vdots &\vdots &\ddots &\vdots \\ {\widehat{S}}_{N2}&{\widehat{S}}_{N2}&\ldots &{\widehat{S}}_{NN}\end{array}\right]$$

(12)

In equation [13], ${\widehat{S}}_{\alpha \alpha }$ defines the transition probabilities inside a given multiplex network. In the case of a multiplex network, if a node has no bipartite interactions with nodes from another multiplex networks, we can use the standard normalization. If bipartite interactions exist, then the normalization takes into account the probability that the walker can stay in the multiplex network $(1-\mathop{\sum }\nolimits_{\beta = 1}^{{c}_{{i}_{\alpha }}}{\lambda }_{\alpha \beta })$. In equation [14], ${\widehat{S}}_{\alpha \beta }$ defines the transition probability between two different multiplex networks. There are here three possibilities. If the node has no bipartite interactions, the transition probability is equal to zero. If the node has bipartite interactions, the transition probability is equal to the standard normalization weighted by the jump probability (λ_αβ). Finally, if the node exists only in the bipartite network, the normalization corresponds to the standard normalization weighted by a modified jump probability. This normalization takes into account all the bipartite interactions of the considered node.

$${\widehat{S}}_{\alpha \alpha }({i}_{\alpha },{j}_{\alpha })=\left\{\begin{array}{l}\frac{{A}_{\alpha }({i}_{\alpha },{j}_{\alpha })}{\mathop{\sum }\limits_{{k}_{\alpha } = 1}^{{n}_{\alpha }}{A}_{\alpha }({i}_{\alpha },{k}_{\alpha })} \; \; \; {{{{{{{\rm{if}}}}}}}}\ \forall \beta \,:\,\mathop{\sum }\limits_{{k}_{\beta }=1}^{{n}_{\beta }}{B}_{\alpha ,\beta }({i}_{\alpha },{k}_{\beta })=0\\ \frac{\left(1-\mathop{\sum }\limits_{\beta = 1}^{{c}_{{i}_{\alpha }}}{\lambda }_{\alpha \beta }\right)* {A}_{\alpha }({i}_{\alpha },{j}_{\alpha })}{\mathop{\sum }\limits_{{k}_{\alpha } = 1}^{{n}_{\alpha }}{A}_{\alpha }({i}_{\alpha },{k}_{\alpha })} \; \; \; {{{{{{{\rm{Otherwise}}}}}}}}\hfill\end{array}\right.$$

(13)

$${\widehat{S}}_{\alpha \beta }({i}_{\alpha },{j}_{\beta })=\left\{\begin{array}{l}\frac{{\lambda }_{\alpha \beta }{B}_{\alpha ,\beta }({i}_{\alpha },{j}_{\beta })}{\mathop{\sum }\limits_{{k}_{\beta } = 1}^{{n}_{\beta }}{B}_{\alpha ,\beta }({i}_{\alpha },{k}_{\beta })} \; \; \; {{{{{{{\rm{if}}}}}}}}\mathop{\sum }\limits_{{k}_{\beta }=1}^{{n}_{\beta }}{B}_{\alpha ,\beta }({i}_{\alpha },{k}_{\beta })\ \ne\ 0\\ \frac{\frac{{\lambda }_{\alpha \beta }}{\mathop{\sum }\limits_{\beta = 1}^{c}{\lambda }_{\alpha \beta }}\mathop{\sum }\limits_{{i}_{\alpha }=1}^{c}{B}_{\alpha ,\beta }({i}_{\alpha },{j}_{\beta })}{\mathop{\sum }\limits_{{i}_{\alpha }=1}^{c}\mathop{\sum }\limits_{{k}_{\beta }=1}^{{n}_{\beta }}{B}_{\alpha ,\beta }({i}_{\alpha },{k}_{\beta })} \; \; \;{{{{{{{\rm{if}}}}}}}}\,{i}_{\alpha }\,{{{{{{{\rm{not}}}}}}}}\,{{{{{{{\rm{in}}}}}}}}\,{M}_{\alpha }\\ 0 \; \; \; {{{{{{{\rm{Otherwise}}}}}}}}\end{array}\right.$$

(14)

The normalization allows including the parameters λ_αβ to jump between the multiplex networks (Fig. 2). In other words, these parameters weight the jumps from one multiplex network α to another multiplex network β, if the bipartite interaction exists. Moreover, the standard probability condition of normalization imposes that $\mathop{\sum }\nolimits_{\alpha = 1}^{N}{\lambda }_{\alpha \beta }=1,\forall \,\beta$, where N represents the number of multiplex networks. Finally, the RWR equation on universal multilayer networks is defined as:

$${{{{{{{{\bf{p}}}}}}}}}_{t+1}^{T}=(1-r)\widehat{S}{{{{{{{{\bf{p}}}}}}}}}_{t}^{T}+r{{{{{{{{\bf{p}}}}}}}}}_{0}^{T}.$$

(15)

RWR initial probability distribution in universal multilayer networks

The initial probability distribution p₀ from equation [15], which contains the probabilities to restart on the seed(s), can be written in its general form as follows:

$${{{{{{{{\bf{p}}}}}}}}}_{0}^{T}=\left[\begin{array}{c}{\eta }_{1}{\bar{{{{{{{{\bf{v}}}}}}}}}}_{0}^{1}\\ {\eta }_{2}{\bar{{{{{{{{\bf{v}}}}}}}}}}_{0}^{2}\\ \ldots \\ {\eta }_{N}{\bar{{{{{{{{\bf{v}}}}}}}}}}_{0}^{N}\end{array}\right]$$

(16)

where η_k is the probability to restart in one of the layers of the multiplex network k, and ${\bar{{{{{{{{\bf{v}}}}}}}}}}_{0}^{k}$ is the initial probability distribution of the multiplex network k. The size of ${\bar{{{{{{{{\bf{v}}}}}}}}}}_{0}^{k}$ is equal to (L_k*n_k), where L_k is the number of layers in the multiplex network k and n_k is the number of nodes in the multiplex network k. We constraint the parameter η with the standard condition of normalization of the probability that imposes $\mathop{\sum }\nolimits_{k = 1}^{N}{\eta }_{k}=1$. We defined another parameter, τ, to take into account the probability of restarting in the different layers of a given multiplex network. This parameter includes τ_kj, where k corresponds to the index of the multiplex network, and j to the index of the layer of the multiplex network k (Fig. 2). In other words, τ_kj corresponds to the probability to restart in the j^th layer of the multiplex network k. Finally, ${\bar{{{{{{{{\bf{v}}}}}}}}}}_{0}^{k}$ is defined as follows: ${\bar{{{{{{{{\bf{v}}}}}}}}}}_{0}^{k}={[{\tau }_{k1}{{{{{{{{\bf{v}}}}}}}}}_{0}^{k},{\tau }_{k2}{{{{{{{{\bf{v}}}}}}}}}_{0}^{k},\ldots , {\tau }_{k{L}_{k}}{{{{{{{{\bf{v}}}}}}}}}_{0}^{k}]}^{T}$, with ${{{{{{{{\bf{v}}}}}}}}}_{0}^{k}$ being a vector with 1/ω_k in the position(s) of seed(s) and zeros elsewhere, and ω_k being the number of seeds in the multiplex network k. The standard condition of normalization of the probability gives the constraint: $\mathop{\sum }\nolimits_{j = 1}^{{L}_{k}}{\tau }_{kj}=1$, ∀ k.

Numerical implementation: multiXrank

Our RWR on universal multilayer networks is implemented as a Python package called MultiXrank (Supplementary Note 2). MultiXrank has an optimized implementation. Default parameters allow exploring homogeneously the multilayer network (Supplementary Note 1.B). The running time of the package depends on the number of edges of the multilayer network (complexity analyses in Supplementary Note 2.A). The package is available on GitHub https://github.com/anthbapt/multixrank, and can be installed with standard pip installation command: https://pypi.org/project/MultiXrank.

Evaluations

We evaluated the performances of MultiXrank using two different multilayer networks. The first one is a large biological multilayer network composed of two multiplex networks and one monoplex network. It contains a gene multiplex network gathering gene physical and functional relationships, a drug multiplex network containing drug clinical and chemical relationships, and a disease monoplex network representing disease phenotypic similarities. Each monoplex/multiplex network is connected to the others thanks to bipartite networks containing gene-disease, drug-gene, and drug-disease interactions (Supplementary Note 3.B). The second multilayer network is composed of three multiplex networks. It contains a French airports multiplex network, a British airports multiplex network, and a German airports multiplex network. In each multiplex network, the nodes represent the airports of each country and the edges represent the national flight connections between these airports for three different airline companies. The three multiplex networks are linked with bipartite networks corresponding to transnational flight connections (Supplementary Note 3.A).

We designed a Leave-One-Out Cross-Validation (LOOCV) protocol inspired by F.Mordelet and J.P.Vert²⁴ and A.Valdeolivas et al.¹⁵. In this protocol, we systematically leave-out some known associations and assess the reconstruction of this left-out data using the data remaining in the network (Supplementary Note 4.A and Fig. S9). In the case of the biological multilayer network,we systematically left-out known gene-disease associations. More specifically, for each disease associated with at least two genes, each gene is remove one-by-one and considered as the left-out gene. The remaining gene(s) associated with the same disease are used as seed(s). When the disease network is considered in the evaluation, the disease node is used as seed together with the gene node(s). The RWR algorithm is then applied, and all the network nodes are scored according to their proximity to the seed(s). The rank of the gene node that was left-out in the ongoing run is recorded. The perfect ranking for the left-out gene is 1; the closer the rank is to 1, the better the prediction. The gene left-out process is repeated iteratively for all the genes. Finally, the Cumulative Distribution Function (CDF) of the ranks of the left-out genes is plotted (Fig. 3). The CDF displays the ratio of left-out genes that are ranked by the RWR within the top-K ranked gene nodes. The CDFs are used to evaluate and compare the performance of the RWR applied to different combinations of biological networks: the protein-protein interactions (PPI) network alone, the gene multiplex network, the multilayer network composed of the gene multiplex and the disease monoplex networks, and the multilayer network composed of the gene and drug multiplex networks and the disease monoplex network (Fig. 3a).

**Fig. 3: Evaluation and comparison of multiXrank performances on different combinations of multilayer networks.**

We observed that considering multiple sources of network data is always better than considering the PPI alone. In addition, considering multilayer information is better than considering only the gene multiplex network. However, the increased performances in the LOOCV seem to arise only from combining the gene multiplex network with the disease monoplex network (and associated gene-disease bipartite network). Indeed, the addition of the drug multiplex network (and associated drug-gene and drug-disease bipartite networks) to the multilayer system does not increase the performances (Fig. 3a).

We repeated the same LOOCV protocol for the airports multilayer network, in which the left-out nodes are French airport nodes associated with a given British airport node. Here, the behavior is different, as adding the third multiplex network containing German airports connections (and associated French-German and British-German bipartite networks) increases the performances of the RWR to predict the associations between French and British airports (Fig. 3b).

To better understand these different behaviors, we examined in detail the amount of common nodes (called overlaps) existing between the nodes of the different bipartite networks. We observed that only 23% of the genes from the gene-disease bipartite network are present in the drug-gene bipartite network. Similarly, only 5% of the diseases from the gene-disease bipartite network are present in the disease-drug bipartite network (Fig. S10). Given these low overlaps, the drug multiplex network might not contribute significantly to connecting gene and disease nodes during the random walks. This might explain why adding the drug multiplex network does not improve the performances of the LOOCV. Contrarily, the bipartite networks of the airport multilayer network displays high overlaps (Fig. S10). These high overlaps might explain why the addition of the third multiplex network in this case increases the predictive power (Fig. 3b).

To validate the proposed central role of bipartite networks in the RWR performances, we artificially increased the connectivity of the gene-drug and disease-drug bipartite networks before applying the same LOOCV protocol. To this goal, we added artificial transit drug nodes linking existing gene-disease associations (strategy described in Supplementary Note 4C and Fig. S12). We observed that these artificially added transit nodes increased drastically the performances of the LOOCV (Fig. 3c). The same phenomenon is observed for the airports multilayer network (Fig. 3d). In addition, we checked if random perturbations in these artificially enhanced bipartite networks would decrease the performances of the LOOCV. To do so, we progressively randomized the edges in the bipartite networks with artificially increased connectivity, until obtaining completely random bipartite networks. We observed that the progressive randomization of the bipartite networks continuously decreases the predictive power of the RWR up to obtaining the same performances as with only two multiplex networks (Fig. S13.A for the airport multilayer networks and S13.B for the biological multilayer networks).

Finally, we repeated all these evaluations using a standard Link Prediction (LP) protocol (Supplementary Note 4.B). LP has already been used to measure the predictive power of RWR methods²⁵. In the LP protocol, we systematically removed gene-disease edges from the gene-disease bipartite network, and predicted the rank of the removed gene using the disease as seed in the RWR. The LP protocol is applied on the airport multilayer network by removing a French-British edge from the French-British bipartite network, and predicting the rank of the French airport using the British airport node as a seed in the RWR. We overall observed similar behaviors as in the LOOCV (Fig. S11 and S14).

Importantly, the LOOCV and LP protocols can be used to evaluate the pertinence of adding new multiplex networks in a multilayer network or new network layers in a multiplex network. Both evaluation protocols are available within the MultiXrank package.

Parameter space exploration

We next evaluated the stability of MultiXrank output scores upon variations of the input parameters. We illustrate this exploration of the parameter space with the biological multilayer network composed of the gene multiplex network and the disease monoplex network. We first compared the top-5 and top-100 gene and disease nodes prioritized by MultiXrank using 125 different sets of parameters (see Supplementary Note 5 for the definition of the sets of parameters). We observed that the top-ranked gene nodes vary more depending on the input parameters than the top-ranked disease nodes (Fig. 4a).

**Fig. 4: Exploration of multiXrank parameter space.**

To better understand the stability of the output scores upon variations of the input parameters, we proposed a protocol based on 5 successive steps: (i) definition of the sets of parameters, (ii) construction of a matrix containing the similarities of the RWR output scores obtained with each set of input parameters, using a the similarity measure defined in equation [17]. The similarities are computed for each type of node independently (i.e., for gene and disease nodes independently).

$${{{\Theta }}}_{\gamma \sigma }^{k}=\mathop{\sum }\limits_{j=1}^{{n}_{k}}\frac{\sqrt{{\left(\frac{{1}}{\left[{\left({{{{{{{{\bf{r}}}}}}}}}_{\gamma }^{k}\right)}_{j}-{\left({{{{{{{{\bf{r}}}}}}}}}_{\gamma \sigma }^{k}\right)}_{j}\right]}\right)}^{2}+{\left(\frac{{1}}{\left[{\left({{{{{{{{\bf{r}}}}}}}}}_{\sigma }^{k}\right)}_{j}-{\left({{{{{{{{\bf{r}}}}}}}}}_{\sigma \gamma }^{k}\right)}_{j}\right]}\right)}^{2}}}{{\left(\frac{{\left({{{{{{{{\bf{r}}}}}}}}}_{\gamma }^{k}\right)}_{j}\ +\ {\left({{{{{{{{\bf{r}}}}}}}}}_{\sigma }^{k}\right)}_{j}}{{2}}\right)}^{2}}$$

(17)

where γ and σ define two sets of parameters, n_k is the number of nodes associated with the multiplex network k. In addition, ${{{{{{{{\bf{r}}}}}}}}}_{\gamma }^{k}$ (resp. ${{{{{{{{\bf{r}}}}}}}}}_{\sigma }^{k}$) is the rank output scores distribution that associates with each node its rank given by the RWR with the set of parameters γ (resp. σ) for the multiplex network k. Finally, ${{{{{{{{\bf{r}}}}}}}}}_{\gamma \sigma }^{k}$ (resp. ${{{{{{{{\bf{r}}}}}}}}}_{\sigma \gamma }^{k}$) gives to each node of the output scores distribution obtained by the set of parameters γ (resp. σ) (in the multiplex network k) their rank in the distribution σ (resp. γ).

We next computed a consensus Similarity matrix with a normalized euclidean norm of each individual Similarity matrix (equation [18]).

$${{{\Theta }}}_{\gamma \sigma }=\sqrt{\mathop{\sum }\limits_{k=1}^{N}\frac{{({{{\Theta }}}_{\gamma \sigma }^{k})}^{2}}{{n}_{k}}}$$

(18)

where N is the number of multiplex networks.

The next step is (iii) projection of the consensus Similarity matrix into a Principal Component Analysis (PCA) space (Fig. 4b). In this PCA space, each dot represents the output scores resulting from a set of parameters. Then, (iv) clustering (using k-means on the two first principal components) to identify sub-regions containing similar RWR output scores. Finally, (v) comparing the top-ranked nodes obtained with the set of parameters belonging to each cluster (Fig 4c, Supplementary Note 5).

We applied this protocol to evaluate the output scores obtained by MultiXrank on the previously defined biological multilayer network composed of the gene multiplex network and the disease monoplex network, using 125 different combinations of parameters (Fig. 4, supplementary Fig. S16). We projected the consensus Similarity matrix into a PCA space and identified 8 clusters (Fig. 4b). To illustrate the behavior inside clusters, we concentrated our analyses on the two clusters defined in the bottom left subspace (clusters number 4 and 6, zoom-in Fig. 4b). The top-100 ranked gene and disease nodes inside each of the two clusters are overall similar (Fig. 4c). This means that, even if the node prioritization can be sensitive to input parameters, we can identify regions of stability in the parameter space. Moreover, the protocol allows identifying the monoplex/multiplex networks that generate most variability in the output scores upon changes in the input parameters.

We applied the parameter space exploration protocol to other multilayer networks and observed diverse behaviors, from highly variable top-rankings and scattered projections in the PCA space for the airport multilayer network (Supplementary Fig. S15) to robust top-rankings with well-clustered projections in the PCA space for the biological multilayer network composed of 3 types of nodes (genes, diseases and drugs, Supplementary Fig. S16). Overall, our parameter space study reveals different sensitivities to input parameters depending on the multilayer network explored. The protocol is available within the MultiXrank package and can be used to characterize in-depth the sensitivity to input parameters of any multilayer network.

Discussion

Multilayer networks are nowadays very popular, in particular because they allow capturing a larger part of real and engineered systems. In biology, multilayer networks integrating multiscale sources of heterogeneous interactions provide a more comprehensive picture of biological system functionalities. However, data representation as multilayer networks must be accompanied by the development of tools allowing their exploration. Many efforts are thereby dedicated to extend classical network theory algorithms to multilayer systems^5,26. These algorithms include for instance clustering algorithms²⁷, Graph Convolutional Networks^28,29 or meta-path based methods^3,30. Other important network exploration algorithms, such as diffusion kernels or methods based on random walk, are based on the principle of network propagation²⁶. The methods based on random walk, such as PageRank, biased random walk or Random Walk with Restart (RWR), are widely used in network science. They are indeed versatile: the random walk output scores can be used directly for node prioritization and subnetwork extraction, but can also be used as input for downstream analyses, for instance for supervised classification or node embedding²².

Different random walk methods have been adapted to consider multilayer networks. However, a large variety of multilayer networks exist, from multiplex to temporal networks, for instance. To the best of our knowledge, network exploration algorithms that have been adapted to handle multilayer networks can usually be applied only to specific categories of multilayer networks, such as multiplex networks composed of the same set of nodes.

We present here MultiXrank, a tool that proposes an optimized and general formalism for RWR on universal multilayer networks. MultiXrank can be applied to explore multilayer networks composed of any combination of multiplex, monoplex or bipartite networks, and all the network edges can be directed and/or weighted. To the best of our knowledge, any type of multilayer networks could be represented with our formalism, even if it might sometimes require some adaptations. We illustrated the use of MultiXrank with RWR on biological and airport multilayer networks and thereby provide guidelines for users. Even if one’s initial intuition in data analysis could be that “more data is better”, the addition of interaction network layers also brings additional degrees of freedom⁵. To evaluate the pertinence of the addition of multiplex networks or the addition of layers in a multilayer system, MultiXrank includes a systematic evaluation protocol based on Leave-One-Out-Cross-Validation and Link Prediction. Overall, our results show that adding networks data does not always increase the predictive power of the RWR, as already suggested by previous studies¹¹. Our evaluation protocol can be used, for the first time to our knowledge, to evaluate in-depth the signal-to-noise of multilayer system combinations. Finally, we complemented MultiXrank with a parameter space exploration protocol to measure the influence of varying the input parameters on the global stability of the output scores. It is to note that this parameter space exploration protocol is universal and can be used to study any complex system exploration approach providing scores as outputs.

The output scores of MultiXrank can be used in a wide variety of downstream analyses. For instance, shallow embedding methods need similarity measures for the optimization of the loss function^22,31. MultiXrank can produce such a similarity measure respecting the global topology of the multilayer network. An interesting application could be to use MultiXrank output scores for embedding and evaluate the predictive power of the gene-disease association prediction task. Indeed, the embedding is expected to be more robust to the noise than the direct network space³².

The MultiXrank package can be applied to any kind of multilayer network such as social, economic, or ecological multilayer networks. MultiXrank is optimized and can handle multilayer networks containing up to millions edges. To consider billion-scale network problems, several strategies could be considered, such as the Block Elimination Approach for RWR (BEAR) that can be exact or approximate³³ or the Best of Preprocessing and Iterative approaches (BEPI) that is an approximate approach³⁴.

Data availability

All the data and the code used in the article are available on an OSF repository: https://osf.io/zsmua (DOI 10.17605/OSF.IO/ZSMUA). This repository includes all the results obtained in the article.

Code availability

The package is available on GitHub https://github.com/anthbapt/multixrank, can be installed with standard pip installation command: https://pypi.org/project/MultiXrank, and is associated with complete documentation: https://multixrank-doc.readthedocs.io/en/latest.

References

Bianconi, G. Multilayer Networks: Structure and Function. (Oxford University Press, Oxford, 2018).
Book MATH Google Scholar
Duran-Frigola, M. et al. Extending the small-molecule similarity principle to all levels of biology with the chemical checker. Nat. Biotechnol. 38, 1087–1096 (2020).
Article Google Scholar
Himmelstein, D. S. et al. Systematic integration of biomedical knowledge prioritizes drugs for repurposing. eLife 6, e26726 (2017).
Article Google Scholar
De Domenico, M. et al. Mathematical formulation of multilayer networks. Phys. Rev. X 3, 041022 (2013).
Google Scholar
Kivelä, M. et al. Multilayer networks. J. Complex Netw. 2, 203–271 (2014).
Article Google Scholar
Lee, B., Zhang, S., Poleksic, A. & Xie, L. Heterogeneous multi-layered network model for omics data integration and analysis. Front. Genet. 10, 1381 (2020).
Article Google Scholar
Holme, P. & Saramäki, J. Temporal networks. Phys. Rep. 519, 97–125 (2012).
Article Google Scholar
Battiston, F., Nicosia, V. & Latora, V. Structural measures for multiplex networks. Phys. Rev. E 89, 032804 (2014).
Article Google Scholar
Mucha, P. J., Richardson, T., Macon, K., Porter, M. A. & Onnela, J.-P. Community structure in time-dependent, multiscale, and multiplex networks. Science 328, 876–878 (2010).
Article MathSciNet MATH Google Scholar
Didier, G., Brun, C., Baudot, A. & Gomez, S. Identifying communities from multiplex biological networks. PeerJ 3, e1525 (2015).
Article Google Scholar
Choobdar, S. et al. Assessment of network module identification across complex diseases. Nat. Methods 16, 843–852 (2019).
Article Google Scholar
Li, Y. & Patra, J. C. Genome-wide inferring gene-phenotype relationship by walking on the heterogeneous network. Bioinformatics 26, 1219–1224 (2010).
Article Google Scholar
De Domenico, M., Solé-Ribalta, A., Gómez, S. & Arenas, A. Navigability of interconnected networks under random failures. Proc. Natl Acad. Sci. 111, 8351–8356 (2014).
Article MathSciNet MATH Google Scholar
Cho, H., Berger, B. & Peng, J. Compact integration of multi-network topology for functional analysis of genes. Cell Syst. 3, 540–548.e5 (2016).
Google Scholar
Valdeolivas, A. et al. Random walk with restart on multiplex and heterogeneous biological networks. Bioinformatics 35, 497–505 (2018).
Article Google Scholar
Lovász, L. Random walks on graphs: a survey. Combinatorics, Paul. Erdos is. Eighty 2, 1–46 (1993).
Google Scholar
Brin, S. & Page, L. The anatomy of a large-scale hypertextual web search engine. Computer Netw. ISDN Syst. 30, 107–117 (1998). Proceedings of the Seventh International World Wide Web Conference.
Article Google Scholar
Langville, A. N. & Meyer, C. D. Google’s PageRank and Beyond: The Science of Search Engine Rankings. (Princeton University Press, USA, 2006).
Book MATH Google Scholar
Pan, J.-Y., Yang, H.-J., Faloutsos, C. & Duygulu, P. Automatic multimedia cross-modal correlation discovery. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’04, 653–658 (Association for Computing Machinery, New York, NY, USA, 2004). https://doi.org/10.1145/1014052.1014135.
Gómez, S. et al. Diffusion dynamics on multiplex networks. Phys. Rev. Lett. 110, 028701 (2013).
Article Google Scholar
Köhler, S., Bauer, S., Horn, D. & Robinson, P. N. Walking the interactome for prioritization of candidate disease genes. Am. J. Hum. Genet. 82, 949–958 (2008).
Article Google Scholar
Pio-Lopez, L., Valdeolivas, A., Tichit, L., Remy, E. & Baudot, A. Multiverse: a multiplex and multiplex-heterogeneous network embedding approach. Sci. Rep. 11, 8794 (2021).
Article Google Scholar
Meyer, C. D. Matrix Analysis and Applied Linear Algebra. (Society for Industrial and Applied Mathematics, USA, 2000).
Book Google Scholar
Mordelet, F. & Vert, J.-P. Prodige: prioritization of disease genes with multitask machine learning from positive and unlabeled examples. BMC Bioinforma. 12, 389 (2011).
Article Google Scholar
Zhou, M., Zheng, C. & Xu, R. Combining phenome-driven drug-target interaction prediction with patients’ electronic health records-based clinical corroboration toward drug discovery. Bioinformatics 36, i436–i444 (2020).
Article Google Scholar
Boccaletti, S. et al. The structure and dynamics of multilayer networks. Phys. Rep. 544, 1–122 (2014).
Article MathSciNet Google Scholar
Huang, X., Chen, D., Ren, T. & Wang, D. A survey of community detection methods in multilayer networks. Data Min. Knowl. Discov. 35, 1–45 (2021).
Article MathSciNet MATH Google Scholar
Ghorbani, M., Baghshah, M. S. & Rabiee, H. R. Mgcn: Semi-supervised classification in multi-layer graphs with graph convolutional networks. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM ’19, 208-211 (Association for Computing Machinery, New York, NY, USA, 2019). https://doi.org/10.1145/3341161.3342942.
Shanthamallu, U. S., Thiagarajan, J. J., Song, H. & Spanias, A. Gramme: Semisupervised learning using multilayered graph attention models. IEEE Trans. Neural Netw. Learn. Syst. 31, 3977–3988 (2020).
Article Google Scholar
Zhang, X., Zou, Q., Rodríguez-Patón, A. & ZENG, X. Meta-path methods for prioritizing candidate disease mirnas. IEEE/ACM Trans. Computational Biol. Bioinforma. 16, 283–291 (2019).
Article Google Scholar
Hamilton, L., Ying, W., R. & Leskovec, J. Representation learning on graphs: Methods and applications (v3). https://arxiv.org/abs/1709.05584 (2018).
Nelson, W. et al. To embed or not: Network embedding as a paradigm in computational biology. Front. Genet. 10, 381–381 (2019).
Article Google Scholar
Shin, K., Jung, J., Lee, S. & Kang, U. Bear: Block elimination approach for random walk with restart on large graphs. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, SIGMOD ’15, 1571-1585 (Association for Computing Machinery, New York, NY, USA, 2015). https://doi.org/10.1145/2723372.2723716.
Jung, J., Park, N., Lee, S. & Kang, U. Bepi: Fast and memory-efficient method for billion-scale random walk with restart. In Proceedings of the 2017 ACM International Conference on Management of Data, SIGMOD ’17, 789-804 (Association for Computing Machinery, New York, NY, USA, 2017). https://doi.org/10.1145/3035918.3035950.

Download references

Acknowledgements

The project leading to this preprint has received funding from the ≪ Investissements d’Avenir ≫ French Government program managed by the French National Research Agency (ANR-16-CONV-0001), from Excellence Initiative of Aix-Marseille University - A*MIDEX and from the Inserm Cross-Cutting Project GOLD.

Author information

Authors and Affiliations

Aix-Marseille Univ, INSERM, MMG, Turing Center for Living Systems, CNRS, Marseille, France
Anthony Baptista & Anaïs Baudot
Aix-Marseille Univ, INSERM, TAGC, Turing Center for Living Systems, Marseille, France
Anthony Baptista & Aitor Gonzalez
Barcelona Supercomputing Center, Barcelona, Spain
Anaïs Baudot

Authors

Anthony Baptista
View author publications
You can also search for this author in PubMed Google Scholar
Aitor Gonzalez
View author publications
You can also search for this author in PubMed Google Scholar
Anaïs Baudot
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.Bap. and A.Bau. designed research; A.Bap. performed research; A.Bap. analyzed data; A.Bap. and A.G. contributed to packaged code; A.Bap. and A.Bau. wrote the paper.

Corresponding authors

Correspondence to Anthony Baptista or Anaïs Baudot.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Communications Physics thanks Albert Solé-Ribalta, Joao Gama Oliveira and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Peer review file

Supplementary material

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Baptista, A., Gonzalez, A. & Baudot, A. Universal multilayer network exploration by random walk with restart. Commun Phys 5, 170 (2022). https://doi.org/10.1038/s42005-022-00937-9

Download citation

Received: 13 September 2021
Accepted: 08 June 2022
Published: 01 July 2022
DOI: https://doi.org/10.1038/s42005-022-00937-9

This article is cited by

Random walk with restart on multilayer networks: from node prioritisation to supervised link prediction and beyond
- Anthony Baptista
- Galadriel Brière
- Anaïs Baudot
BMC Bioinformatics (2024)
Link prediction for heterogeneous information networks based on enhanced meta-path aggregation and attention mechanism
- Hao Shao
- Lunwen Wang
- Rangang Zhu
International Journal of Machine Learning and Cybernetics (2023)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.