Abstract
Networks are ubiquitous in biology where they encode connectivity patterns at all scales of organization, from molecular to the biome. However, biological networks are noisy due to the limitations of measurement technology and inherent natural variation, which can hamper discovery of network patterns and dynamics. We propose Network Enhancement (NE), a method for improving the signaltonoise ratio of undirected, weighted networks. NE uses a doubly stochastic matrix operator that induces sparsity and provides a closedform solution that increases spectral eigengap of the input network. As a result, NE removes weak edges, enhances real connections, and leads to better downstream performance. Experiments show that NE improves gene–function prediction by denoising tissuespecific interaction networks, alleviates interpretation of noisy HiC contact maps from the human genome, and boosts finegrained identification accuracy of species. Our results indicate that NE is widely applicable for denoising biological networks.
Introduction
Networks provide an elegant abstraction for expressing finegrained connectivity and dynamics of interactions in complex biological systems^{1}. In this representation, the nodes indicate the components of the system. These nodes are often connected by nonnegative, (weighted)edges, which indicate the similarity between two components. For example, in protein–protein interaction (PPI) networks, weighted edges capture the strength of physical interactions between proteins and can be leveraged to detect functional modules^{2}. However, accurate experimental quantification of interaction strength is challenging^{3,4}. Technical and biological noise can lead to superficially strong edges, implying spurious interactions; conversely, dubiously weak edges can hide real, biologically important connections^{4,5,6}. Furthermore, corruption of experimentally derived networks by noise can alter the entire structure of the network by modifying the strength of edges within and amongst underlying biological pathways. These modifications adversely impact the performance of downstream analysis^{7}. The challenge of noisy interaction measurements is not unique to PPI networks and plagues many different types of biological networks, such as HiC^{8} and cell–cell interaction networks^{9}.
To overcome this challenge, computational approaches have been proposed for denoising networks. These methods operate by replacing the original edge weights with weights obtained based on a diffusion defined on the network^{10,11}. However, these methods are often not tested on different types of networks^{11}, rely on heuristics without providing explanations for why these approaches work, and lack mathematical understanding of the properties of the denoised networks^{10,11}. Consequently, these methods may not be effective on new applications derived from emerging experimental biotechnology.
Here, we introduce network enhancement (NE), a diffusionbased algorithm for network denoising that does not require supervision or prior knowledge. NE takes as input a noisy, undirected, weighted network, and outputs a network on the same set of nodes but with a new set of edge weights (Fig. 1). The main crux of NE is the observation that nodes connected through paths with highweight edges are more likely to have a direct, highweight edge between them^{12,13}. Following this intuition, we define a diffusion process that uses random walks of length three or less and a form of regularized information flow to denoise the input network (Fig. 1a and Methods). Intuitively, this diffusion generates a network in which nodes with strong similarity/interactions are connected by highweight edges while nodes with weak similarity/interactions are connected by lowweight edges (Fig. 1b). Mathematically, this means that eigenvectors associated with the input network are preserved while the eigengap is increased. In particular, NE denoises the input by downweighting small eigenvalues more aggressively than large eigenvalues. This reweighting is advantageous when the noise is spread in the eigendirections corresponding to small eigenvalues^{14}. Furthermore, the increased eigengap of the enhanced network is a highly appealing property as it leads to accurate detection of modules/clusters^{15,16} and allows for higherorder network analysis^{12}. Moreover, NE has an efficient and easy to implement closedform solution for the diffusion process, and provides mathematical guarantees for this converged solution. (Fig. 1b and Methods).
Results
Methods for network denoising
We have applied NE to three challenging yet important problems in network biology. In each experiment, we evaluate the network denoised by NE against the same network denoised by alternative methods: network deconvolution (ND)^{10} and diffusion state distance (DSD)^{11}. For completeness, we also compare our results to a network reconstructed from features learned by Mashup (MU)^{17}. All three of these methods use a diffusion process as a fundamental step in their algorithms and have a closedform solution at convergence. ND solves an inverse diffusion process to remove the transitive edges, and DSD uses a diffusionbased distance to transform the network. While ND and DSD are denoising algorithms, MU is a feature learning algorithm that learns lowdimensional representations for nodes based on their steadystate topological positions in the network. This representation can be used as input to any subsequent prediction model. In particular, a denoised network can be constructed by computing a similarity measure using MU’s output features^{17}.
NE improves human tissue networks for gene–function prediction
Networks play a critical role in capturing molecular aspects of precision medicine, particularly those related to gene–function and functional implications of gene mutation^{18,19}. We test the utility of our denoising algorithm in improving gene interaction networks from 22 human tissues assembled by Greene et al.^{20}. These networks capture gene interactions that are specific to human tissues and cell lineages ranging from B lymphocyte to skeletal muscle and the whole brain^{20,21}. We predict the cellular functions of genes specialized in different tissues based on the networks obtained from different denoising algorithms.
Given a tissue and the associated tissuespecific gene interaction network, we first denoise the network and then use a networkbased algorithm on the denoised edge weights to predict gene functions in that tissue. We use standard weighted random walks with restarts to propagate gene–function associations from training nodes to the rest of the network^{22}. We define a weighted random walk starting from nodes representing known genes associated with a given function. At each time step, the walk moves from the current node to a neighboring node selected with a probability that depends on the edge weights and has a small probability of returning to the initial nodes^{22}. The algorithm scores each gene according to its visitation probability by the random walk. Node scores returned by the algorithm are then used to predict gene–function associations for genes in the test set. Predictions are evaluated against experimentally validated gene–function associations using a leaveoneout crossvalidation strategy.
When averaged over the four denoising algorithms and the 22 human tissues, the gene–function prediction improved by 12.0% after denoising. Furthermore, we observed that all denoising algorithms improved the average prediction performance (Fig. 2a and Supplementary Note 1). These findings motivate the use of denoised networks over original (raw) biological networks for downstream predictive analytics. We further observed that gene–function prediction performed consistently better in combination with networks revised by NE than in combination with networks revised by other algorithms. On average, NE outperformed networks reconstructed by ND, DSD, and MU by 12.3%. In particular, NE resulted in an average of a 5.1% performance gain over the second bestperforming denoised network (constructed by MU). Following Greene et al.^{20}, we further validated our NE approach by examining each enhanced tissue network in turn and evaluating how well relevant tissuespecific gene functions are connected in the network. The expectation is that functionassociated genes tend to interact more frequently in tissues in which the function is active than in other nonrelevant tissues^{20}. As a result, relevant functions are expected to be more tightly connected in the tissue network than functions specific to other tissues. For each NEenhanced tissue network, we ranked all functions by the edge density of functionassociated tissue subnetworks and examined topranked functions. In the NEenhanced blood plasma network, we found that functions with the highest edge density were blood coagulation, fibrin clot formation, and negative regulation of verylowdensity lipoprotein particle remodeling, all these functions are specific to blood plasma tissue (Fig. 2b). This finding suggests that tissue subnetworks associated with relevant functions tend to be more connected in the tissue network than subnetworks of nontissuespecific functions. The most connected functions in the NEenhanced brain network were brain morphogenesis and forebrain regionalization, which are both specific to brain tissue (Fig. 2b). Examining edge densitybased rankings of gene functions across 22 tissue networks, we found relevant functions consistently placed at or near the top of the rankings, further indicating that NE can improve the signaltonoise ratio of tissue networks.
NE improves HiC networks for domain identification
The recent discovery of numerous cisregulatory elements away from their target genes emphasizes the deep impact of 3D structure of DNA on cell regulation and reproduction^{23,24,25}. Chromosome conformation capture (3C)based technologies^{25} provide experimental approaches for understanding the chromatin interactions within DNA. HiC is a 3Cbased technology that allows measurement of pairwise chromatin interaction frequencies within a cell population^{8,25}. The HiC reads are grouped into bins based on the genetic region they map to. The bin size determines the measurement resolution.
HiC read data can be thought of as a network where genomic regions are nodes and the normalized count of reads mapping to two regions are the weighted edges. Network community detection algorithms can be used on this HiC derived network to identify clusters of regions that are close in 3D genomic structure^{26}. The detected megabasescale communities correspond to regions known as topologicalassociating domains (TADs) and represent chromatin interaction neighborhoods^{26}. TADs tend to be enriched for regulatory features^{27,28} and are hypothesized to specify elementary regulatory microenvironments. Therefore, detection of these domains can be important for analysis and interpretation of HiC data. The limited number of HiC reads, hierarchical structure of TADs and other technological challenges lead to noisy HiC networks, and hamper accurate detection of TADs^{25}.
To investigate the ability of NE in improving TAD detection, we apply NE to a HiC dataset and analyze the performance of a standard domain identification pipeline with and without a network denoising step. For this experiment, we used 1 and 5 kb resolution HiC data from all autosomes of the GM12878 cell line^{8}. Since true goldstandards for TAD regions are lacking, a synthetic dataset was created by stitching together nonoverlapping clusters detected in the original work^{8}. As a result, the clusters stitched together can be used as a good proxy for the true clusters (more details in Supplementary Note 1). Figure 3a shows a heatmap of the raw HiC data for a portion of chromosome 16.
We applied two, offtheshelf, community detection methods (Louvian^{29} and MSCD^{30}) to each HiC network and compared the quality of the detected TADs with or without network denoising. Visual inspection of the HiC contact matrix before and after the HiC network is denoised using NE reveals an enhancement of edges within each community and sharper boundaries between communities (Fig. 3a). This improvement is particularly clear for the 5 kb resolution data, where communities that were visually undetectable in the raw data become clear after denoising with NE. To quantify this enhancement, the communities obtained from raw networks and networks enhanced by NE or other denoising methods were compared to the true cluster assignments. We used normalized mutual information (NMI, Supplementary Note 2) as a measure of shared information between the detected communities and the true clusters. NMI ranges between 0 to 1, where a higher value indicates higher concordance and 1 indicates an exact match between the detected communities and the true clusters. The results across 22 autosomes indicate that while denoising can improve the detection of communities, not all denoising algorithms succeed in this task (Fig. 3b). For both resolutions considered, NE performs the best with an average NMI of 0.92 for 1 kb resolution and 0.94 for 5 kb resolution, MU (the second bestperforming method) achieves an average NMI of 0.85 and 0.84, respectively, while ND and DSD achieve lower average NMI than the raw data which has NMI of 0.81 and 0.67, respectively. Furthermore, we note that the performance of NE and MU remains high as the resolution decreases from 1 to 5 kb, in contrast the ability of the other pipelines in detecting the correct communities diminishes. While MU maintains a good average performance at 5 kb resolution, the standard deviation of NMI values after denoising with MU increases from 0.037 in 1 kb data to 0.054 in 5 kb data due to relatively poor performances on a few chromosomes. On the other hand, the NMI values for data denoised with NE maintain a similar spread at both resolutions (standard deviation 0.033 and 0.031, respectively). The better average NMI and smaller spread indicates that NE can reliably enhance the network and improve TAD detection.
NE improves finegrained species identification
Finegrained species identification from images concerns querying objects within the same subordinate category. Traditional image retrieval works on highlevel categories (e.g., finding all butterflies instead of cats in a database given a query of a butterfly), while finegrained image retrieval aims to distinguish categories with subtle differences (e.g., monarch butterfly versus peacock butterfly). One major obstacle in finegrained species identification is the high similarity between subordinate categories. On one hand, two subordinate categories share similar shapes and carry subtle color difference in a small region; on the other hand, two subordinate categories of close colors can only be well separated by texture. Furthermore, viewpoint, scale variation, and occlusions among objects all contribute to the difficulties in this task^{31}. Due to these challenges, similarity networks, which represent pairwise affinity between images, can be very noisy and ineffective in retrieval of a sample from the correct species for any query.
We test our method on the Leeds butterfly finegrained species image dataset^{32}. Leeds Butterfly dataset contains 832 butterflies in 10 different classes with each class containing between 55 and 100 images^{32}. We use two different common encoding methods (Fisher Vector (FV) and Vector of Linearly Aggregated Descriptors (VLAD) with dense SIFT; Supplementary Note 1) to generate two different vectorizations of each image. These two encoding methods describe the content of the images differently and therefore capture different information about the images. Each method can generate a similarity network in which nodes represent images and edge weights indicate similarity between pairs of images. The inner product of these two similarity networks is used as a single input network to a network denoising algorithm.
Visual inspection indicates that NE is able to greatly improve the overall similarity network for finegrain identification (Fig. 4a). While both encodings partially separate the species, before applying NE, all the images are tangled together without a clear clustering. On the other hand, the resulting similarity network after applying NE clearly shows 10 clusters corresponding to different butterfly species (Fig. 4a). More specifically, given a query, the original input networks fail to capture the true affinities between the query butterfly and its most similar retrievals, while NE is able to correct the affinities and more reliably output the correct retrievals (Fig. 4b).
To quantify the improvements due to NE in the task of species identification, we use identification accuracy, a standard metric which quantifies the average numbers of correct retrievals given any query of interests (Supplementary Note 2). A detailed comparison between NE and other alternatives by examining identification accuracy of the final network with respect to different number of top retrievals demonstrates NE’s ability in improving the original noisy networks (Fig. 4b). For example, when considering top 40 retrievals, NE can improve the raw network by 18.6% (more than 10% better than other alternatives). Further, NE generates the most significant improvement in performance (41% over the raw network and more than 25% over the second best alternative), when examining the top 80 retrieved images.
Current denoising methods suffer from high sensitivity to the hyperparameters when constructing the input similarity networks, e.g., the variance used in Gaussian kernel (Supplementary Note 1). However, our model is more robust to the choice of hyperparameters (Supplementary Fig. 3). This robustness is due to the strict structure enforced by the preservation of symmetry and DSM structure during the diffusion process (see Supplementary Note 3).
Discussion
We proposed NE as a general method to denoiseweighted undirected networks. NE implements a dynamic diffusion process that uses both local and global network structures to construct a denoised network from its noisy version. The core of our approach is a symmetric, positive semidefinite, doubly stochastic matrix, which is a theoretically justified replacement for the commonly used rownormalized transition matrix^{33}. We showed that NE’s diffusion model preserves the eigenvectors and increases the eigengap of this matrix for large eigenvalues. This finding provides insight into the mechanism of NE’s diffusion and explains its ability to improve network quality^{15,16}. In addition to increasing the eigengap, NE disproportionately trims small eigenvalues. This property can be contrasted with the principal component analysis (PCA) where the eigenspectrum is truncated at a particular threshold. Through extensive experimentation, we show that NE can flexibly fit into important network analytic pipelines in biology and that its theoretical properties enable substantial improvements in the performance of downstream network analyses.
We see many opportunities to improve upon the foundational concept of NE in future work. First, in some cases, a small subset of high confidence nodes may be available. For example, genomic regions in the HiC contact maps can be augmented using data obtained from 3C technology or a small number of species can be identified by a domain expert and used together with network data as input to a denoising methodology. Extending NE to take advantage of the small amount of accurately labeled data might further extend our ability to denoise networks. Second, although we showed the utility of NE for denoising several types of weighted networks, there are other network types worth exploring, such as multimodal networks involving multiomic measurements of cancer patients. Finally, incorporating NE’s diffusion process into other network analytic pipelines can potentially improve performance. For example, MU^{17} learns vector representations for nodes based on a steady state of a traditional random walk with restart, and replacing MU’s diffusion process with the rescaled steady state of NE might be a promising future direction.
Methods
Problem definition and doubly stochastic matrix property
Let G = (E, V, W) be a weighted network where V denotes the set of nodes in the network (with \(\left V \right\) = n), E represents the edges of G, and W contains the weights on the edges. The goal of NE is to generate a network G^{*} = (E^{*}, V, W^{*}) that provides a better representation of the underlying module membership than the original network G. For the analysis below, we let W represent a symmetric, nonnegative matrix.
Diffusionbased models often rely on the rownormalized transition probability matrix P = D^{−1}W, where D is a diagonal matrix whose entries are D_{i,i} = \(\mathop {\sum}\nolimits_{{{j}} = 1}^{{n}} {\kern 1pt} W_{{{i,j}}}\). However, transition probability matrix P defined in this way is generally asymmetric and does not induce a directly usable nodenode similarity metric. Additionally, most diffusionbased models lack spectral analysis of the denoised model. To construct our diffusion process and provide a theoretical analysis of our model, we propose to use a symmetric, doubly stochastic matrix (DSM). Given a matrix \(M \in {\Bbb R}^{n \times n}\), M is said to be DSM if:

1.
\(M_{{\rm{i,j}}} \ge 0\quad i,j \in \{ 1,2, \ldots ,n\}\),

2.
\(\mathop {\sum}\nolimits_{{i}} {\kern 1pt} M_{{{i,j}}} = \mathop {\sum}\nolimits_{{j}} {\kern 1pt} M_{{{i,j}}} = 1.\)
The second condition above is equivalent to 1 = (1, 1, …, 1)^{T} and 1^{T} being a right and left eigenvector of M with eigenvalue 1. In fact, 1 is the greatest eigenvalue for all DSM matrices (see the remark following the definition of DSM in the Supplementary Notes). Overall, the DSM property imposes a strict constraint on the scale of the node similarities and provides a scalefree matrix that is wellsuited for subsequent analyses.
Network enhancement
Given a matrix of edge weights W representing the pairwise weights between all the nodes, we construct another localized network \({\cal T} \in {\Bbb R}^{n \times n}\) on the same set of nodes to capture local structures of the input network. Denote the set of Knearest neighbors (KNN) of the ith node (including the node i) as \({\cal N}_i\). We use these nearest neighbors to measure local affinity. Then the corresponding localized network \({\cal T}\) can be constructed from the original weighted network using the following two steps:
where \({\Bbb I}_{\{ \cdot \} }\) is the indicator function. We can verify that \({\cal T}\) is a symmetric DSM by directly checking the conditions of the definition (Supplementary Note 3). \({\cal T}\) encodes the local structures of the original network with the intuition that local neighbors (highly similar pairs of nodes) are more reliable than remote ones, and local structures can be propagated to nonlocal nodes through a diffusion process on the network. Motivated by the updates introduced in Zhou et al.^{34}, we define our diffusion process using \({\cal T}\) as follows:
where α is a regularization parameter and t represents the iteration step. The value of W_{0} can be initialized to be the input matrix W. Equation (2) shows that diffusion process in NE is defined by random walks of length three or less and a form of regularized information flow. There are three main reasons for restricting the influence of random walks to at most thirdorder neighbors in the network: (1) for most nodes thirdorder neighborhood spans the extent of almost the entire biological network, making neighborhoods of order beyond three not very informative of individual nodes^{35,36}, (2) currently there is little information about the extent of influence of a node (i.e., a biological entity, such as gene) on the activity (e.g., expression level) of its neighbor that is more than three hops away^{37}, and (3) recent studies have empirically demonstrated that network features extracted based on threehop neighborhoods contain the most useful information for predictive modeling^{38,39}.
To further explore Eq. (2) we can write the update rule for each entry:
It can be seen from Eq. (3) that the updated network comes from similarity/interaction flow only through the neighbors of each data point. The parameter α adds strengths to selfsimilarities, i.e., a node is always most similar to itself. One key property that differentiates our method from typical diffusion methods is that in the proposed diffusion process defined in Eq. (2), for each iteration t, W_{t} remains a symmetric DSM. Furthermore, W_{t} converges to a nontrivial equilibrium network which is a symmetric DSM as well (Supplementary Note 3). Therefore, NE constructs an undirected network that preserves the symmetry and DSM property of the original network. Through extensive experimentation we show that NE improves the similarity between related nodes and the performance of downstream methods such as community detection algorithms.
The main theoretical insight into the operation of NE is that the proposed diffusion process does not change eigenvectors of the initial DSM while mapping eigenvalues via a nonlinear function (Supplementary Note 3). Let eigenpair (λ_{0}, v_{0}) denote the eigenpair of the initial symmetric DSM, \({\cal T}_0\). Then, the diffusion process defined in Eq. (2) does not change the eigenvectors, and the final converged graph has eigenpair (f_{α}(λ_{0}), v_{0}), where f_{α}(x) = \(\frac{{(1  \alpha )x}}{{1  \alpha x^2}}\). This property shows that, the diffusion process using a symmetric, DSM is a nonlinear operator on the spectrum of the eigenvalues of the original network. This results has a number of consequences. Practically, it provides us with a closedform expression for the converged network. Theoretically, it hints at how this diffusion process effects the eigenspectrum and improves the network for subsequent analyses. (1) If the original eigenvalue is either 0 or 1, the diffusion process preserves this eigenvalue. This implies that, like other diffusion processes, NE does not connect disconnected components. (2) NE increases the gap between large eigenvalues of the original network and reduces the gap between small eigenvalues of this matrix. Larger eigengap is associated with better network community detection and higherorder network analysis^{12,15,16}. (3) The diffusion process always decreases the eigenvalues, which follows from: \((1  \alpha )\lambda _0{\mathrm{/}}\left( {1  \alpha \lambda _0^2} \right)\) ≤ λ_{0}, where smaller eigenvalues get reduced at a higher rate. This observation can be interpreted in relation to PCA where the eigenspectrum below a user determined threshold value is ignored. PCA has many attractive theoretical properties, especially for dimensionality reduction. In fact, MU^{17}, a feature learning method whose output is also a denoised version of the original network, can be fit by computing the PCA decomposition on the stationary state of the network. MU aims to learn a lowdimensional representation of nodes in the network which makes PCA a natural choice. However, a smoothedout version of the PCA is more attractive for network denoising because denoising is typically used as a preprocessing step for downstream prediction tasks, and thus robustness to selection of a threshold value for the eigenspectrum is desirable.
These findings shed light on why the proposed algorithm (NE) enhances the robustness of the diffused network compared to the input network (Supplementary Note 3). In some contexts, we may need the output to remain a network of the same scale as the input network. This requirement can be satisfied by first recording the degree matrix of the input network and eventually mapping the denoised output of the algorithm back to the original scale by a symmetric matrix multiplication. We summarize our NE algorithm along with this optional degreemapping step in Supplementary Note 3.
Code availability
The project website can be found at: http://snap.stanford.edu/ne. Source code of the NE method is available for download from the project website.
Data availability
All relevant data are public and available from the authors of the original publications. The project website can be found at: http://snap.stanford.edu/ne. The website contains preprocessed data used in the paper together with raw and enhanced networks.
Additional information
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
 1.
Gao, J., Barzel, B. & Barabási, A.L. Universal resilience patterns in complex networks. Nature 530, 307–312 (2016).
 2.
Zhong, Q. et al. An interspecies protein–protein interaction network across vast evolutionary distance. Mol. Syst. Biol. 12, 865 (2016).
 3.
Rolland, T. et al. A proteomescale map of the human interactome network. Cell 159, 1212–1226 (2014).
 4.
Costanzo, M. et al. A global genetic interaction network maps a wiring diagram of cellular function. Science 353, aaf1420 (2016).
 5.
Ji, J., Zhang, A., Liu, C., Quan, X. & Liu, Z. Survey: functional module detection from proteinprotein interaction networks. IEEE Trans. Knowl. Data Eng. 26, 261–277 (2014).
 6.
Wang, B. et al. Similarity network fusion for aggregating data types on a genomic scale. Nat. Methods 11, 333–337 (2014).
 7.
Menche, J. et al. Uncovering diseasedisease relationships through the incomplete interactome. Science 347, 1257601 (2015).
 8.
Rao, S. S. et al. A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 1665–1680 (2014).
 9.
Stegle, O., Teichmann, S. A. & Marioni, J. C. Computational and analytical challenges in singlecell transcriptomics. Nat. Rev. Genet. 16, 133 (2015).
 10.
Feizi, S., Marbach, D., Médard, M. & Kellis, M. Network deconvolution as a general method to distinguish direct dependencies in networks. Nat. Biotechnol. 31, 726–733 (2013).
 11.
Cao, M. et al. Going the distance for protein function prediction: a new distance metric for protein interaction networks. PLoS ONE 8, e76339 (2013).
 12.
Benson, A. R., Gleich, D. F. & Leskovec, J. Higherorder organization of complex networks. Science 353, 163–166 (2016).
 13.
Wang, B., Zhu, J., Pierson, E., Ramazzotti, D. & Batzoglou, S. Visualization and analysis of singlecell rnaseq data by kernelbased similarity learning. Nat. Methods 14, 414–416 (2017).
 14.
Rosipal, R. & Trejo, L. J. Kernel partial least squares regression in reproducing kernel hilbert space. J. Mach. Learn. Res. 2, 97–123 (2001).
 15.
Spielman, D. A. Spectral graph theory and its applications. In 48th Annual IEEE Symposium on Foundations of Computer Science 29–38 (IEEE, Providence, RI, USA, 2007).
 16.
Verma, D. & Meila, M. Comparison of spectral clustering methods. Adv. Neural Inf. Process. Syst. 15, 38 (2003).
 17.
Cho, H., Berger, B. & Peng, J. Compact integration of multinetwork topology for functional analysis of genes. Cell Syst. 3, 540–548 (2016).
 18.
Radivojac, P. et al. A largescale evaluation of computational protein function prediction. Nat. Methods 10, 221–227 (2013).
 19.
Zitnik, M. & Zupan, B. Matrix factorizationbased data fusion for gene function prediction in baker’s yeast and slime mold. Pac. Symp. Biocomput. 19, 400–411 (2014).
 20.
Greene, C. S. et al. Understanding multicellular function and disease with human tissuespecific networks. Nat. Genet. 47, 569–576 (2015).
 21.
Zitnik, M. & Leskovec, J. Predicting multicellular function through multilayer tissue networks. Bioinformatics 33, 190–198 (2017).
 22.
Köhler, S., Bauer, S., Horn, D. & Robinson, P. N. Walking the interactome for prioritization of candidate disease genes. Am. J. Human. Genet. 82, 949–958 (2008).
 23.
Bickmore, W. A. & van Steensel, B. Genome architecture: domain organization of interphase chromosomes. Cell 152, 1270–1284 (2013).
 24.
De Laat, W. & Duboule, D. Topology of mammalian developmental enhancers and their regulatory landscapes. Nature 502, 499–506 (2013).
 25.
Schmitt, A. D., Hu, M. & Ren, B. Genomewide mapping and analysis of chromosome architecture. Nat. Rev. Mol. Cell Biol. 17, 743 (2016).
 26.
Cabreros, I., Abbe, E. & Tsirigos, A. Detecting community structures in HiC genomic data. In Annual Conference on Information Science and Systems 584–589 (IEEE, NJ, USA, 2016).
 27.
Dixon, J. R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376–380 (2012).
 28.
Nora, E. P. et al. Spatial partitioning of the regulatory landscape of the Xinactivation centre. Nature 485, 381–385 (2012).
 29.
Blondel, V. D., Guillaume, J.L., Lambiotte, R. & Lefebvre, E. Fast unfolding of communities in large networks. J. Stat. Mech.Theory Exp. 10, 10008 (2008).
 30.
Le Martelot, E. & Hankin, C. Fast multiscale community detection based on local criteria within a multithreaded algorithm. Preprint at https://arxiv.org/abs/1301.0955 (2013).
 31.
Gavves, E., Fernando, B., Snoek, C. G., Smeulders, A. W., and Tuytelaars, T. Finegrained categorization by alignments. In 2013 IEEE International Conference on Computer Vision 1713–1720 (IEEE Computer Society, Washington, DC, 2013).
 32.
Wang, J., Markert, K. & Everingham, M. Learning models for object recognition from natural language descriptions. In Proc. British Machine Vision Conference 1–11 (British Machine Vision Association, London, 2009).
 33.
Wang, B., Jiang, J., Wang, W., Zhou, Z.H. & Tu, Z. Unsupervised metric fusion by cross diffusion. In 2012 IEEE Conference on Computer Vision and Pattern Recognition 2997–3004 (IEEE, Rhode Island, USA, 2012).
 34.
Zhou, D., Bousquet, O., Lal, T. N., Weston, J. & Schölkopf, B. Learning with local and global consistency. In Advances in Neural Information Processing Systems. Proc. of the First 12 Conferences (eds Jordan, M. I., LeCun, Y. & Solla, S. A.) 321328 (Max Planck Institute for Biological Cybernetics, Tuebingen, Germany, 2001).
 35.
Pržulj, N. Biological network comparison using graphlet degree distribution. Bioinformatics 23, e177–e183 (2007).
 36.
Davis, D., Yaveroğlu, Ö. N., MalodDognin, N., Stojmirovic, A. & Pržulj, N. Topologyfunction conservation in protein–protein interaction networks. Bioinformatics 31, 1632–1639 (2015).
 37.
Cowen, L., Ideker, T., Raphael, B. J. & Sharan, R. Network propagation: a universal amplifier of genetic associations. Nat. Rev. Genet. 18, 551 (2017).
 38.
Goldenberg, A., Mostafavi, S., Quon, G., Boutros, P. C. & Morris, Q. D. Unsupervised detection of genes of influence in lung cancer using biological networks. Bioinformatics 27, 3166–3172 (2011).
 39.
Mostafavi, S., Goldenberg, A., Morris, Q. & Ravasi, T. Labeling nodes using three degrees of propagation. PLoS ONE 7, e51947 (2012).
Acknowledgements
M.Z. and J.L. were supported by NSF, NIH BD2K, DARPA SIMPLEX, Stanford Data Science Initiative, and Chan Zuckerberg Biohub. J.Z. was supported by the Stanford Graduate Fellowship, NSF DMS 1712800 Grant and the Stanford Discovery Innovation Fund. A.P. and C.D.B. were supported by National Institutes of Health/National Human Genome Research Institute T32 HG000044, Chan Zuckerberg Initiative and Grant Number U01FD004979 from the FDA, which supports the UCSFStanford Center of Excellence in Regulatory Sciences and Innovation.
Author information
Author notes
 Serafim Batzoglou
Present address: Illumina Inc, 499 Illinois Street, San Francisco, 94158, CA, USA
These authors contributed equally: Bo Wang, Armin Pourshafeie, Marinka Zitnik.
Affiliations
Department of Computer Science, Stanford University, 353 Serra Mall, Stanford, 94305, CA, USA
 Bo Wang
 , Marinka Zitnik
 , Serafim Batzoglou
 & Jure Leskovec
Department of Physics, Stanford University, 382 Via Pueblo Mall, Stanford, 94305, CA, USA
 Armin Pourshafeie
Department of Electrical Engineering, Stanford University, 350 Serra Mall, Stanford, 94305, CA, USA
 Junjie Zhu
Department of Biomedical Data Science, Stanford University, 1265 Welch Road, Stanford, 94305, CA, USA
 Carlos D. Bustamante
Chan Zuckerberg Biohub, 499 Illinois St, San Francisco, 94158, CA, USA
 Carlos D. Bustamante
 & Jure Leskovec
Authors
Search for Bo Wang in:
Search for Armin Pourshafeie in:
Search for Marinka Zitnik in:
Search for Junjie Zhu in:
Search for Carlos D. Bustamante in:
Search for Serafim Batzoglou in:
Search for Jure Leskovec in:
Contributions
B.W., A.P., M.Z., and J.Z. designed and performed research, contributed new analytic tools, analyzed data, and wrote the manuscript. C.D.B., S.B., and J.L. designed and supervised the research and contributed to the manuscript.
Competing interests
The authors declare no competing interests.
Corresponding authors
Correspondence to Serafim Batzoglou or Jure Leskovec.
Electronic supplementary material
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.