Biological networks are powerful resources for the discovery of genes and genetic modules that drive disease. Fundamental to network analysis is the concept that genes underlying the same phenotype tend to interact; this principle can be used to combine and to amplify signals from individual genes. Recently, numerous bioinformatic techniques have been proposed for genetic analysis using networks, based on random walks, information diffusion and electrical resistance. These approaches have been applied successfully to identify disease genes, genetic modules and drug targets. In fact, all these approaches are variations of a unifying mathematical machinery — network propagation — suggesting that it is a powerful data transformation method of broad utility in genetic research.
At a glance
- Network medicine: a network-based approach to human disease. Nat. Rev. Genet. 12, 56–68 (2011). , &
- Network biology: understanding the cell's functional organization. Nat. Rev. Genet. 5, 101–113 (2004). &
- A network of protein–protein interactions in yeast. Nat. Biotechnol. 18, 1257–1261 (2000). , &
- Evaluation of clustering algorithms for protein–protein interaction networks. BMC Bioinformatics 7, 488 (2006). &
- How and when should interactome-derived clusters be used to predict functional modules and protein function? Bioinformatics 25, 3143–3150 (2009). &
- Network-based prediction of protein function. Mol. Syst. Biol. 3, 88 (2007). , &
- A critical assessment of Mus musculus gene function prediction using integrated genomic evidence. Genome Biol. 9 (Suppl. 1), S2 (2008). et al.
- The power of protein interaction networks for associating genes with diseases. Bioinformatics 26, 1057–1063 (2010). &
- Uncovering disease–disease relationships through the incomplete interactome. Science 347, 1257601–1257601 (2015). et al.
- Observation of phase transitions in spreading activation networks. Science 236, 1092–1094 (1987). , &
- 1–46 (Janos Bolyai Mathematical Society, 1993. in Combinatorics: Paul Erdõs is Eighty (eds Miklós, D., Sós, V. T. & Szõnyi, T.),
- The PageRank citation ranking: bringing order to the web. Stanford InfoLab http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.1768 (1999). , , &
- Authoritative sources in a hyperlinked environment. J. of the ACM 46, 604–632 (1999).
- Resistance distance. J. Math. Chem. 12, 81–95 (1993). &
- Random walk with restart: fast solutions and applications. Knowl. Inf. Syst. 14, 327–346 (2007). , &
- Topic-sensitive pagerank: a context-sensitive ranking algorithm for web search. IEEE Trans. Knowl. Data Eng. 15, 784–796 (2003).
- 2010). , & A Kinetic View of Statistical Physics (Cambridge Univ. Press,
- 2000). & Diffusion and Reactions in Fractals and Disordered Systems (Cambridge Univ. Press,
- 1984). & Random Walks and Electric Networks (The Mathematical Association of America,
- Diffusion kernels on graphs and other discrete input spaces. Proc. Intl Conf. on Machine Learning (ICML) 2, 315–322 (2002). &
- Identifying remote protein homologs by network propagation. FEBS J. 272, 5119–5128 (2005). , , &
- Integrative approaches for finding modular structure in biological networks. Nat. Rev. Genet. 14, 719–732 (2013). , , &
- Chapter 5: network biology approach to complex diseases. PLoS Comput. Biol. 8, e1002820 (2012). , &
- Protein networks in disease. Genome Res. 18, 644–652 (2008). &
- Structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review. Pharmacol. Ther. 138, 333–408 (2013). , , , &
- Predicting disease genes using protein–protein interactions. J. Med. Genet. 43, 691–698 (2006). , , &
- Reconstruction of a functional human gene network, with an application for prioritizing positional candidate genes. Am. J. Hum. Genet. 78, 1011–1025 (2006). et al.
- Scale-free networks: a decade and beyond. Science 325, 412–413 (2009).
- Mutational heterogeneity in cancer and the search for new cancer-associated genes. Nature 499, 214–218 (2013). et al.
- Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes. Nat. Genet. 47, 106–114 (2015).
A 2D method that exploits the propagation-derived similarity matrix to infer protein modules that are associated with cancer.
- Network-based integration of disparate omic data to identify 'silent players' in cancer. PLoS Comput. Biol. 11, e1004595 (2015). , &
- Systematic differences in signal emitting and receiving revealed by PageRank analysis of a human protein interactome. PLoS ONE 7, e44872 (2012). , &
- A directed protein interaction network for investigating intracellular signal transduction. Sci. Signal. 4, rs8 (2011). et al.
- New directions for diffusion-based network prediction of protein function: incorporating pathways with confidence. Bioinformatics 30, i219–i227 (2014).
A network propagation-based approach for incorporating known biological pathways into protein function prediction.
- Protein ranking: from local to global structure in the protein similarity network. Proc. Natl Acad. Sci. USA 101, 6559–6563 (2004).
One of the first studies to apply the concept of network propagation to the biological domain. A propagation process over sequence similarity networks of different species is used to predict orthology.
, , , &
- Motif-based protein ranking by network propagation. Bioinformatics 21, 3711–3718 (2005). , , &
- Improved network-based identification of protein orthologs. Bioinformatics 24, i200–i206 (2008). , &
- Global alignment of multiple protein interaction networks with application to functional orthology detection. Proc. Natl Acad. Sci. USA 105, 12763–12768 (2008). , &
- IsoRankN: spectral methods for global alignment of multiple protein networks. Bioinformatics 25, i253–i258 (2009). , , , &
- Whole-proteome prediction of protein function via graph-theoretic analysis of interaction maps. Bioinformatics 21 (Suppl. 1), i302–i310 (2005). , , , &
- Predicting protein function from protein/protein interaction data: a probabilistic approach. Bioinformatics 19 (Suppl. 1), i197–i204 (2003). &
- Prediction of protein function using protein–protein interaction data. J. Comput. Biol. 10, 947–960 (2003). , , , &
- Analysis of protein–protein interaction networks using random walks. BIOKDD '05 https://doi.org/10.1145/1134030.1134042 (2005). , &
- Spectral affinity in protein networks. BMC Syst. Biol. 3, 112 (2009). , &
- eQED: an efficient method for interpreting eQTL associations using protein networks. Mol. Syst. Biol. 4, 162 (2008). , , , &
- Systematic interpretation of genetic interactions using protein networks. Nat. Biotechnol. 23, 561–566 (2005). &
- Finding friends and enemies in an enemies-only network: a graph diffusion kernel for predicting novel genetic interactions and co-complex membership from yeast genetic interactions. Genome Res. 18, 1991–2004 (2008). , , , &
- Going the distance for protein function prediction: a new distance metric for protein interaction networks. PLoS ONE 8, e76339 (2013). et al.
- Gene function prediction from functional association networks using kernel partial least squares regression. PLoS ONE 10, e0134668 (2015). , , , &
- GeneMANIA: a real-time multiple association network integration algorithm for predicting gene function. Genome Biol. 9 (Suppl. 1), S4 (2008). , , , &
- Predicting protein functions by using unbalanced random walk algorithm on three biological networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 14, 360–369 (2015). , , &
- A statistical framework for genomic data fusion. Bioinformatics 20, 2626–2635 (2004). , , , &
- Diffusion kernel-based logistic regression models for protein function prediction. OMICS 10, 40–55 (2006). , , , &
- Fast protein classification with multiple networks. Bioinformatics 21 (Suppl. 2), ii59–ii65 (2005). , &
- Learning kernels from biological networks by maximizing entropy. Bioinformatics 20 (Suppl. 1), i326–i333 (2004). &
- Compact integration of multi-network topology for functional analysis of genes. Cell Syst. 3, 540–548.e5 (2016).
An integrative network propagation approach for functional inference using multiple heterogeneous networks.
- Exploiting ontology graph for predicting sparsely annotated gene function. Bioinformatics 31, i357–i364 (2015). , , , &
- Finding local communities in protein networks. BMC Bioinformatics 10, 297 (2009). , &
- Identification of protein complexes using weighted PageRank-nibble algorithm and core-attachment structure. IEEE/ACM Trans. Comput. Biol. Bioinform. 12, 179–192 (2015). , , &
- RRW: repeated random walks on genome-scale protein networks for local cluster discovery. BMC Bioinformatics 10, 283 (2009). , &
- GeneRank: using search engine technology for the analysis of microarray experiments. BMC Bioinformatics 6, 233 (2005). , , &
- Information flow analysis of interactome networks. PLoS Comput. Biol. 5, e1000350 (2009). et al.
- Why do hubs in the yeast protein interaction network tend to be essential: reexamining the connection between the network topology and essentiality. PLoS Comput. Biol. 4, e1000140 (2008). , , &
- An integrative approach for causal gene identification and gene regulatory pathway inference. Bioinformatics 22, e489–e496 (2006). , , , &
- Bridging high-throughput genetic and transcriptional data reveals cellular responses to alpha-synuclein toxicity. Nat. Genet. 41, 316–323 (2009). et al.
- An algorithmic framework for predicting side effects of drugs. J. Comput. Biol. 18, 207–218 (2011). &
- A novel link prediction algorithm for reconstructing protein–protein interaction networks by topological similarity. Bioinformatics 29, 355–364 (2013). &
- RedNemo: topology-based PPI network reconstruction via repeated diffusion with neighborhood modifications. Bioinformatics 33, 537–544 (2016). &
- Defining functional distance using manifold embeddings of gene ontology annotations. Proc. Natl Acad. Sci. USA 104, 11334–11339 (2007). &
- RIDDLE: reflective diffusion and local extension reveal functional associations for unannotated gene sets via proximity in a gene network. Genome Biol. 13, R125 (2012). et al.
- Genome-wide inferring gene–phenotype relationship by walking on the heterogeneous network. Bioinformatics 26, 1219–1224 (2010). &
- Walking the interactome for candidate prioritization in exome sequencing studies of Mendelian diseases. Bioinformatics 30, 3215–3222 (2014). et al.
- Walking the interactome for prioritization of candidate disease genes. Am. J. Hum. Genet. 82, 949–958 (2008).
An application of network propagation to prioritize disease-causing genes.
, , &
- Associating genes and protein complexes with disease via network propagation. PLoS Comput. Biol. 6, e1000641 (2010).
One of the first studies to use network propagation to associate modules of multiple proteins with disease.
, , , &
- Prioritizing candidate disease genes by network-based boosting of genome-wide association data. Genome Res. 21, 1109–1121 (2011). , , , &
- Disease candidate gene identification and prioritization using protein interaction networks. BMC Bioinformatics 10, 73 (2009). , &
- Mining Alzheimer disease relevant proteins from integrated protein interactome data. Pac. Symp. Biocomput. 2006, 367–378 (2006). , &
- Candidate gene prioritization by network analysis of differential expression using machine learning approaches. BMC Bioinformatics 11, 460 (2010). , , , &
- Identifying causal genes and dysregulated pathways in complex diseases. PLoS Comput. Biol. 7, e1001095 (2011). , &
- DADA: degree-aware algorithms for network-based disease gene prioritization. BioData Min. 4, 19 (2011). , , &
- Vavien: an algorithm for prioritizing candidate disease genes based on topological similarity of proteins in interaction networks. J. Comput. Biol. 18, 1561–1574 (2011). , &
- Prediction and validation of gene-disease associations using methods inspired by social network analyses. PLoS ONE 8, e58977 (2013). et al.
- Understanding genotype–phenotype effects in cancer via network approaches. PLoS Comput. Biol. 12, e1004747 (2016). , &
- Enhancing the prioritization of disease-causing genes through tissue specific protein interaction networks. PLoS Comput. Biol. 8, e1002690 (2012). , , &
- An integer programming framework for inferring disease complexes from network data. Bioinformatics 32, i271–i277 (2016). , , &
- Algorithms for detecting significantly mutated pathways in cancer. J. Comput. Biol. 18, 507–522 (2011). , &
- Gene and network analysis of common variants reveals novel associations in multiple complex diseases. Genetics 204, 783–798 (2016). , &
- 293–306 (Springer, 2014). et al. in Research in Computational Molecular Biology. RECOMB 2014. Lecture Notes in Computer Science (ed. Sharan, R.)
- Network-based stratification of tumor mutations. Nat. Methods 10, 1108–1115 (2013).
One of the first methods to use patient-specific propagation processes to stratify patients with cancer into subtypes.
, , , &
- Similarity network fusion for aggregating data types on a genomic scale. Nat. Methods 11, 333–337 (2014). et al.
- Discovering causal pathways linking genomic events to transcriptional states using Tied Diffusion Through Interacting Events (TieDIE). Bioinformatics 29, 2757–2764 (2013).
An integrative method to predict cancer pathways that is based on superimposing two propagation processes that are run from nodes corresponding to mutated and differentially expressed genes.
- Phosphoproteome integration reveals patient-specific networks in prostate cancer. Cell 166, 1041–1054 (2016). et al.
- Inference of personalized drug targets via network propagation. Pac. Symp. Biocomput. 21, 156–167 (2016). , , &
- Drug–target interaction prediction by random walk on the heterogeneous network. Mol. Biosyst. 8, 1970 (2012). , , &
- Understanding multicellular function and disease with human tissue-specific networks. Nat. Genet. 47, 569–576 (2015). et al.
- GTEx Consortium. Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans. Science 348, 648–660 (2015).
- Defining functional DNA elements in the human genome. Proc. Natl Acad. Sci. USA 111, 6131–6138 (2014). et al.
- Roadmap Epigenomics Consortium et al. Integrative analysis of 111 reference human epigenomes. Nature 518, 317–330 (2015).
- Laplacians and the Cheeger inequality for directed graphs. Ann. Comb. 9, 1–19 (2005).
- Clustering and community detection in directed networks: a survey. Phys. Rep. 533, 95–142 (2013). &
- KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 44, D457–D462 (2016). , , , &
- The somatic genomic landscape of glioblastoma. Cell 155, 462–477 (2013). et al.
- GeneMANIA Cytoscape plugin: fast gene function predictions on the desktop. Bioinformatics 26, 2927–2928 (2010). et al.
- Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498–2504 (2003). et al.
- Exploiting protein–protein interaction networks for genome-wide disease-gene prioritization. PLoS ONE 7, e43557 (2012). &
- PRINCIPLE: a tool for associating genes with diseases via network propagation. Bioinformatics 27, 3325–3326 (2011). , , , &
- ToppGene Suite for gene list enrichment analysis and candidate gene prioritization. Nucleic Acids Res. 37, W305–W311 (2009). , , &