Abstract
Molecular networks represent the backbone of molecular activity within the cell. Recent studies have taken a comparative approach toward interpreting these networks, contrasting networks of different species and molecular types, and under varying conditions. In this review, we survey the field of comparative biological network analysis and describe its applications to elucidate cellular machinery and to predict protein function and interaction. We highlight the open problems in the field as well as propose some initial mathematical formulations for addressing them. Many of the methodological and conceptual advances that were important for sequence comparison will likely also be important at the network level, including improved search algorithms, techniques for multiple alignment, evolutionary models for similarity scoring and better integration with public databases.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Ho, Y. et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180–183 (2002).
Gavin, A.C. et al. Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 415, 141–147 (2002).
Iyer, V.R. et al. Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF. Nature 409, 533–538 (2001).
Ren, B. et al. Genome-wide location and function of DNA binding proteins. Science 290, 2306–2309 (2000).
Uetz, P. et al. A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae. Nature 403, 623–627 (2000).
Fields, S. & Song, O. A novel genetic system to detect protein-protein interactions. Nature 340, 245–246 (1989).
Stelzl, U. et al. A human protein-protein interaction network: a resource for annotating the proteome. Cell 122, 957–968 (2005).
Rual, J.F. et al. Towards a proteome-scale map of the human protein-protein interaction network. Nature 437, 1173–1178 (2005).
Tong, A.H. et al. Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science 294, 2364–2368 (2001).
Peri, S. et al. Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res. 13, 2363–2371 (2003).
Nikitin, A., Egorov, S., Daraselia, N. & Mazo, I. Pathway studio—the analysis and navigation of molecular networks. Bioinformatics 19, 2155–2157 (2003).
von Mering, C. et al. Comparative assessment of large-scale data sets of protein-protein interactions. Nature 417, 399–403 (2002).
Sharan, R. et al. Conserved patterns of protein interaction in multiple species. Proc. Natl. Acad. Sci. USA 102, 1974–1979 (2005).
Bader, J.S., Chaudhuri, A., Rothberg, J.M. & Chant, J. Gaining confidence in high-throughput protein interaction networks. Nat. Biotechnol. 22, 78–85 (2004).
Qi, Y., Klein-Seetharaman, J. & Bar-Joseph, Z. Random forest similarity for protein-protein interaction prediction from multiple sources. Pac. Symp. Biocomput. 10, 531–542 (2005).
Deng, M., Sun, F. & Chen, T. Assessment of the reliability of protein-protein interactions and protein function prediction. Pac. Symp. Biocomput. 8, 140–151 (2003).
Suthram, S., Shlomi, T., Ruppin, E., Sharan, R. & Ideker, T. in Proceedings of the First Annual RECOMB Systems Biology Workshop, vol. 1 (2005).
Kelley, B.P. et al. Conserved pathways within bacteria and yeast as revealed by global protein network alignment. Proc. Natl. Acad. Sci. USA 100, 11394–11399 (2003).
Rhodes, D.R. et al. Probabilistic model of the human protein-protein interaction network. Nat. Biotechnol. 23, 951–959 (2005).
Kelley, R. & Ideker, T. Systematic interpretation of genetic interactions using protein networks. Nat. Biotechnol. 23, 561–566 (2005).
Zhang, L.V. et al. Motifs, themes and thematic maps of an integrated Saccharomyces cerevisiae interaction network. J. Biol. 4, 6 (2005).
Matthews, L.R. et al. Identification of potential interaction networks using sequence-based searches for conserved protein-protein interactions or “interologs.” Genome Res. 11, 2120–2126 (2001).
Yu, H. et al. Annotation transfer between genomes: protein-protein interologs and protein-DNA regulogs. Genome Res. 14, 1107–1118 (2004).
Tohsato, Y., Matsuda, H. & Hashimoto, A. in Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology (ISMB) 376–383 (2000).
Berg, J. & Lassig, M. Local graph alignment and motif search in biological networks. Proc. Natl. Acad. Sci. USA 101, 14689–14694 (2004).
Ogata, H., Fujibuchi, W., Goto, S. & Kanehisa, M. A heuristic graph comparison algorithm and its application to detect functionally related enzyme clusters. Nucleic Acids Res. 28, 4021–4028 (2000).
Sharan, R., Ideker, T., Kelley, B., Shamir, R. & Karp, R.M. Identification of protein complexes by comparative analysis of yeast and bacterial protein interaction data. J. Comput. Biol. 12, 835–846 (2005).
Suthram, S., Sittler, T. & Ideker, T. The Plasmodium protein network diverges from those of other eukaryotes. Nature 438, 108–112 (2005).
Koyuturk, M., Grama, A. & Szpankowski, W. in Proceedings of the Ninth Annual International Conference on Research in Computational Molecular Biology (RECOMB) 48–65 (2005).
Bandyopadhyay, S., Sharan, R. & Ideker, T. Systematic identification of functional orthologs based on protein network comparison. Genome Res. 16, 428–435 (2006).
Koyuturk, M., Grama, A. & Szpankowski, W. An efficient algorithm for detecting frequent subgraphs in biological networks. Bioinformatics 20 suppl. 1, I200–I207 (2004).
Stuart, J.M., Segal, E., Koller, D. & Kim, S.K. A gene-coexpression network for global discovery of conserved genetic modules. Science 302, 249–255 (2003).
Bader, G.D. et al. BIND-The biomolecular interaction network database. Nucleic Acids Res. 29, 242–245 (2001).
Xenarios, I. et al. DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res. 30, 303–305 (2002).
Zanzoni, A. et al. MINT: a Molecular INTeraction database. FEBS Lett. 513, 135–140 (2002).
Breitkreutz, B.J., Stark, C. & Tyers, M. The GRID: the General Repository for Interaction Datasets. Genome Biol. 4, R23 (2003).
Gunsalus, K.C. et al. Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis. Nature 436, 861–865 (2005).
Kemmeren, P. et al. Protein interaction verification and functional annotation by integrated analysis of genome-scale data. Mol. Cell 9, 1133–1143 (2002).
Jansen, R. et al. A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 302, 449–453 (2003).
Lee, I., Date, S.V., Adai, A.T. & Marcotte, E.M. A probabilistic functional network of yeast genes. Science 306, 1555–1558 (2004).
Lu, L.J., Xia, Y., Paccanaro, A., Yu, H. & Gerstein, M. Assessing the limits of genomic data integration for predicting protein networks. Genome Res. 15, 945–953 (2005).
Wong, S.L. et al. Combining biological networks to predict genetic interactions. Proc. Natl. Acad. Sci. USA 101, 15682–15687 (2004).
Yeger-Lotem, E. et al. Network motifs in integrated cellular networks of transcription-regulation and protein-protein interaction. Proc. Natl. Acad. Sci. USA 101, 5934–5939 (2004).
Pinter, R.Y., Rokhlenko, O., Yeger-Lotem, E. & Ziv-Ukelson, M. Alignment of metabolic pathways. Bioinformatics 21, 3401–3408 (2005).
Giugno, R. & Shasha, D. in Proceeding of the 16th International Conference on Pattern Recognition (ICPR) 112–115 (2002).
Jones, S. & Thornton, J.M. Principles of protein-protein interactions. Proc. Natl. Acad. Sci. USA 93, 13–20 (1996).
Berg, J., Lassig, M. & Wagner, A. Structure and evolution of protein interaction networks: a statistical model for link dynamics and gene duplications. BMC Evol. Biol. 4, 51 (2004).
Rzhetsky, A. & Gomez, S.M. Birth of scale-free molecular networks and the number of distinct DNA and protein domains per genome. Bioinformatics 17, 988–996 (2001).
Barabasi, A.L. & Albert, R. Emergence of scaling in random networks. Science 286, 509–512 (1999).
Wagner, A. & Fell, D.A. The small world inside large metabolic networks. Proc. Biol. Sci. 268, 1803–1810 (2001).
Eisenberg, E. & Levanon, E.Y. Preferential attachment in the protein network evolution. Phys. Rev. Lett. 91, 138701 (2003).
Needleman, S.B. & Wunsch, C.D. A general method applicable to the search for similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970).
Lander, E.S. et al. Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001).
Venter, J.C. et al. The sequence of the human genome. Science 291, 1304–1351 (2001).
Jukes, T.H. & Cantor, C.R. in Mammalian Protein Metabolism (ed. Munro, H.N.) 21–123 (Academic Press, New York, 1969).
Goehler, H. et al. A protein interaction network links GIT1, an enhancer of huntingtin aggregation, to Huntington's disease. Mol. Cell 15, 853–865 (2004).
Calvano, S.E. et al. A network-based analysis of systemic inflammation in humans. Nature 437, 1032–1037 (2005).
Sanger, F. & Tuppy, H. The amino acid sequence in the phenylalanyl chain of insulin. I. The identification of lower peptides from partial hydrolysates. Biochem. J. 49, 463–481 (1951).
Dayhoff, M.O., Schwartz, R.M. & Orcutt, B.C. A model of evolutionary change in proteins. in Atlas of Protein Sequence and Structure, vol. 5, suppl. 3, (Dayhoff, M.O., ed.) 345–352 (National Biomedical Research Foundation, Silver Spring, MD, 1978).
Needleman, S.B. & Wunsch, C.D. A general method applicable to the search of similarities in the amino acid sequence of two proteins. J. Mol. Biol. 48, 443–453 (1970).
Smith, T.F. & Waterman, M.S. Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981).
Kyte, J. & Doolittle, R.F. A simple method for displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105–132 (1982).
Stormo, G.D. & Hartzell, G.W. III. Identifying protein-binding sites from unaligned DNA fragments. Proc. Natl. Acad. Sci. USA 86, 1183–1187 (1989).
Taylor, W.R. Multiple sequence alignment by a pairwise algorithm. Comput. Appl. Biosci. 3, 81–87 (1987).
Lipman, D.J., Altschul, S.F. & Kececioglu, J.D. A tool for multiple sequence alignment. Proc. Natl Acad. Sci. USA 86, 4412–4415 (1989).
Krogh, A., Brown, M., Mian, S., Sjolander, K. & Haussler, D. Hidden Markov models in computational biology: applications to protein modeling. J. Mol. Biol. 235, 1501–1531 (1994).
Borodovsky, M. & McIninch, J. GENMARK: parallel gene recognition for both DNA strands. Comput. Chem. 17, 123–133 (1993).
Churchill, G.A. Stochastic models for heterogeneous DNA sequences. Bull. Math. Biol. 51, 79–94 (1989).
Milo, R. et al. Network motifs: simple building blocks of complex networks. Science 298, 824–827 (2002).
Scott, J., Ideker, T., Karp, R.M. & Sharan, R. in Proceedings of the Ninth Annual International Conference on Research in Computational Molecular Biology (RECOMB) 1–13 (2005).
Alon, N., Yuster, R. & Zwick, U. Color-coding. J. ACM 42, 844–856 (1995).
Acknowledgements
R.S. is supported by an Alon Fellowship; T.I., by the David and Lucille Packard Foundation. This work was also supported by the National Center for Research Resources (RR018627) and the National Science Foundation (NSF 0425926).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Rights and permissions
About this article
Cite this article
Sharan, R., Ideker, T. Modeling cellular machinery through biological network comparison. Nat Biotechnol 24, 427–433 (2006). https://doi.org/10.1038/nbt1196
Published:
Issue Date:
DOI: https://doi.org/10.1038/nbt1196
This article is cited by
-
Characterizing the circadian connectome of Ocimum tenuiflorum using an integrated network theoretic framework
Scientific Reports (2023)
-
Heuristic shortest hyperpaths in cell signaling hypergraphs
Algorithms for Molecular Biology (2022)
-
A comprehensive review of global alignment of multiple biological networks: background, applications and open issues
Network Modeling Analysis in Health Informatics and Bioinformatics (2022)
-
Convex graph invariant relaxations for graph edit distance
Mathematical Programming (2022)
-
Empirically classifying network mechanisms
Scientific Reports (2021)