Who's your neighbor? New computational approaches for functional genomics

Abstract

Several recently developed computational approaches in comparative genomics go beyond sequence comparison. By analyzing phylogenetic profiles of protein families, domain fusions, gene adjacency in genomes, and expression patterns, these methods predict many functional interactions between proteins and help deduce specific functions for numerous proteins. Although some of the resultant predictions may not be highly specific, these developments herald a new era in genomics in which the benefits of comparative analysis of the rapidly growing collection of complete genomes will become increasingly obvious.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Context-based approaches in comparative genomics.
Figure 2: Phylogenetic patterns, domain fusion, and gene clustering help predict functional pathways.

References

  1. 1

    Koonin, E.V., Tatusov, R.L. & Galperin, M.Y. Beyond complete genomes: from sequence to structure and function. Curr. Opin. Struct. Biol. 8, 355–363 (1998).

  2. 2

    Bork, P. et al. Predicting function: From genes to genomes and back. J. Mol. Biol. 283, 707–725 (1998).

  3. 3

    Bork, P. & Koonin, E.V. Predicting functions from protein sequences—where are the bottlenecks? Nat. Genet. 18, 313–318 (1998).

  4. 4

    http://www.ncbi.nlm.nih.gov/Entrez/Genome/org.html.

  5. 5

    http://www.ncbi.nlm.nih.gov/PMGifs/Genomes/bact.html.

  6. 6

    http://www-fp.mcs.anl.gov/~gaasterland/genomes.html.

  7. 7

    Huynen, M.J. & Snel, B. Gene and context: integrative approaches to genome analysis. Adv. Prot. Chem. 54, 345–380 (2000).

  8. 8

    Fitch, W.M. Distinguishing homologous from analogous proteins. Syst. Zool. 19, 99–106 (1970).

  9. 9

    Henikoff, S. et al. Gene families: the taxonomy of protein paralogs and chimeras. Science 278, 609–614 (1997).

  10. 10

    Tatusov, R.L., Koonin, E.V. & Lipman, D.J. A genomic perspective on protein families. Science 278, 631–637 (1997).

  11. 11

    Tatusov, R.L., Galperin, M.Y., Natale, D.A. & Koonin, E.V. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 28, 33–36 (2000).

  12. 12

    http://www.ncbi.nlm.nih.gov/COG.

  13. 13

    Gaasterland, T. & Ragan, M.A. Microbial genescapes: phyletic and functional patterns of ORF distribution among prokaryotes. Microb. Comp. Genomics 3, 199–217 (1998).

  14. 14

    Pellegrini, M., Marcotte, E.M., Thompson, M.J., Eisenberg, D. & Yeates, T.O. Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl. Acad. Sci. U S A 96, 4285–4288 (1999).

  15. 15

    Brown, J.R. & Doolittle, W.F. Archaea and the prokaryote-to-eukaryote transition. Microbiol. Mol. Biol. Rev. 61, 456–502 (1997).

  16. 16

    Makarova, K.S. et al. Comparative genomics of the Archaea (Euryarchaeota): evolution of conserved protein families, the stable core, and the variable shell. Genome Res. 9, 608–628 (1999).

  17. 17

    Dandekar, T., Schuster, S., Snel, B., Huynen, M. & Bork, P. Pathway alignment: application to the comparative analysis of glycolytic enzymes. Biochem. J. 343, 115–124 (1999).

  18. 18

    Huynen, M.A., Dandekar, T. & Bork, P. Variation and evolution of the citric-acid cycle: a genomic perspective. Trends Microbiol. 7, 281–291 (1999).

  19. 19

    Ibba, M. et al. A euryarchaeal lysyl-tRNA synthetase: resemblance to class I synthetases. Science 278, 1119–1122 (1997).

  20. 20

    Ibba, M., Bono, J.L., Rosa, P.A. & Soll, D. Archaeal-type lysyl-tRNA synthetase in the Lyme disease spirochete Borrelia burgdorferi. Proc. Natl. Acad. Sci. USA 94, 14383–14388 (1997).

  21. 21

    Wolf, Y.I., Aravind, L., Grishin, N.V. & Koonin, E.V. Evolution of aminoacyl-tRNA synthetases—analysis of unique domain architectures and phylogenetic trees reveals a complex history of horizontal gene transfer events. Genome Res. 9, 689–710 (1999).

  22. 22

    Galperin, M.Y., Aravind, L. & Koonin, E.V. Aldolases of the DhnA family: a possible solution to the problem of pentose and hexose biosynthesis in archaea. FEMS Microbiol. Lett. 183, 269–264 (2000).

  23. 23

    Thomson, G.J., Howlett, G.J., Ashcroft, A.E. & Berry, A. The dhnA gene of Escherichia coli encodes a class I fructose bisphosphate aldolase. Biochem. J. 331, 437–445 (1998).

  24. 24

    Dynes, J.L. & Firtel, R.A. Molecular complementation of a genetic marker in Dictyostelium using a genomic DNA library. Proc. Natl. Acad. Sci. USA. 86, 7966–7970 (1989).

  25. 25

    Marcotte, E.M. et al. Detecting protein function and protein-protein interactions from genome sequences. Science 285, 751–753 (1999).

  26. 26

    Enright, A.J., Ilipoulos, I., Kyrpides, N.C. & Ouzounis, C.A. Protein interaction maps for complete genomes based on gene fusion events. Nature 402, 86–90 (1999).

  27. 27

    Snel, B., Bork, P. & Huynen, M. Genome evolution: Gene fusion versus gene fission. Trends Genet. 16, 9–11 (2000).

  28. 28

    Aravind, L. & Ponting, C.P. The GAF domain: an evolutionary link between diverse phototransducing proteins. Trends Biochem. Sci. 22, 458–459 (1997).

  29. 29

    Galperin, M.Y., Natale, D.A., Aravind, L. & Koonin, E.V. A specialized version of the HD hydrolase domain implicated in signal transduction. J. Mol. Microbiol. Biotechnol. 1, 303–305 (1999).

  30. 30

    Doolittle, R.F. Do you dig my groove? Nat. Genet. 23, 6–8 (1999).

  31. 31

    Overbeek, R., Fonstein, M., D'Souza, M., Pusch, G.D. & Maltsev, N. Use of contiguity on the chromosome to predict functional coupling. In Silico Biol. http://www.bioinfo.de/isb/1998/01/0009/(1998).

  32. 32

    Overbeek, R., Fonstein, M., D'Souza, M., Pusch, G.D. & Maltsev, N. The use of gene clusters to infer functional coupling. Proc. Natl. Acad. Sci. USA 96, 2896–2901 (1999).

  33. 33

    Dandekar, T., Snel, B., Huynen, M. & Bork, P. Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem. Sci. 23, 324–328 (1998).

  34. 34

    Schiott, T., Throne-Holst, M. & Hederstedt, L. Bacillus subtilis CcdA-defective mutants are blocked in a late step of cytochrome c biogenesis. J. Bacteriol. 179, 4523–4529 (1997).

  35. 35

    Mironov, A.A., Koonin, E.V., Roytberg, M.A. & Gelfand, M.S. Computer analysis of transcription regulatory patterns in completely sequenced bacterial genomes. Nucleic Acids Res. 27, 2981–2989 (1999).

  36. 36

    Gelfand, M.S., Koonin, E.V. & Mironov, A.A. Prediction of transcription regulatory sites in Archaea by a comparative-genomic approach. Nucleic Acids Res. 28, 695–705 (2000).

  37. 37

    Walhout, A.J. et al. Protein interaction mapping in C. elegans using proteins involved in vulval development. Science 287, 116–122 (2000).

  38. 38

    Marcotte, E.M., Pellegrini, M., Thompson, M.J., Yeates, T.O. & Eisenberg, D. A combined algorithm for genome-wide prediction of protein function. Nature 402, 83–86 (1999).

  39. 39

    Mewes, H.W., Hani, J., Pfeiffer, F. & Frishman, D. MIPS: a database for protein sequences and complete genomes. Nucleic Acids Res. 26, 33–37 (1998).

  40. 40

    Doolittle, W.F. You are what you eat: a gene transfer ratchet could account for bacterial genes in eukaryotic nuclear genomes. Trends Genet. 14, 307–311 (1998).

  41. 41

    Doolittle, W.F. Phylogenetic classification and the universal tree. Science 284, 2124–2129 (1999).

  42. 42

    Mushegian, A.R. & Koonin, E.V. A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc. Natl. Acad. Sci. USA 93, 10268–10273 (1996).

  43. 43

    Aravind, L., Tatusov, R.L., Wolf, Y.I., Walker, D.R. & Koonin, E.V. Evidence for massive gene exchange between archaeal and bacterial hyperthermophiles. Trends Genet. 14, 442–444 (1998).

Download references

Acknowledgements

We thank L. Aravind, Arcady Mushegian, and Yuri Wolf for numerous helpful discussions of the issues considered in this article, and Martijn Huynen and Peer Bork for sending us preprints of their publications.

Author information

Correspondence to Eugene V. Koonin.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Galperin, M., Koonin, E. Who's your neighbor? New computational approaches for functional genomics. Nat Biotechnol 18, 609–613 (2000). https://doi.org/10.1038/76443

Download citation

Further reading