Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Analysis
  • Published:

Functional genome annotation through phylogenomic mapping

Abstract

Accurate determination of functional interactions among proteins at the genome level remains a challenge for genomic research. Here we introduce a genome-scale approach to functional protein annotation—phylogenomic mapping—that requires only sequence data, can be applied equally well to both finished and unfinished genomes, and can be extended beyond single genomes to annotate multiple genomes simultaneously. We have developed and applied it to more than 200 sequenced bacterial genomes. Proteins with similar evolutionary histories were grouped together, placed on a three dimensional map and visualized as a topographical landscape. The resulting phylogenomic maps display thousands of proteins clustered in mountains on the basis of coinheritance, a strong indicator of shared function. In addition to systematic computational validation, we have experimentally confirmed the ability of phylogenomic maps to predict both mutant phenotype and gene function in the delta proteobacterium Myxococcus xanthus.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Phylogenomic mapping.
Figure 2: The phylogenomic map of M. xanthus.
Figure 3: Experimental validation of phylogenomic map predictions in M. xanthus.

Similar content being viewed by others

References

  1. Hartwell, L.H., Hopfield, J.J., Leibler, S. & Murray, A.W. From molecular to modular cell biology. Nature 402, C47–C52 (1999).

    Article  CAS  Google Scholar 

  2. Marcotte, E.M. et al. Detecting protein function and protein-protein interactions from genome sequences. Science 285, 751–753 (1999).

    Article  CAS  Google Scholar 

  3. Gertz, J. et al. Inferring protein interactions from phylogenetic distance matrices. Bioinformatics 19, 2039–2045 (2003).

    Article  CAS  Google Scholar 

  4. Pazos, F. & Valencia, A. In silico two-hybrid system for the selection of physically interacting protein pairs. Proteins 47, 219–227 (2002).

    Article  CAS  Google Scholar 

  5. Overbeek, R., Fonstein, M., D'Souza, M., Pusch, G.D. & Maltsev, N. The use of gene clusters to infer functional coupling. Proc. Natl. Acad. Sci. USA 96, 2896–2901 (1999).

    Article  CAS  Google Scholar 

  6. Huynen, M.A., Snel, B., von Mering, C. & Bork, P. Function prediction and protein networks. Curr. Opin. Cell Biol. 15, 191–198 (2003).

    Article  CAS  Google Scholar 

  7. Pellegrini, M., Marcotte, E.M., Thompson, M.J., Eisenberg, D. & Yeates, T.O. Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl. Acad. Sci. USA 96, 4285–4288 (1999).

    Article  CAS  Google Scholar 

  8. Eisen, M.B., Spellman, P.T., Brown, P.O. & Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc. Natl. Acad. Sci. USA 95, 14863–14868 (1998).

    Article  CAS  Google Scholar 

  9. Barabasi, A.L. & Oltvai, Z.N. Network biology: understanding the cell's functional organization. Nat. Rev. Genet. 5, 101–113 (2004).

    Article  CAS  Google Scholar 

  10. Alter, O., Brown, P.O. & Botstein, D. Singular value decomposition for genome-wide expression data processing and modeling. Proc. Natl. Acad. Sci. USA 97, 10101–10106 (2000).

    Article  CAS  Google Scholar 

  11. Davidson, G.S., Wylie, B.N. & Boyack, K. Cluster stability and the use of noise in interpretation of clustering. in Proceedings of the IEEE Symposium on Information Visualization 2001 (INFOVIS'01), 23–30 (IEEE Computer Society, 2001).

    Google Scholar 

  12. Werner-Washburne, M. et al. Comparative analysis of multiple genome-scale data sets. Genome Res. 12, 1564–1573 (2002).

    Article  CAS  Google Scholar 

  13. Kim, S.K. et al. A gene expression map for Caenorhabditis elegans. Science 293, 2087–2092 (2001).

    Article  CAS  Google Scholar 

  14. Marcotte, E.M., Xenarios, I., van Der Bliek, A.M. & Eisenberg, D. Localizing proteins in the cell from their phylogenetic profiles. Proc. Natl. Acad. Sci. USA 97, 12115–12120 (2000).

    Article  CAS  Google Scholar 

  15. Enault, F., Suhre, K., Poirot, O., Abergel, C. & Claverie, J.M. Phydbac2: improved inference of gene function using interactive phylogenomic profiling and chromosomal location analysis. Nucleic Acids Res. 32, W336–W339 (2004).

    Article  CAS  Google Scholar 

  16. Altschul, S.F. et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 (1997).

    Article  CAS  Google Scholar 

  17. de Hoon, M.J., Imoto, S., Nolan, J. & Miyano, S. Open source clustering software. Bioinformatics 20, 1453–1454 (2004).

    Article  CAS  Google Scholar 

  18. Julien, B. & Shah, S. Heterologous expression of epothilone biosynthetic genes in Myxococcus xanthus. Antimicrob. Agents Chemother. 46, 2772–2778 (2002).

    Article  CAS  Google Scholar 

  19. Gerth, K. et al. The myxalamids, new antibiotics from Myxococcus xanthus (Myxobacterales). I. Production, physico-chemical and biological properties, and mechanism of action. J. Antibiot. (Tokyo) 36, 1150–1156 (1983).

    Article  CAS  Google Scholar 

  20. Pospiech, A., Cluzel, B., Bietenhader, J. & Schupp, T. A new Myxococcus xanthus gene cluster for the biosynthesis of the antibiotic saframycin Mx1 encoding a peptide synthetase. Microbiology 141, 1793–1803 (1995).

    Article  CAS  Google Scholar 

  21. Shi, W. & Zusman, D.R. The two motility systems of Myxococcus xanthus show different selective advantages on various surfaces. Proc. Natl. Acad. Sci. USA 90, 3378–3382 (1993).

    Article  CAS  Google Scholar 

  22. Kaiser, D. & Welch, R. Dynamics of fruiting body morphogenesis. J. Bacteriol. 186, 919–927 (2004).

    Article  CAS  Google Scholar 

  23. Kaiser, D. Coupling cell movement to multicellular development in myxobacteria. Nat. Rev. Microbiol. 1, 45–54 (2003).

    Article  CAS  Google Scholar 

  24. Wu, S.S. & Kaiser, D. Genetic and functional evidence that Type IV pili are required for social gliding motility in Myxococcus xanthus. Mol. Microbiol. 18, 547–558 (1995).

    Article  CAS  Google Scholar 

  25. Lowe, J., van den Ent, F. & Amos, L.A. Molecules of the bacterial cytoskeleton. Annu. Rev. Biophys. Biomol. Struct. 33, 177–198 (2004).

    Article  Google Scholar 

  26. Wolgemuth, C., Hoiczyk, E., Kaiser, D. & Oster, G. How myxobacteria glide. Curr. Biol. 12, 369–377 (2002).

    Article  CAS  Google Scholar 

  27. Raetz, C.R. & Whitfield, C. Lipopolysaccharide endotoxins. Annu. Rev. Biochem. 71, 635–700 (2002).

    Article  CAS  Google Scholar 

  28. Gaspar, J.A., Thomas, J.A., Marolda, C.L. & Valvano, M.A. Surface expression of O-specific lipopolysaccharide in Escherichia coli requires the function of the TolA protein. Mol. Microbiol. 38, 262–275 (2000).

    Article  CAS  Google Scholar 

  29. Fink, J.M. & Zissler, J.F. Defects in motility and development of Myxococcus xanthus lipopolysaccharide mutants. J. Bacteriol. 171, 2042–2048 (1989).

    Article  CAS  Google Scholar 

  30. Youderian, P., Burke, N., White, D.J. & Hartzell, P.L. Identification of genes required for adventurous gliding motility in Myxococcus xanthus with the transposable element mariner. Mol. Microbiol. 49, 555–570 (2003).

    Article  CAS  Google Scholar 

  31. Caberoy, N.B., Welch, R.D., Jakobsen, J.S., Slater, S.C. & Garza, A.G. Global mutational analysis of NtrC-like activators in Myxococcus xanthus: identifying activator mutants defective for motility and fruiting body development. J. Bacteriol. 185, 6083–6094 (2003).

    Article  CAS  Google Scholar 

  32. Kroos, L., Kuspa, A. & Kaiser, D. Defects in fruiting body development caused by Tn5 lac insertions in Myxococcus xanthus. J. Bacteriol. 172, 484–487 (1990).

    Article  CAS  Google Scholar 

  33. Harris, M.A. et al. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 32 Database issue, D258–261 (2004).

  34. Michalickova, K. et al. SeqHound: biological sequence and structure database as a platform for bioinformatics research. BMC Bioinformatics 3, 32 (2002).

    Article  Google Scholar 

  35. Camon, E. et al. The Gene Ontology Annotation (GOA) Database: sharing knowledge in Uniprot with Gene Ontology. Nucleic Acids Res. 32, D262–D266 (2004).

    Article  CAS  Google Scholar 

  36. Boyle, E.I. et al. GO:TermFinder – open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics 20, 3710–3715 (2004).

    Article  CAS  Google Scholar 

  37. Venter, J.C. et al. Environmental genome shotgun sequencing of the Sargasso Sea. Science 304, 66–74 (2004).

    Article  CAS  Google Scholar 

  38. McAdams, H.H., Srinivasan, B. & Arkin, A.P. The evolution of genetic regulatory systems in bacteria. Nat. Rev. Genet. 5, 169–178 (2004).

    Article  CAS  Google Scholar 

  39. Holder, M. & Lewis, P.O. Phylogeny estimation: traditional and Bayesian approaches. Nat. Rev. Genet. 4, 275–284 (2003).

    Article  CAS  Google Scholar 

  40. Daubin, V., Moran, N.A. & Ochman, H. Phylogenetics and the cohesion of bacterial genomes. Science 301, 829–832 (2003).

    Article  CAS  Google Scholar 

  41. Florea, L., McClelland, M., Riemer, C., Schwartz, S. & Miller, W. EnteriX 2003: Visualization tools for genome alignments of Enterobacteriaceae. Nucleic Acids Res. 31, 3527–3532 (2003).

    Article  CAS  Google Scholar 

  42. Galperin, M.Y. & Koonin, E.V. Who's your neighbor? New computational approaches for functional genomics. Nat. Biotechnol. 18, 609–613 (2000).

    Article  CAS  Google Scholar 

  43. Stuart, J.M., Segal, E., Koller, D. & Kim, S.K. A gene-coexpression network for global discovery of conserved genetic modules. Science 302, 249–255 (2003).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank Harley McAdams, Lucy Shapiro, William Nierman and Dale Kaiser for helpful discussions. We thank the Monsanto Corporation and the Institute for Genomics Research for providing access to the genome sequence of M. xanthus DK1622. This work was supported in part by National Science Foundation (NSF) Grant MCB-0444154 to A.G.G. B.S.S. was supported by a Department of Defense National Defense Science and Engineering Graduate Fellowship through the Army Research Office. Sequencing of M. xanthus DK1622 was accomplished with support from the NSF.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Roy D Welch.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Table 1

Global computational validation of phylogenomic mapping (PDF 153 kb)

Supplementary Note 1

Similarity matrix generation (PDF 91 kb)

Supplementary Note 2

Motility assays and plasmid insertion (PDF 54 kb)

Supplementary Note 3

Gene ontology analysis (PDF 61 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Srinivasan, B., Caberoy, N., Suen, G. et al. Functional genome annotation through phylogenomic mapping. Nat Biotechnol 23, 691–698 (2005). https://doi.org/10.1038/nbt1098

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt1098

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing