Letter | Published:

A universal trend of amino acid gain and loss in protein evolution

Nature volume 433, pages 633638 (10 February 2005) | Download Citation

Subjects

  • A Corrigendum to this article was published on 26 May 2005

Abstract

Amino acid composition of proteins varies substantially between taxa and, thus, can evolve. For example, proteins from organisms with (G + C)-rich (or (A + T)-rich) genomes contain more (or fewer) amino acids encoded by (G + C)-rich codons1,2,3,4. However, no universal trends in ongoing changes of amino acid frequencies have been reported. We compared sets of orthologous proteins encoded by triplets of closely related genomes from 15 taxa representing all three domains of life (Bacteria, Archaea and Eukaryota), and used phylogenies to polarize amino acid substitutions. Cys, Met, His, Ser and Phe accrue in at least 14 taxa, whereas Pro, Ala, Glu and Gly are consistently lost. The same nine amino acids are currently accrued or lost in human proteins, as shown by analysis of non-synonymous single-nucleotide polymorphisms. All amino acids with declining frequencies are thought to be among the first incorporated into the genetic code; conversely, all amino acids with increasing frequencies, except Ser, were probably recruited late5,6,7. Thus, expansion of initially under-represented amino acids, which began over 3,400 million years ago8,9, apparently continues to this day.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    Correlation between base composition of deoxyribonucleic acid and amino acid composition of protein. Proc. Natl Acad. Sci. USA 47, 1141–1149 (1961)

  2. 2.

    , & Directional mutational pressure affects the amino acid composition and hydrophobicity of proteins in bacteria. Genetica 103, 383–391 (1998)

  3. 3.

    , & A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes. Genome Biol. 4, Research0010.1 (2001)

  4. 4.

    , & Mutational bias affects protein evolution in flowering plants. Mol. Biol. Evol. 21, 90–96 (2004)

  5. 5.

    The triplet code from first principles. J. Biomol. Struct. Dyn. 22, 1–11 (2004)

  6. 6.

    Which organic compounds could have occurred on the prebiotic earth? Cold Spring Harb. Symp. Quant. Biol. 52, 17–27 (1987)

  7. 7.

    & Amino acids in meteorites. Adv. Space Res. 3, 5–18 (1983)

  8. 8.

    , , & Evolution of amino acid frequencies in proteins over deep time: Inferred order of introduction of amino acids into the genetic code. Mol. Biol. Evol. 19, 1645–1655 (2002)

  9. 9.

    & Increased frequency of cysteine, tyrosine, and phenylalanine residues since the last universal ancestor. Mol. Cell. Proteom. 1, 125–131 (2002)

  10. 10.

    & Modeling amino acid replacement. J. Comp. Biol. 7, 761–776 (2000)

  11. 11.

    & A novel use of equilibrium frequencies in models of sequence evolution. Mol. Biol. Evol. 19, 1821–1831 (2002)

  12. 12.

    , & A transition probability model for amino acid substitutions from blocks. J. Comp. Biol. 10, 997–1010 (2003)

  13. 13.

    & Amino acid substitution matrices. Adv. Prot. Chem. 54, 73–97 (2000)

  14. 14.

    A simple sequentially rejective multiple test procedure. Stand. J. Stat. 6, 65–70 (1979)

  15. 15.

    & Progressive alignment of amino acid sequences and construction of phylogenetic trees from them. Methods Enzymol. 266, 368–382 (1996)

  16. 16.

    Problems with parsimony in sequences of biased base composition. J. Mol. Evol. 47, 686–690 (1998)

  17. 17.

    et al. Prediction of deleterious human alleles. Hum. Mol. Genet. 10, 591–597 (2001)

  18. 18.

    & Photosynthetic microbial mats in the 3,416-Myr-old ocean. Nature 431, 549–552 (2004)

  19. 19.

    . Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 428, 493–535 (2004)

  20. 20.

    et al. Comparative genome and proteome analysis of Anopheles gambiae and Drosophila melanogaster . Science 298, 149–159 (2002)

  21. 21.

    & Synonymous and nonsynonymous substitution rates in diatoms: A comparison between chloroplast and nuclear genes. J. Mol. Evol. 48, 209–212 (1999)

  22. 22.

    & Evolution in bacteria—evidence for a universal substitution rate in cellular genomes. J. Mol. Evol. 26, 74–86 (1987)

  23. 23.

    , & Sequence evolution in bacterial endosymbionts having extreme base compositions. Mol. Biol. Evol. 16, 1586–1598 (1999)

  24. 24.

    & Adaptive protein evolution in Drosophila . Nature 415, 1022–1024 (2002)

  25. 25.

    & An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution. Biochem. Genet. 4, 579–593 (1970)

  26. 26.

    , & Dobzhansky-Muller incompatibilities in protein evolution. Proc. Natl Acad. Sci. USA 99, 14878–14883 (2002)

  27. 27.

    et al. Inferring nonneutral evolution from human-chimp-mouse orthologous gene trios. Science 302, 1960–1963 (2003)

  28. 28.

    , , , & Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 233–234 (2003)

  29. 29.

    , & A genomic perspective on protein families. Science 278, 631–637 (1997)

  30. 30.

    , & CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)

Download references

Acknowledgements

S.S. and I.A.A. were supported by the Genome Canada Foundation.

Author information

Affiliations

  1. National Center for Biotechnology Information, NIH, Bethesda, Maryland 20894, USA

    • I. King Jordan
    • , Yuri I. Wolf
    • , Eugene V. Koonin
    •  & Alexey S. Kondrashov
  2. Section of Evolution and Ecology, University of California at Davis, Davis, California 95616, USA

    • Fyodor A. Kondrashov
  3. Division of Genetics, Department of Medicine, Brigham & Women's Hospital, Harvard Medical School, Boston, Massachusetts 02115, USA

    • Ivan A. Adzhubei
    •  & Shamil Sunyaev

Authors

  1. Search for I. King Jordan in:

  2. Search for Fyodor A. Kondrashov in:

  3. Search for Ivan A. Adzhubei in:

  4. Search for Yuri I. Wolf in:

  5. Search for Eugene V. Koonin in:

  6. Search for Alexey S. Kondrashov in:

  7. Search for Shamil Sunyaev in:

Competing interests

The authors declare that they have no competing financial interests.

Corresponding authors

Correspondence to Alexey S. Kondrashov or Shamil Sunyaev.

Supplementary information

Word documents

  1. 1.

    Supplementary Tables

    Contains Supplementary Tables 1-3. Supplementary Table 1 shows the changes of frequencies of all amino acids in the 15 taxa. Supplementary Table 2 shows the long-term rates of amino acid gain and loss. Supplementary Table 3 shows the order of the recruitment of amino acids into the genetic code.

  2. 2.

    Supplementary Methods

    This section describes the method used to correct for possible multiple substitutions at aligned amino acid sites.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nature03306

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.