A universal trend of amino acid gain and loss in protein evolution

A Corrigendum to this article was published on 26 May 2005


Amino acid composition of proteins varies substantially between taxa and, thus, can evolve. For example, proteins from organisms with (G + C)-rich (or (A + T)-rich) genomes contain more (or fewer) amino acids encoded by (G + C)-rich codons1,2,3,4. However, no universal trends in ongoing changes of amino acid frequencies have been reported. We compared sets of orthologous proteins encoded by triplets of closely related genomes from 15 taxa representing all three domains of life (Bacteria, Archaea and Eukaryota), and used phylogenies to polarize amino acid substitutions. Cys, Met, His, Ser and Phe accrue in at least 14 taxa, whereas Pro, Ala, Glu and Gly are consistently lost. The same nine amino acids are currently accrued or lost in human proteins, as shown by analysis of non-synonymous single-nucleotide polymorphisms. All amino acids with declining frequencies are thought to be among the first incorporated into the genetic code; conversely, all amino acids with increasing frequencies, except Ser, were probably recruited late5,6,7. Thus, expansion of initially under-represented amino acids, which began over 3,400 million years ago8,9, apparently continues to this day.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1


  1. 1

    Sueoka, N. Correlation between base composition of deoxyribonucleic acid and amino acid composition of protein. Proc. Natl Acad. Sci. USA 47, 1141–1149 (1961)

    ADS  CAS  Article  Google Scholar 

  2. 2

    Gu, X., Hewett-Emmett, D. & Li, W.-H. Directional mutational pressure affects the amino acid composition and hydrophobicity of proteins in bacteria. Genetica 103, 383–391 (1998)

    Article  Google Scholar 

  3. 3

    Knight, R. D., Freeland, S. J. & Landweber, L. F. A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes. Genome Biol. 4, Research0010.1 (2001)

  4. 4

    Wang, H.-C., Singer, G. A. C. & Hickey, D. A. Mutational bias affects protein evolution in flowering plants. Mol. Biol. Evol. 21, 90–96 (2004)

    Article  Google Scholar 

  5. 5

    Trifonov, E. N. The triplet code from first principles. J. Biomol. Struct. Dyn. 22, 1–11 (2004)

    CAS  Article  Google Scholar 

  6. 6

    Miller, S. L. Which organic compounds could have occurred on the prebiotic earth? Cold Spring Harb. Symp. Quant. Biol. 52, 17–27 (1987)

    CAS  Article  Google Scholar 

  7. 7

    Cronin, J. R. & Pizzarello, S. Amino acids in meteorites. Adv. Space Res. 3, 5–18 (1983)

    CAS  Article  Google Scholar 

  8. 8

    Brooks, D. J., Fresco, J. R., Lesk, A. M. & Singh, M. Evolution of amino acid frequencies in proteins over deep time: Inferred order of introduction of amino acids into the genetic code. Mol. Biol. Evol. 19, 1645–1655 (2002)

    CAS  Article  Google Scholar 

  9. 9

    Brooks, D. J. & Fresco, J. R. Increased frequency of cysteine, tyrosine, and phenylalanine residues since the last universal ancestor. Mol. Cell. Proteom. 1, 125–131 (2002)

    CAS  Article  Google Scholar 

  10. 10

    Muller, T. & Vingron, M. Modeling amino acid replacement. J. Comp. Biol. 7, 761–776 (2000)

    CAS  Article  Google Scholar 

  11. 11

    Goldman, N. & Whelan, S. A novel use of equilibrium frequencies in models of sequence evolution. Mol. Biol. Evol. 19, 1821–1831 (2002)

    CAS  Article  Google Scholar 

  12. 12

    Veerassamy, S., Smith, A. & Tillier, E. R. M. A transition probability model for amino acid substitutions from blocks. J. Comp. Biol. 10, 997–1010 (2003)

    CAS  Article  Google Scholar 

  13. 13

    Henikoff, S. & Henikoff, J. G. Amino acid substitution matrices. Adv. Prot. Chem. 54, 73–97 (2000)

    CAS  Google Scholar 

  14. 14

    Holm, S. A simple sequentially rejective multiple test procedure. Stand. J. Stat. 6, 65–70 (1979)

    MathSciNet  MATH  Google Scholar 

  15. 15

    Feng, D. F. & Doolittle, R. F. Progressive alignment of amino acid sequences and construction of phylogenetic trees from them. Methods Enzymol. 266, 368–382 (1996)

    CAS  Article  Google Scholar 

  16. 16

    Eyre-Walker, A. Problems with parsimony in sequences of biased base composition. J. Mol. Evol. 47, 686–690 (1998)

    ADS  CAS  Article  Google Scholar 

  17. 17

    Sunyaev, S. et al. Prediction of deleterious human alleles. Hum. Mol. Genet. 10, 591–597 (2001)

    CAS  Article  Google Scholar 

  18. 18

    Tice, M. M. & Lowe, D. R. Photosynthetic microbial mats in the 3,416-Myr-old ocean. Nature 431, 549–552 (2004)

    ADS  CAS  Article  Google Scholar 

  19. 19

    Rat Genome Sequencing Consortium. Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 428, 493–535 (2004)

    Article  Google Scholar 

  20. 20

    Zdobnov, E. M. et al. Comparative genome and proteome analysis of Anopheles gambiae and Drosophila melanogaster . Science 298, 149–159 (2002)

    ADS  CAS  Article  Google Scholar 

  21. 21

    Sorhannus, U. & Fox, M. Synonymous and nonsynonymous substitution rates in diatoms: A comparison between chloroplast and nuclear genes. J. Mol. Evol. 48, 209–212 (1999)

    ADS  CAS  Article  Google Scholar 

  22. 22

    Ochman, H. & Wilson, A. C. Evolution in bacteria—evidence for a universal substitution rate in cellular genomes. J. Mol. Evol. 26, 74–86 (1987)

    ADS  CAS  Article  Google Scholar 

  23. 23

    Clark, M. A., Moran, N. A. & Baumann, P. Sequence evolution in bacterial endosymbionts having extreme base compositions. Mol. Biol. Evol. 16, 1586–1598 (1999)

    CAS  Article  Google Scholar 

  24. 24

    Smith, N. G. C. & Eyre-Walker, A. Adaptive protein evolution in Drosophila . Nature 415, 1022–1024 (2002)

    ADS  CAS  Article  Google Scholar 

  25. 25

    Fitch, W. M. & Markowitz, E. An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution. Biochem. Genet. 4, 579–593 (1970)

    CAS  Article  Google Scholar 

  26. 26

    Kondrashov, A. S., Sunyaev, S. & Kondrashov, F. A. Dobzhansky-Muller incompatibilities in protein evolution. Proc. Natl Acad. Sci. USA 99, 14878–14883 (2002)

    ADS  CAS  Article  Google Scholar 

  27. 27

    Clark, A. G. et al. Inferring nonneutral evolution from human-chimp-mouse orthologous gene trios. Science 302, 1960–1963 (2003)

    ADS  CAS  Article  Google Scholar 

  28. 28

    Kellis, M., Patterson, N., Endrizzi, M., Birren, B. & Lander, E. S. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 233–234 (2003)

    ADS  Article  Google Scholar 

  29. 29

    Tatusov, R. L., Koonin, E. V. & Lipman, D. J. A genomic perspective on protein families. Science 278, 631–637 (1997)

    ADS  CAS  Article  Google Scholar 

  30. 30

    Thompson, J. D., Higgins, D. G. & Gibson, T. J. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)

    CAS  Article  Google Scholar 

Download references


S.S. and I.A.A. were supported by the Genome Canada Foundation.

Author information



Corresponding authors

Correspondence to Alexey S. Kondrashov or Shamil Sunyaev.

Ethics declarations

Competing interests

The authors declare that they have no competing financial interests.

Supplementary information

Supplementary Tables

Contains Supplementary Tables 1-3. Supplementary Table 1 shows the changes of frequencies of all amino acids in the 15 taxa. Supplementary Table 2 shows the long-term rates of amino acid gain and loss. Supplementary Table 3 shows the order of the recruitment of amino acids into the genetic code. (DOC 243 kb)

Supplementary Methods

This section describes the method used to correct for possible multiple substitutions at aligned amino acid sites. (DOC 47 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Jordan, I., Kondrashov, F., Adzhubei, I. et al. A universal trend of amino acid gain and loss in protein evolution. Nature 433, 633–638 (2005). https://doi.org/10.1038/nature03306

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing