Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

A universal trend of amino acid gain and loss in protein evolution

A Corrigendum to this article was published on 26 May 2005


Amino acid composition of proteins varies substantially between taxa and, thus, can evolve. For example, proteins from organisms with (G + C)-rich (or (A + T)-rich) genomes contain more (or fewer) amino acids encoded by (G + C)-rich codons1,2,3,4. However, no universal trends in ongoing changes of amino acid frequencies have been reported. We compared sets of orthologous proteins encoded by triplets of closely related genomes from 15 taxa representing all three domains of life (Bacteria, Archaea and Eukaryota), and used phylogenies to polarize amino acid substitutions. Cys, Met, His, Ser and Phe accrue in at least 14 taxa, whereas Pro, Ala, Glu and Gly are consistently lost. The same nine amino acids are currently accrued or lost in human proteins, as shown by analysis of non-synonymous single-nucleotide polymorphisms. All amino acids with declining frequencies are thought to be among the first incorporated into the genetic code; conversely, all amino acids with increasing frequencies, except Ser, were probably recruited late5,6,7. Thus, expansion of initially under-represented amino acids, which began over 3,400 million years ago8,9, apparently continues to this day.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Rent or buy this article

Prices vary by article type



Prices may be subject to local taxes which are calculated during checkout

Figure 1


  1. Sueoka, N. Correlation between base composition of deoxyribonucleic acid and amino acid composition of protein. Proc. Natl Acad. Sci. USA 47, 1141–1149 (1961)

    Article  ADS  CAS  Google Scholar 

  2. Gu, X., Hewett-Emmett, D. & Li, W.-H. Directional mutational pressure affects the amino acid composition and hydrophobicity of proteins in bacteria. Genetica 103, 383–391 (1998)

    Article  Google Scholar 

  3. Knight, R. D., Freeland, S. J. & Landweber, L. F. A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes. Genome Biol. 4, Research0010.1 (2001)

  4. Wang, H.-C., Singer, G. A. C. & Hickey, D. A. Mutational bias affects protein evolution in flowering plants. Mol. Biol. Evol. 21, 90–96 (2004)

    Article  Google Scholar 

  5. Trifonov, E. N. The triplet code from first principles. J. Biomol. Struct. Dyn. 22, 1–11 (2004)

    Article  CAS  Google Scholar 

  6. Miller, S. L. Which organic compounds could have occurred on the prebiotic earth? Cold Spring Harb. Symp. Quant. Biol. 52, 17–27 (1987)

    Article  CAS  Google Scholar 

  7. Cronin, J. R. & Pizzarello, S. Amino acids in meteorites. Adv. Space Res. 3, 5–18 (1983)

    Article  CAS  Google Scholar 

  8. Brooks, D. J., Fresco, J. R., Lesk, A. M. & Singh, M. Evolution of amino acid frequencies in proteins over deep time: Inferred order of introduction of amino acids into the genetic code. Mol. Biol. Evol. 19, 1645–1655 (2002)

    Article  CAS  Google Scholar 

  9. Brooks, D. J. & Fresco, J. R. Increased frequency of cysteine, tyrosine, and phenylalanine residues since the last universal ancestor. Mol. Cell. Proteom. 1, 125–131 (2002)

    Article  CAS  Google Scholar 

  10. Muller, T. & Vingron, M. Modeling amino acid replacement. J. Comp. Biol. 7, 761–776 (2000)

    Article  CAS  Google Scholar 

  11. Goldman, N. & Whelan, S. A novel use of equilibrium frequencies in models of sequence evolution. Mol. Biol. Evol. 19, 1821–1831 (2002)

    Article  CAS  Google Scholar 

  12. Veerassamy, S., Smith, A. & Tillier, E. R. M. A transition probability model for amino acid substitutions from blocks. J. Comp. Biol. 10, 997–1010 (2003)

    Article  CAS  Google Scholar 

  13. Henikoff, S. & Henikoff, J. G. Amino acid substitution matrices. Adv. Prot. Chem. 54, 73–97 (2000)

    CAS  Google Scholar 

  14. Holm, S. A simple sequentially rejective multiple test procedure. Stand. J. Stat. 6, 65–70 (1979)

    MathSciNet  MATH  Google Scholar 

  15. Feng, D. F. & Doolittle, R. F. Progressive alignment of amino acid sequences and construction of phylogenetic trees from them. Methods Enzymol. 266, 368–382 (1996)

    Article  CAS  Google Scholar 

  16. Eyre-Walker, A. Problems with parsimony in sequences of biased base composition. J. Mol. Evol. 47, 686–690 (1998)

    Article  ADS  CAS  Google Scholar 

  17. Sunyaev, S. et al. Prediction of deleterious human alleles. Hum. Mol. Genet. 10, 591–597 (2001)

    Article  CAS  Google Scholar 

  18. Tice, M. M. & Lowe, D. R. Photosynthetic microbial mats in the 3,416-Myr-old ocean. Nature 431, 549–552 (2004)

    Article  ADS  CAS  Google Scholar 

  19. Rat Genome Sequencing Consortium. Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 428, 493–535 (2004)

    Article  Google Scholar 

  20. Zdobnov, E. M. et al. Comparative genome and proteome analysis of Anopheles gambiae and Drosophila melanogaster . Science 298, 149–159 (2002)

    Article  ADS  CAS  Google Scholar 

  21. Sorhannus, U. & Fox, M. Synonymous and nonsynonymous substitution rates in diatoms: A comparison between chloroplast and nuclear genes. J. Mol. Evol. 48, 209–212 (1999)

    Article  ADS  CAS  Google Scholar 

  22. Ochman, H. & Wilson, A. C. Evolution in bacteria—evidence for a universal substitution rate in cellular genomes. J. Mol. Evol. 26, 74–86 (1987)

    Article  ADS  CAS  Google Scholar 

  23. Clark, M. A., Moran, N. A. & Baumann, P. Sequence evolution in bacterial endosymbionts having extreme base compositions. Mol. Biol. Evol. 16, 1586–1598 (1999)

    Article  CAS  Google Scholar 

  24. Smith, N. G. C. & Eyre-Walker, A. Adaptive protein evolution in Drosophila . Nature 415, 1022–1024 (2002)

    Article  ADS  CAS  Google Scholar 

  25. Fitch, W. M. & Markowitz, E. An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution. Biochem. Genet. 4, 579–593 (1970)

    Article  CAS  Google Scholar 

  26. Kondrashov, A. S., Sunyaev, S. & Kondrashov, F. A. Dobzhansky-Muller incompatibilities in protein evolution. Proc. Natl Acad. Sci. USA 99, 14878–14883 (2002)

    Article  ADS  CAS  Google Scholar 

  27. Clark, A. G. et al. Inferring nonneutral evolution from human-chimp-mouse orthologous gene trios. Science 302, 1960–1963 (2003)

    Article  ADS  CAS  Google Scholar 

  28. Kellis, M., Patterson, N., Endrizzi, M., Birren, B. & Lander, E. S. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 233–234 (2003)

    Article  ADS  Google Scholar 

  29. Tatusov, R. L., Koonin, E. V. & Lipman, D. J. A genomic perspective on protein families. Science 278, 631–637 (1997)

    Article  ADS  CAS  Google Scholar 

  30. Thompson, J. D., Higgins, D. G. & Gibson, T. J. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)

    Article  CAS  Google Scholar 

Download references


S.S. and I.A.A. were supported by the Genome Canada Foundation.

Author information

Authors and Affiliations


Corresponding authors

Correspondence to Alexey S. Kondrashov or Shamil Sunyaev.

Ethics declarations

Competing interests

The authors declare that they have no competing financial interests.

Supplementary information

Supplementary Tables

Contains Supplementary Tables 1-3. Supplementary Table 1 shows the changes of frequencies of all amino acids in the 15 taxa. Supplementary Table 2 shows the long-term rates of amino acid gain and loss. Supplementary Table 3 shows the order of the recruitment of amino acids into the genetic code. (DOC 243 kb)

Supplementary Methods

This section describes the method used to correct for possible multiple substitutions at aligned amino acid sites. (DOC 47 kb)

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Jordan, I., Kondrashov, F., Adzhubei, I. et al. A universal trend of amino acid gain and loss in protein evolution. Nature 433, 633–638 (2005).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:

This article is cited by


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing