Amino acid composition of proteins varies substantially between taxa and, thus, can evolve. For example, proteins from organisms with (G + C)-rich (or (A + T)-rich) genomes contain more (or fewer) amino acids encoded by (G + C)-rich codons1,2,3,4. However, no universal trends in ongoing changes of amino acid frequencies have been reported. We compared sets of orthologous proteins encoded by triplets of closely related genomes from 15 taxa representing all three domains of life (Bacteria, Archaea and Eukaryota), and used phylogenies to polarize amino acid substitutions. Cys, Met, His, Ser and Phe accrue in at least 14 taxa, whereas Pro, Ala, Glu and Gly are consistently lost. The same nine amino acids are currently accrued or lost in human proteins, as shown by analysis of non-synonymous single-nucleotide polymorphisms. All amino acids with declining frequencies are thought to be among the first incorporated into the genetic code; conversely, all amino acids with increasing frequencies, except Ser, were probably recruited late5,6,7. Thus, expansion of initially under-represented amino acids, which began over 3,400 million years ago8,9, apparently continues to this day.
Subscribe to Journal
Get full journal access for 1 year
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Sueoka, N. Correlation between base composition of deoxyribonucleic acid and amino acid composition of protein. Proc. Natl Acad. Sci. USA 47, 1141–1149 (1961)
Gu, X., Hewett-Emmett, D. & Li, W.-H. Directional mutational pressure affects the amino acid composition and hydrophobicity of proteins in bacteria. Genetica 103, 383–391 (1998)
Knight, R. D., Freeland, S. J. & Landweber, L. F. A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes. Genome Biol. 4, Research0010.1 (2001)
Wang, H.-C., Singer, G. A. C. & Hickey, D. A. Mutational bias affects protein evolution in flowering plants. Mol. Biol. Evol. 21, 90–96 (2004)
Trifonov, E. N. The triplet code from first principles. J. Biomol. Struct. Dyn. 22, 1–11 (2004)
Miller, S. L. Which organic compounds could have occurred on the prebiotic earth? Cold Spring Harb. Symp. Quant. Biol. 52, 17–27 (1987)
Cronin, J. R. & Pizzarello, S. Amino acids in meteorites. Adv. Space Res. 3, 5–18 (1983)
Brooks, D. J., Fresco, J. R., Lesk, A. M. & Singh, M. Evolution of amino acid frequencies in proteins over deep time: Inferred order of introduction of amino acids into the genetic code. Mol. Biol. Evol. 19, 1645–1655 (2002)
Brooks, D. J. & Fresco, J. R. Increased frequency of cysteine, tyrosine, and phenylalanine residues since the last universal ancestor. Mol. Cell. Proteom. 1, 125–131 (2002)
Muller, T. & Vingron, M. Modeling amino acid replacement. J. Comp. Biol. 7, 761–776 (2000)
Goldman, N. & Whelan, S. A novel use of equilibrium frequencies in models of sequence evolution. Mol. Biol. Evol. 19, 1821–1831 (2002)
Veerassamy, S., Smith, A. & Tillier, E. R. M. A transition probability model for amino acid substitutions from blocks. J. Comp. Biol. 10, 997–1010 (2003)
Henikoff, S. & Henikoff, J. G. Amino acid substitution matrices. Adv. Prot. Chem. 54, 73–97 (2000)
Holm, S. A simple sequentially rejective multiple test procedure. Stand. J. Stat. 6, 65–70 (1979)
Feng, D. F. & Doolittle, R. F. Progressive alignment of amino acid sequences and construction of phylogenetic trees from them. Methods Enzymol. 266, 368–382 (1996)
Eyre-Walker, A. Problems with parsimony in sequences of biased base composition. J. Mol. Evol. 47, 686–690 (1998)
Sunyaev, S. et al. Prediction of deleterious human alleles. Hum. Mol. Genet. 10, 591–597 (2001)
Tice, M. M. & Lowe, D. R. Photosynthetic microbial mats in the 3,416-Myr-old ocean. Nature 431, 549–552 (2004)
Rat Genome Sequencing Consortium. Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature 428, 493–535 (2004)
Zdobnov, E. M. et al. Comparative genome and proteome analysis of Anopheles gambiae and Drosophila melanogaster . Science 298, 149–159 (2002)
Sorhannus, U. & Fox, M. Synonymous and nonsynonymous substitution rates in diatoms: A comparison between chloroplast and nuclear genes. J. Mol. Evol. 48, 209–212 (1999)
Ochman, H. & Wilson, A. C. Evolution in bacteria—evidence for a universal substitution rate in cellular genomes. J. Mol. Evol. 26, 74–86 (1987)
Clark, M. A., Moran, N. A. & Baumann, P. Sequence evolution in bacterial endosymbionts having extreme base compositions. Mol. Biol. Evol. 16, 1586–1598 (1999)
Smith, N. G. C. & Eyre-Walker, A. Adaptive protein evolution in Drosophila . Nature 415, 1022–1024 (2002)
Fitch, W. M. & Markowitz, E. An improved method for determining codon variability in a gene and its application to the rate of fixation of mutations in evolution. Biochem. Genet. 4, 579–593 (1970)
Kondrashov, A. S., Sunyaev, S. & Kondrashov, F. A. Dobzhansky-Muller incompatibilities in protein evolution. Proc. Natl Acad. Sci. USA 99, 14878–14883 (2002)
Clark, A. G. et al. Inferring nonneutral evolution from human-chimp-mouse orthologous gene trios. Science 302, 1960–1963 (2003)
Kellis, M., Patterson, N., Endrizzi, M., Birren, B. & Lander, E. S. Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423, 233–234 (2003)
Tatusov, R. L., Koonin, E. V. & Lipman, D. J. A genomic perspective on protein families. Science 278, 631–637 (1997)
Thompson, J. D., Higgins, D. G. & Gibson, T. J. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680 (1994)
S.S. and I.A.A. were supported by the Genome Canada Foundation.
The authors declare that they have no competing financial interests.
Contains Supplementary Tables 1-3. Supplementary Table 1 shows the changes of frequencies of all amino acids in the 15 taxa. Supplementary Table 2 shows the long-term rates of amino acid gain and loss. Supplementary Table 3 shows the order of the recruitment of amino acids into the genetic code. (DOC 243 kb)
This section describes the method used to correct for possible multiple substitutions at aligned amino acid sites. (DOC 47 kb)
About this article
Cite this article
Jordan, I., Kondrashov, F., Adzhubei, I. et al. A universal trend of amino acid gain and loss in protein evolution. Nature 433, 633–638 (2005). https://doi.org/10.1038/nature03306
Compactness of Protein Folds Alters Disulfide‐Bond Reducibility by Three Orders of Magnitude: A Comprehensive Kinetic Case Study on the Reduction of Differently Sized Tryptophan Cage Model Proteins
Journal of Chemical Theory and Computation (2020)
IEEE Access (2020)
Reconstruction and Characterization of Thermally Stable and Catalytically Active Proteins Comprising an Alphabet of ~ 13 Amino Acids
Journal of Molecular Evolution (2020)
Nature Communications (2020)