Abstract
We list, without thinking, the four base types that make up DNA as adenine, guanine, cytosine and thymine. But why are there four? This question is now all the more relevant as organic chemists have synthesized new base pairs that can be incorporated into nucleic acids. Here, I argue that there are theoretical, experimental and computational reasons to believe that having four base types is a frozen relic from the RNA world, when RNA was genetic as well as enzymatic material.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$189.00 per year
only $15.75 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
References
Fisher, R. A. The Genetical Theory of Natural Selection (Oxford Univ. Press, London, 1930).
Watson, J. D. & Crick, F. H. C. A structure for deoxyribose nucleic acid. Nature 171, 737 (1953).
Piccirilli, J. A., Krauch, T., Moroney, S. E. & Benner, S. A. Enzymatic incorporation of a new base pair into DNA and RNA extends the genetic alphabet. Nature 343, 33–37 (1990).
Kool, E. T. Hydrogen bonding, base stacking, and steric effects in DNA replication. Annu. Rev. Biophys. Biomol. Struct. 30, 1–22 (2001).
Benner, S. A. et al. Redesigning nucleic acids. Pure Appl. Chem. 70, 263–266 (1998).
Mathis, G. & Hunziker, J. Towards a DNA-like duplex without hydrogen-bonded base pairs. Angew. Chem. Int. Ed. 41, 3203–3205 (2002).
Ogawa, A. K., Wu, Y., Berger, M., Schultz, P. G. & Romesberg, F. E. Rational design of an unnatural base pair with increased kinetic selectivity. J. Am. Chem. Soc. 122, 8803–8804 (2000).
Kool, E. T. Synthetically modified DNAs as substrates for polymerases. Curr. Opin. Chem. Biol. 4, 602–608 (2000).
Switzer, C. Y., Moroney, S. E. & Benner, S. A. Enzymatic incorporation of a new base pair into DNA and RNA. J. Am. Chem. Soc. 111, 8322–8323 (1989).
Roberts, C., Bandaru, R. & Switzer, C. Theoretical and experimental study of isoguanine and isocytosine: base pairing in an expanded genetic system. J. Am. Chem. Soc. 119, 4640–4649 (1997).
Switzer, C. Y., Moroney, S. E. & Benner, S. A. Enzymatic recognition of the base pair between isocytidine and isoguanine. Biochemistry 32, 10489–10496 (1993).
Chu, C. K., Reichmann, U., Watanabe K. A. & Fox, J. J. Nucleosides 104. Synthesis of 4-amino-5-(D-ribofuranosyl) pyrimidine C-nucleosides from 2-(2,3-O-isopropylidene-5-O-trityl-D-ribofuranosyl)acetonitrile. J. Org. Chem. 42, 711–714 (1977).
Voegel, J. J. & Benner, S. A. Nonstandard hydrogen bonding in duplex oligonucleotides. The base pair between an acceptor–donor–donor pyrimidine analog and a donor–acceptor–acceptor purine analog. J. Am. Chem. Soc. 116, 6929–6930 (1994).
Tae, E. L., Wu, Y., Xia, G., Schultz, P. G. & Romesberg, F. E. Efforts toward expansion of the genetic alphabet: replication of DNA with three base pairs. J. Am. Chem. Soc. 123, 7439–7440 (2001).
Yu, C., Henry, A. A., Romesberg, F. E. & Schultz, P. G. Polymerase recognition of unnatural base pairs. Angew. Chem. Int. Ed. 41, 3841–3844 (2002).
Matsuda, S. et al. The effect of minor-groove hydrogen-bond acceptors and donors on the stability and replication of four unnatural base pairs. J. Am. Chem. Soc. 125, 6134–6139 (2003).
Wu, Y. et al. Enzymatic phosphorylation of unnatural nucleosides. J. Am. Chem. Soc. 124, 14626–14630 (2002).
Ohtsuki, T. et al. Unnatural base pairs for specific transcription. Proc. Natl Acad. Sci. USA 98, 4922–4925 (2001).
Hirao, I. et al. An unnatural base pair for incorporating amino acid analogs into proteins. Nature Biotech. 20, 177–182 (2002).
Orgel, L. E. Nucleic acids — adding to the genetic alphabet. Nature 343, 18–20 (1990).
Orgel, L. E. Evolution of the genetic apparatus. J. Mol. Biol. 38, 381–393 (1968).
Crick, F. H. C. The origin of the genetic code. J. Mol. Biol. 38, 367–379 (1968).
Wächtershäuser, G. An all-purine precursor of nucleic acids. Proc. Natl Acad. Sci. USA 85, 1134–1135 (1988).
Zubay, G. An all-purine precursor of nucleic acids. Chemtracts 2, 439–442 (1991).
Gilbert, W. The RNA world. Nature 319, 618 (1986).
Joyce, G. F. The antiquity of RNA-based evolution. Nature 418, 214–221 (2002).
Gardner, P. P., Holland, B. R., Moulton, V., Hendy, M. & Penny, D. Optimal alphabets for an RNA world. Proc. R. Soc. Lond. B 270, 1177–1182 (2003).
Fontana, W., Konings, D., Stadler, P. & Schuster, P. Statistics of RNA secondary structures. Biopolymers 33, 1389–1404 (1993).
Schuster, P. RNA-based evolutionary optimization. Orig. Life Evol. Biosphere 23, 373–391 (1993).
Grüner, W. et al. Analysis of RNA sequence and structure maps by exhaustive enumeration. Monatshefte Chem. 127, 355–374 (1996).
Szathmáry, E. Four letters in the genetic alphabet: a frozen evolutionary optimum? Proc. R. Soc. Lond. B 245, 91–99 (1991).
Szathmáry, E. What is the optimum size for the genetic alphabet? Proc. Natl Acad. Sci. USA 89, 2614–2618 (1992).
Benner, S. A., Ellington, A. D. & Tauer, S. A. Modern metabolism as a palimpsest of an RNA world. Proc. Natl Acid. Sci. USA 86, 7054–7058 (1989).
Eigen, M. Self-organization of matter and the evolution of biological macromolecules. Naturwiissenschaften 58, 465–523 (1971).
Rogers, J. & Joyce, G. F. The effect of cytidine on the structure and function of an RNA ligase ribozyme. RNA 7, 395–404 (2001).
Reader, J. S. & Joyce, G. F. A ribozyme composed of only two different nucleotides. Nature 420, 841–844 (2002).
Mac Dónaill, D. A. A parity code interpretation of nucleotide alphabet composition. Chem. Commun. 18, 2062–2063 (2002).
Mac Dónaill, D. A. Why nature chose A, C, G and U/T: an error-coding perspective of nucleotide alphabet composition. Orig. Life Evol. Biosphere 33, 433–455 (2003).
Mac Dónaill, D. A. & Brocklebank, D. An ab initio quantum chemical investigation of the error-coding model of nucleotide alphabet composition. Mol. Phys. 101, 2755–2763 (2003).
McGinness, K. E. & Joyce, G. F. In search of an RNA replicase ribozyme. Chem. Biol. 10, 5–14 (2003).
Brautigam, C. A. & Steitz, T. A. Structural and functional insights provided by crystal structures of DNA polymerases and their substrate complexes. Curr. Opin. Struct. Biol. 8, 54–63 (1998).
Szathmáry, E. The origin of the genetic code: amino acids as cofactors in an RNA world. Trends Genet. 15, 223–229 (1999).
Wong, J. T. A coevolution theory of the genetic code. Proc. Natl Acad. Sci. USA 72, 1909–1912 (1975).
Maynard Smith, J. & Szathmáry, E. The Major Transitions in Evolution (Freeman, Oxford, 1995).
Benner, S. A. Synthetic biology: act natural. Nature 421, 118 (2003).
Hamming, R. W. Error detecting and error correcting codes. Bell Syst. Techn. J. 29, 147–160 (1950).
Acknowledgements
I thank the biologists at the Wissenschaftskolleg zu Berlin for vivid discussions.Also, B.Papp and V.Müller who kindly read the manuscript before submission.
Author information
Authors and Affiliations
Ethics declarations
Competing interests
The authors declare that they have no competing financial interests.
Related links
Related links
FURTHER INFORMATION
Glossary
- AMINO-A
-
An adenine molecule with a second amino (-NH2) group attached to its carbon in position 2, which acts as an extra hydrogen-bond donor.
- DEAMINATION
-
The reaction of a water molecule with the amino-group on position 4 of the pyrimidine ring of cytosine, which results in the conversion of cytosine to uracil.
- DIRECTIONAL SELECTION
-
Natural selection that acts to promote the fixation (an increase in frequency in the population to 100%) of a particular allele.
- EPIMERIZATION
-
The spontaneous change of configuration of chemical groups that are attached to a so-called asymmetric carbon atom. Such isomers are not mirror images of each other.
- ERROR-CODING THEORY
-
A theory that was developed by Hamming to analyse the detection and correction of errors in messages consisting of 'zeros' and 'ones'.
- KLENOW FRAGMENT
-
The Escherichia coli DNA polymerase, without the exonuclease subunit.
- MALTHUSIAN GROWTH RATE
-
The per capita rate of growth of a population modelled in continuous time.
- MUTATION–SELECTION EQUILIBRIUM
-
The equilibrium at which selection that decreases the frequency of an unfavourable allele exactly balances mutations that increase its frequency.
- ORTHOGONALITY
-
Features of natural and/or artificial bases that in a given set (alphabet) decrease the degree of incorporating non-cognate base pairs.
- PROCESSIVITY
-
The ability of polymerases to repeatedly add bases to the primer, extending even a new type of base.
- RIBO-ORGANISM
-
A cell in the RNA world.
- RNA WORLD
-
A hypothetical, but widely believed, era in early evolution when RNA-like molecules were not only genetic but also enzymatic material.
- SIMULATED PROTOCELL MODEL
-
An in silico implementation of a ribo-organism.
- STABILIZING SELECTION
-
Selection for the mean or intermediate phenotype; consequently, peripheral variants are eliminated, which maintains an existing state of adaptation in a stable environment.
Rights and permissions
About this article
Cite this article
Szathmáry, E. Why are there four letters in the genetic alphabet?. Nat Rev Genet 4, 995–1001 (2003). https://doi.org/10.1038/nrg1231
Issue Date:
DOI: https://doi.org/10.1038/nrg1231
This article is cited by
-
Statistical analysis of synonymous and stop codons in pseudo-random and real sequences as a function of GC content
Scientific Reports (2023)
-
Excitation and ionization energies of unnatural nucleic acid bases: a computational approach
Theoretical Chemistry Accounts (2023)
-
Kinds of modalities and modeling practices
Synthese (2023)
-
Prebiotic competition and evolution in self-replicating polynucleotides can explain the properties of DNA/RNA in modern living systems
BMC Evolutionary Biology (2020)
-
Advances in high-dimensional quantum entanglement
Nature Reviews Physics (2020)