Read, copy, edit and repeat

    A detailed picture of how DNA is copied and modified comes from a molecular-level understanding of DNA and the enzymes that process it. Why is DNA not always copied correctly, and what happens when its bases are modified?

    Nature does a good job of storing and copying information. A typical duplex of human DNA — assuming that each of the 3.2 million base pairs is a Watson–Crick match between A and T, or G and C — contains about 760 MB of information. Copying this data verbatim, base for base, is no mean feat because it is entropically favourable for randomness to find its way into the replication process. This randomness is largely overcome because the formation of a Watson–Crick match, on account of strong hydrogen bonding, is more exergonic than that of a mismatch. Yet, these energetic differences alone cannot account for the extraordinarily high fidelities of certain DNA polymerases, which incorporate matches with very high selectivity. In their Comment article, John Petruska and Myron Goodman remind us that only by considering the hydrogen bonding of the aqueous medium surrounding the base pairs can we explain how genetic information can be copied with only a single error for every 10 million bases that are written.

    Despite the strong thermodynamic basis for accurate replication, it turns out that errors do creep into DNA replication. Some DNA polymerases are more prone to making ‘spelling’ mistakes, with some of these low-fidelity enzymes making an error every second base pair. The vast range over which fidelities span is a result of subtle structural differences in DNA polymerase active sites, as described in a Review penned by Wen-Jin Wu, Wei Yang and Ming-Daw Tsai. High-fidelity DNA polymerases arrange matched substrates in perfect geometries for in-line nucleophilic substitution, whereas mismatched substrates are accommodated in less optimal geometries such that they rarely form the requisite transition states. The discrimination between matched and mismatched substrates is greatly affected by the identity of the metal ions at the DNA polymerase active site (for example, Mg2+ is more selective than Mn2+) as well as the presence or absence of specific hydrogen-bonding interactions. The size of the substrate binding pocket also has implications on error rates, with higher-fidelity DNA polymerases often having smaller pockets that are very selective for matched substrates. Nevertheless, high- and low-fidelity enzymes should not be viewed as being ‘good’ and ‘bad’, respectively. For example, less discriminatory enzymes introduce random mutations that allow our bodies to synthesize a diverse library of B cell receptors. This process is essential to our immune system and eventually affords antibodies to combat the wide variety of pathogens that we encounter.

    The genetic alphabet that spells out DNA contains more than just the nucleobases A, T, G and C. Enzymes modify these bases during and/or after replication, with the A-, T-, G- or C-derived products bearing substituents such as methyl groups or elaborate sugar-containing moieties. In their Review, Eun-Ang Raiber, Robyn Hardisty, Pieter van Delft and Shankar Balasubramanian argue that each modification, although hardly perturbing base pairing at all, may change the local nature of the DNA major groove. These modifications cause proteins to see DNA in a different light, to an extent that can affect gene expression. Understanding how these downstream consequences arise from the presence of unusual and often rare bases requires sensitive analytical instrumentation, and developments on this front have enabled the discovery of new bases. As more data come to hand, it is becoming increasingly apparent that the occurrence of these sites is not random, such that the original base sequence may indeed have epigenetic consequences.

    “These modifications cause proteins to see DNA in a different light”

    The articles described above only scratch the surface of nucleic acid chemistry, a subject that has benefited greatly from ideas contributed by creative minds from diverse scientific backgrounds. Whether your research is biologically inclined or you are simply fascinated by the chemistry at play in nature, we hope that you enjoy this contemporary discussion, more of which will certainly grace the pages of Nature Reviews Chemistry to come.

    Rights and permissions

    Reprints and Permissions

    About this article

    Verify currency and authenticity via CrossMark

    Cite this article

    Read, copy, edit and repeat. Nat Rev Chem 1, 0075 (2017).

    Download citation


    Sign up for the Nature Briefing newsletter for a daily update on COVID-19 science.
    Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing