DNA polymerases were named for their function of catalysing DNA replication, a process that is necessary for growth and propagation of life. DNA involving Watson–Crick base-pairing can be synthesized with high fidelity, the structural and mechanistic origins of which have been investigated for many decades. Despite this, new chemical insights continue to be uncovered, including recent findings that may explain newly discovered functions for many DNA polymerases in DNA repair and mutation. Some of these reactions involve non-Watson–Crick base-pairing. In addition, certain DNA polymerases have been engineered for a wide variety of applications in biotechnology and biomedicine. This Review describes the molecular basis for the diverse and contrasting functions of different DNA polymerases, providing an up-to-date understanding of how these tasks are accomplished and the means by which we can benefit from them.
The first known DNA polymerase, DNA polymerase I (Pol I), was isolated from Escherichia coli1,2 and was shown to faithfully copy template DNA sequences3. This discovery, made by Arthur Kornberg and co-workers2 in 1958, stimulated decades of intensive research in identifying new polymerases and studying their chemical mechanisms and biological functions. Scientists have comprehensively catalogued DNA polymerases from all three kingdoms of life, with the enzymes being classified into six major families (A, B, C, D, X and Y) according to their sequence homology4.
DNA polymerases are known to most chemists for their role in catalysing DNA replication with high accuracy. Less well known is the function that these enzymes have in DNA repair, with some polymerases, perhaps counter-intuitively, even effecting DNA mutations, some of which are important for life processes5,6. As a consequence, the fidelity (Box 1) of deoxyribonucleoside triphosphate (dNTP) incorporation is highly dependent on the polymerase catalyst and varies from being very high (107–108 for replicative polymerases)7 to very low (close to 2 for mutagenic polymerases)8 (Fig. 1a). For example, incorporation of a correct Watson–Crick (W–C) base pair (hereafter referred to as a ‘match’) mediated by Pol β is governed by Kd,app and kpol (refer to Box 1 for definitions), the values for which fall in the ranges of 10–100 μM and 10–25 s−1, respectively. For incorporation of an incorrect base pair (hereafter referred to as ‘mismatch’), Kd,app values increase by a factor of ∼20 (weaker binding) and kpol values decrease by a factor of ∼103–104 (slower reaction). Taken together, these thermodynamic and kinetic parameters for the match and mismatch situations result in Pol β fidelity on the order of 103–105 (Ref. 9).
It has been known for some time that there is a degree of structural variation between polymerases10 (Box 2). In this Review, we delineate how these small differences in polymerase structure enable the enzymes to perform diverse biological functions, each of which may require a very different fidelity. This variation is perhaps surprising given that polymerases operate through very similar chemical mechanisms.
Early studies focused on high-fidelity replicative polymerases from bacteria or yeast; however, the mammalian Pol β, a main player in DNA repair, is the most well-characterized polymerase. Although Pol β is not a replicative polymerase, its fidelity is comparable to that of replicative polymerases without exonuclease proofreading activity (Fig. 1a), and its kinetic and structural properties are very similar to replicative polymerases. A useful understanding of DNA chemistry can be gained by studying the catalytic cycle (Fig. 1b) and conformational changes (Fig. 1c) of Pol β11, a system that also serves as a point of comparison for other enzymes. When in its apo form (that is, free of substrate), Pol β adopts an extended conformation (I), which undergoes significant structural change on binding to DNA, after which it assumes an open conformation (II). A ternary complex (III) is formed once the enzyme binds to metal-bound dNTP (MdNTP), with conformational closure of the N subdomain of Pol β resulting in the third distinct arrangement, the closed conformation (IV). It is here that the enzyme effects dNTP incorporation with concomitant generation of metal-bound pyrophosphate (MPPi). This product state (V) undergoes conformational change (V → VI) to release MPPi (VI →VII), after which the DNA substrate undergoes translocation and further elongation.
Structural data reported for Pol β in its I12,13, II14,15, III16, IV14,17,
This Review first summarizes the chemistry that is involved in polymerase-catalysed match incorporation, the mechanism of which is likely to be common to most polymerases. This discussion is followed by a description of the mismatch incorporation mechanism, a process that may proceed in a different manner for each polymerase. With these two aspects covered, we then address how the conserved structural and mechanistic features of polymerases facilitate different biological functions, including repair and bypass of damaged DNA. These specific functions lend themselves to various applications, and we conclude by describing how engineering polymerases allows these to be realized.
Incorporation of Watson–Crick pairs
Direct monitoring of phosphodiester bond formation. The formation and cleavage of chemical bonds can be very rapid and dynamic processes. As a consequence, they are difficult to observe directly, although a recent report on femtosecond X-ray scattering may pave the way for progress in this area24. For enzymatic reactions, including the nucleotidyl transfer reaction catalysed by polymerases, the active site geometry in the TS can only be inferred from biochemical and computational analyses25,
Three metal ions are essential for nucleotidyl transfer. Analysis of the crystal structures of numerous polymerases led many to believe that each of these enzymes operates by a catalytic mechanism that involves two metal ions10,31 (Fig. 2d). In this mechanism, one site (metal B) binds to all three phosphate groups of dNTP, two of the three active site carboxylate groups, as well as a H2O molecule or a backbone carbonyl (Met14 in the case of Pol η)32. Thus, metal B has a dual role: to correctly position the triphosphate and to stabilize the negative charge that accrues on the oxygen atoms of the PPi leaving group. Another divalent cation (metal A) binds to all three carboxylate groups, the α-phosphate and another H2O ligand, activating the 3′-OH moiety of the upstream primer. Deprotonation of this now acidic 3′-OH group (probably by water27,30 or by the Asp256 carboxylate in Pol β33) affords a nucleophilic alkoxide that attacks Pα when the two atoms are collinear with the oxygen atom that bridges the Pα and Pβ atoms (in accordance with an SN2-type in-line mechanism)10,11,14,18,19,31,
Despite the general acceptance of the above mechanism, recent work has shown that a third metal ion (metal C) must enter the active site to induce phosphodiester bond formation (as highlighted in Fig. 2a). The binding site for metal C can be accurately located by replacing Mg2+ with Mn2+, a metal ion that also catalyses DNA synthesis and is readily detected by X-ray diffraction even at low occupancy (Fig. 2b). Phosphoryl transfer takes place only when the enzyme–substrate complex captures the third divalent cation through thermal activation35. Considering the structures of the reaction intermediates (Fig. 2c), the conversion of conformation 2 to 3 involves an Arg61 side chain rotating upwards and away from the third metal binding site, such that metal C can enter the active site. This metal cation binds to the PPi group and one of the oxygen atoms of the departing α-phosphate, thereby stabilizing the reaction intermediate. These findings form the basis of a newly proposed three-metal-ion mechanism for Pol η, in which the third metal ion initiates the reaction by breaking the existing phosphodiester bond in dNTP and driving the nucleotidyl transfer35,36 (Fig. 2e). The native 3′-OH must be well aligned with the substrate and the three metal ions for deprotonation to occur. The catalytic role of a third metal ion is also supported by quantum mechanics/molecular mechanics (QM/MM) calculations, which confirm that metal C can stabilize the negative charge of the PPi product during DNA replication catalysed by Pol η37.
A third metal ion has also been identified in time-lapse studies of match incorporation by Pol β, although no snapshot yet exists of the three metal sites being occupied at the same time as the phosphodiester bond forms. Instead, the third metal was observed only in the product state (hence, it is referred to as the product metal) in which the Mg2+ ion at site A had already been replaced by Na+ (Ref. 19). As is discussed later, metal C has been observed concomitant with phosphodiester bond formation in translesion syntheses mediated by Pol β38,39. Thus, it is likely that the third divalent metal ion is also present in the Pol β mechanism for match incorporation. These experimental findings have been further probed and extended through QM/MM calculations that focused on the specific functions of the third metal40,41.
Of the four structures proposed to be involved in the key nucleotidyl transfer reaction (Fig. 2c), structure 3 may represent the most important TS or near-TS form. Structures 1 and 2 correspond to substrate complex IV in Fig. 1b, whereas structure 4 corresponds to the product complex V. The capture of the intermediate with metal B alone (structure 1) lends support to the early structural and kinetic studies with Cr(III)dNTP in the absence of Mg2+; these works showed that binding of Cr(III)dNTP to Pol β is sufficient to induce the closure of the N subdomain of the polymerase20,42. In addition, the Kd value of the Mg2+ ion (0.5–1 mM) was substantially higher than that of MgdNTP (30–50 μM)43,
The most common metal ion involved in catalysis mediated by polymerases is Mg2+, and it is likely that Mg2+ is the preferred metal at each binding site. However, Mn2+ has also been suggested as a natural metal ion that may be important for some polymerase functions. The Mn2+ ion has been observed for Pol λ46,47 and Pol μ48,
A common intramolecular hydrogen bond in the incoming dNTP. The structures and conformations of nucleobases can also play important roles in the catalysis of DNA polymerases (Fig. 3). For example, the intra- and intermolecular hydrogen-bonding motifs involving nucleobases can provide clues regarding the fidelities of the polymerase enzymes that house these substrates. Analysis of crystal structures of DNA polymerase–DNA–dNTP and RNA polymerase–RNA–rNTP ternary complexes (where rNTP denotes a ribonucleoside triphosphate) revealed the existence of a common intramolecular hydrogen bond between the 3′-OH and the β-phosphate of the incoming dNTP or rNTP37 (Fig. 3a). This interaction has been suggested to promote deprotonation of the primer 3′-OH by facilitating PPi departure. 2′,3′-Dideoxy-NTP (ddNTP) can also be incorporated into DNA by polymerases and is commonly used as a chain terminator in DNA sequencing. That ddNTP is unable to form this intramolecular hydrogen bond may contribute to the catalytic efficiency of the enzyme being reduced by a factor of >100 when processing this substrate54. This result further underscores the importance of the hydrogen bond with 3′-OH, which has been found to form stereospecifically with the pro-S oxygen atom of the β-phosphate in dNTP or rNTP. In addition, this hydrogen bond is also present in the structure of the Pol β–DNA–Cr(III)dTMPPCP (where dTMPPCP is 2′-deoxythymidine 5′-(β,γ-methylene)triphosphate) ternary complex bearing metal B alone20 (Fig. 3b). By contrast, the intramolecular hydrogen bond is not observed in the complexes that feature a dNTP with L-stereochemistry, such as the unnatural Pol λ–DNA–L-dCTP complex55.
Watson–Crick base-pairing alone is insufficient to ensure match incorporation. With the exception of Pol ν56,
Incorporation of non-Watson–Crick pairs
Spontaneous errors in Watson–Crick pairing. Watson and Crick82,83 observed that deviation from W–C pairing can arise when tautomeric forms of the bases are present. For example, a dA–dCTP mismatch, in which A undergoes keto–enol-like tautomerization to a structure with an exocyclic imine, mimics the shape of a W–C base pair (Fig. 3c). Indeed, this mismatch can be incorporated in DNA by the high-fidelity Bacillus stearothermophilus Pol I large fragment84. The incorporation of an ionized dG–dTTP mismatch has been observed in the reaction catalysed by avian myeloblastosis virus reverse transcriptase; the efficiency of dTTP misincorporation increased as the pH was increased from 6.5 to 9.5 (Ref. 85). The crystal structure of a human Pol λ variant bound to a DNA substrate with dGTP opposite to a template T (dT–dGTP) has been solved, confirming that the bases, despite being mismatched, are nevertheless arranged in a W–C-like geometry86 (Fig. 3d). The pH dependence of the misincorporation is consistent with the presence of an ionized base pair. Relaxation dispersion NMR spectroscopy enabled observation, in free duplex DNA, of transient Hoogsteen pairings87,
It is important to note that the above deviations from W–C-like pairing can arise owing to additional factors: use of the mutator metal ion Mn2+ for dA–dCTP incorporation and engineering of a polymerase active site for dT–dGTP. As we describe below, both approaches have been used in conjunction with other polymerases to facilitate mismatch formation.
Mn2+ facilitates mismatched ternary complex formation of Pol β. The presence of Mn2+ at a polymerase active site has been observed to enhance the efficiency of DNA synthesis, although this is often at the expense of fidelity. Thus, Mn2+ is a mutator metal ion for some polymerases19,91,
Enlarging the dNTP binding pocket converts the high-fidelity RB69 polymerase to a low-fidelity polymerase. One may expect a decrease in fidelity if the constraints on a polymerase active site are made less stringent. In this regard, the nascent dNTP binding pocket of a high-fidelity replicative polymerase from bacteriophage RB69 was modified by replacing four bulky amino acid residues with smaller ones to afford the quadruple mutant L415A/L561A/S565G/Y567A100. Pre-steady-state kinetic analyses of the mutant-catalysed reaction indicated that its fidelity is lowered by a factor of 103–106 (Ref. 101). Consequently, this mutant can form stable ground-state ternary complexes with all 12 mismatches in the presence of the (catalytically inactive) Ca2+ ion, with the resulting structures featuring distorted base-pairing at the active site (the structures of mismatched dG–dGTP and matched dC–dGTP can be compared in Fig. 3e).
Some low-fidelity polymerases use an enzyme side chain to select a specific dNTP. The African swine fever virus (ASFV) Pol X, at 174 residues, is a very small polymerase that does not feature the lyase domain and duplex DNA binding subdomain that are present in its mammalian homologue Pol β, which is twice as large102. In terms of mismatch incorporation, ASFV Pol X is perhaps the most extreme case as it catalyses the formation of a dG–dGTP (G–G) mismatch in addition to the four W–C matches8. Structures of the free protein have been determined using solution NMR spectroscopy103,104, as have those of the ASFV Pol X–MgdGTP binary complex and ASFV Pol X–MgdGTP–DNA ternary complex, which indicate that ASFV Pol X can use either gapped DNA or MgdNTP as the first substrate105,106. ASFV Pol X can bind to MgdGTP in a syn configuration in the absence of DNA and form a dG–dGTP mismatch with an anti–syn Hoogsteen base-pair conformation (Fig. 3f). The His115 residue is key to the catalytic incorporation of dG–dGTP mismatches (Fig. 5a), which has recently been confirmed independently by crystallography107. A double hairpin DNA with two GAA stem loops was used in the NMR studies to acquire high quality NMR spectra106, whereas a natural one-nucleotide gap DNA was used in the crystallographic study. The latter analysis located a unique binding pocket for the 5′-phosphate group of the downstream primer107, which affects dG–dGTP mismatch formation107,108. In the case of Pol β11 and Pol λ109, this binding is strengthened by the presence, in the lyase domain, of three cationic residues that are missing in Pol X.
The role of an enzyme side chain in selecting a specific dNTP has also been confirmed in the case of Rev1, a Y-family polymerase that repairs human DNA. Indeed, the Rev1–DNA–MgdCTP ternary complex has been characterized, and the enzyme, initially identified as a deoxycytidyl transferase110, mediates incorporation of dCTP opposite to template G with very high specificity. To a lesser extent, Rev1 can also install dCTP opposite to an abasic site or an O6-methylguanine during translesion DNA synthesis110,111. The origin of the very high specificity that Rev1 shows for dCTP became evident after the crystal structure of yeast Rev1 bound with template G and dCTP112 was determined. The dCTP substrate does not form a base pair with the template G, but instead hydrogen bonds to the Arg324 side chain of Rev1. Furthermore, Rev1 sets aside the template base and uses Arg324 as a template to form an Arg–dCTP pair that mimics a DNA base pair (Fig. 5b). Thus, Rev1 uses its protein side chain to dictate the identity of not only the incoming dNTP, but also the template base112.
Understanding fidelity attenuation with the medium-fidelity Pοl λ. In the X-family pols, Pol λ exhibits a fidelity (calculated from the reported error frequencies to be 30–9,100)113 between those of Pol β (1,700–93,000)114 and ASFV Pol X (1.9–7,700)8. Thorough structure–function studies have been conducted on Pol λ109, and a recent report indicates that, similar to ASFV Pol X, Pol λ possesses high MgdNTP affinity in the absence of DNA52. Analogous to the stabilizing interaction provided by His115 in ASFV Pol X, Pol λ makes use of its Tyr505 side chain to bind to the dNTP substrate through π–π interactions; mutation of Tyr505 into a smaller Ala reduced the affinity significantly. Structural analysis suggested that Pol λ maintains its medium fidelity by binding the substrate in a well-defined hydrophobic pocket that features Leu431, Ile492, Tyr505 and Phe506 residues. In support, it was predicted and subsequently demonstrated that the L431A mutation enhances MgdNTP pre-binding (Fig. 5c) and lowers fidelity.
The mechanism of fidelity. Having established the structures and mechanisms that are involved in match and mismatch incorporations, we now address the main factors that are responsible for the high fidelity of certain DNA polymerases. The most extensively studied effect is the MdNTP-induced conformational change (closure of the N subdomain or thumb subdomain for Pol β but the fingers subdomain for other high-fidelity polymerases10). The closed conformation has been well studied in many intermediate structures of polymerase–DNA–MdNTP complexes. Work on the R61A mutant of Pol η also indicated that the closed conformation is a prerequisite for aligning the primer and dNTP such that a third metal ion can bind30,35,36. A major point of interest is the role of conformational closure in differentiating correct and incorrect dNTP. This conformational change was once believed to be the rate-limiting step and thus the main fidelity-controlling step115. However, as mentioned above, it is the chemical step that is now recognized as being rate limiting. Nevertheless, recent studies have revealed that mismatched MgdNTP can induce only partial conformational closure or none at all116,
Further studies suggested that the conformational closure differentiates dNTP by a thermodynamic rather than a kinetic effect. As becomes evident on considering the free energy profile for the Pol β catalytic cycle (Fig. 4c), mismatch incorporation induces conformational closure at a rate comparable to that induced by match incorporation (III → IV, Fig. 4c). However, the correct dNTP can better stabilize the closed form IV119; this has been supported by recent single-molecule studies120,121 and is logical given that the N subdomain (represented by the characteristic αN helix) is closed in the near-TS intermediate structures in reactions that lead to both match (Fig. 4a) and mismatch incorporation (Fig. 4b). Taken together, the results indicate that both match and mismatch incorporations (if the latter does occur) proceed through analogous conformational trajectories that involve closure of the N subdomain119. Consistent with this conclusion, some lower-fidelity polymerases, including Pol μ51, Pol λ52 and terminal deoxynucleotidyl transferase (TdT)122,123, exist in a closed conformation even in their substrate-free forms. The DinB homologue (Dbh) polymerase from S. solfataricus, an error-prone enzyme from the Y-family, is also closed in its apo form, and human Pol η is closed when in the Pol η–DNA binary complex124. Furthermore, as described above, several lower-fidelity polymerases are able to bind with MdNTP before binding DNA, which led to the proposal of a partially random sequential mechanism (Fig. 5d) for some polymerases52,106. Different polymerases may operate through a combination of pathways A (in which DNA is bound first) and B (in which MdNTP is bound first) to achieve their specific functions.
As shown in Fig. 4c, the main influence on fidelity, based mainly on the studies of Pol β, is the nature of the TS of the nucleotidyl transfer reaction70. This is supported by kinetic analyses119, as well as by consideration of the near-TS structures19. In catalysis mediated by Pol η, binding of metal C is the rate-limiting sub-step of the chemical step35,36. The near-TS structures of Pol β complexes of matched and mismatched substrates differ in many ways, including the interactions involving active site residues, metal ions, W–C pairing and DNA binding (Fig. 4a,b). These differences are reflected in the higher relative energies of the TS and the intermediates that are close to the TS in the case of mismatch incorporation (Fig. 4c). Indeed, small structural perturbations can affect fidelity, and mismatches may occur more often if the enzyme relaxes its stringency for correct dNTP incorporation (for example, by enlarging the active site) or is inherently selective for binding a specific dNTP (as described above for Rev1 and dCTP). In addition, binding of the A-site M2+ is a key step in the discrimination against an incorrect incoming dNTP, and Mn2+ is less selective than is Mg2+ in this regard34. Similar reasoning can explain why the mutagenic I260Q variant of Pol β has a lower fidelity than the native Pol β125, as kinetic119, small angle X-ray scattering117 and recent crystallographic analyses13 all suggest that the main cause for the lower fidelity of I260Q is its ability to form a relatively stable mismatched ternary complex (Fig. 4c). Further elaboration of the mechanisms is provided in Box 3.
DNA repair, damage bypass and mutation
Roles of DNA polymerases in DNA damage responses. DNA can be damaged by exogenous agents such as reactive oxygen species (ROS), alkylating agents or UV light. More than 50,000–70,000 DNA sites can be damaged per cell per day126,
When DNA damage occurs, cells usually initiate specific repair mechanisms before the synthesis phase (S phase) of the cell cycle, in which DNA is replicated. These mechanisms, often in conjunction with activation of checkpoint proteins, are generally referred to as DNA damage response. Common repair mechanisms include, but are not limited to, BER, nucleotide excision repair (NER), ribonucleotide excision repair (RER) and mismatch repair128. The roles of polymerases in these repair pathways are usually to catalyse ‘re-synthesis’ after the damaged nucleobases have been excised through multiple steps. BER usually involves the use of Pol β to fill a single-nucleotide gap, whereas NER involves Pol δ and possibly Pol κ133, and RER involves mainly Pol δ to fill longer gaps131. Another role of polymerases in response to DNA damage is translesion synthesis, which usually occurs either before the lesion can be repaired by one of the mechanisms mentioned above or after the lesion escapes the repair mechanisms. The aim is to try to insert a correct base to avoid mutation, and, if this is not possible, a mismatched dNTP is incorporated and then fixed in the post-replication repair.
New functions of human DNA polymerases. The 17 human DNA polymerases that have been identified to date belong to the A-, B-, X- and Y-families, with none being in the C- or D-families. Of the A-family polymerases (mainly replicative), Pol γ functions in high-fidelity DNA synthesis in mitochondria134,135, the nuclear Pol θ (which features polymerase and helicase domains) also participates in double-strand break repair (DSBR)61,136,
An exciting newly discovered polymerase is PrimPol153,154, which belongs to the archaeal and eukaryotic primase superfamily, and is so named because it can function as both a primase and a polymerase. Uniquely, PrimPol can use both dNTP and rNTP substrates to initiate DNA and RNA synthesis, respectively, which is in contrast to primase, for which only rNTP can be used. PrimPol can also function in translesion synthesis153,154 but is highly error-prone155.
Approaches used by DNA polymerases to deal with the 8-oxo-dG lesion and mutation. 8-Oxo-7,8-dihydroxy-2′-deoxyguanosine (8-oxo-dG) is an abundant mutagenic oxidative DNA lesion, occurring nearly 2,800 times per human cell per day128. In addition, dGTP can also be oxidized to 8-oxo-dGTP and misincorporated into DNA by many polymerases, such as Pol α156, mitochondrial Pol γ157, Pol β38,156,158,159, Pol λ160, HIV-1 reverse transcriptase156, Bacillus stearothermophilus Pol I large fragment161 and even telomerase162. Once incorporated, most of the lesions can be repaired by the BER mechanism, but the sheer abundance of the lesions means that some will persist into the S phase. As 8-oxo-dG can use its Hoogsteen face to pair with an incorrect dATP (syn–anti, Fig. 6a) in addition to the W–C-like match with dCTP (anti–anti)163, it can generate dG to dT transversions161 and lead to cancers. Recent studies have elucidated the reasons for the variation in the response of different polymerases to the 8-oxo-dG lesion.
Pol β has been shown to insert 8-oxo-dGTP opposite to a template dA in preference to dC158. The structures of the intermediates in the Pol β-catalysed reaction for 8-oxo-dGTP insertion opposite to a template dA or dC have been determined using time-lapse crystallography38. The dA–8-oxo-dGMP mismatch forms a good anti–syn Hoogsteen base pair (Fig. 6b), which induces structural changes in the active site such that the mismatch, although compromising the subsequent DNA ligation process159, cannot be identified as a damage site by Pol β. Time-lapse crystallography has also been used to show that the 8-oxo-dG–dAMP mismatch also exists in a syn–anti (Hoogsteen) conformation39 (Fig. 6c). The corresponding dC–8-oxo-dGMP and 8-oxo-dG–dCMP complexes from the two studies both exist in an anti–anti conformation, as is usually adopted by matched structures. Importantly, the N subdomain was closed39,159 and the third metal ion was also observed in the complexes involving 8-oxo-dG or 8-oxo-dGTP described above, although it was missing in the mismatch complexes of undamaged dNTP19.
In contrast to Pol β, the Y-family enzyme Pol ι preferentially incorporates a correct dCTP opposite to an 8-oxo-dG lesion, a reaction that defines the unique biological role of Pol ι in protection against oxidative stress164. In comparing the four structures of Pol ι with the 8-oxo-dG–dCTP match and the dATP, dGTP, or dTTP mismatch163, it was found that the exceptionally narrow active site of Pol ι forces the purine bases of template 8-oxo-dG and incoming dGTP or dATP to each adopt a syn conformation. This stereochemistry and an extra hydrogen bond favour the smaller 8-oxo-dG–dCTP (syn–anti Hoogsteen) base pair over the 8-oxo-dG–dATP (syn–syn) base pair (Fig. 6d) and normal W–C (anti–anti) base pairs, leading to correct dCTP incorporation opposite to the template 8-oxo-dG.
The origin of Pol λ fidelity in promoting error-free bypass of 8-oxo-dG has recently been reported165. Seven novel crystal structures and kinetic data point to Pol λ having a flexible active site that can tolerate 8-oxo-dG in either the anti- or syn-conformation, with discrimination against the pro-mutagenic syn-conformation occurring at the extension step.
Lesion bypass across O6-methylguanine. The O6-methylguanine (O6Me-dG) moiety is a methylated DNA lesion that is produced by various alkylating agents. When left unrepaired, O6Me-dG causes G to A mutations, owing to Pol β pairing the methylated base with an incorrect dTTP much more frequently (∼30-fold) than with a correct dCTP166. By using the mutator Mn2+ to promote formation of an O6Me-dG–dTTP mismatched ternary Pol β complex, it has been shown that Pol β adopts a catalytically competent closed conformation, with the O6Me-dG–dTTP pair recognized as a pseudo W–C base pair97. By contrast, the enzyme adopts an open conformation in the O6Me-dG–dCTP ternary complex97,167. These results provide the structural basis for the carcinogenic O6Me-dG lesion.
Lesion bypass across an abasic site. Abasic DNA sites are estimated to occur approximately 10,000 times per day in each human cell168. These are referred to as being apurinic or apyrimidinic (AP) owing to the absence of purine or pyrimidine bases, respectively. Abasic sites may arise spontaneously or can be induced by chemotherapeutics. The chemistry behind the deleterious effects of AP sites has been reviewed169. Referred to as the ‘A rule’, polymerases from families A and B are most likely to install a dATP substrate opposite to an abasic site170,171, which leads to the transversion mutations that are found in cancer cells172. It has been shown that the A-family polymerase KlenTaq (a Klenow-fragment analogue of Taq polymerase) follows the A rule by using the side chain of Tyr671 to bind incoming dATP173. By contrast, Pol β from the X-family uses its lyase subdomain to remove the abasic site from DNA following incision of its 5′-phosphate, and an irreversible inhibitor of a 2-phosphato-1,4-dioxobutane derivative that mimics an AP DNA lesion has been shown to shut down the lyase activity of Pol β (with a half-maximal inhibitory concentration, IC50, of ∼21 μM)174.
The enlarged active site of Pοl η enables bypass of the bulky T–T dimer. DNA synthesis often stalls when a replicative polymerase encounters a bulky adduct such as a cyclobutane–pyrimidine dimer (CPD)175, which forms by UV-induced [2+2] cycloadditions between pyrimidine bases. In many species, including bacteria, fungi, plants and some mammals (marsupials and the species below), a T–T dimer can be repaired by photoactivation involving photolyase176,
Misincorporation of rNTP. It has long been recognized that DNA polymerases can make the mistake of processing rNTPs instead of dNTPs183,184, which is unsurprising given that the former are present, on average, at 30–200-fold higher cellular concentrations185,186. Most polymerases use a steric gate183,184,187 that would clash with the 2′-OH group of an rNTP188. However, it has recently been demonstrated that even high-fidelity replicative polymerases, such as Pol α, Pol δ and Pol ε, incorporate more than 10,000 rNTPs into the yeast (S. cerevisiae) nuclear genome in each round of replication186, with the estimated value being greater in the human genome131. Furthermore, the X-family enzymes Pol β and, to a lesser extent, Pol λ can also incorporate rNTPs opposite to normal bases or 8-oxo-dG189. Incorporation of rNTP into DNA, if not repaired, can cause genomic instability and serious diseases. Like other forms of damage, living systems have multiple pathways to repair DNA, with RER being the primary one in this case131.
DNA polymerases involved in mutagenic functions. As described above, replicative polymerases minimize mutations and achieve very high fidelity, whereas repair polymerases are expressed to fix and bypass life-threatening DNA damage and mutation. There are also some polymerases that are involved in mutagenesis, a normal life process. For example, the low-fidelity ASFV Pol X has been implicated in a mutagenic BER pathway8,71,104,190, which could be a survival mechanism for the virus under stress.
Mutagenesis is an important part of our immune response. In order to produce specific antibodies to fight against diverse pathogens or other foreign subjects, B cells must first generate a diverse repertoire of B cell receptors, which involves a key randomization step called V(D)J recombination. Furthermore, activation of B cell receptors by a foreign antigen triggers somatic hypermutation, a process by which the immune system adapts to combat pathogens. Recent studies indicate that three of the X-family polymerases, Pol λ, Pol μ and TdT, participate in V(D)J recombination, whereas two Y-family pols, Pol η and Rev1 (and possibly additional polymerases), are involved in somatic hypermutation. Although the detailed biochemical mechanisms of the processes that involve these polymerases are still subjects of intensive study6,191, it is intuitively clear that that these polymerases have characteristically low fidelities, as described in the preceding section, because their likely role is to synthesize mismatched DNA.
Applications of DNA polymerases
Polymerases have diverse and essential biological roles, but further, and more than any other class of enzymes, they have found varied applications in biotechnology and health-related industries. The most obvious applications are in DNA amplification and manipulation, including the polymerase chain reaction (PCR) and its error-prone process. Polymerases are also useful in DNA and RNA sequencing, as well as in other emerging applications that are described below. For brevity, we introduce only the most sophisticated applications, placing emphasis on recent developments that may either enhance current technologies or spawn new ones.
DNA amplification and manipulation. Many polymerases have been widely used in applications such as the amplification of DNA by PCR192,193, site-directed mutagenesis, error-prone PCR, DNA sequencing, diagnosis by detection of DNA or RNA with methylation (for example, 5′-methylated cytidine) and aptamer selection by systematic enrichment of ligands by exponential amplification194. In particular, the bacteriophage T7 polymerase has found use in the cloning of genes. Some of these many applications involve polymerases incorporating modified dNTPs into growing DNA. For example, the A-family KlenTaq as well as B-family KOD (Thermococcus kodakarensis polymerase) and 9°N (Thermococcus sp. 9°N-7 polymerase) can process dNTP analogues with very bulky groups; the structural basis for their surprisingly wide substrate scope is now well known194. The high temperatures required to denature double-stranded DNA demand that PCR use only very thermally stable polymerases195. In this regard, both high-fidelity and error-prone archaeal DNA polymerases with high thermostability have been developed. The use of archaeal polymerases in biotechnology, particularly in different types of PCR, has recently been reviewed195.
Recent development in the use of polymerases in DNA sequencing. Certain polymerases can faithfully copy a DNA strand; thus, it makes sense to use these enzymes to sequence genes of interest. The approach most widely adopted is the chain terminator method, which was introduced by Sanger and co-workers196,197 but has now been revised and improved in many ways. One strategy for improving signal detection and throughput makes use of reversible terminator dNTP analogues that feature fluorophore labels198, and this technology is now being put to commercial use, for example, by Illumina Cambridge Ltd. However, dNTP bearing a bulky label may be difficult for a polymerase to process; this problem has been addressed by the development of mutants such as the 9°N penta-mutant of D141A/E143A/L408S/Y409A/P410V, an enzyme engineered by New England Biolabs and termed Therminator III. In this 3′-5′-exo− polymerase variant, the bulky amino acids are replaced by smaller ones, such that bulky dNTP analogues can be incorporated efficiently. The reversible terminator approach is now widely used in second-generation sequencing, but although these methods constitute tremendous improvements over the Sanger approach, they are limited in the lengths of DNA that can be read in each step. Some of these problems can be overcome using the single-molecule real-time sequencing method developed by PacBio199,200. This method, unlike second-generation sequencing, does not require a pause between read steps to deprotect the 3′-OR group or to remove the fluorophore from the base. This new method is now classified as a third-generation sequencing tool.
A very different strategy is key to fourth-generation DNA sequencing technologies such as the Oxford Nanopore Technologies Nanopore sequencers. These devices feature a nanopore, formed by the bacteriophage φ29 DNA packaging motor or other materials201, through which a single strand of DNA passes. Changes in electrical current are measured and can be related back to the sequence of bases that were drawn through the nanopore. Recently, the nanopore technology has been combined with sequencing-by-synthesis, with the resulting nanopore-sequencing-by-synthesis methodology reportedly having a false positive background detection rate below 1.2%202. Over the past decade, the next-generation DNA sequencing technology200 has reduced the cost of sequencing genomes by five orders of magnitude203,204; a person's whole genome can now be sequenced for as little as US$999 (Ref. 205).
Engineering polymerases to synthesize DNA with unnatural base pairs. Attempts to expand the genetic alphabet involve the development of unnatural base pairs, which include nucleobase shape mimics65, hydrophobic pairs66,67,206,
A directed evolution approach can be used to prepare polymerase variants with desirable and novel functions, some of which we now mention221,222. Applications in mRNA diagnosis, such as pathogen detection or gene expression analysis, motivated the conversion of a DNA-template polymerase into an RNA-reading polymerase, which was achieved by screening thermostable KlenTaq variants in which amino acids in immediate proximity to the 2′-O of the RNA template base paired with the incoming dNTP are mutated223,
The study of DNA polymerases — the enzymes most essential for the existence and understanding of life — brings chemists and biologists together. This Review describes the molecular basis for the function of these enzymes, emphasizing that high-fidelity polymerases have well-aligned TSs for match incorporation but distorted TSs for mismatches. Fidelity is lowered when mismatched substrates are able to form well-aligned intermediates, as can occur in rationally designed or naturally evolved enzymes. In this way, most low-fidelity or mutagenic polymerases achieve specific biological functions, including translesion DNA synthesis and mutagenesis. These diverse properties of DNA polymerases see them amenable to various applications, which further motivates the study of these remarkable enzymes.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
How to cite this article
Wu, W.-J., Yang, W. & Tsai, M.-D. How DNA polymerases catalyse replication and repair with contrasting fidelity. Nat. Rev. Chem. 1, 0068 (2017).
RCSB Protein Data Bank: http://www.rcsb.org/pdb/home/home.do
The authors acknowledge financial support from the Ministry of Science and Technology (Grant Nos MOST103-2113-M-001-016-MY3, MOST105-0210-01-12-01 and MOST106-0210-01-15-04) to M.-D.T. and a US National Institutes of Health intramural grant (DK036146-08) to W.Y.