Proton transfer during DNA strand separation as a source of mutagenic guanine-cytosine tautomers

Proton transfer between the DNA bases can lead to mutagenic Guanine-Cytosine tautomers. Over the past several decades, a heated debate has emerged over the biological impact of tautomeric forms. Here, we determine that the energy required for generating tautomers radically changes during the separation of double-stranded DNA. Density Functional Theory calculations indicate that the double proton transfer in Guanine-Cytosine follows a sequential, step-like mechanism where the reaction barrier increases quasi-linearly with strand separation. These results point to increased stability of the tautomer when the DNA strands unzip as they enter the helicase, effectively trapping the tautomer population. In addition, molecular dynamics simulations indicate that the relevant strand separation time is two orders of magnitude quicker than previously thought. Our results demonstrate that the unwinding of DNA by the helicase could simultaneously slow the formation but significantly enhance the stability of tautomeric base pairs and provide a feasible pathway for spontaneous DNA mutations.


CONTENTS
Reading from left to right, each plot in a row of Fig. 2 shows the image number of the reaction path. We use 15 images along the reaction pathway, the first corresponding to the canonical form and the last to the tautomeric form. In each row, the seventh image is the transition state. Each row corresponds to a reaction path for a given separation distance. Thus reading down, we observe how the proton transfer landscape changes as the bases separate. Fig. 2 shows the splitting of the canonical and tautomeric form of G-C while being constrained at the R-group mentioned before. Initially, there are visible changes other than the elongation of the hydrogen bonds holding the bases together. As the separation distance increases, there is a slight internal rotation of the bases relative to each other. The rotation minimises the length of the hydrogen bonds. There is a clear preference for the top hydrogen bond to maintain its equilibrium length while the base rotates. The rotation only happens since we are fixing the R-group, where the base joins onto the sugar and the rest of the DNA backbone. The R-group is where the base will be pulled from since this is the only covalently bonded link between the base and the rest of the DNA.
In comparison, there is some difference between the rotation of the base for the canonical vs the tautomeric form (see the bottom panel of Fig. 2). Here, the O-H bond of the tautomeric G offers a much more comprehensive hydrogen bonding range due to being on the outer edge of the molecule -in comparison to the standard form of C. As a result, the O-H bond of the tautomeric G remains in a hydrogen bond for much longer as the separation distance increases.
In reality, the splitting reaction coordinate is likely not a straight line due to the interactions with the DNA backbone and local water environment, restricting the movement of the bases. Consequently, we expect too see a range of opening angles as observed by molecular dynamics calculations.
Supplementary Figure 2: The double proton transfer reaction pathway. Each row is the reaction path, and each column is the pathway over increasing separation distances. The R label corresponds to the reactant, TS, the transition state, and P the product.
Supplementary Figure 3: The reaction pathway at large separation distances. Each row is the proton transfer reaction pathway, and while each column shows the mechanism at increasing separation distances.

SUPPLEMENTARY NOTE 2: PROTON TRANSFER AT LARGER SEPARATION DISTANCES
We explore the double proton transfer mechanism at separation distances greater than > 2.0Å. Fig. 3 highlights the reaction pathway of two separation distances, 2.5Å and 5.0Å. In both pathways, the rotation is diminished as it no longer becomes favourable to rotate to minimise bonding length, but to maintain the tail end of the hydrogen bonds so that the residual bonds are maximised. In the 2.5Å case, the pathway is comprised of two separate movements of the protons. First, the middle hydrogen transfers, followed by the top proton in a similar mechanism observed before. On the other hand, for 5.0Å the middle hydrogen begins to transfer before meeting the other halfway between the bases. This switch to a concerted asynchronous mechanism is likely due to the solvent's increasing role when the bases are far apart, since at this distance a water molecule could begin to creep in between the bases.
In Fig. 4 the energy of the reaction paths are shown. At a separation of 2.5Å, two distinct large barriers are observed. The first barrier corresponds to the middle hydrogen moving. The single proton transfer has a deep well due to the high barriers on the reaction coordinate in both directions. In addition, it has a considerable asymmetry value of 0.668 eV relative to the canonical form. The second barrier corresponds to the top proton transferring. At this separation distance, it is the minimum energy path, but not the most energy efficient mechanism. Instead, we expect the intra-base transfer scheme, which is not considered in the reaction path calculation, to compete with this mechanism since it has a much lower and narrower reaction barrier. While at a separation of 5.0Å the reaction barrier has doubled in height compared with that of the 2.5Å separation, and the barriers have merged. We expect both the classical and quantum contribution to the rate to be low, resulting in a slow reaction that is unlikely to form biological products in either of these cases. Consequently, we expect proton transfer to be unlikely to occur above 2.0Å. During our quantum mechanical calculations where the effect of stacking interactions between DNA base pairs on the double proton transfer was investigated, the base pair below the base pair of interest was assumed to be frozen during the separation event in the base pair above it. The resulting reaction profile is shown in Fig. 5. This assumption was not taken lightly, and backed up by our molecular dynamics simulations, in which we can see a clear separation of timescales between the opening of one base pair, and the next. This can be deduced from Fig. 6 where the separation of base pair 2 (plotted on the y-axis) remains near-zero during separation events in base pair 1 (x-axis).

SUPPLEMENTARY NOTE 4: FURTHER NOTES ON PROTON TRANSFER ASYNCHRONICITY
We determine the asynchronicity to analyse further how the proton transfer mechanism changes during the base dissociation process. As a concept, asynchronicity is defined by a slight separation of the double proton transfer, i.e. one proton transfers, other heavy ions rearrange and then the second proton transfers.
To evaluate the derivatives shown in the methods section "Proton Transfer Asynchronicity", we first pass the Cartesian coordinate vectors into a Savitzky-Golay filter to suppress any spurious noise introduced by the uncertainty in the path which is inherent to the machine-learning approach of finding the reaction path. The filtered Cartesian coordinate vector and the reaction path are then interpolated using cubic splines.
In the top panel of Fig. 7, the first peak demonstrates B2 transferring before the second peak corresponding to B1. The distance between each peak is used to determine the asynchronicity. In the second panel, the two prominent peaks are observed, along with some additional oscillations of the atoms rearranging. However, here the reaction path is much longer, and there is a clear separation between the two transfer peaks. The separation between the peaks demonstrates the large asynchronicity of the transfer process.