A Base-Independent Repair Mechanism for DNA Glycosylase—No Discrimination Within the Active Site

The ubiquitous occurrence of DNA damages renders its repair machinery a crucial requirement for the genomic stability and the survival of living organisms. Deficiencies in DNA repair can lead to carcinogenesis, Alzheimer, or Diabetes II, where increased amounts of oxidized DNA bases have been found in patients. Despite the highest mutation frequency among oxidized DNA bases, the base-excision repair process of oxidized and ring-opened guanine, FapydG (2,6-diamino-4-hydroxy-5-formamidopyrimidine), remained unclear since it is difficult to study experimentally. We use newly-developed linear-scaling quantum-chemical methods (QM) allowing us to include up to 700 QM-atoms and achieving size convergence. Instead of the widely assumed base-protonated pathway we find a ribose-protonated repair mechanism which explains experimental observations and shows strong evidence for a base-independent repair process. Our results also imply that discrimination must occur during recognition, prior to the binding within the active site.


SI-2 Discussion of different X-ray-structures of Fpg
As only one structure exists for FapydG, X-ray structures including 8OG need to be used for comparison. Two different structures for the educt state have been obtained via X-ray crystallography. One structure was obtained by a E2Q mutation [PDB-code: 1R2Y [1]] the other structure by replacing 8OG with the carbocyclic analogue c8OG [PDB-code: 4CIS [2]]. The three structures have in common, that they trap the educt state, but it is worth mentioning, that only the crystal structures, where the damaged nucleotide was substituted by a carbocyclic compound [2,3], contain a water molecule in the active site (X-WAT). In contrast to FapydG, cFapydG cannot be cleaved by the enzyme. cFapydG blocks the reaction by lacking the O 4' atom, which was replaced by a carbon atom. In addition, no hydrogen bonds can be formed anymore to the ribose. Interactions between Fpg and O 4' have been disabled by the modification, hence, the interaction pattern in the active site is expected to have significantly changed and to be disturbed. This indicates, that the interaction between the active site (especially E2) and O 4' is crucial for the reaction.
The presence of a water molecule in the active site in some of the structures raises the question whether it is part of the excision mechanism in vivo or only an artifact of the crystallization conditions.

SI-3.1 Behavior of different systems in FF-MD
We employed FF-MD, which is a standard method to investigate the overall dynamics of systems by gaining a huge number of snapshots. This method can be applied because no chemical reaction takes place. To gain deeper insights into the behavior of the active site, multiple FF-MD simulations for all four systems have been performed: FapydG with (I) and without X-WAT (II), cFapydG with (III) and without X-WAT (IV). For proper statistics and to obtain a statistically significant analysis, at least 5 simulations have been performed for each system. The average runtime over all systems was 110 ns. The simulation time for each system is listed in Tab. SI-1. Also during FF-MD simulations of FapydG without X-WAT (system II), no entrance of a water molecule into the active site has been observed, and there are multiple events of interaction between protonated E2 with O 4' (see SI-2.3). We also performed FF-MD calculations on the DNA-enzyme complex containing cFapydG instead of FapydG. In that case, X-WAT leaves the active site within the first ns in most cases (system III). In system IV no X-WAT is present in the beginning of the simulations. For this system we have observed that a water molecule is able to enter the active site with low probability after about 30 ns.

SI-3.2 Structural analysis of FF-MD
As a stability check of the system during the FF-MD simulations, RMSD of the protein backbone, DNA and X-WAT are plotted in Fig. SI-3. As reference for the structure analysis we use the minimized and equilibrated X-ray structure. The selection of DNA consists of the backbone atoms of FapydG with pairing nucleotide as well as the previous and following nucleotide pair, where hydrogens were excluded. Fig.3: RMSD plots for systems I (left) and II (right) of the protein backbone, DNA, and X-WAT. The system is more stable without X-WAT, due to the direct interactions between O 4' and E2.

SI-3.3 Interaction between ribose and E2
Tab. SI-2 shows that the crucial interaction between O 4' and E2 is much more probable without    The calculated structure is shown in atomic colors, the structure obtained by X-ray crystallography is shown in orange. In the X-ray structure the cleaved base is not resolved.  Based on this estimation, we conclude, that the exact energies of the two structures would not change the overall picture of our reaction profile. It should be noted, that although in general sp calculations are enough to estimate the necessary size of the QM region, that has to be included to obtain size converged energies, geometry optimizations are necessary to calculate the influence of this QM region on the energy.  The following steps have been performed: The systems were energy minimized (NVT ensemble) in 3 steps, relaxing different degrees of freedom, using of the conjugate gradient algorithm: (1) only hydrogen atoms (2000 steps), (2) only solvent (3000 steps), (3) all atoms (5000 steps). The system was heated up to 300 K within 10 ps using Langevin dynamics, where a positional constraint of 1 kcal/mol/Å 2 was applied on non-water atoms. In the subsequent equilibration step we switched to the NPT ensemble employing the Langevin piston Nosé-Hoover method [7,8]. At this stage the restraints on non-water atoms are reduced step by step down to zero (0.2 kcal/mol/Å 2 increments for every 20 ps). Equilibration was then performed for 300 ps, in which the coordinates were saved every 0.2 ps. Production runs were performed for at least 10 ns with timesteps of 2 fs using the SHAKE algorithm [9]. Coordinates were saved every 6 ps.

SI-6.2 Details for QM/MM calculations
The system for these calculations consists of the whole equilibrated structure (54412 atoms), with 15Å around N 9 of FapydG as relaxed region and 87 QM atoms. For the QM size convergence study the total system was reduced to protein, DNA, ions, water within 15Å of N 9 of FapydG, and 3Å of water around both protein and DNA (in total 12031 atoms in the system).