Bulky Lesion Bypass Requires Dpo4 Binding in Distinct Conformations

Translesion DNA synthesis is an essential process that helps resume DNA replication at forks stalled near bulky adducts on the DNA. Benzo[a]pyrene (B[a]P) is a polycyclic aromatic hydrocarbon (PAH) that can be metabolically activated to benzo[a]pyrene diol epoxide (BPDE), which then can react with DNA to form carcinogenic DNA adducts. Here, we have used single-molecule florescence resonance energy transfer (smFRET) experiments, classical molecular dynamics simulations, and nucleotide incorporation assays to investigate the mechanism by which the model Y-family polymerase, Dpo4, bypasses a (+)-cis-B[a]P-N 2-dG adduct in DNA. Our data show that when (+)-cis-B[a]P-N 2-dG is the templating base, the B[a]P moiety is in a non-solvent exposed conformation stacked within the DNA helix, where it effectively blocks nucleotide incorporation across the adduct by Dpo4. However, when the media contains a small amount of dimethyl sulfoxide (DMSO), the adduct is able to move to a solvent-exposed conformation, which enables error-prone DNA replication past the adduct. When the primer terminates across from the adduct position, the addition of DMSO leads to the formation of an insertion complex capable of accurate nucleotide incorporation.


Supplementary Methods
Principal Component Analysis. Principal component analysis, or PCA, can be a useful tool to isolate and show large protein movements. PCA was performed on 33,000 snapshots for each trajectory. 100 total modes were generated, and the cross-correlation plots for those modes were subtracted in the same way described in above for the distance correlation plots (Fig. S9). These plots show some interesting features and strong differences; notably, there are again fewer differences between B[a]P-dG:dC in water and B[a]P-dG:dG in water versus B[a]P-dG:dC in water and B[a]P-dG:dC in DMSO. In light of the experimental results that show a return to function for Dpo4 with the adduct in a solvent-exposed conformation, the substantial differences in the large motions of the protein support the concept that not only is the structure of the protein different, but that the overall movement of the protein has drastically changed. A representative depiction of the first (and largest) principal coordinate (PC) for B[a]P-dG:dC in water and B[a]P-dG:dG in DMSO can be seen in Fig. S8.a and Fig. S8.b respectively. The two modes contain not only very different movements, but also the B[a]P-dG:dC in water's first PC has substantially less movement overall. This is also consistent with the experimental result that when the adduct is stacked within the major groove of the DNA helix, the protein is blocked from functioning properly. That said, it is important to recognize that the first PC only represents one large-scale motion of the protein-while more difficult to visually interpret, the overall correlation differences are more telling, since they show substantial differences in many of the PCs rather than just one.

MD simulations.
The simulations in pure water comprise two binary insertion complexes and one binary pre-insertion complex. All of the insertion binary complexes in water have the 3'-B[a]P adduct stacked within the DNA helix and the modified G nucleobase stacked within the minor groove 1 . The two sub-systems for the insertion binary complex had modified bases at the ending 5' position opposite the adduct, which was changed from C to G in the second system. The corresponding nucleobase was deleted completely from the third to simulate the pre-insertion complex. These systems were compared in order to investigate the effect of different nucleobases on the stability of the complex. Two additional simulations were run in 10% DMSO with the adduct flipped out into solvent; one with the original experimental primer given above and one with the final nucleobase at the end of the 5' strand (across from the adducted base location) deleted. Further changes were not performed for the DMSO since the flipped-out conformation for the B[a]P-cis-G adducted base does not have substantial interaction with the corresponding nucleobase. Additionally, the TYR-274 amino acid in the protein chain was changed to the primary Dunbrack rotamer (53.1377%) in order to eliminate steric clashes with the flipped out DNA base in the 10% DMSO systems 2 .
Partial charges for the B[a]P-cis-G adduct were calculated using RESP fitting on the RED development servers 3,4 and the standard procedure for creating modified nucleotides for proper linkages between nucleobases. An additional angle parameter was S3 taken from Mocquet 2007, and the partial charges generated in this work compared favorably with the partial charges from those results (all calculated charges and other parameters are reported in the supporting information) 5 . The guanine geometry parameters were taken from the OL15 DNA force field 6 , and the remaining geometry parameters for the adduct were generated from the GAFF force field 7 . Parameters for the Ca 2+ ions present in the crystal structure 8 and parameters for the DMSO solvent box 9 (where appropriate) were taken from the literature.
Molecular dynamics simulations on all systems were performed with the pmemd.cuda program from AMBER16, with the ff14SB force field for all of the protein parameters, 10-12 the OL15 force field for the DNA parameters 6 and a 1 fs time step. Long-range electrostatics were treated with sPME with an 8Å cutoff for all nonbonded interactions. 13 All simulations were performed in the NVT ensemble with the Berendsen thermostat and barostat 14 , and SHAKE was applied to all bonds involving hydrogen atoms.
Prior to solvation, all systems were neutralized to a net charge of 0 with Na + ions in AMBER16's tleap program. The water systems were solvated in a rectangular box of TIP3P water using a 12Å pad between the surface of the protein and the edge of the box 15 . For the systems with 10% DMSO, in order to ensure an even distribution of DMSO molecules within the box, the AMBER16 program AddToBox was used to generate the 10% DMSO/90% TIP3P water solvent boxes, with the parameters of the box given to meet or exceed 12Å from the surface of the protein and the edge of the box. RMSD, RMSF, correlation analysis, and solvation shell tracking were all calculated with AMBER16's cpptraj program. PCA and NMA were done with in the ProDy module in VMD 16,17 . 33,000 snapshots from each trajectory were analyzed with ProDy's PCA algorithm, with 100 PC modes generated. Figure S1. Dpo4 running start primer extension assay with unmodified template. The 16mer primer is fully extended to 26mer product in one minute. The Dpo4 concentration is 50 nM. The sequence used for the extension assay is shown on the top of the gel and in Table S1. Figure S2. Dwell time distributions for Dpo4-DNA binary complex. The dissociation constants (k off ) were calculated by fitting data to single exponential decays. The corresponding smFRET experiments were carried out as a function of DMSO concentration shown in Fig. 3c. Figure S3. Single nucleotide incorporation assay for unmodified template. (a) Dpo4 primarily incorporates the next correct nucleotide in the unmodified DNA construct. The gel picture shows the tendency of dNTP incorporation across the dG, templating base in the 20mer/26mer primer-template. Unmodified DNA sequence used for this assay is shown on the top. Dpo4 mainly incorporates dC in both in absence and presence of DMSO. Lanes 1 and 7 represent control experiments in absence of dNTPs in the experiments (labeled as -). Lanes 2 and 8 represent experiments with all 4 dNTPs (labeled as 4). Other lanes contain only the designated dNTP. The Dpo4 concentration is 10 nM and the incubation time is 1 min. (b) This is the same reaction as described in (a) but the incubation time is 10 min. Figure S4. Single nucleotide incorporation assays on 21mer/26mer unmodified DNA primer-template (Table S1). Cy5-labeled DNA primer shows the extended products on the gel. Dpo4 incorporates the next correct dNTP, dT in the DNA construct shown on top of the gel. In this single nucleotide incorporation assay, lanes 1and 7 (dash line) corresponding to control experiments without dNTPs. Lanes 2 and 8 contain all four dNTPs (labeled as 4). Other lanes contain the designated dNTP below the lane. The primer extension reaction was quenched after 1 min.