Introduction

The APOBEC3 family of enzymes (A3A-A3H) forms part of the innate immune response against viruses and transposons1,2,3. These enzymes deaminate cytosine to uracil in single-stranded (ss)DNA4,5,6,7 and, a subset, also in RNA8,9,10. Two family members, A3A and A3B, have also been implicated in deaminating genomic DNA cytosines, which can result in mutations that fuel tumor development and contribute to poor disease outcomes including drug resistance and metastases11,12,13,14. Recent studies have also shown that A3A contributes to the overall constellation of APOBEC signature mutations in human cancer cell lines15,16 and, importantly, is capable of causing carcinogenesis in mice17,18. A3 inhibitors have been proposed as a therapeutic strategy to prevent A3-mediated evolution of primary tumors into lethal metastatic and drug-resisting secondary growths19,20.

A3A exhibits an intrinsic preference for deamination of cytosine bases within 5′-YTCD motifs, where Y denotes C or T and D is A, G, or T15,16,17,21,22,23,24,25,26. The strong preference for TC dinucleotide motifs is explained by crystal and NMR structures of linear ssDNA bound to an inactive mutant of A3A, which revealed the atomic contacts (3 hydrogen bonds) with thymine (T−1) as well as the binding pocket for the target cytosine27,28,29,30,31,32,33. However, preferences of A3A for nucleotides flanking the TC-sequence are as-yet-unexplained. Crystallographic studies also revealed that, upon binding to A3A, linear ssDNA substrates adopt a distinctive U-shape that projects cytosine into the A3A active site27,28. Consistent with this observation, DNA hairpins with loops of 3- and 4-nucleotides (nt) have been shown to be more potent substrates of A3A in comparison to linear analogs16,34,35,36,37,38. Importantly, DNA hairpins are both physiologically and pathologically relevant. Such structures are ubiquitous in nature, especially at inverted repeats, which can cause stalling of DNA replication and transcription complexes, genomic instability, and a general predisposition to mutagenesis38,39.

Prior studies have shown that linear ssDNA substrates with 2′-deoxyzebularine (dZ) and 5-fluoro-dZ (FdZ) in place of the target cytidine (C) are weak inhibitors of A3A40,41,42, and that these cytidine analogs as free nucleosides are non-inhibitory42. Additional work by our group and others has shown greater inhibition of A3A in vitro with U-shaped oligonucleotides containing dZ, FdZ, or 5-methyl-2′-deoxyzebularine transition-state trapping molecules43,44,45, which agrees well with biochemical studies comparing linear and hairpin substrates and systematically varying hairpin stems and loops16,26,34,35. Here, we use X-ray crystallography to determine high-resolution structures of wildtype A3A/hairpin-inhibited complexes and demonstrate the underlying mechanism of inhibition. Importantly, these structures explain why 3-nt hairpin loops with TTC (or TTFdZ) are preferred A3A substrates (and inhibitors). Larger 4-nt loops extrude one nucleotide to optimally fit around the A3A active site. Moreover, here we also demonstrate that an FdZ-hairpin, TTFdZ-hairpin, is not only a potent nanomolar inhibitor in vitro in biochemical assays but is also capable of inhibiting wildtype A3A-catalyzed chromosomal DNA editing in living cells.

Results

Deamination and inhibition mechanisms are conserved

The mechanism for cytosine deamination by cytosine deaminases (CDA) was proposed based on crystal structures in which zebularine and 5-fluorozebularine accept a Zn2+-bound water molecule across the N3-C4 double bond forming a tetrahedral intermediate in complex with the enzyme46,47,48 (Fig. 1a). Accordingly, our prior work with A3A has indicated that, of these two cytosine analogs, the latter fluoro-containing nucleobase is a more potent inhibitor in the context of ssDNA41. Therefore, to address whether A3A utilizes the same deamination mechanism, we crystallized wildtype A3A complexed with an inhibitory hairpin (TTFdZ-hairpin: 5′-T(GC)2TTFdZ(GC)2T; Fig. 1b, c). Two different crystal forms were obtained and resolved to 2.80 and 2.94 Å, with slightly different packing in the unit cell of the crystallographically independent pairs of molecules, but close superposition of each molecule of the pair (superpositions are provided in Supplementary Fig. S1a, b). A3 enzymes in general prefer ssDNA over RNA as substrates and, accordingly, all deoxyribose moieties bound by A3A adopt the standard DNA C2-endo conformation in this crystal structure and in other structures described below, which helps to determine the positioning of T−1 and C0 (or FdZ0) into the −1 and target nucleobase binding pockets, respectively. The loop region, anchored by the target cytosine C0 and thymine T−1, is bound to wildtype A3A in a U-shaped conformation as reported for linear ssDNA bound to a Glu72-to-Ala catalytic mutant27,28, but with several important differences discussed below in subsequent sections.

Fig. 1: Inhibitor 5-fluoro-2′-deoxyzebularine (FdZ) embedded in hairpin DNA is hydrolyzed to form a tetrahedral species coordinated to Zn2+ of wildtype A3A.
figure 1

a Schematic of reaction of FdZ with water activated by Zn2+, analogous to that of cytidine/cytosine deaminases, showing critical role of general acid-base Glu72. b Schematic of TTFdZ- and TTC-hairpins. c Structure of wildtype A3A (yellow) in complex with TTFdZ-hairpin inhibitor (orange). Carbon atoms of nucleotides of the TTFdZ loop are shown in salmon-pink, ligands to the Zn2+ center and key protein residues interacting with TTFdZ-hairpin in cyan. d Zoomed-in structure of wildtype A3A with TTFdZ-hairpin inhibitor highlighting the hydrolyzed FdZ0 coordinated to the active-site Zn2+ and hydrogen bonding to Glu72 (cyan).

Mass-spectrometric characterization of DNA hairpins and X-ray structural information of this and other hairpin structures in complex with A3A are provided in Supplementary Table S1 and S2. Supplementary Fig. S1c highlights similarities and differences of the binding of C0 and FdZ0 into the active site. Evidence for hairpin structure in solution is shown with representative CD and NMR spectra in Supplementary Fig. S1d.

Zooming into the target nucleotide, FdZ0, in both structures, the N3-C4 double bond of FdZ is hydrolyzed so that the hydroxy group at C4 represents the tetrahedral intermediate in the deamination reaction (Fig. 1a, c, d; additional view in Supplementary Fig. S1c). The FdZ0 exhibits R-stereochemistry, 4-(R)-hydroxy-3,4-dihydro-2’-deoxy-5-fluorozebularine. The hydroxyl group at C4 is coordinated to the Zn2+ center and is derived from the water/hydroxide ion bound to the Zn2+ in the substrate-free state. Tetrahedral coordination at the Zn2+ is completed by His70, Cys101, and Cys106. The catalytic glutamic acid residue, Glu72, which functions as a general acid/base, hydrogen bonds to C4-OH and N3-H of hydrated FdZ. We expect for such interactions that Glu72 is present in the carboxylate form in the crystal structure. The 5-fluoro group of FdZ is accommodated comfortably, abutting Tyr130 (illustrated in Fig. 1d). These results with wildtype A3A, together with prior work indicating hydration of dZ and its derivatives in structures with CDA and A3G, combine to demonstrate a universal mechanism of target nucleobase engagement, deamination, and inhibition27,28,48,49,50,51.

A3A binds similarly to 3- and 4-nt DNA hairpin loops

To ascertain generality of the interactions observed between hairpin substrates and A3A, we determined crystal structures of A3A as its inactive E72A mutant in complex with three distinct DNA hairpins with loop regions of either 3- or 4-nt (depictions in Fig. 1b and Fig. 2a). All structures exhibit a similar crystal packing (Supplementary Fig. S2) and a common tertiary structure for the protein (with minor changes on absence of Zn2+; Supplementary Fig. S3a, b). Surprisingly, they also share very similar conformation of the hairpin loop region’s TT(C/FdZ)G moiety (Fig. 2b, c). The superpositions of A3A-E72A in complex with the 3-nt loop, TTC-hairpin (at 2.22 Å resolution), and the 4-nucleotide loops, ATTC-hairpin (at 1.91 Å resolution) and CTTC-hairpin (at 3.15 Å resolution), are mostly similar with an average RMSD of 0.43 Å (ribbon schematics in Fig. 2b and Supplementary Fig. S3c). However, an unexpected difference emerged between 3- and 4-nt hairpin structures with the clear flipping-out of the nucleotide at position −3 of the 4-nt loop, with the next nt, C−4, maintaining hydrogen bonding with G+1, as in structures with a three-nt loop (Fig. 2c).

Fig. 2: Stereochemistry of interactions of wildtype A3A and A3A-E72A with hairpin DNA featuring 3- and 4-nucleotide loops is conserved.
figure 2

a Sequences of hairpins with 4-nt loops: ATTC- and CTTC-hairpin. b Superposition of structures of A3A-E72A in complex with 3-nt loop TTC-hairpin (cyan) and 4-nt loop ATTC-hairpin (green) and CTTC-hairpin (magenta), showing conservation of protein tertiary structure and loop conformation. c Zoom into the active-site region of A3A-E72A in complex with TTC-hairpin (cyan) and ATTC-hairpin (green), showing overall binding conservation (structure and hairpin stem orientation) and extrusion of A at position − 3 for the 4-nucleotide loop. Red ellipses highlight His29 and Arg28, which have key roles in determining the conformation of the loop that makes hairpins better substrates for wildtype A3A than corresponding linear ssDNA. d Superposition of wildtype A3A (yellow) with TTFdZ-hairpin (grey-blue) and A3A-E72A (magenta) with TTC-hairpin (cyan). e Zoom-in (with slight reorientation) of the superposition of wildtype A3A (yellow) with TTFdZ-hairpin (grey-blue) onto A3A-E72A (magenta) with TTC-hairpin (cyan). The water (faded red sphere) and chloride ion (faded green sphere) of A3A-E72A superimpose approximately onto the carboxylate oxygen atoms of Glu72. Key hydrogen bonds are shown with black dashed lines.

Apart from extrusion of the 5′ nucleotide of 4-nt loops, the structures of wildtype A3A (two near-isoforms) and A3A-E72A, along with their respective hairpin DNA substrates, all superimpose very closely (Fig. 2d, e; Supplementary Fig. S1c). This close superposition indicates that the catalytic Glu72-to-Ala substitution has negligible effect on tertiary structure. This conclusion is supported by wildtype A3A and its catalytic mutant derivative E72A exhibiting similar thermostabilities52. In all instances, the C0 and T−1 of TTC-hairpin bind to inactive A3A-E72A identically to the binding of FdZ0 and T−1 in TTFdZ-hairpin to wildtype active A3A (Fig. 2d, e; Supplementary Fig. S1c). For both FdZ0 and dC0 the carbonyl oxygen at C2 hydrogen bonds to the NH of Ala71 and the nucleobase base stacks with Zn2+ ligand His70 and makes an edge-to-face π interaction with Tyr130. The NH2 substituent on C0 hydrogen bonds to the peptide carbonyl of Ser99 and to a water molecule. This water is located similarly to the Glu72 carboxylate oxygen of wildtype A3A that hydrogen bonds to the OH at C4 of hydrolysed FdZ0 (Fig. 2e; Supplementary Fig. S1c).

Structural explanation for hairpin loop preference of A3A

Thymine T−1 and cytosine C0 of the TTC-hairpin substrate bind identically to that observed crystallographically for binding of linear ssDNA to A3A-E72A27,28,29,30 (Fig. 3a, b). As reported for other structures of A3A-E72A in complex with ssDNA27,28, specificity for thymine at position −1 is defined by hydrogen bonding to the peptide NH and carboxylate group of Asp131 (Fig. 3c). However, the binding of T−1 and C0 does not explain the fact that hairpin loop cytosines are preferred substrates for A3A compared to linear substrates. In contrast to existing structures of linear ssDNA with mutant A3A constructs where nucleotides other than T−1, C0 and W+1 (W = A, T) are absent or poorly defined in electron-density maps27,28,32,33, our structures reveal this information and provide a molecular basis for higher reactivity of hairpin DNA substrates and the corresponding enhanced potency of hairpin-based inhibitors (Fig. 3).

Fig. 3: Key interactions of hairpin and linear DNA with A3A-E72A that define recognition of the TC motif and U-shaped conformation and preferences at −2 and +1 positions.
figure 3

a Superposition of a representative A3A-hairpin complex (E72A-TTC hairpin; cyan) and A3A-E72A-Cys171A with 15-nt linear ssDNA (5keg; grey), A3A-E72A with 15-nt linear ssDNA (5sww; magenta)27, and A3Bctd-E255A-AL1swap-QM-∆L3 with 7-nt linear ssDNA (5td5; salmon-pink). Only underlined nucleobases are visible in these ssDNA complexes: 5′-T7TCTT5 (5keg), 5′-A6ATCGGGA3 (5sww), and 5′-T2 TTCAT (5td5). Zn2+ is shown as grey spheres and Cl as green spheres. b Zoom-in showing details of shared interactions of TCX (X = G, A, T) of hairpin and linear DNA. DNA passes over loop 1 (bearing Arg28 and His29) and under loop 7 (bearing Tyr130, Asp131, and Tyr 132). Asp131 largely specifies binding of T at position −1; Tyr130 makes an n-π* interaction with the phosphate linking nucleotides −1 and 0. Loop 3 contacts DNA at metal-ligand His70 and at Lys60. G for TTC-hairpin and A for 5′-T2 TTCAT (at position +1) overlap poorly, as do T at −2 (observed only for TTC-hairpin and 5′-T2 TTCAT (5td5)). Zn2+ is shown as grey spheres and Cl as green spheres. c Details of interactions of T−1 with Asp131 and the cluster of aromatic residues. d Details of interactions of Arg28 and His29 of loop 1 of A3A-E72A with the ATTC-hairpin. Key hydrogen-bonding interactions are shown by black dashes. Cation-π interactions between Arg28 and T−2, π-π interactions between T−2 and C−4 and between His29 and G+1, and dispersion interactions are shown as magenta dashes. Water molecules are shown as red spheres. e Details of interactions that mediate the tight turn between C0 and G+1. Peptide NH and amino side chain of Lys60 make key contacts with ATTC-hairpin. f Thymine modelled at position +1 (in place of G) for TTC-hairpin bound to A3A-E72A. In addition to the hydrogen bond of the terminal amino group of Lys30 to the carbonyl at C4 (overlapping blue and red spheres), there are van der Waals contacts made by the methyl group of T+1 with the methyl group of Ala59 and the methylene groups Cγ and Cε of Lys30 modelled in a highly preferred rotamer.

First, His29 base-stacks with G+1, and T−2 base-stacks with a pyrimidine in the stem (C−3 for the 3-residue loop or C−4 for the 4-residue loop) (Fig. 2c, e; Fig. 3b, d, e). Second, the positively charged guanidinium moiety of Arg28 forms a cation-π interaction with the nucleobase T−2, thereby further stabilizing T−2 placement for interaction with His29 (Fig. 3d). Arg28 also forms a hydrogen bond to an oxygen atom of the phosphate linking A−3/C−3 to T−2, and the cytosine at position −4 is in register to hydrogen bond with guanine at position +1 as a part of hairpin’s stem (Fig. 3d). Third, the tight turn between the nucleotide at position +1 and the target cytosine at position 0 is stabilized by a bifurcated hydrogen bond between Nδ1 (amine tautomer) of His29 and O4′ of 2-deoxyribose at +1 and an oxygen atom of the phosphate group that links nucleotides C0 and T−1 (Fig. 3e). This tight turn to project C0 into the active site is accomplished with non-standard torsional angles for the phosphate groups, relative to expected values for A- or B-form DNA, as detailed in Supplementary analysis and discussion. Last, the peptide NH group of Lys60 hydrogen bonds to the phosphate linking nucleotides at positions 0 and +1, and the -NH3+ moiety of its side chain forms a salt bridge to the phosphate group linking nts +1 and +2 (Fig. 3e). Taken together, the hairpin stem bestows restricted conformational flexibility for the A3A-binding loop region of hairpin DNA compared to linear ssDNA, thereby enabling enhanced interactions with A3A loop 1 residues Arg28 and His29. Supplementary Fig. S4 shows a space-filled representation that highlights the tight packing of the TTC-hairpin into the A3A active-site cavity and against loop 3, along with a complete depiction of the base-pairing of the stem of TTC-hairpin and the atoms in hydrogen bonding and van der Waals contact.

The close superposition of the stems, which include in several structures two AT pairs, along with GC pairs, and the lack of specific interaction of most of the DNA stem with the protein (Fig. 2b, c; Supplementary Fig. S4a), suggests that reactivity of dC-containing hairpins and inhibition by FdZ- (or dZ- or 5-methyl-dZ-) hairpins is unrelated to stem composition of DNA hairpins, where substrate or inhibitor moiety is located at the 3′ end of the loop. This is consistent with the recent observation that variation of stem composition had negligible effect on inhibition44.

Structural explanation for A3A ’s −2 pyrimidine preference

The composite mutation spectra of human A3A in multiple model systems have revealed a marked bias for a pyrimidine (C or T) at nucleotide position −216,17,24,25. In addition to roles of His29 and Asp131 in positioning cytosine in the substrate-binding pocket and dictating the strong preference for thymine at position −1, as detailed in the previous section, His29 also helps to determine the preference of A3A for a pyrimidine (T or C) at position −2. By virtue of chemical structures, purines (adenine and guanine) are larger than pyrimidines, and modelling indicates that they cannot be accommodated in the −2 binding site in either a syn conformation of the glycosidic bond (where there is repulsive interaction of imine N3 with the phosphate group linking nucleotides −2 and −3) or an anti conformation of the glycosidic bond (Supplementary Fig. S5). In the anti conformation, purines lack the C = O moiety at C2 of T or C to form the crucial non-classical hydrogen bond with His29 (Fig. 2c). In addition, the water molecule that bridges the carbonyl O2 of T−2 (and potentially also of C at −2) to the peptide backbone NH of His29 cannot be accommodated for purine at −2 (Fig. 3d, Supplementary Fig. S5b, d). Therefore, this quasi-base pair between pyrimidine at position −2 and His29 is able to stack on top of the first CG base pair (at positions −3 or −4 and +1) at the head of the stem (Fig. 3d).

The importance of Arg28 and His29 to recognition of T−2 is shown also by the following results. For A3A on mutation of Arg28 to alanine, diminished activity towards linear ssDNA was reported27. Moreover, nutation of His29 to arginine in A3A (H29R) caused a 10-fold diminution of activity against a linear ssDNA compared to wildtype A3A53,54. Based on co-crystal structures here, we predict that the longer side chain of Arg in the H29R mutant will place the polar head group in a suboptimal position beyond the reach of O2 of T−2 and with poor π-stacking with the nucleobase at +1. Interestingly, the nearly identical (>90%) A3B catalytic domain lacks a corresponding histidine in its loop 1 region (it naturally has an arginine), and A3B shows no preference for pyrimidines in the –2 position of linear ssDNA substrates with target C at position 016,55. As yet, there is no structural information on wildtype A3BCTD with loop 1 in an open conformation with substrate or inhibitor bound to test this supposition.

Hairpin structures also help explain +1 preference for D (D ≠ C)

Cytosine at +1 is rarely observed in the A3A-induced mutation spectra in model systems16,17,24,25 and it is also strongly underrepresented in the overall A3 mutation signature in tumors56,57. At least for hairpin structures, π-π stacking of His29 with the six-membered ring of G+1 (or alternatively A+1) and van der Waals interaction of the CH2 group (Cβ) of His29 with the five-membered ring of G+1 (or alternatively A+1) explains the preference for purines over pyrimidines in the +1 position (Fig. 3d, e). The preference for thymine over cytosine is more subtle. In part, cytosine lacks the electron-donating methyl group of thymine, which leads to less favorable π-π interactions with His29. In addition, modelling T at position +1 and adjusting the Lys30 side chain to a favoured conformation reveals a small hydrophobic pocket that brings the methyl groups of T+1 and Ala59 and the methylene groups Cγ and Cε of Lys30 into van der Waals contact. Moreover, the terminal amino group of Lys30 hydrogen-bonds with the carbonyl moiety at C4 of T+1 (Fig. 3f); cytosine lacks this carbonyl, instead having an amino group. In this context, we also note that in vitro A3A prefers to deaminate the highlighted C of a suboptimal linear 5′-ATTCCCAATT substrate, whereas A3BCTD attacks the 5′-most C40,54. Cytosine lacks this methyl group and carbonyl group and thus substrates presenting YTTCD (D = A, G, T) to A3A are favoured over those presenting YTTCC.

Hairpin optimization for cellular experiments

The thermodynamic properties of the TTC hairpin were assessed in solution by CD and NMR and confirmed stable (Supplementary Fig. S1d). Isothermal titration calorimetry (ITC) measurements established that binding is largely enthalpically driven (Supplementary Table S3; Supplementary Fig. S6a, b). Real-time C-to-U deamination experiments using NMR spectroscopy demonstrated that the TTC-hairpin is a preferred substrate of wildtype A3A in comparison to a linear TTC substrate, which agrees with prior reports using a fluorescence-based assay16,35,38 (Fig. 4a, b). The kinetic parameters derived from NMR experiments using the integrated Michaelis-Menten equation and Lambert’s W function58 also confirmed that a TTFdZ-hairpin is a 21-fold more potent than a TTFdZ linear ssDNA in blocking wildtype A3A activity (Fig. 4a, b; Supplementary Fig. S6c). A dramatic change in Km from 3.0 ± 0.9 mM for linear ssDNA to 31 ± 6 μM for TTC-hairpin is the major contributor to a 42-times more efficient deamination of TTC-hairpin if kcat/Km are compared at pH 7.4 (Supplementary Table S4a). Moreover, the nanomolar potency (Ki) of TTFdZ-hairpin inhibitor was not significantly altered by replacing the hairpin phosphate groups with phosphorothioate (PS) linkages (Fig. 4b). Nuclease resistance was confirmed by treating with snake venom phosphodiesterase, which is a strong 3′-exonuclease commonly used to evaluate the stability of oligonucleotides with therapeutic potential59 (Fig. 4c). Nuclease resistance and inhibitory activity were also shown using A3A-expressing 293 T cell lysates, where the PS-TTFdZ hairpin exhibited stronger inhibitory activity in comparison to a linear PS-TTFdZ ssDNA or a PS-TTT-hairpin negative control (Fig. 4d).

Fig. 4: TTFdZ-hairpin and its nuclease-resistant phosphorothioated derivative are potent inhibitors of A3A compared to linear ssDNA with FdZ.
figure 4

a Plot of product concentration versus time in the absence and presence of TTFdZ-hairpin at 500 and 1500 nM; concentration of A3A 140 nM; concentration of TTC-hairpin substrate 500 μM. b Table shows derived inhibition constants, Ki for linear FdZ ssDNA (A2T2FdZA4), TTFdZ-hairpin and its phosphorothioated analog PS-TTFdZ-hairpin. The kinetic parameters were derived from the integrated Michaelis-Menten equation by means of Lambert’s W function58, which provides more robust estimates for kinetic parameters Km and Vmax than analysis of initial rates only. c Percentage of intact hairpins (15 μM) over time upon treatment with snake venom phosphodiesterase (phosphodiesterase I, Sigma, 32 mU/mL) in 50 mM Tris-HCl buffer, 10 mM MgCl2, pH 8.0 for the indicated times at 37 °C. Data are presented as mean values of two independent experiments on each sample; errors bars are estimated instrumental error of ± 5%. Full experimental details are available in Supplementary Information. d Concentration-dependent inhibition of A3A in cell lysates by phosphorothioated TTFdZ-hairpin. Bands from SDS-PAGE (Supplementary Fig. S10b) were quantified, normalized to the PS-TTT-hairpin (denoted PS-dT oligo) and used to determine the IC50 for linear PS-TTFdZ (denoted PS-FdZ (L)) and PS-TTFdZ-hairpin (denoted PS-FdZ (HP)). This experiment was repeated once more (n = 2 biologically independent replicates), and data are from one representative gel. The inset panel is an immunoblot showing expression of A3A-HA from cell lysate. A3A-HA was detected by an anti-HA antibody while anti-tubulin was used as a loading control. EV denotes the empty vector control. Raw immunoblot images are located in Supplementary Fig. 10a.

PS-TTFdZ-hairpin inhibits A3A editing in cellulo

To assess stability and localization in living cells, the modified PS-hairpins possessing dZ or FdZ were fluorescently labelled at the 3′-end (6-FAM). The MCF-7 breast cancer cell line was transfected with these hairpins using the Xtreme GENETM HP transfection agent. After 18 h these FAM-labelled PS-hairpins were found to localize to the nucleus in a concentration-dependent manner (Fig. 5a; Supplementary Fig. S7). Moreover, the metabolic activity of MCF-7 and another breast cancer cell line MDA-MB-453 was reduced less than 2-fold (Supplementary Fig. S8a).

Fig. 5: Fluorescently tagged PS-TT(F)dZ-hairpin-FAM localizes to the cell nucleus where in cellulo A3A-editing activity is inhibited by phosphorothioated TTFdZ-hairpin.
figure 5

a Representative images of asynchronously-grown MCF-7 cells transfected using Xtreme GENETM HP with either no hairpin (top panel) or 1.25 μM of fluorescently-tagged (6-FAM) PS-TTFdZ-hairpin. MCF-7 cells were incubated for 16 h with hairpin DNA and Xtreme GENE TM HP. Images have the pseudo-coloured panels overlaid: nucleus (magenta) and PS-TTFdZ-hairpin-FAM (green). Variability in hairpin uptake is attributed to cells being at different stages of their cycle. Scale bars, 20 μm. Additional images may be found as Supplementary Fig. S7. b PS-TTFdZ-hairpin (denoted PS-FdZ (HP)) shows concentration-dependent inhibition of A3A-editing activity in comparison with PS-FdZ (L), a linear, fully phosphorothioated oligonucleotide AT3FdZAT3. Biological replicates, establishing reproducibility, are shown in Supplementary Fig. S9a, b. Variability in plasmid transfection prevents averaging of results. The transfection reagent (TransIT-LT1), administered at constant concentrations across all in cellulo experiments in presence or absence of DNA species, appears to have a slight activating effect on A3A editing activity at low concentrations of both hairpin and linear DNA. c Maximum A3A-catalyzed editing rate at various concentrations of PS-FdZ (L) and PS-FdZ (HP). Data points are the rolling slope for each concentration as described in Methods, and representative of three biologically independent experiments (n = 3). Replicates are shown in Supplementary Fig. S9a, b. Error bars represent standard deviation from the mean (mean +/− SD).

In cellulo inhibition of A3A activity was quantified using a base-editing system reported previously60,61, except here A3A is detached to the Cas9 nickase complex (Methods). Briefly, A3A-catalyzed editing of a single target cytosine nucleobase within a ssDNA R-loop created by a nuclease-deficient Cas9-guide RNA complex in an eGFP reporter construct results in a dose- and time-dependent restoration of eGFP fluorescence (Methods; Supplementary Fig. S8b). This reporter was stably integrated into the chromosomal DNA of 293 T cells to mimic an R-loop environment that A3A might encounter in cancer cells. Upstream of the mutated eGFP codon lies a linked wildtype mCherry gene for calculating the efficiency of base editing (eGFP + /mCherry + ). With PS-TTT-hairpin as a A3A-non-binding control, little change in generation of eGFP fluorescence over time was observed as a function of concentration (Supplementary Fig. S9). However, in the presence of increasing concentrations of PS-TTFdZ-hairpin inhibitor, there was marked suppression of the generation of fluorescence (Fig. 5b). The maximum rate of editing by A3A was 2.4-fold lower in the presence of 7.5 μM inhibitor than in the absence of inhibitor, suggesting an upper limit on IC50 of ~5 μM (Fig. 5c). The difference in cell viability between PS-TTT and PS-TTFdZ hairpins was minimal in comparison with A3A inhibition data for these oligos at the same concentrations in 293 T cells (Supplementary Fig. S8a).

Discussion

Our structural studies establish that the stem-loop preconfigures the primary TC recognition motif in optimal position for binding to A3A, such that hairpin DNAs are more reactive substrates and as the 2′-deoxy-5-fluorozebularine derivative is a more potent inhibitor of A3A than linear ssDNA. Both 3- and 4-nt loops present the TC or TTFdZ motif in an identical configuration to A3A. Although hinted at in earlier structures with ssDNA24, a crucial role for A3A His29 (and to a lesser extent Arg28) in substrate binding is demonstrated here and, importantly, also shown to help explain its preference for deamination of YTCD motifs (Y = C, T; D = A, G, T).

Although we have not specifically tested the hairpin inhibitors described here against related human enzymes, the most similar TC-preferring A3 family member (A3B) is not known to prefer hairpin substrates16,55,62 and other TC-preferring A3s have yet to be tested systematically with hairpins. A3G and AID, with non-TC preferences, are unlikely to be inhibited by the hairpins described here. However, when hairpin or other A3A inhibitors get closer to clinical development, these and other off-target possibilities should be examined in dedicated biochemical and cellular experiments.

A structural understanding of the hairpin preference of A3A helped inform the design of substrate-mimicking FdZ inhibitors. Importantly, phosphorothioated derivatives are resistant to nuclease degradation and can be directed to the nucleus with the aid of commonly used transfection reagents. Moreover, we have obtained an important proof of concept here through the inhibition of the mutagenic activity of A3A in living cells with a PS-FdZ hairpin. Further optimization of such inhibitors may lead to small molecules that can be used in a therapeutic setting to slow rates of tumor evolution and improve clinical outcomes for patients with A3A-driven tumors.

Methods

Oligodeoxynucleotide synthesis and purification

The general strategy for the synthesis of dZ and FdZ and their incorporation into DNA oligomers has been described by us elsewhere43. 3-[(Dimethylaminomethylidene)amino]-3H-1,2,4-dithiazole-3-thione (DDTT, Sulfurizing Reagent II from GlenResearch, USA) was used for sulfurization of oligos. The sulfurization step (2–4 min) was conducted before capping as a replacement of the standard oxidation step. FAM is located at 3’-end of the oligo and was obtained by synthesizing an oligo on a controlled pore glass loaded with 4-[6-[(2 S, 4 R)-4-hydroxy-2-(DMT-O-methyl)pyrrolidin-1-yl]-6-oxohexyl]carbamoylfluorescein purchased from PrimeTech (Cat. number: 008a-500, Minsk, Belorussia). dZ and FdZ phosphoramidites were synthesized as previously described in refs. 41,42,43.

The final detritylated dZ-containing oligos were cleaved from the solid support and deprotected at room temperature using conc. NH4OH overnight. FdZ-containing oligos were deprotected on the solid support by a two-step procedure with 10% Et2NH in CH3CN for 5 min, followed by incubation of the support in ethylenediamine/toluene mixture (1/1, v/v) for 2 hrs at room temperature63. The support was washed with toluene (3 × 1 mL), dried in vacuo and the deprotected FdZ-containing oligo was released in H2O (1 mL).

The deprotected oligos in solution were freeze-dried and dry pellets were dissolved in milli-Q water (1 mL) and purified and isolated by i) reverse-phase HPLC on 250/4.6 mm, 5 μm, 300 Å C18 column (Thermo Fisher Scientific) in a gradient of CH3CN (0 → 20% for 20 min, 1.3 mL/min) in 0.1 M TEAA buffer (pH 7.0) with a detection at 260 nm or ii) ion-exchange (IE) HPLC using TSKgel Super Q-5PW column from TSK in buffer A [25 mM Tris·HCl, 20% CH3CN, 10 mM NaClO4, pH 7.4] and buffer B [25 mM Tris·HCl, 20% CH3CN, 600 mM NaClO4, pH 7.4]. Gradients: 3.7 min 100% buffer A, convex curve gradient to 30% B in 11.1 min, linear gradient to 50% B in 18.5 min, concave gradient to 100% B in 7.4 min, keep 100% B for 7.4 min and then 100% A in 7.3 min. Flow rate: 0.8 mL/min with a detection at 260 nm.

Oligonucleotides were freeze-dried, pellets were dissolved in milli-Q water (1.5 mL) and desalted by reverse-phase HPLC on a 100/10 mm, 5 μm, 300 Å C18 column (Phenomenex) in a gradient of CH3CN (0 → 80% for 15 min, 5 mL/min) in milli-Q water with detection at 260 nm. Pure products were quantified by measuring absorbance at 260 nm, analyzed by ESI-MS and concentrated by freeze-drying (Supplementary Table S1).

Expression and purification of A3A constructs

A3A-E72A was expressed and purified as described in ref. 32. A3A-E72A was used for structural studies and ITC experiments with the substrate. Wildtype A3A, which was recombinantly expressed with the His6 tag at the C-terminal end in E. coli and purified as described in ref. 43, was used for kinetic and structural studies with the inhibitor. The yield of wildtype A3A from 1 to 10 L expression was usually not enough to justify size-exclusion chromatography purification. The protein, both A3A-E72A and wildtype A3A, was transferred from high-salt buffer used for purification (50 mM phosphate buffer pH 6.5, 300 mM Na acetate, 300 mM choline chloride, 1 mM TCEP) into low-salt buffer (1 mM phytic acid pH 7.0, 1 mM NaF, 1 mM NaCl, 1 mM TCEP) for biophysical characterization and crystallization by “washing” 3 times using centrifugal filtration with 10 kDa cut-off.

Co-crystallization of A3A constructs with hairpin substrates

A3A-E72A (1–4 mM in low salt buffer) was mixed with oligonucleotides (10 mM in TE buffer: 10 mM Tris/HCl pH 7.9, 1 mM EDTA) in a 1:2 molar ratio (protein:ligand) and diluted to 0.75 mM using above low salt buffer. Dilution was done with protein buffer. The mixture was added to crystallization solution in a 1:1 ratio and the mixture was pipetted on siliconized glass disks and sealed on top of a reservoir of crystallization solution for hanging-drop crystallization at 12 °C. The crystallization solution had the following composition: 100 mM bicine at pH 6.6, 200 mM NaCl, 20 mM putrescine, 1 mM TCEP, 1 mM inositol hexaphosphate (phytic acid) and 45% pentaerythritol propoxylate (5/4 PO/OH). The Zn2+-free crystals of A3A-E72A with ssDNA were crystallized using A3A-E72A that had been purified in the presence of 1 mM EDTA.

His6-tagged wildtype A3A was mixed with inhibitor oligonucleotide in a 1:2 molar ratio (protein:inhibitor) and the protein concentration was adjusted to 0.85 mM in low-salt buffer and crystallization proceeded as described for substrates with A3A-E72A.

X-ray crystallography

Notwithstanding two distinct crystal habits (tiny flattened needles and thin plates), all structures are approximately isomorphous with space group P21 (Z′ = 2) and unit cells of dimensions a ≈ 52 Å, b ≈ 57 Å, c ≈ 92 Å and β ≈ 105°. Data were processed on-site at the Australian Synchrotron using XDS64,65. Each structure was solved independently by molecular replacement (MolRep66) using the A3A structure PDB ID 5keg28,67 (space group I222) from which metal ions, ssDNA, chloride ions and waters had been stripped. After rigid body refinement with REFMAC568 of the CCP4 suite69, initial electron density maps, visualized with COOT70, showed clearly the presence, or in one structure absence, of Zn2+, along with well-defined electron density for the ssDNA hairpin. Structure elucidation proceeded with rounds of building with COOT and refinement with REFMAC5. In all structures, active site Loop-3 was not well defined, as well as Loop-2 that is remote to the active site. In several structures, phytic acid (inositol hexaphosphate) was ill-defined with only three phosphate groups being well-defined. Supplementary Table S2 presents a summary of crystallographic data, data collection and structure refinement. Supplementary Figs. S1S4 illustrate crystal packing, molecular structures and superpositions.

Circular dichroism (CD) spectroscopy of hairpin DNA

CD spectra were recorded using a Chirascan CD spectrophotometer (150 W Xe arc) from Applied Photophysics with a Quantum Northwest TC125 temperature controller. CD spectra (average of at least 3 scans) were recorded between 200 and 350 nm with 1 nm intervals, 120 nm/min scan rate and 10 mm path length followed by subtraction of a background spectrum (buffer only). CD spectra were recorded at 10 µM DNA concentration in 50 mM Na+/K+ phosphate buffer, pH 7.0 supplemented with 100 mM NaCl, 1 mM TCEP, 100 µM DSS and 10 % D2O.

Circular dichroism spectra showed a shift of positive ellipticity from 274 nm for unstructured DNA (T4CAT) to 286 nm for dC-hairpin which was also accompanied by increase of molar ellipticity (Supplementary Fig. S1d, top panel). Presence of G-C base-pairs in the duplex part of dC-hairpin is evident from four singlets of four imino protons at 13–13.2 ppm in 1H NMR spectrum (Supplementary Fig. S1d, bottom panel). These data confirm that dC-hairpin is folded in solution and can be used as a scaffold to design A3A inhibitors by using nucleoside-based inhibitors of CDA instead of dC in the loop of dC-hairpin.

Nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry of hairpin DNA

1H, 13C, 31P NMR spectra were recorded on Bruker 500- and 700 MHz spectrometers, the latter with dual-channel cryoprobe. A representative spectrum in the imino region is shown in Supplementary Fig. S1d (top panel). NMR spectra were processed in TopSpin. High-resolution electrospray mass spectra were recorded on a Thermo Fisher Scientific Q Exactive Focus Hybrid Quadrupole-Orbitrap mass spectrometer. Ions generated by ESI were detected in positive ion mode for small molecules and negative ion mode for oligonucleotides. Total ion count (TIC) was recorded in centroid mode over the m/z range of 100–3000 and analyzed using Thermo Fisher Xcalibur Qual Browser. Mass-spectrometric data on hairpin DNA are presented in Supplementary Table S1.

Isothermal titration calorimetry (ITC) of interaction of A3A with hairpin DNA

Desalted unmodified TTC-hairpin oligo was purchased (Integrated DNA Technologies) at 1 μmol synthesis scale and dissolved in TE buffer (10 mM Tris/HCl pH 7.9, 1 mM EDTA) to give 10 mM solutions. ITC experiments were conducted at 25 °C using a MicroCal ITC200 (now Malvern Instruments) isothermal titration calorimeter. Protein A3A-E72A, which is a catalytically inactive variant, was dialyzed and diluted with ITC buffer to concentrations of about 50 μM (ITC buffer: 50 mM Na+/K+ phosphate, pH 6.0, 50 mM NaCl, 50 mM choline acetate, 2.5 mM TCEP, 200 μM EDTA with 30 mg/mL bovine serum albumin; after preparation, this buffer was frozen and defrosted before the experiments) and titrated with dC oligonucleotides dialyzed against the above ITC buffer. The concentration ratio of oligonucleotide in the syringe to protein in the cell is generally 10:1 (for 1:1 binding). Supplementary Table S3 presents full analysis of ITC results; Supplementary Fig. S6a, b show the titration curves and derived plots of enthalpy changes versus stoichiometry ratio A3A-E72A:hairpin DNA.

Enzymology of A3A with hairpin substrates and inhibitors

Hairpins as substrates and inhibitors were analyzed as previously described in ref. 43. In short, wildtype A3A was used to compare linear DNA (A2T2CA4) and dC-hairpin (T(GC)2TTC(GC)2T, bold C is deaminated) at 500 μM in the NMR-based assay (20 oC, pH 7.4, 50 mM Na+/K+ phosphate buffer, supplemented with 100 mM NaCl, 1 mM TCEP, 100 µM sodium trimethylsilylpropanesulfonate (DSS) and 10% D2O; enzyme concentration in assay: 140 nM, from dilution of wildtype A3A ( > 200 μM) in low-salt buffer. The NMR-based assay yields the initial velocity of deamination of various ssDNA substrates, including the modified ones40, in the presence of A3 enzymes. Consequently, the Michaelis–Menten kinetic model was used to characterize substrates and inhibitors of A3. Moreover, use of dC-containing hairpin as a substrate of A3A allowed us to use a global regression analysis of the kinetic data over the entire time course of the reaction using Lambert’s W function (integrated form of the Michaelis-Menten equation).

The course of the reaction was followed by 1H NMR until the substrate was consumed (up to 28 h, depending on the experiment performed). Subsequently the amount of substrate or product at each time point was calculated by integrating the decreasing substrate peak at 7.752 ppm (singlet) or the increasing product peak at 5.726 ppm (doublet) and calibrated by the area of DSS standard peak at 0.0 ppm. Using the known concentration of the standard, the peak was converted to a corresponding substrate concentration. The time at which each spectrum was recorded as a difference to the first spectrum was used as the time passed. The product or substrate concentration versus the time of reaction was plotted and fitted using the integrated form of the Michaelis-Menten equation:

$${[S]}_{t}={K}_{m}W\left(\frac{{S}_{0}}{{K}_{m}}\exp \left(\frac{{\left[S\right]}_{0}-{V}_{\max }t}{{K}_{m}}\right)\right)$$
(1)

where W is Lambert’s W function, [S]t is the substrate concentration at specific time, [S]0 is the initial substrate concentration, Vmax and Km are the Michaelis-Menten constants and t is the time. The two Michaelis-Menten constants, kcat and Km, the initial substrate concentration and an offset which corrects for the integration baseline in the NMR spectra were fitted using Lambert’s W function in Gnuplot.

By varying the concentration of an inhibitor, the plots of observed Km versus inhibitor concentration were obtained, fitted with a linear function (f(x) = a + b x) and Ki values were calculated as a/b, with error propagation as described in ref. 40 (Supplementary Fig. S6c).

Evaluation of stability of PS-hairpins against enzymatic digestion

Separately TTC-hairpin, PS-TTFdZ-hairpin, and 3′-fluorescein-labelled PS-TTdZ-hairpin-FAM and PS-TTdZ-hairpin-FAM (each 15 μM in 50 mM Tris-HCl buffer, 10 mM MgCl2, pH 8.0, 37 °C) were treated with snake venom phosphodiesterase (phosphodiesterase I, Sigma, 32 mU/mL). The percent degradation over time (0–360 min) was monitored by anion-exchange chromatography for the indicated times at 37 °C.

DNA deaminase activity assays

HEK 293 T (RRID: CVCL_0063) cells (ATCC, USA) were maintained in RPMI-1640 (#SH30027.01, Cytiva, USA) supplemented with 10% fetal bovine serum (#10437028, Gibco, ThermoFisher, USA) at 37 °C with 5% CO2 in a humidified atmosphere. The ssDNA deaminase activity was performed as described in ref. 71. Whole cell lysates were prepared, placed on ice, and immediately used for the deaminase assay. Inhibitor oligos (and controls) were heated to 80 °C for 5 min, and then cooled to RT to induce hairpin formation. They were then prepared at varying concentrations in 5 µL and combined with 10 µL of cell lysate at 37 °C for 15 min to promote binding of oligos to A3A. To this reaction, 5 µL of a mastermix containing 0.25 µL RNAse A, 800 nM fluorescent ssDNA substrate, 10x UDG buffer (NEB #M0280), and 0.25 µL UDG (NEB #M0280) were added to each sample for a total volume of 20 µL and incubated at 37 °C for 1 h. The fluorescently-labeled oligo has the following sequence: (5′-(ATT)3ATTCGAATGG(ATTT)6-fluorescein-3′). Reactions were fractionated on a 15% Urea-TBE acrylamide gel, imaged with a Typhoon FLA-7000 imager (GE Healthcare), and then quantified using ImageQuant (Cytiva, USA).

Cellular uptake and localization of DNA oligomers

MCF7 (RRID: CVCL_0031) cells (ATCC, USA) were maintained in DMEM (Gibco, ThermoFisher Scientific, USA) supplemented with 1% penicillin/streptomycin (Gibco), and 10% fetal bovine serum (Gibco) at 37 °C with 5% CO2 in a humidified atmosphere.

Briefly, MCF7 cells were transfected with FAM-labelled hairpins using X-tremeGENETM HP DNA Transfection Reagent (Roche, USA; 1.0 µL). After 16 h, the cells were washed twice with phosphate-buffered saline (PBS) containing MgCl2 and CaCl2, fixed in 4% paraformaldehyde/PBS for 15 min at room temperature (RT), and then washed with PBS. Cells were then stained with Hoechst 33342 before imaging on a Zeiss LSM 900 Scanning Confocal Microscope using an oil-immersion 63 objective lens (NA 1.4). Laser excitation wavelengths and collection ranges appropriate to the fluorophores of each sample were used to detect the emission spectra of the specific combination of Hoechst 33342 (excitation at 405 nm, emission monitored at 410–530 nm) and FAM 555 (excitation at 496 nm, emission monitored at 511-579 nm). All images were digitally processed for presentation with ImageJ (Rasband, W. 2014. ImageJ. U.S. National Institutes of Health, Bethesda, MD). Confocal microscopy images of cellular uptake of inhibitors are provided in Fig. 5a and Supplementary Fig. S7.

MTT cell viability assay

MCF-7 or MDA-MB-453 cells were seeded in 96-well plates at a density of 1.8 × 104 and 2 × 104 cells/well in 90 µL complete DMEM and incubated for 24 h to adhere under conditions described in section 1.8. Then, 10 µL of transfection mixture containing Opti-MEM™ I Reduced-Serum Medium (Gibco, Thermo Fisher), the respective DNA hairpins and 0.2 µL of X-tremeGENE™ HP DNA Transfection Reagent (Roche) were added to cells and incubated for 24 h or 48 h. Cell viability in the absence or presence of hairpins was assessed by MTT assay. Briefly, 10 µL MTT solution (Biotium) was incubated with the cells for 3 h. Formazan crystals formed in living cells were solubilized with 200 µL/well of DMSO. The absorbance was read at the test and reference wavelengths (λ) of 570 and 620 nm, respectively, on a POLARstar Omega plate reader (BMG Labtech). The percentage of living cells was calculated as follows:

$${{{{{\rm{Viability}}}}}}(\%)=({{{{{{\rm{OD}}}}}}}_{570\exp }-{{{{{{\rm{OD}}}}}}}_{620\exp })/({{{{{{\rm{OD}}}}}}}_{570{{{{{\rm{cont}}}}}}}-{{{{{{\rm{OD}}}}}}}_{620{{{{{\rm{cont}}}}}}})\times 100,$$
(2)

where OD570exp and OD570cont correspond to optical density in experimental and control wells, respectively, at λ = 570 nm, and OD620exp and OD620cont correspond to optical density in experimental and control wells, respectively, at λ = 620 nm. Results are shown in Supplementary Fig. S8a.

Inhibition of A3A activity in cellulo

As summarized in Supplementary Fig. S8b, HEK 293 T cells stably transduced with the live-cell deaminase reporter have been reported60,61. Semi-confluent cells in a 24-well TPP plate (#Z707791, Millipore Sigma, Merck, DE) were transfected with the following: pcDNA3.1-A3A-HA (20 ng), LTR-gRNA-Cas9n-UGI-NLS-Puro-LTR (400 ng), and varying concentrations of inhibitor (or control) oligos using TransIT-LT1 (Mirus Bio) as the transfection reagent. The plate was imaged using an Incucyte (Sartorius, USA) over the course of 68 h. Live-cell images of orange, green, and phase image channels were captured with a 4x objective every four hours after the initial transfection with five images per well in a fixed grid. mCherry- and GFP-positive cells were identified with internal cellular analysis software. The rolling slope for each concentration between 30 and 50 h was calculated using GraphPad Prism (Dotmatics) software (first derivative between two time points) and represented as an average in Fig. 5c. After 68 h, cells were collected and prepared for immunoblots. Replicate data are provided in Supplementary Fig. S9.

Immunoblots

Immunoblots were prepared as described in ref. 61. Samples were separated by either a 4–20% Criterion TGX Precast gel (#5671095, Bio-Rad, USA) or 4–15% Mini PROTEAN TGC gel (#4561084, Bio-Rad, USA), and then transferred to nitrocellulose membranes (#1620112, Bio-Rad, USA). Primary antibodies include mouse anti-Tubulin (#T5168, 1:10000, Sigma-Aldrich), rabbit anti-HA (#3724 S, 1:2500, Cell Signaling), and rabbit anti-Cas9 (#ab189380, 1:5000, Abcam). Secondary antibodies used were goat anti-rabbit IRdye800 (LI-COR, #925-32211, 1:10000) and goat anti-mouse IRdye680 (LI-COR, #926-69020, 1:10000). Raw immunoblots are included as Supplementary Fig. S10. The ladder used to mark the molecular weight is the PageRuler Prestained Protein Ladder (10 to 180 kDa, #26616 ThermoFisher Scientific).

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.