Structural basis of sequence-specific cytosine deamination by double-stranded DNA deaminase toxin DddA

The interbacterial deaminase toxin DddA catalyzes cytosine-to-uracil conversion in double-stranded (ds) DNA and enables CRISPR-free mitochondrial base editing, but the molecular mechanisms underlying its unique substrate selectivity have remained elusive. Here, we report crystal structures of DddA bound to a dsDNA substrate containing the 5′-TC target motif. These structures show that DddA binds to the minor groove of a sharply bent dsDNA and engages the target cytosine extruded from the double helix. DddA Phe1375 intercalates in dsDNA and displaces the 5′ (−1) thymine, which in turn replaces the target (0) cytosine and forms a noncanonical T–G base pair with the juxtaposed guanine. This tandem displacement mechanism allows DddA to locate a target cytosine without flipping it into the active site. Biochemical experiments demonstrate that DNA base mismatches enhance the DddA deaminase activity and relax its sequence selectivity. On the basis of the structural information, we further identified DddA mutants that exhibit attenuated activity or altered substrate preference. Our studies may help design new tools useful in genome editing or other applications.

The interbacterial deaminase toxin DddA catalyzes cytosine-to-uracil conversion in double-stranded (ds) DNA and enables CRISPR-free mitochondrial base editing, but the molecular mechanisms underlying its unique substrate selectivity have remained elusive. Here, we report crystal structures of DddA bound to a dsDNA substrate containing the 5′-TC target motif. These structures show that DddA binds to the minor groove of a sharply bent dsDNA and engages the target cytosine extruded from the double helix. DddA Phe1375 intercalates in dsDNA and displaces the 5′ (−1) thymine, which in turn replaces the target (0) cytosine and forms a noncanonical T-G base pair with the juxtaposed guanine. This tandem displacement mechanism allows DddA to locate a target cytosine without flipping it into the active site. Biochemical experiments demonstrate that DNA base mismatches enhance the DddA deaminase activity and relax its sequence selectivity. On the basis of the structural information, we further identified DddA mutants that exhibit attenuated activity or altered substrate preference. Our studies may help design new tools useful in genome editing or other applications.
Enzymatic deamination of cytosines in DNA plays key roles in various important biological processes, including innate immune responses against viruses and transposons, antibody diversification in adaptive immunity and the accumulation of somatic mutations in various human cancers [1][2][3][4] . The activity of APOBEC family single-stranded (ss) DNA cytosine deaminases has also been harnessed in base-editing technologies, where an engineered Cas9-guide RNA complex directs APOBECs for site-specific C-to-T base substitutions in genomic DNA without making double-strand breaks 5 . Cytosine deamination by the APOBEC enzymes is sequence selective; for instance, human APOBEC3A (A3A) and APOBEC3B (A3B) only deaminate cytosines in a 5′-TC sequence context (deaminated C is in bold), which is responsible for the characteristic 'APOBEC signature' mutations found widely in cancer genomes 6,7 . Structural studies have shown that A3A and A3B bind ssDNA substrates in a U-shaped conformation, with the thymine base 5′ (−1) to the target cytosine flipped out and making specific contacts with the protein 8 . A similar mode of hairpin-shaped substrate engagement was observed for a distantly related bacterial transfer RNA adenosine deaminase, TadA, which served as the template for an evolved DNA adenine deaminase capable of A-to-G conversion in base editing 9,10 .
Recent studies have identified a dsDNA deaminase from Burkholderia cenocepacia, DddA, an interbacterial toxin that is delivered to contacting cells by the type VI secretion system and mediates antagonism between Gram-negative bacteria 11,12 . Interestingly, DddA shares a strong preference for the 5′-TC target sequence with A3A, A3B and several other APOBEC family members 12 . However, unlike APOBECs that only deaminate ssDNA, DddA selectively deaminates cytosines in dsDNA. The unique activity of DddA allowed Mok et al. to develop CRISPR-free DddA-derived cytosine base editors, which enable C-to-T base editing in mitochondrial, chloroplast and nuclear DNA [12][13][14][15][16][17][18][19][20] . Furthermore, Cho et al. showed that a catalytically inactive DddA mutant (E1347A) fused to the TadA-derived DNA adenine deaminase mediates Article https://doi.org/10.1038/s41594-023-01034-3 Zn-coordinating residues, also donates a hydrogen bond to the thymine O2 atom. Thus, the strong 5′-TC preference of DddA appears to reflect the favorable interaction made by the −1 T base in replacing the target cytosine in the double helix. The noncanonical T-G interaction, which is distinct from the G•T wobble pair commonly observed in RNA secondary structures, is further stabilized by van der Waals contacts made by Ala1341 and a hydrogen bond between the carbonyl oxygen of Pro1338 and the guanine base N2 atom (Extended Data Fig. 6). Met1379 complements Phe1375 and Ala1341 to form a cluster of hydrophobic side chains inserted into the minor groove, interacting with the orphan (unpaired) adenine at the −1 position and stabilizing unstacked bases of dsDNA in the distorted conformation (Figs. 1d and 3a). Upstream of the 5′-TC motif, Asn1378 and Arg1403 are inserted into the DNA minor groove and interact with guanine at the position −2 of the deaminated strand and thymine at −4 of the complementary strand, respectively, targeted A-to-G editing in human mitochondrial DNA, where DddA may assist in unwinding/melting of the dsDNA substrate 21 . In addition, DddA has been adapted by Gallagher et al. for genome-wide protein-DNA interaction site mapping in bacteria 22 . However, despite its useful applications, molecular mechanisms underlying the biochemical activities of DddA have remained unknown. Here we report crystal structures of DddA in complex with dsDNA and corroborating biochemical data, which together reveal a unique mechanism of substrate DNA recognition of DddA.

Overall structure of DddA-dsDNA complex
To understand how DddA interacts with dsDNA substrates, we crystallized the toxin domain (Gly1290 to Pro1422) of B. cenocepacia DddA in complex with a 14-base pair (bp) dsDNA substrate containing the 5′-TC target sequence (Fig. 1a,b). DddA with a substitution of the catalytically essential glutamic acid residue (E1347A) was used to capture the enzyme-substrate complex. The structure of the DddA-dsDNA complex was determined in two different crystal forms and refined to 2.39 and 2.62 Å resolution, respectively ( Table 1). The crystal structures show that DddA engages the minor groove of a sharply bent dsDNA (Fig. 1c,d). The structures obtained in the two crystal forms are very similar overall, with a root mean square deviation (r.m.s.d.) of 1.37 Å for all protein and DNA atoms, and of 0.45 Å for the protein backbone atoms, although they differ in the conformation of the target (0) 2′-deoxycytidine nucleotide. In the first structure (PDB 8E5E), the target cytosine base is completely flipped out of the DNA double helix and captured in the active site pocket, where it interacts with the Zn ion (Fig. 2a,c and Extended Data Figs. 1 and 2). In the second structure (PDB 8E5D), the target cytosine is parked in the major groove via a T-shaped stacking on the edge of the adjacent (+1) cytosine base, and the active site pocket is occupied by a phosphate ion (Fig. 2b,c and Extended Data Fig. 1). In both structures, the dsDNA substrate bound by DddA is bent away from the protein by ~80°, which leads to a substantially widened minor groove (groove width up to 15 Å, in comparison to 6 Å in the B-form DNA; calculated using CURVES+) 23 , allowing for direct base contacts by the protein. Correspondingly, several nucleotides surrounding the 5′-−1 TC 0 motif, including G (−2) and C (+1) of the deaminated strand and A (−1) of the complementary strand (unpaired due to the shift of −1 T; see below), show the A-form-like C3′-endo sugar pucker in both structures.
The structure of the Zn-dependent deaminase fold of DddA in complex with DNA shows minimal changes from that in complex with the immunity protein DddI (PDB 6U08) 12 , with an overall backbone r.m.s.d. of 0.50 and 0.62 Å, respectively, for the two DNA-bound structures. A structural comparison highlights DNA mimicry by DddI in blocking the active site of DddA (Extended Data Fig. 3). Besides the active site zinc ion, in both our DddA-dsDNA structures we observed electron density for a putative metal ion octahedrally coordinated by the backbone carbonyl oxygen of Glu1381, Thr1382, Leu1384 and Asn1417, and both the backbone and side chain oxygen atoms of Asn1415. This density was modeled as a magnesium ion, which appears to stabilize the DddA residues important for DNA binding (Extended Data Fig. 4). Biochemical experiment showed that although the bound magnesium ion is not essential it enhances DddA deaminase activity (Extended Data Fig. 5), which is consistent with its structural role.

Mechanism of TC motif recognition
The minor groove interaction by DddA is centered on Phe1375, which intercalates in dsDNA and displaces thymine at −1 position (5′ to the target cytosine) (Fig. 1c,d and Extended Data Fig. 1). The displaced thymine in turn replaces the target (0) cytosine extruded from the double helix (Fig. 3a). This unique arrangement is stabilized by bifurcated hydrogen bonds donated to the thymine O4 atom from the juxtaposed guanine base N1 and N2 atoms (Fig. 3b). His1345, which is one of the which may modestly contribute to sequence preferences (Extended Data Fig. 6). Binding of DddA to the bent DNA is also supported by interaction with the backbone phosphate groups from both strands, involving residues Ser1331, Asn1339, Tyr1340, Lys1402 and Lys1420 (Extended Data Fig. 6).

Base mismatches promote DddA activity
On the basis of the highly distorted conformation of the dsDNA bound to DddA, we reasoned that base mismatches at the target (0) or 5′ (−1) position would destabilize the double helical structure of the substrate and facilitate DNA deamination by DddA. Thus, we compared DddA activity on fully base paired, singly mismatched (at position 0), and doubly mismatched (at positions 0 and −1) 14-bp dsDNA substrates ( Fig. 4a,b). DddA deaminates cytosine in the 5′-TC motif in the fully base-paired substrate, in which the complementary strand has opposing 5′-GA (Fig. 4b, lane 6). Using a complementary strand with a single mismatch (5′-TA) led to enhanced activity, confirming our hypothesis (Fig. 4b, lane 4). The deamination reaction was even more efficient with a complementary strand with double mismatches (5′-TT), consistent with our structural observation that substrate engagement by DddA requires disruption of base pairs at both positions 0 and −1 (Fig. 4b, lane 5). Next, we further hypothesized that base mismatches may relax the 5′-TC requirement of DddA and examined whether DddA can deaminate cytosines preceded by different −1 bases (5′-GC, 5′-CC, 5′-AC) when paired with mismatched complementary strands (Fig. 4c). For the original complementary strand with 5′-GA, which would generate mismatches at the −1 position, we observed DddA-mediated deamination on all three substrates to a varying extent; the activity was highest on 5′-AC and poor on 5′-GC (Fig. 4c, lanes 9-11). For the complementary strand with opposing 5′-TT, we also observed deamination on all three substrates but their preferences were reversed; the activity was highest on 5′-GC and modest on 5′-AC, which forms a single mismatch at position 0 (Fig. 4c, lanes 6-8). With opposing 5′-TA, the activity was high on all three doubly mismatched substrates (Fig. 4c, lanes 3-5). Of note, the 5′-CC target was deaminated at both (−1 and 0) cytosines, His1345 Ser1331   Fig. 7). These results show that base mismatches at either position 0 or −1 eliminate the 5′-TC requirement of DddA, although the sequence context matters in some cases.

DddA mutants
To dissect structure-function relationships, we explored amino acid substitutions for key DNA-interacting residues of DddA ( Fig. 5a and Supplementary Fig. 1). As mentioned above, a triad of hydrophobic residues, Ala1341, Phe1375 and Met1379, support unstacked bases of DddA-bound dsDNA in the minor groove ( Figs. 1 and 3a,b). For Ala1341, which abuts against the noncanonical T-G base pair, we tested substitutions of Ser, Thr, Glu, Tyr and Pro. Of these mutants, only DddA A1341P retained activity on the canonical substrate (5′-TC/GA), and it showed the 5′-TC preference (Fig. 5a,b). Interestingly, although the activity of DddA A1341P on the fully base-paired substrate was weaker than that of the wild type, DddA A1341P showed higher activities than the wild type on all mismatch-containing substrates (Fig. 5c,d, compare with Fig. 4b,c). The hydrophobic proline side chain inserted more deeply (than alanine) into the minor groove may interact favorably with unpaired DNA bases. For the DNA-intercalating residue Phe1375, either Ala (F1375A) or Arg (F1375R) substitution led to a complete loss of the deaminase activity, while a variant with Tyr substitution (F1375Y) showed residual activity, which highlights the importance of the π-stacking interaction (Fig. 5a). DddA F1375Y also showed activities on mismatched substrates ( Supplementary Fig. 1). For Met1379, either Ala (M1379A) or Arg (M1379R) substitution abolished the deaminase activity (Fig. 5a). These results show the importance of the hydrophobic patch of DddA in DNA substrate engagement and that structural perturbation of this region affects the target preference. One of the DddA residues positioned near the DNA backbone is Glu1370, which forms a part of the rim of the deep active site pocket along with Tyr1307 (Fig. 2c). In the structure with the cytosine base parked in the DNA major groove, Glu1370 side chain is pointed away from the DNA (Fig. 2b). When the cytosine base is engaged in the active site pocket, Glu1370 appears to be oriented toward DNA with ~3.7 Å between the carboxyl and phosphate groups, although weak electron density suggests high flexibility of this side chain ( Fig.  2a and Extended Data Fig. 1a). Substitution of either Lys (E1370K) or Article https://doi.org/10.1038/s41594-023-01034-3 Arg (E1370R) for Glu1370, which installs a positive charge to interact favorably with the DNA backbone phosphate, made DddA less active than the wild type ( Fig. 5a and Supplementary Fig. 1). It is possible that the dynamics of this residue plays a role in flipping the target cytosine base into the active site. Lastly, replacing His1345 with Cys, an alternative Zn-coordinating residue as found in some cytidine deaminases 24 , abolished the DddA activity (Fig. 5a).

Discussion
Our structural studies show that DddA active site captures the target cytosine base that has completely swung out of the DNA double helix ( Figs. 1 and 2a,c). Similar base-flipping mechanisms have been observed for various nucleic acid repair or modifying enzymes, including DNA glycosylase, cytosine methyltransferase, dsRNA adenosine deaminase and lesion-specific endonuclease [25][26][27][28][29][30][31][32] . A hallmark feature of these enzymes is the intercalation of an amino acid side chain into DNA/ RNA base stacks to fill a void in the double helix 33 . Another frequently observed feature is a sharp kink in the dsDNA substrate with unstacked bases, which also facilitates base flipping 25,29,31,34 . DddA uses both these strategies-the dsDNA bound by DddA is sharply bent at the base step 5′ to the 5′-TC motif, and Phe1375 inserts deeply into the minor groove. However, the mechanism of base flipping by DddA is distinct in that the intercalated phenylalanine replaces the adjacent (−1) thymine rather than the target (0) cytosine base itself (Fig. 3). This unique arrangement causes tandem displacement and a shift in the register of base pairing, with the target cytosine base extruded from the double helix. The DddA-dsDNA structure trapped with the target cytosine parked in the major groove ( Fig. 2b and Extended Data Fig. 1b) suggests that DddA can locate 5′-TC motifs in double-stranded DNA without engaging the cytosine base in the active site. It may represent an intermediate conformation that allows DddA to scan through a DNA sequence to locate target cytosines. The mechanism of 5′-TC target recognition by DddA is distinct from that of APOBEC family ssDNA deaminases (Extended Data Fig. 8). We showed previously that ssDNA substrates bound to A3A and A3B take a U-shaped conformation with the −1 thymine base bound in a groove on the enzyme surface, where it forms hydrogen bonds with a key Asp side chain 8 . In contrast, the −1 thymine in dsDNA bound to DddA remains intrahelical and is paired with a guanine base, where it makes both DNA base (guanine) and protein side chain (His1345) contacts (Fig. 3b). The hydrogen bonding to a DNA base in the widened minor groove by Zn-coordinating His1345 of DddA is distinct from the shape readout mechanism through histidine insertion into a compressed minor groove used by various DNA-binding proteins 35   by the dramatically relaxed target sequence selectivity of DddA on mismatch-containing dsDNA substrates. Residual sequence dependence observed for the mismatched substrates (for example, Fig. 4c, lane 9 versus lane 11) may reflect how efficiently the −1 base replaces the target (0) cytosine by interacting with its juxtaposed base and the surrounding protein residues, including His1345, in the distorted dsDNA conformation. While most amino acid substitutions that affect the key DNA minor groove interaction of DddA led to a loss of the enzymatic activity, several mutant enzymes retained DNA deaminase activity (Fig. 5 and Supplementary Fig. 1). These attenuated DddA variants could be useful in reducing off-target mutations or alleviating cytotoxicity in base editing, as shown in recent studies 16,19 . In addition, the enhanced activity of DddA A1341P toward mismatch-containing substrates (Fig. 4c  compared with Fig. 5d) suggests that it might be possible to engineer DddA to expand its targets. In this context, it is notable that recent directed evolution experiments have identified DddA11, a DddA variant containing A1341V and E1370K amino acid substitutions, which can edit non-TC targets in both mitochondrial and nuclear DNA 18 . Our studies reported here will be instrumental in further structure-based engineering of DddA for base editing or other new applications, either as the deaminase catalytic component or a vehicle for other DNA-modifying enzymes.

Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41594-023-01034-3. The DddA-dsDNA crystals were cryo-protected by brief soaking in the respective reservoir solution supplemented with 20% ethylene glycol and flash cooled by plunging in liquid nitrogen. X-ray diffraction data were collected at the NE-CAT beamline 24-ID-C of the Advanced Photon Source (Lemont, IL). The 8E5E dataset was processed using DIALS (https://dials.github.io). The 8E5D dataset exhibited anisotropic diffraction, and the dataset was processed with autoPROC 37 , which implements XDS 38 for integration, followed by three other programs from CCP4 Suite 39 : POINTLESS 40 , AIMLESS 41 and TRUNCATE 42 for reduction, scaling and structure factor calculation, respectively. Anisotropic diffraction analysis and truncation were done with STARANISO (https:// staraniso.globalphasing.org/). The structures were determined by molecular replacement with PHASER 43 using the previously reported inhibitor (DddI)-bound DddA structure 12 (PDB 6U08) as the search model. Iterative model building and refinement were conducted using Coot 44 and PHENIX 45 . The final resolution cutoffs for both crystal structures were determined by paired refinement 46 (Extended Data Fig. 9). A summary of crystallographic data statistics is shown in Table 1. Figures were generated using PyMOL (https://pymol.org/2/).

DddA activity assay
To reconstitute the active enzyme, DddA(1290-1396) was mixed with 10× molar excess of a chemically synthesized and HPLC-purified (purity >90%, BIOMATIK) C-terminal peptide corresponding to the residues 1397-1422 (GAIPVKRGATGETKVFTGNSNSPKSP). The deaminase assay was conducted with a 5′-fluorescein-labeled 14-mer DNA oligonucleotide (5′-GCAACGTCCGGTAC-3′) or its variants with different −1 bases  29 . The reactions were stopped by the addition of formamide to 65% and heating to 95 °C for 10 min. The products were separated by gel electrophoresis on a 15% polyacrylamide TBE-urea denaturing gel and visualized by scanning on a Typhoon FLA 9500 imager. For every experiment, the activity of pfuEndoQ was verified on a control DNA oligonucleotide with dU (2′-deoxyuridine) in place of the target C. Specifically in the experiment shown in Extended Data Fig. 7, 3′-fluorescein-labeled DNA substrates were used. All oligonucleotides were obtained from Integrated DNA Technologies. For investigating metal ion dependency, DddA(1290-1396) purified in the presence of 1 mM EDTA was first dialyzed overnight against the SEC buffer containing 0.5 mM TCEP and no EDTA in a Slide-A-Lyzer MINI Dialysis Device, 3.5 kDa MWCO. The dialyzed protein was quantitated by measuring UV absorbance and subjected to the deaminase assay as above in four modified buffer conditions, including (1) no added metal ions, (2) 20 μM ZnCl 2 , (3) 1.0 mM MgCl 2 , (4) 20 μM ZnCl 2 and 1.0 mM MgCl 2 , with molar ratios between DddA(1290-1396) and DddA(1397-1422) of 1:1 and 1:10 (Extended Data Fig. 5). The Mg 2+ -free reactions were supplemented with 1 mM MgCl 2 upon the addition of pfuEndoQ and heating to 60 °C.

Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability
Atomic coordinates and structure factors have been deposited in the Protein Data Bank (PDB) under accession codes 8E5E and 8E5D. Source data are provided with this paper. All other data are available from the authors upon request.