Dear Editor,

DNA phosphorothioate (PT) modification, originally developed as an artificial tool to stabilize oligodeoxynucleotides against nuclease degradation1, was recently found to be incorporated with sulfur into DNA backbone as a novel physiological variation by the five-gene dnd cluster (dndA-dndE) products in a sequence- and stereo-specific manner2. This PT modification causes the DNA degradation (Dnd) phenotype and is widespread and quantized in bacterial genomes, working as a part of a restriction modification system3,4,5. This modification can be specifically cleaved in vitro by type IV restriction endonuclease6. DndA works as a cysteine desulfurase and assembles DndC as a 4Fe-4S cluster protein7. DndC possesses ATP pyrophosphatase activity8,9 and is predicted to have 3′-phosphoadenosine-5′-phosphosulfate (PAPS) reductase activity, whereas DndB has homology to a group of transcriptional regulators10,11. DndD, known as SpfD in Pseudomonas fluorescens Pf0-1, has ATPase activity possibly related to DNA structure alteration or nicking during PT incorporation5,12. Sequence identity (46%) and similarity (61%) to phosphoribosylaminoimidazole carboxylase (NCAIR synthetase) from Anabaena variabilis11 suggest that DndE could be a NCAIR synthase analogue13. However, DndE may also act as a sulfotransferase due to a specific PAPS binding sequence AAVGK-TLLIHLHR contained in the C-terminus of DndE from Streptomyces lividans (DndEstrep)10. Therefore, the exact function of DndE remains unknown.

To address this, we over-expressed two DndE proteins from Escherichia Coli B7A (DndEB7A-full) and Salmonella enterica (DndESalm-full), respectively (Supplementary information, Data S1), both having isoelectric point values close to 9.5. Sequence alignment indicates that they have high similarity except for several residues in their C-termini (Figure 1A). The sequence-selectivity of PT modification in both strains is a d(GPSA) or a d(GPST) dinucleotide with an RP PT bond5. DndEB7A-full is unstable, so we used its C-terminus-truncated form (aa 1-110, DndEB7A-N110) for crystallization. The aggregation states of DndE proteins were determined by dynamic light scattering (DLS) assay (Supplementary information, Data S1), indicating the formation of tetramers at the concentration of 200 Î¼M. Two-dimensional NMR 1H-15N HSQC spectra on 15N-labeled DndEB7A-N110 suggest that the DndEB7A-N110 fragment in Tris buffer is more homogeneous (Supplementary information, Figure S1). The structure was determined using the single-wavelength anomalous dispersion technique, and refined to 2.5 Ã… resolution with an Rfree factor of 24.2% and an R factor of 20.1% (Supplementary information, Table S1).

Figure 1
figure 1

DndE structure analysis and fluorescence polarization assay for DNA-binding measurement. (A) Sequence alignment of DndEB7A-full and DndESalm-full with structural elements labeled on the top. (B) Ribbon diagram of DndEB7A-N110 tetramer, monomer A is shown in green, E in purple, C in royal blue and M in yellow. (C, D) Interfaces between A-M monomers and between A-E monomers. (E) Cleft formed in DndEB7A-N110tetramer. (F) The view rotated by 90° relative to the x axis from E. (G) Electrostatic surface representation of DndEB7A-N110 tetramer at the same orientation with B. (H) The view rotated by 90° relative to the x axis from G. (I-K) Fluorescence polarization assay for DNA interaction with (I) DndESalm-full, (J) DndEB7A-N110 and (K) DndEB7A-N110 mutants, respectively. In the bottom, the sequences of DNA substrates were indicated, where the blur star represents phosphate.

The DndEB7A-N110 structure reveals a tetramer conformer (Figure 1B), consistent with the results from DLS assay. The tetramer adopts a quadrate fold, resembling a four-leaf clover of 62 Ã… in both length and width and 32 Ã… in height. It contains two kinds of dimers (A-E and A-M) with different stability resulted from different numbers of H-bonds in the dimer interface (Figure 1C and 1D). Two clefts (∼60° and ∼90°) are formed by the dimers (Figure 1E and 1F). Electrostatic surface analysis indicates that the cleft-containing side is more hydrophobic than its opposite side (Figure 1G and 1H). The K20 side-chain in each monomer extends into the tetramer center, forming hydrogen bonds with G24 and/or G21 in the next monomer, producing a large positively charged hole (Figure 1B-1D and 1G). The four monomer structures are almost identical and can be superimposed with an root-mean-square deviation of 0.33 Ã… over 59 backbone Cα atoms in the secondary structural region. Each monomer comprises five helices: H1 (aa 10-22), H2 (aa 27-39), H3 (aa 62-66), H4 (aa 70-77) and H5 (aa 86-102) (Figure 1A). Together with the H2 helix, a long flexible loop (aa 42-60) between the H2 and H3 helices encloses the H5 helix, where the residues R49, D50, S51 and K52 are not observable in some monomers.

Dali search indicates that DndEB7A-N110 is a novel DNA-binding protein (Supplementary information, Figure S2). Thus, we performed DndEB7A-N110 binding studies using different DNA substrates in vitro, including dsDNA with or without a specific sequence (5′-GAAC-3′) for PT modification as found in E. coli B7A and Salmonella (specific dsDNA and random dsDNA in Figure 1). Since DndD is an ATPase with putative DNA-nicking actitvity5,12, we assumed that, before PT modification, DndD might hydrolyze the P-O bond between the G-A bases in PT modification sequence. Thus, two nicked dsDNA species were also tested (in one sequence, nicking is located between G-A bases in a specific sequence for PT modification, named as nicked SdsDNA; in the other sequence, nicking is designed between T-C bases in a random sequence, denoted as nicked RdsDNA, as shown in Figure 1), as well as a sequence-specific single-stranded DNA (ssDNA in Figure 1). Among different DNA substrates, DndESalm-full has the strongest binding affinity to nicked dsDNA (for both nicked SdsDNA and nicked RdsDNA, KD ∼20 μM), indicating that DndE might be a nicked dsDNA binding protein (Figure 1I), and that the specific –GAAC–sequence might not be relevant for nicked DNA binding. DndEB7A-N110 has a higher binding affinity to nicked SdsDNA (KD ∼30 μM) than that to nicked RdsDNA (KD ∼60 μM), specific dsDNA (KD ∼90 μM), random dsDNA (KD ∼146.5 μM), and specific ssDNA (binding is non-detectable) (Figure 1J, Supplementary information, Table S2), suggesting that DndEB7A-N110 also slightly prefers to bind with nicked dsDNA. Among the mutants where positively charged residues in the DndEB7A-N110 surface including K17, K18, K20, K53, K87 and K91 are replaced by alanine, the K20A and K18A mutants showed significant decreases in binding affinities to specific dsDNA (K20A, KD ∼690 μM; K18A, non-detectable). All these variants showed much weaker (K87A, KD 323 μM, decreased by 10-fold) or non-detectable binding to nicked SdsDNA (Figure 1K), further supporting that all these positively charged residues might be important for interaction with nicked dsDNA.

In addition, we measured the interaction of DndEB7A-N110 with PAPS, ATP or ADP using the isothermal titration calorimetry assay (Supplementary information, Figure S3). In contrast with the early prediction10,11,13, DndEB7A-N110 showed no binding affinities to any of them (Supplementary information, Figure S3), consistent with the fact that the DndEB7A-N110 structure includes neither a 5′-PSB motif nor a 3′-PB motif for PAPS and PAP binding14, nor a glycine-rich P loop for ATP or ADP phosphate binding as seen in the NCAIR synthetase15,16.

Taken together, the crystal structure of DndEB7A-N110 reveals that DndE is neither a sulfotransferase nor a NCAIR synthase analogue, but a possible nicked dsDNA binding protein with a previously unrecognized fold. Thus, DNA nicking (probably by DndD5,12) and nicked DNA binding by DndE might be essential for DNA PT modification.