Introduction

In prokaryotes, control of DNA replication and related processes for cell division are aided by a family of DNA-binding proteins known as nucleoid-associated proteins (NAPs)1. GapR is an essential NAP in Caulobacter crescentus involved in DNA replication, chromosome segregation and cell division2,3,4. It has been shown to bind DNA both in vivo and in vitro3. A preference for AT-rich DNA has been reported, though the strength of this preference differs between reports2,4. GapR has dynamic chromosome-binding patterns and changing GapR distributions during the cell cycle. In C. crescentus “swarmer” cells before chromosome replication, fluorescently tagged GapR shows a concentration gradient across the whole chromosome from high GapR at the origin of replication region to low GapR towards the terminus region3,4. Upon starting chromosome replication, a GapR-depleted region of the chromosome is created that expands while tracking the mitotic-like movements of the newly duplicated DNA3 and the moving replication forks4. Chromatin immunoprecipitation (ChIP) experiments showed that GapR preferentially binds to the 3′ ends of actively transcribed genes, and that blocking transcription with antibiotics quickly (within minutes) dissipates GapR peaks and redistributes GapR towards the 5′ ends of these genes3,5. Arias-Cartin et al. proposed that replication forks encounter and dislodge tightly bound (presumed static) GapR molecules and that GapR molecules released into the cytoplasm rebind the chromosome at random places4.

GapR does not resemble other DNA binding proteins. It is a small protein of ~90 amino acids, with mainly α-helical secondary structure, and until very recently its mode of DNA binding was unknown. Last year, while this work was underway, a highly informative study which included crystal structures of GapR in the presence and absence of DNA was published5. In the reported structure of GapR determined in absence of DNA, GapR is seen to be a dimer, with each monomer folded into two extended α helices. In the structure of GapR determined in presence of DNA, GapR is seen to be a tetramer, with the C-terminal α helix of each monomer reorganized into two shorter helices, which allows GapR to encircle the DNA. The DNA in the published structure is 100% AT, over-twisted and slightly narrower than B-form DNA. In the accompanying experiments, GapR is proposed to associate with positively supercoiled chromosomal DNA5, which is often found ahead of the replication fork and RNA polymerase6. GapR was found to stimulate in vitro the topoisomerases gyrase and topo IV, and removing supercoils is required for DNA replication to proceed in vivo5.

In this paper we present three crystal structures of GapR from C. crescentus, crystallised in the presence of double-stranded DNA with a sequence from the origin of chromosome replication. In all three structures, GapR adopts the same overall tetramer ring form as seen by Guo et al., albeit with small but important increases in the diameter of the central channel. DNA could not be visualized in any of our three structures, but DNA could be detected in gel electrophoresis, indicating that DNA is present in the crystals. Furthermore, these crystals do not form without DNA in the crystallization protocol. Notably, double stranded B-form DNA can be docked into the channel of our structure without any clashes. We identified residues that our structural analyses suggested should be involved in DNA binding and showed that mutation of these residues abolishes DNA binding while retaining tetramer oligomerization. We propose that our crystals capture a physiologically-relevant GapR structure distinct from and complementary to the physiologically-relevant structures captured by Guo et al. Finally, we combine these structural data and present an updated model for DNA binding and translocation which more fully accounts for the observed dynamics of GapR binding.

Results

Structure determination of GapR

At the time of undertaking these experiments, no structural information on GapR was known. We set out to crystallize GapR alone and in the presence of DNA to structurally characterize the protein and protein-DNA complex. Initial attempts to crystallize GapR in the absence of added DNA yielded abundant microcrystals. Despite extensive efforts, conditions producing these microcrystals could not be optimized to yield diffraction quality crystals. We next tried crystallizing GapR in the presence of double stranded DNA (dsDNA). For this, we used both full length GapR and GapRΔ1-11 (a construct in which we removed the first eleven residues, that were predicted to be disordered) and a 19-bp dsDNA oligomer with a sequence from the C. crescentus origin of replication (Table 1), which GapR binds with a nanomolar dissociation constant3. GapR-DNA complexes were subjected to sparse matrix crystallization, and multiple crystallization conditions were identified. A single condition was optimized to yield three sets of morphologically distinct diffraction quality crystals, which were used to determine three independent structures of GapR (Table 2).

The best diffracting crystals were of full-length GapR plus DNA. A diffraction dataset was collected which included reflections to 1.85 Å resolution and was of the space group I4132. At the time, there were no available structures of GapR, and the protein shared no substantial identity with any other DNA-binding protein structures. Attempts to use DNA as a search model in molecular replacement were unsuccessful. GapR was categorized as containing the domain of unknown function 23122,4 which had previously been subjected to ab initio modelling and theoretical screening for nucleic acid binding potential7. We submitted the sequence of GapR to the Roβetta server for protein structure prediction8, and the top five models produced were used as search models in molecular replacement. One of these models gave a possible solution, with residues 35–65 well placed in the electron density maps generated by Phaser9. A subsequent search model using only these residues gave a definite solution and the remaining residues other than the first 11 could be manually built into the electron density to give a final model at 1.85 Å with an Rfree of 0.223 (Fig. 1).

Figure 1
figure 1

Crystal structures of GapR and packing in the crystal lattice. (a) GapR modelled in to a 2Fo-Fc electron density map, shown contoured at 1σ. (b) The GapR monomer from the asymmetric unit of the crystals of space group I4132, colored blue through to red from the N to C termini. (c) Four symmetry related monomers assemble into the physiologically relevant tetramer. The left side of panel C shows the central channel. (d) Overlay of the biological tetramer of GapR from crystals of space group I4132 (green), P2221 (cyan) and P4322 (purple) used in this study. (e) Packing in crystals of space group I4132, with different tetramers shown in red, blue, green, yellow and orange. The right side of panel e orients the channel of the green tetramer toward to viewer. (f) Packing in crystals of space group P2221. The left side of panel f orients the channels of all tetramers toward the viewer. (g) Packing in crystals of space group P4322. The channel formed by a tetramer is highlighted by a black rectangle in the right side of the panel.

Overall structure of GapR

In this structure, there is a single GapR monomer in the asymmetric unit. It has an extended N-terminal α helix followed by two shorter α helices (Fig. 1a). The three helices are relatively open and not bundled together, suggesting that this conformation would not be stabilized in the absence of a binding partner. Indeed, applying crystallographic symmetry assembles the monomers into repeating tetramer units (Fig. 1c) which have a large central channel. The tetramer is very similar, but not identical to the published tetramer5.

The same crystallization condition also produced crystals with a second morphology, which were of the space group P2221. The diffraction from these crystals displayed severe anisotropy and had resolution limits between 2.0 and 2.7 Å. Following ellipsoidal truncation and anisotropic correction using the Diffraction Anisotropy Server10,11, the structure was solved using our first GapR structure as a molecular replacement search model and refined to a final Rfree of 0.279. In this crystal, GapR is a dimer in the asymmetric unit, with the tetramer again being formed by crystallographic symmetry. The tetramer could be overlaid onto the I4132 tetramer with a root-mean-square deviation (rmsd) of 1.5 Å (Fig. 1d).

We were also able to solve the structure of GapRΔ1-11 from crystals of space group P4322. Diffraction from these crystals also showed high anisotropy, with resolution limits between 3.3 Å and 3.8 Å. Following ellipsoidal truncation and anisotropic correction using the Diffraction Anisotropy Server10 the structure was refined to an Rfree of 0.322. There are six monomers in the asymmetric unit, arranged as a tetramer and a dimer. The tetramer could be overlaid onto the I4132 tetramer with an rmsd of 1.9 Å (Fig. 1d). Crystallographic symmetry again showed that the dimer of the asymmetric unit paired with symmetry-related molecules to form tetramers. The main difference between GapR in the three crystal forms arises from slight changes in the orientation of helix three, which results in minor variations in the size of the central channel (Fig. 1d).

The central channel seen in the GapR tetramers is an obvious binding site for DNA. Interestingly, the channel is quite short, at around ~30 Å long. This is much shorter than the length of the 19-bp oligonucleotide, which would be ~65 Å long. In each crystal, GapR forms a complete and tight lattice (Fig. 1e–g), suggesting that although the presence of the DNA was required for crystal growth, it did not participate in crystal packing, and that there is not a 1:1 stoichiometry between 19 bp DNA and GapR tetramer. Furthermore, in the I4132 and P4322 structures, a 19 bp DNA oligio would have to adopt substantial bends to thread through the central channel of adjacent tetramers, because channels formed by adjacent GapR tetramers do not line up.

An electrostatic surface representation of the GapR tetramer showed that the lining of the channel is highly positively charged (Fig. 2a). We manually placed model B-DNA (PDB ID: 1BNA)12 into the channel and found that it could be fit without clashing with the GapR tetramer (Fig. 2b). Closer analysis of the channel identified six highly conserved lysine residues per monomer (Fig. 2c,d) that seemed likely to contribute to DNA binding by tetrameric GapR.

Figure 2
figure 2

Identification and analysis of a DNA binding channel. (a) Electrostatic surface representation of GapR with electropositive and electronegative regions colored blue and red respectively. (b) The GapR tetramer (green cartoon with transparent surface representation overlaid) channel of the I4132 crystal form can accommodate B-DNA (shown as sticks overlaid with transparent surface representation). B-DNA (PDBID: 1BNA)12 was manually placed into the GapR channel using PyMOL (The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC) and then subject to a round of energy-minimization in Phenix24. (c) Six highly conserved channel-lining lysine residues were identified and targeted for mutation. (d) Weblogo326 showing sequence conservation in the 250 proteins in the NCBI protein database which share highest percentage identity with GapR from C. crescentus (shown beneath). The lysine residues drawn as sticks in panel C are marked with an asterisk.

GapR is a tetramer in solution that binds DNA through conserved lysine residues

We used size exclusion chromatogram – multiangle light scattering (SEC-MALS) to investigate the oligomeric state of GapR in solution (Fig. 3a). SEC-MALS of GapR in the absence of DNA provided an estimate of its solution molecular weight of ~47.2 kDa, most consistent with a non-compact GapR tetramer. Analytical size exclusion chromatogram (SEC) of GapR in the presence and absence of the 19 bp DNA used for crystallization demonstrated that DNA binding does not alter the oligomeric state of GapR (Fig. 3b). Thus the analytical SEC and SEC-MALS show that GapR can exist as a tetramer in solution, even in the absence of DNA.

Figure 3
figure 3

Mutation of channel lysines abolishes DNA binding but does not disrupt tetramer formation. (a) SEC-MALS of apo GapR. Absorbance at 280 nm (blue) and the calculated scattering mass across the peak (black) are shown. The expected mass of tetrameric GapR is 40.9 kDa. (b) Analytical size exclusion chromatography of GapR (6.25 μM tetramer, ε280 = 2980 M−1 cm−1) in the (i) absence and (ii) presence of 19 bp DNA (20 μM, ε260 = 309,776 M−1 cm−1) using a Superdex 200 Increase 10 300 GL column. (c) Overlay of size exclusion chromatograms for GapR (blue) and GapR6KtoE (purple) GapR. Samples of 6.25 μM (if calculated using the tetrameric molecular weights) protein was applied to a HiLoad 16/600 Superdex 200 pg column. Arrows indicate the void volume of the column and elution positions of proteins standards of known molecular weight (66 kDa – bovine serum albumin, 29 kDa – carbonic anhydrase, 12.4 kDa – cytochrome c. (d) EMSA experiment demonstrating that DNA binding by GapR is mediated by conserved lysine residues lining the tetrameric channel. Wild type GapR or GapR6KtoE (0.425 μM tetramer) was incubated in the presence or absence of 19 bp oligonucleotide (1.7 μM) and used for non-denaturing gel electrophoresis. The gel was stained with SYBR Gold to detect DNA (left side of panel), then with InstantBlue to detect protein (right side of panel). Note that GapR has an overall negative charge. Uncropped gels are presented in Supplementary Figure 1.

We next tested if the conserved lysines (Fig. 2) did contribute to DNA binding. We mutated all six of these lysines to aspartic acids, and purified the resulting protein (GapR6KtoE). GapR6KtoE showed low solubility, which has also been reported for wild type GapR in the absence of DNA4, and it could not be concentrated above 0.4 mg/ml. To ensure the mutations did not disrupt tetramerization, we compared the oligomeric state of the mutant and wild type proteins by migration in size exclusion chromatography. GapR6KtoE migrated slightly slower than wild type GapR (Fig. 3c), but comparison with proteins of known molecular weights show this small difference is not indicative of a change in oligomeric state. To assess DNA binding of wild type and mutant proteins, we performed an electrophoretic mobility shift assay (EMSA) (Fig. 3d). DNA binding was observed for wild type GapR, but not for GapR6KtoE. Like wild type protein, GapR6KtoE migrated through the native gel as a single band (Fig. 3d) but slightly faster than wild type GapR. The small differences in migration is likely caused by the difference in charge introduced by the six lysine to aspartic acid mutations per monomer (24 per tetramer). The inability of GapR6KtoE to bind DNA supports the conclusion that GapR binds DNA through its central pore5.

The GapR channel can accommodate B-form DNA

Modelling of DNA into our GapR structures revealed that our original 19-bp oligonucleotide was not ideal for structural studies, as GapR would be able to bind at various locations along the 19-mer. To determine the optimal length of DNA required for GapR binding we performed EMSA experiments. We reasoned that when GapR is in excess over DNA, multiple copies of the GapR tetramer should bind to DNA if its length is sufficient. By using increasing lengths of oligonucleotides in EMSA experiments it should be possible to measure the transition that accommodates extra GapR binding as the threshold length is crossed. Accordingly, we performed EMSA with excess GapR and 14, 16 and 18-bp DNA oligonucleotides. GapR bound all DNA oligomers, but a second shift was observed only with the 18-bp DNA (Fig. 4a). Thus, a segment of 18 base pairs of DNA is just sufficient to bind two GapR tetramers, suggesting that a single GapR binds to ~9 base pairs of DNA.

Figure 4
figure 4

GapR DNA-binding analysis. (a) EMSA analysis of GapR binding to DNA oligonucleotides of different lengths. 0.4 μM DNA oligonucleotide (Table 1) was incubated with GapR (of concentrations 0.4, 1.2, 3.6 and 10.8 μM, if calculated using monomeric molecular weight or 0.1, 0.3, 0.9 and 2.7 μM if tetramer) prior to non-denaturing gel electrophoresis. GapR (2.7 μM tetramer) without oligonucleotide was analysed in the rightmost lane. The uncropped gel is presented in Supplementary Figure 1. (b) Crystals of GapR contain DNA. Crystals of space group I4132 crystals, grown from a sample of GapR with 8 bp oligonucleotide were subject to gel electrophoresis and stained for the presence of DNA (left side of panel) and protein (right side of panel). It is likely the free DNA observed here results from disassociation of the oligo after harvesting & washing the crystal and when loading onto the gel, when it is no longer in excess. The uncropped gel is presented in Supplementary Figure 1. (c) Overlay of the crystal structure determined from crystals of space group I4132 (green) with the GapR-DNA structure of Guo et al.5 (blue with DNA shown in orange, PDB ID: 6CG8). The Guo et al. structure features a marginally narrower channel.

We next repeated our crystallization protocols with GapR in the presence of DNA oligos of 8, 9 and 10 base pairs. Crystals with identical morphology to those grown with 19-bp DNA were obtained for the 8-bp DNA, but despite the inclusion of excess DNA oligonucleotide during cryoprotection, no density for DNA could be seen in the electron density maps of the resulting structure. A native gel of washed, dissolved crystals demonstrated that DNA was in fact present (Fig. 4b). Thus, the absence of DNA in the crystal structures suggests that the DNA binds in a continuum of positions within the channel. Indeed, the lysine residues are fairly evenly spread out in the channel, and a mechanism to keep the DNA at any one particular position within the channel is not evident.

Discussion

During the course of this work, a structure of GapR in complex with DNA was published5. GapR forms essentially the same tetramer seen in our structures, with DNA bound in the central channel, as we anticipated. The six lysines that we identified, plus an additional arginine, all make contacts with the DNA backbone. In marked contrast to our study, which used DNA with an AT content ranging from 40 to 53%, the DNA used by Guo et al. was 100% AT. This DNA adopted an over-twisted conformation, with a wider major groove and narrower minor groove than seen in B-DNA. The length of this DNA oligomer is 11 base pairs and it forms head-to-tail contacts with the adjacent symmetry-related DNA molecule to form extended noncovalent DNA polymers. The length and AT content of the DNA were likely both important for crystallization and the ordering that allowed the DNA to be visualized.

Over-twisting of DNA slightly narrows its diameter relative to B-DNA and the DNA-bound GapR tetramer5 is constricted relative to our structure, with a narrower DNA channel (Fig. 4c). Guo et al. report being unable to fit B-DNA into their GapR tetramer without clashes5, although these appear to be minor clashes with mobile side chains. In contrast, our analysis of our structure indicates the tetramer we observe could comfortably accommodate B-DNA (Fig. 2b). In addition to crystallizing tetrameric GapR bound to DNA, Guo et al. were also able to crystallize GapR from the closely related Bosea sp. Root381, in the presence and absence of DNA. In the presence of DNA, which was visible in electron density maps but could not be reliably modelled, GapR was tetrameric. In the absence of DNA, GapR crystallized as a dimer, with α helices 2 and 3 rearranging to form a single extended helix5. On the basis of these structures, the authors proposed a model for DNA binding whereby dimeric GapR uses its two extended helices to track along DNA, monitoring the size of the major and minor grooves. Upon encountering over-twisted DNA, the extended α-helix re-organizes into two shorter helices and forms the stable tetramer with a co-translocating dimer. In our studies with GapR from C. crescentus, we saw no evidence of the dimer form in the presence or absence of DNA (Fig. 3). However, we did note some subtle difference in position of helix three in each of our crystal forms (Fig. 1d), suggesting that plasticity of this region could possibly be important for formation and size of the central channel.

That the current structures show GapR tetramers can accommodate B-DNA in the channel, and that we observe tetrameric GapR in absence of DNA, has important implications for the model of DNA binding (Fig. 5). GapR could first associate with B-DNA as a dimer, with the region of helices 2 and 3 extended into a single long helix, like in the Guo model, or as a tetramer, which would open to allow DNA binding by splaying of two interacting copies of helices 2 and 3. In both proposed pathways, the tetramer would close around DNA by rearrangement into the α helices 2 and 3 into the conformation observed in our structures, with the wider GapR tetramer bound to B-DNA. GapR with this wider channel would scan along B-DNA until it encounters over-twisted, AT-rich DNA. The over-twisted DNA would induce the transition to the slightly constricted tetramer, allowing it to bind the over-twisted, AT-rich DNA with higher affinity5, thus localizing GapR to this site.

Figure 5
figure 5

Proposed models for GapR binding to DNA. GapR first associates with DNA via one of two possible pathways. In the pathway shown to the left, GapR dimers5 (PDB ID: 6CFY) associate with DNA weakly until encountering a partner. The two dimers then assemble into a GapR tetramer by rearrangements in the C-terminal α helices and encirclement of the DNA. In the pathway shown on the right, a pre-formed GapR tetramer encircles DNA through rearrangement of two C termini on one side of the tetramer. If the tetramer has formed around B-form DNA, it adopts a more open conformation and can freely diffuse along the DNA. Upon encountering over-twisted DNA, GapR constricts slightly (as in PDB ID: 6CG85), narrowing the channel around the DNA and leading to a higher-affinity complex.

A key point of the updated model is the scanning of DNA by the GapR tetramer. GapR encircling and scanning B-DNA could be more processive and efficient than scanning as a dimer, as it could proceed by one-dimensional diffusion rather than repeated association and disassociation. The model also allows GapR tetramers to form on B-form DNA in the bacterial genome, and not just at over-twisted DNA, which obviates the less probable event of two GapR dimers being coincidentally attracted to the same over-twisted DNA. It is also not clear how a GapR dimer would be preferentially attracted to over-twisted DNA. Scanning of B-DNA by tetrameric GapR remains to be directly shown, but passive movement along double stranded DNA has ample precedence in proteins which encircle DNA such as sliding clamps, MutS-ATP, Ku70/80 and proliferating cell nuclear antigen13,14,15,16,17,18. It is also possible that interactions with other proteins in the cell could affect the size of the channel and the sliding rate along the DNA.

The GapR binding and scanning model also help explain previous observations that seemed significant for biological function yet lacked a mechanistic basis. Having GapR tetramers sliding across a predominantly CG-rich genome and occasionally encountering AT-rich patches where presumably they pause with a tighter conformation and a narrower channel could help reconcile the differential reporting of a strong2 or weak preference for AT-rich DNA in vivo3,4,5. It is possible that the FLAG-tagged GapR used by Ricci et al. somehow favored the tighter conformation, accentuating the preference for AT-rich DNA.

More importantly, the model has implications before and during chromosome replication. The GapR gradient in pre-replication C. crescentus “swarmer” cells is easier to explain if GapR tetramers can bind relaxed DNA. Also, C. crescentus GapR is required at the start of chromosome replication and during the first stage of chromosome separation that coincides with early DNA duplication3. ChIP experiments in synchronized cells showed distinct high and low patterns of GapR binding at each of the “left”, “middle” and “right” positions within a short (~700 bp) span of the C. crescentus chromosome origin of replication. Such localized and DNA-specific binding patterns probably result from GapR encounters with DNA sequence-specific replication proteins as well as RNA and DNA polymerases. GapR binding and sliding would be important for reaching targets inside the origin of replication and similarly important to minimize interference with essential replication proteins on a crowded DNA platform19.

Materials and Methods

Oligonucleotides

Single stranded DNA oligonucleotides used in this study are listed in Table 1. To anneal them into double stranded DNA, oligonucleotides were resuspended in annealing buffer (10 mM Tris pH 7.5, 50 mM NaCl, 1 mM EDTA) at 100 μM. Complementary strands were mixed at a 1:1 ratio, incubated at 95 °C for five minutes and then slow-cooled to room temperature.

Table 1 DNA oligonucleotides used in this study.
Table 2 Data collection and refinement statistics.

Cloning, expression and purification of GapR constructs

Full length GapR was amplified from plasmid pJT1603 using primers GapR_Fwd and GapR_Rev (Table 1). The PCR product was digested with NcoI and EcoRI (New England Biosciences) and ligated into a similarly digested pJ411-derived vector containing an N-terminal TEV cleavable octa-histidine tag. GapRΔ1-11 was generated by site-directed mutagenesis by deleting the first eleven codons of gapR with primers GapRNtDelF and GapRNtDelR. GapR6KtoE was synthesised by ATUM (Newark, California) and cloned into pUC57 between the NcoI and NotI restriction sites. This region was then subcloned into the NcoI/NotI digested pJ411-derived vector described above.

Expression of GapR, GapRΔ1-11 and GapR6KtoE was induced in Escherichia coli ΒL21(DE3) cells grown at 30 °C in LB media to an OD600 of ~0.6, before inducing with 1 mM isopropyl β-D-1-thiogalactopyranoside. Cell cultures were grown for five more hours at 30° before harvesting. Cell pellets were stored at −80 °C until required.

Pellets were resuspended in IMAC binding buffer (20 mM HEPES pH 7.4, 200 mM NaCl, 40 mM imidazole, 10% glycerol, 2 mM β-mercaptoethanol (βME)) supplemented with 1 mM PMSF and several crystals of DNase I. Cells were lysed by sonication (5 minutes total pulse at 50% amplitude, 10 s on, 20 s off) and the lysate cleared by centrifugation at 18 000 g. Cleared lysates were applied to a 5 ml HisTrap IMAC FF column (GE Healthcare) and washed with heparin elution buffer (20 mM HEPES pH 7.4, 2 M NaCl, 10% glycerol, 2 mM βME). After re-equilibration with IMAC binding buffer, bound protein was eluted with IMAC elution buffer (as binding buffer but 800 mM imidazole). Fractions containing GapR were pooled and applied to a 5 ml HiTrap Heparin HP column (GE Healthcare) and bound protein eluted on a 100 ml gradient to 60% heparin elution buffer. Protein-containing fractions (GapR6KtoE failed to bind the column and protein was recovered from the flow through) were pooled and the His tag cleaved during overnight dialysis against 1 L of IMAC binding buffer in the presence of TEV protease20 at 4 °C. Samples were reapplied to the HisTrap column and the flow-through collected. For formation of GapR-DNA and GapRΔ1-11-DNA complexes for crystallization, excess annealed oligonucleotides (Table 1) were added and the resultant complex purified on a HiLoad 16/600 Superdex 200 pg column (GE Healthcare) in SEC buffer (20 mM HEPES pH 7.4, 200 mM NaCl, 0.2 mM TCEP) to remove unbound DNA.

Crystallography

GapR-DNA and GapRΔ1-11-DNA complexes were concentrated to 3 mg/ml, as determined by the Bio-Rad Protein Assay, (Bio-Rad Laboratories) using 3 kDa MWCO Amicon Ultra centrifugation devices (EMD Millipore) and subjected to sparse-matrix crystallization against commercially available screens (Qiagen) using sitting drop vapor diffusion at room temperature. Crystals were obtained in multiple conditions, with final conditions optimized in 24-well sitting drop plates with 500 μl reservoir and 2 μl protein sample plus 2 μl reservoir solution in the drop. The final crystallization conditions for the I4132 and P2221 crystal forms of GapR-DNA were 0.16 M ammonium sulfate, 12% PEG 3350 with 1% 1,2-butanediol or 10 mM cadmium chloride respectively. Crystals with morphologies indicative of both crystal forms were seen in both conditions. GapRΔ1-11-DNA crystallized in space group P4322 in 4% PEG 3350, 0.16 M ammonium sulfate, 0.6% 1,2-butanediol. Crystals were transferred to the reservoir solution used for crystallization that additionally contained all the components of SEC buffer and 20–30% glycerol and then cryo cooled in liquid nitrogen. Datasets were collected on beamline 08ID-1 of the CMCF at the Canadian Light Source, using light of 0.979 Å wavelength in Saskatoon, SK, Canada.

Datasets were indexed with the program iMosflm21 and scaled with the program AIMLES22. The P2221 and P4322 datasets displayed high levels of anisotropy and were re-indexed and re-scaled in XDS23 prior to submission to the diffraction anisotropy server10,11. Ellipsoidal resolution boundaries of 2.6, 2.0 and 2.3 Å for the P2221 dataset and 3.8, 3.8 and 3.3 Å for the P4322 dataset were applied along the a*, b* and c* axes respectively and anisotropically scaled using the webserver. Data collection statistics for the pre- and post-anisotropically corrected datasets are presented in Table 1.

Structure determination of GapR in the I4132 space group was performed by molecular replacement in the Phaser module of PHENIX24 using models generated by the Roβetta server8 from the GapR sequence. This produced a partial solution for residues 35–66. After using these residues alone as a molecular replacement model, clear density for the remaining residues was visible in the resulting maps. GapR was manually built into the maps in the program Coot25 followed by refinement in PHENIX to produce the final model (Table 1). This structure was then used as a search model to determine the structure of GapR in space groups P2221 and P4322 by molecular replacement using Phaser followed by iterative rounds of model building and refinement in Coot and PHENIX. Figures were generated using PyMOL (The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC).

SEC-MALS

A sample of 50 μl of GapR at 0.5 mM concentration (when calculated using tetrameric molecular weight) was applied to a Superdex-200 Increase 10 300 GL column (GE Healthcare) equilibrated in SEC buffer, attached to in-line miniDAWN TREOS (Wyatt Technologies) and Optilab rEX (Wyatt Technologies) SEC-MALS instruments. Molecular mass from SEC-MALS data was calculated with the ASTRA 5.3.4.20 (Wyatt Technologies) software. Lower concentrations of GapR eluted at the same volume but gave noisier scattering. Bovine serum albumin was used as a standard.

Gel electrophoresis

For the EMSA experiments, GapR was incubated with DNA in annealing buffer and subjected to electrophoresis in an 8%, 0.5 X TBE gel at 100 V for 60–80 minutes in 0.5 X TBE buffer pH 8.3 at 4 °C. DNA was visualized with SYBR Gold (Thermo Fisher Scientific) and imaged with an AlphaDigiDoc gel documentation system (Alpha Innotech). Protein complexes were stained with InstantBlue (Sigma).

For gel electrophoresis analysis of GapR-DNA crystals, 10 crystals were each looped into 1 μl stabilization buffer (20 mM HEPES pH 7.4, 200 mM NaCl, 0.2 mM TCEP, 4% PEG 3350, 0.16 M ammonium sulfate, 1% 1,2 butanediol), then looped into 20 μl of annealing buffer prior to electrophoresis as described above for EMSA.