Introduction

In all domains of life, second messenger signaling is essential to modulate the intracellular response to external stimuli. In bacteria, purine nucleotide second messengers, such as guanosine tetra- and pentaphosphate, collectively referred to as (p)ppGpp, and bis-(3´-5´)-cyclic dimeric guanosine monophosphate, c-di-GMP, are involved in the global control of physiological responses to environmental change1,2. (p)ppGpp is the primary regulator of bacterial growth and development in response to stress and nutrient limitation also known as the stringent response3,4,5. It modulates cellular reprogramming via multiple target proteins including RNA polymerase, translational GTPases, and metabolic enzymes6,7, thereby controlling bacterial transcription, translation8, cell cycle progression9,10, stress resistance, and virulence11,12. In most bacteria, c-di-GMP controls the transition between motile and sessile lifestyles. Low c-di-GMP levels are associated with motility, while its accumulation promotes adhesion and biofilm formation13,14,15,16. However, an increasing number of studies indicate that c-di-GMP has an impact on diverse aspects of bacterial physiology including cell cycle progression, metabolism and stress resistance2,17,18,19,20,21,22.

The pleiotropic effects of (p)ppGpp and c-di-GMP are realized due to the diversity of their effectors, represented mainly by nucleotide-binding proteins and riboswitches15,23,24. In particular, the structural diversity of cyclic nucleotide, comprising various conformations from an extended monomeric form to a stacked dimer, explains the variety in c-di-GMP-binding motifs25,26,27. The canonical c-di-GMP binding sites are represented by RxxxR and [DN]xSxxG motifs in the PilZ domains, the RxxD motif in degenerate GGDEF I sites of DGCs and ExLxR in the EAL domains of PDEs. Moreover, several proteins with a non-canonical c-di-GMP binding motif have been recently characterized as high-affinity binding receptors, suggesting a widespread function of c-di-GMP in bacteria25,28.

The development of biochemical methods to identify second messenger effectors greatly complemented our knowledge of novel c-di-GMP and/or (p)ppGpp binding proteins and their interaction networks28,29,30. Recently we have identified the first common target of c-di-GMP and ppGpp, SmbA protein from C. crescentus31. SmbA stimulates Caulobacter growth on glucose while preventing surface attachment in its active state repressed by binding of the c-di-GMP dimer (Fig. 1). The two ligands inversely regulate protein activity presumably by affecting its conformation. The major conformational changes promoting SmbA’s functional switch affect the C-terminal helix 9 and the flexible loop 7 containing c-di-GMP subsite residues R211 and D214 from the RxxD motif (Fig. 1). In the c-di-GMP-bound state, C-terminal helix 9 is stabilized by a salt bridge of D218 (from the loop7) and R289 (from helix 9), while in the ppGpp-bound state, loop 7 is disordered and helix 9 is in the open conformation (Fig. 1). Mutation of R211 to alanine leads to a prolonged adaptation phase and reduced growth in cells suggesting the involvement of loop 7 and potentially helix 9 in downstream signaling31.

Figure 1
figure 1

(Adopted from Shyp et al.31).

Second messenger mediated regulation of SmbA. Binding of a c-di-GMP dimer (blue sphere) inactivates SmbA (“OFF state”, grey), while its dissociation or displacement by a ppGpp monomer (an orange half-sphere) activates the protein (“ON state”, light orange). Loop 7 is shown in green, the C-terminal α9 helix is represented by a magenta cylinder. Amino acid residues essential for salt bridge formation between α9 helix and loop 7 are indicated. Key residues of the RxxD motif in loop 7 are shown in the red box. The physiological functions of activated SmbA are indicated with red dashed lines

To date, our structural knowledge about SmbA, however, is restricted to the wild-type protein in the presence of ligands. To understand how a flexible loop 7 influences the overall SmbA structure and its ligand binding, we present here the high-resolution structure of a loop 7 deletion mutant (fragment 198–215, hereafter SmbA∆loop). We observe that the mutant retains the TIM-barrel fold, however, accommodates only a monomer of c-di-GMP in a unique extended/open conformation. Importantly, in the SmbA∆loop mutant, C-terminal helix 9 adopts an outward orientation similar to that found in the ppGpp-bound active state of the protein. Moreover, changes in c-di-GMP binding stoichiometry in SmbA∆loop mutant, similar to loop 7 single mutant R211A, provide a potential mechanism and essential role of loop 7 in c-di-GMP dimerization and SmbA functional regulation.

Results and discussion

SmbA∆loop forms a crystallographic dimer mediated by monomeric c-di-GMP

Ligand-induced conformational changes may be critical for SmbA physiological function, in particular for interaction with its yet-to-be-discovered downstream targets. Based on the fact that loop 7 is disordered in the apo-state but becomes ordered upon binding of a c-di-GMP dimer, and that mutation of the interacting arginine residue 211 from this loop renders SmbA inactive in signaling31, we hypothesize that loop 7 is a central component of the physiological switch.

To explore the structural changes promoted by c-di-GMP via loop 7 we tried to crystallize the apo form of SmbA protein as well as SmbAR211A and a SmbA∆loop mutant with a partial loop deletion (fragment 198–215 deleted) in complex with c-di-GMP. We only obtained suitable crystals for SmbA∆loop (Supplementary Fig. S1a), which diffracted extremely well to 1.4 Å resolution and belong to space group P43212 with one molecule in the asymmetric unit. The structure was determined by molecular replacement using the structure of wild-type SmbA (PDB: 6GS831) after removing c-di-GMP from the model as a template, followed by iterative refinement. The data collection and refinement statistics are summarized in Table 1.

Table 1 Crystallographic data collection and refinement statistics.

The crystal structure shows that SmbA∆loop forms a crystallographic dimer stabilized by a monomeric c-di-GMP molecule (Fig. 2a). The ligand is found in a fully extended conformation and makes isologous interactions with the two protomers of the protein dimer (Fig. 2b). The guanine bases of c-di-GMP interact extensively, via both polar and nonpolar contacts, with monomers A and B of the dimer. As in the wild-type complex, they form cation–π interactions with the guanidinium groups of R143 from both protomers (Fig. 2b). Detailed interactions will be discussed in detail further below.

Figure 2
figure 2

Crystal structures of SmbAΔloop with c-di-GMP bound across the crystallographic dyad and of apo SmbAΔloop. (a) The two monomers are depicted as surface (negatively charged atoms in red, positively charged atoms in blue and carbon atoms in green) with monomer A (gray) in standard orientation and monomer B (symmetry mate) in cyan. c-di-GMP in the dimer interface is shown as ball-and-stick model. (b) Stereoview down the twofold axis (indicated as a small orange ellipsoid), showing c-di-GMP forming isologous interactions with the two SmbAΔloop protomers. Relevant residues are shown as color-coded sticks (oxygen, red; nitrogen, blue; carbon, green or cyan and waters as red and cyan spheres) and labeled. Residues and waters of the symmetry mate monomer are marked with an asterisk. Hydrogen bonds between subunits and c-di-GMP are indicated as yellow dotted lines. (c) Crystal packing of apo SmbAΔloop shown in surface representation. The four molecules are arranged in an asymmetric unit form two local dimers (A and D, B and C) with 2-fold symmetry.

In solution, the c-di-GMP to-protein stoichiometry using ITC was 1:1 (Supplementary Fig. S1b) and not 1:2 as would have been expected from the crystal structure, indication that SmbA∆loop dimer formation occurs probably only at very high concentration as used for crystallization or during crystal formation.

In addition to the SmbA∆loop/c-di-GMP complex, we also determined a crystal structure of the protein in the absence of c-di-GMP. Overall, apo SmbA∆loop shows virtually the same structure as in complex with c-di-GMP with an rmsd value of 0.49 Å for 225 Cα atoms (Fig. 3d). The Crystal contains four molecules in the asymmetric unit. Given the relatively small interface and loose packing in the crystal lattice, we consider the inter-molecular interactions to be crystallographic artifacts (Fig. 2c).

Figure 3
figure 3

Detailled crystal structures of SmbAΔloop in presence and absence of c-di-GMP and and structural comparison with wild-type SmbA ligands. (a) Crystal structure of the SmbAΔloop with the backbone drawn in grey cartoon and monomeric c-di-GMP shown in a stick. Residues in the SmbAΔloop important in interaction with the c-di-GMP molecule are drawn in stick representation. Carbon atoms are shown in green, nitrogen in blue and oxygen in red. (b) 2Fo-Fc omit maps contoured at 1.2 σ of c-di-GMP and full structural details of the interacting residues. H-bonds (length < 3.5 Å) are indicated by gray lines and water molecules in red spheres. (c) View of c-di-GMP (green) as bound to SmbAΔloop, and the proximal c-di-GMP molecule (blue) of dimeric c-di-GMP and ppGpp (orange) as bound to wild-type SmbA. The proximal guanyl of monomeric c-di-GMP (G1), guanyl of ppGpp (G) and G4 of dimeric c-di-GMP overlap closely. While the other guanyl (G2) of the monomeric ligand has moved out considerably, to form isologous interactions with the second SmbAΔloop molecule (not shown). (d) structural superposition of SmbAΔloop/c-di-GMP (gray) with SmbAΔloop (chocolate) yielding a RMSD of 0.49 Å.

As measured directly by sedimentation velocity analytical ultracentrifugation (AUC-SV), apo SmbA∆loop is monomeric with a sedimentation coefficient of 1.73 s (Fig. 4). Addition of c-di-GMP does not change the sedimentation coefficient significantly. In addition, a small secondary peak is generated at 2.3 S, which may indicate some dimer formation. In contrast, SmbAwt experiences a substantial shift in the sedimentation coefficient upon c-di-GMP addition, probably due to the larger mass of the dimeric ligand and the induced change in protein shape due to loop 7 ordering. As shown in Fig. 4b, a single species was observed in all cases with estimated masses were about 38 and 32 kDa for SmbAwt and SmbA∆loop, respectively. No significant difference in S and f/f0 upon addition of c-di-GMP was observed for both proteins (Fig. 4b). This result is consistent with our previous report that SmbAwt does not change its oligomeric state upon c-di-GMP binding as derived from MALS data31. These results further support that in solution a single c-di-GMP molecule does not cause SmbA∆loop to dimerization.

Figure 4
figure 4

Analytical ultracentrifugation (AUC) analysis of SmbAwt and SmbAΔloop. (a) SV-AUC absorbance c(s) distributions of SmbAwt, SmbAwt/(c-di-GMP)2, SmbAΔloop and SmbAΔloop/c-di-GMP. (b) Mass estimation and s and f/fO values of SmbAwt, SmbAwt/(c-di-GMP)2, SmbAΔloop and SmbAΔloop/c-di-GMP.

C-di-GMP-mediated dimer stabilization has been observed previously, involving dimeric, and tetrameric c-di-GMP in the case of VpsT32 and BldD19, respectively (for a review see ref. 27). Furthermore, c-di-GMP accommodation in the rigid dimer interface has been described for STING protein33. Notably, structure comparison shows that VpsT, STING, and SmbA involve symmetric stacking interactions (with W131, Tyr167, and R143, respectively) which cap two guanine bases of c-di-GMP from both sides at the dimer interface (Supplementary Fig. S2). We anticipate that protein dimerization involved c-di-GMP with isologous interactions may be operational in more, hitherto unrecognized target.

Apo and c-di-GMP bound SmbA∆loop structures and comparison with SmbAwt structures

Overall, the SmbA∆loop mutant retains the TIM-barrel fold with eight α-helices on the outside and eight parallel β-strands on the inside with an extra helix 9 (Fig. 3a). The occupancy of the c-di-GMP ligand was set to 50% to account for its binding across the crystallographic dyad (half of the c-di-GMP molecule belongs to the symmetry mate). The ligand fit to the electron density very well after considerable conformational adjustment of both guanine bases (Fig. 3b and supplementary Fig. S3). Thus, the mutant can accommodate only monomeric c-di-GMP, likely due to the absence of R211 and D214 of the RxxD motif of loop 7 essential for c-di-GMP dimer coordination (Fig. 3b). As discussed in the previous section, the monomeric c-di-GMP ligand forms isologous interaction with the two protomers of the dimer (Fig. 2). The interactions of each guanyl with the protein are the same as observed for the proximal guanyl moiety (G4) of dimeric c-di-GMP and G of ppGpp interacting with wild-type SmbA31 (Fig. 3c). R143 is found stacked upon the guanyl to form a cation–π interaction, R78 forms an H-bond with O6, and E188 forms on H-bond with N1 of the guanyl base (Fig. 3b and supplementary S3). Compared to the wild-type complex the phosphate has moved towards the protein and forms an H-bond with main-chain amide 80 (Fig. 3c). Three well-defined water molecules make hydrogen bonds with R78, E188, and R143 (Fig. 3c).

Structural superimposition of SmbA∆loop/c-di-GMP with SmbAwt/(c-di-GMP)(PDB code-6GS8) and SmbAwt/ppGpp (PDBcode-6GTM) shows RMS deviations of SmbA∆loop/c-di-GMP of 0.39 Å (for 214 Cα atoms) and 0.46 Å (for 230 Cα atoms) when compared to SmbAwt/(c-di-GMP)2 (Fig. 5b) and SmbA/ppGpp, respectively (Fig. 5b). These values indicate virtually idendical structures, but there are some notable local deviations. Particularly, in the SmbA∆loop/c-di-GMP complex, the C-terminal part of loop 7 forms a short helix α7* (Fig. 5a). In addition, significant changes are observed in the C-terminal helix 9, which, in the wild-type protein, is stabilized by loop 7 being in turn immobilized by dimeric c-di-GMP. Thereby, the G1 and G2 guanyl bases interact with the RxxD motif of loop 731. In the SmbA∆loop/c-di-GMP complex, the monomeric ligand adopts an outward-open conformation similar to that found in SmbAWT/ppGpp complex (Fig. 5b). However, its guanyl is in the same position as the G of ppGpp and G4 of c-di-GMP all forming interactions with R78 and R143 (Figs. 3c and 5). At the same time, the phosphate moieties of monomeric c-di-GMP bound to SmbA∆loop do not superimpose with those of bound dimeric c-di-GMP or ppGpp as bound to wild-type SmbA (Fig. 3c).

Figure 5
figure 5

Structural comparison of SmbAΔloop/c-di-GMP with SmbAwt/(c-di-GMP)2, SmbAwt/ppGpp and Alphafold model of SmbAwt. (a) Superposition of SmbAΔloop/c-di-GMP (gray) with SmbAwt/(c-di-GMP)2 (cyan) with RMSD of 0.4. Relevant secondary structure elements are labeled. Dimeric c-di-GMP (cyan) and monomeric (thick) are shown as ball-and-stick models. (b) Superposition of SmbAΔloop/c-di-GMP (gray) with SmbA/ppGpp (Magenta) with RMSD of 0.5. Relevant secondary-structure elements are labeled. ppGpp (magenta) and monomeric (thick in gray) are shown as ball-and-stick models. The disordered part of loop 7 is marked by broken lines. (c) AlphaFold2 predicted model of SmbAwt (yellow) with loop 7 is show in green color. (d) Superposition of SmbAwt/(c-di-GMP)2 (green) with AlphaFold2 model of SmbAwt (yellow). Loop 7 from SmbAwt/(c-di-GMP)2 and Alphfold model of SmbAwt apo are show in red and green repectively.

Because the apo structure of SmbAwt is not known, we turned to a model of apo wild-type SmbA generated by AlphaFold234 (AF2) as deposited in Uniprot (Q9A5E6) to predict the protein conformation and more specifically loop 7 in an unliganded state. The AF2 model of SmbAwt agrees very well with our X-ray structure (Fig. 5d). Indeed, the core of the TIM-barrel fold shows very high confidence (pLDDT > 90) and represents the most stable region of the SmbA structure. Interestingly, loop 7 and helix 9 have low (70 > pLDDT > 50) and very low (pLDDT < 50) scores. It has been shown that AF2 correlated with the root mean square fluctuations (RMSF) calculated from MD (Molecular Dynamics) simulations experiments35. Thus the low AF2 scores of SmbAwt, suggest flexibility of loop 7 in the absence of the c-di-GMP (Supplementary Fig. S5a) which most likely is open in the unliganded state in contrast to closed in SmbAwt/(c-di-GMP)2 structure (Fig. 5c and d). This prediction is in line with the functional model of SmbA action31 which posits that, in response to c-di-GMP binding, the protein switches form an on- to off-state accompanied by structural changes in flexible loop 7 and helix 9, which ultimately controls the interaction with an unknown downstream partner possibly via heterodimerization (Supplementary Fig. S5b).

Conformation of the monomeric c-di-GMP bound to SmbA∆loop

As discussed above, with the deletion of the loop 7 containing the RxxD motif SmbA loses its ability to bind intercalated dimeric c-di-GMP molecule but still can hold one c-di-GMP. The monomeric ligand is two-fold symmetric, where the sugar pucker is C3′-endo, and both glycosidic torsion angles have a value of − 126° (Fig. 6a and b), which is significantly distinct to the trans conformation of G4 as part of dimeric c-di-GMP bound to wild-type SmbA. Superposition of c-di-GMP from the SmbA∆loop and SmbAwt complex structures shows that this difference is the reason for the elongated shape of monomeric c-di-GMP, while macrocycle including the sugar superimposes closely (Fig. 6c).

Figure 6
figure 6

Observed c-di-GMP conformations in SmbAloop and its comparison with SmbAwt, PdeL and LapD. (a) and (b) shows the partial open-twisted form of monomeric c-di-GMP in C3'-endo sugar pucker conformation observed in SmbA mutant. Guanine distances are shown in black dotted line. (c) Superimposition of c-di-GMP from SmbAΔloop, SmbAwt and LapDEAL. GMP moiety from both structures shows the same conformation, the C3′-endo sugar pucker; however, there are considerable differences in the G1 and G2 base orientation (indicated by the gray arrow). (d) Superimposition of c-di-GMP from SmbAΔloop with monomeric c-di-GMP as observed when bound to a phosphodiesterase PdeL and degenerated-phosphodiesterase LapD. Distinct sugar pucker of the base at the right (G2) appears responsible for the fully elongated form of c-di-GMP when bound to PdeL or LapD. In contrast, all bases at the left (G1) show the same sugar pucker, i.e. C3′-endo as also observed for SmbAΔloop in this study. (e) Superimposition of crystal structure of c-di-GMP/Mg2+36 and dimeric c-di-GMP from SmbAwt. Guanine distances are shown in red and green dotted lines of c-di-GMP/Mg2+ and dimeric c-di-GMP from SmbAwt respectevily.

Next, we compared the conformation of monomeric c-di-GMP as bound to SmbA∆loop to other effectors that bind the ligand in the monomeric form such as the phosphodiesterase domain PdeLEAL37 and the degenerate LapDEAL domain38. A superposition of the three complexes is shown in Fig. 6d. While the macrocycles retain a similar, but not identical, conformation, as seen in the SmbA∆loop, the ligands bound to PdeLEAL and LapDEAL, are in a more open conformation, apparently due to the C2′-endo puckering of one of the guanines (at the right side in Fig. 6d). These results show that c-di-GMP can adopt yet another unique conformation different from the stacked dimeric conformation in complex with SmbAwt, or the extended form in the PdeLEAL (PDB code-4LJ3) or degenerate LapDEAL domain (PDB code-3PJT).

The monomeric c-di-GMP conformation observed in the SmbA∆loop complex structure is different from that of dimeric c-di-GMP. From this comparison, one can see that one G1 is bound always the same way in the three complexes (Fig. 3c). Due to the conformational changes, the other GMP has moved out considerably, to form an isologous interaction with the second SmbA∆loop molecule (interacting residues from monomer B is not shown) (Fig. 3c). This indicates that, depending on its binding partner, c-di-GMP is flexible enough to adopt various conformations via only minor changes in torsion-angle.

The SmbAΔloop/c-di-GMP structure may represent the first step of consecutive c-di-GMP binding to form an intercalated dimer

At very high (> 1 mM) concentration, c-di-GMP can form dimers or even higher oligomers, such as tetramers or octamers. However, Gentner et al.39 clearly showed by NMR that c-di-GMP is monomeric at physiological concentrations. However it cannot be ruled out, but is unlikely, that other factors (metal ions, molecular crowding and aromatic compounds) may favor higher oligomers in the cellular environment. Intercalated c-di-GMP dimers have been observed in several protein complexes, such as when bound to the I-site of diguanylate cyclases, or in response regulators, PilZ receptors, and SmbAwt. Based on our data shown here, we propose that at physiological concentrations c-di-GMP dimerization occurs only on the protein by consecutive binding of c-di-GMP monomers to form the intercalated dimer (Fig. 6e).

This obviously implies the presence of a well formed, high-affinity protein binding site for the first c-di-GMP molecule. Here, we have captured upon loop deletion for the first time a potential binding pose of the first c-di-GMP binding event to SmbA. Indeed, all interactions required to bind this first monomer (involving R143, E188, R78) are present in SmbA∆loop (Fig. 3c) and the affinity turned out to be in the low μM range (Fig. S1b). For the second binding event, in addition to a bound c-di-GMP molecule providing guanyl stacking sites, loop 7 providing the R211xxD214 motif would then be required (Supplementary Fig. S1e). In line with the structural considerations, the affinity of c-di-GMP to SmbA∆loop is in the low micromolar range and is in fact comparable to the apparent Kd of c-di-GMP to the wild-type protein (Supplementary Fig. S1c and d).

In line with the structural considerations and the proposed binding mechanism, the Kd of c-di-GMP to SmbA∆loop is low (1.8 μM) (Fig. S1b) and, in fact comparable, to the apparent Kd (0.3 μM) of the compound to the wild-type protein (Fig. S1c). For completeness, the affinity of ppGpp to the SmbA mutant was also measured (Fig. S1d) and was found to be virtually identical to the affinity of the compound to SmbAw31t indicating that loop 7, as expected, does not contribute to ppGpp binding. In summary, the hypothesis of consecutive c-di-GMP binding to form an intercalated dimer on the protein is strongly supported by the results on the SmbA loop deletion mutant.

Phylogenetic analysis and exploring SmbA homologs

To understand the evolutionary significance of the flexible loop of SmbA switch protein, here we have further extended our primary sequence analysis of SmbA and its homologs described in Shyp et al.31. We identified SmbA orthologs based on reciprocal best BLAST hits across species, concordance of the protein sequence distance tree with a species phylogeny based on 16S rRNA markers40 and syntenic conservation41 (Fig. 7a and b). Interestingly, the c-di-GMP-binding RxxD motif is only strictly conserved within the Caulobacter genus, with either Asn or Glu substitution among the Caulobacterales (Fig. 7c). There is considerable variability around this loop region, including several insertions and deletion events. This may suggest alternative binding modes and/or substrates within the Caulobacterales order. Similarly, the sites interacting with ppGpp (R78, N111, Q114, R143, E188) are not strictly conserved within the Caulobacterales order. The C-terminal helix 9 is highly conserved among SmbA orthologs (Fig. 7c). This is consistent with the proposal that it adopts a different conformation in the c-di-GMP-bound state than apo and ppGpp, thus necessary for the ligand-mediated SmbA switch. A similar mechanism may apply to other SmbA orthologs via interplay of unknown ligands. The strictly conserved N-terminal motif (MRYRP[FL]G) is also found in otherwise unrelated proteins from the Acetomycetalesorder (Frankia, Streptomyces).

Figure 7
figure 7

Sequence alignment and distance of SmbA homologs. (a) Pairwise Needleman-Wunsch global alignment scores of SmbA and CckA reciprocal best BLAST hits (BBH) for species sampled from prosthecate Caulobacterales (PC), non-prosthecate Caulobacterales (NPC), and other bacterial groups (OG). Alignment scores are reported relative to self-alignment of SmbA (Q9A5E6) and CckA (H7C7G9) from Caulobacter crescentus. For the null models, CckA BBH was scored against SmbA and vice versa. The latter BBH was identified using BLASTp against the NCBI-NR database using the BLOSUM45 scoring matrix. (b) A phylogenetic tree of 24 SmbA orthologs inferred using the Maximum Likelihood method based on the JTT model as implemented in MEGA7. Branch lengths indicate the number of substitutions per site. The tree with the highest log likelihood (− 9134.38) is shown, with bootstrap support from 100 replicates indicated at branches. (c) Sequence alignment and logo of SmbA orthologs. The sequence logo was generated using the WebLogo server from the global alignment of SmbA orthologs used to build the distance tree.

The Caulobacterales order contains prosthecate and non-prosthecate species40,42. SmbA (Q9A5E6) appears to be unique to the prosthecate Caulobacterales. Reciprocal best BLAST hits for SmbA from non-prosthecate Caulobacterales and other bacterial species are very distantly related Aldo–keto reductases which cannot be meaningfully aligned with SmbA. The central function of SmbA is a simple molecular switch that responds to the cellular concentrations of ppGpp and c-di-GMP to regulate Caulobacter growth31. We surmise that the presence of a SmbA ortholog is a marker for prosthecate-type Caulobacterales species which have not been morphologically characterized. This is further supported by the genes flanking SmbA, including a putative iron-sulfur glutaredoxin (Q9A5E5) and a BolA/YrbA family transcription factor (Q9A5E7) which in E. coli positively regulates the transition from the planktonic to attachment stage of biofilm formation43.

Methods

Plasmid construction and purification of the recombinant proteins

To construct pET21b-smbA∆loop-His6 (deletion of fragment 198–215), the pET21b-smbA-His6 plasmid was amplified with the following primers: 6265_D Loop7_forward CCCCAGGCCCTGCGAGAACTGGCCGATGTGGGCGGCTA and 6266_DLoop7_reverse TAGCCGCCCACATCGGCCAGTTCTCGCAGGGCCTGGGG. The template was digested with DpnI and mutant DNA was transformed into competent cells for nick repair. The final construct has been sequenced to confirm the fragment deletion. Protein was overproduced and purified as described previously31. E. coli Rosetta 2(DE3) cells were used to overproduce recombinant protein from the pET21b expression plasmid. Cells were grown in LB-Miller supplemented with 100 μg/ml ampicillin to an OD600 of 0.4–0.6, expression was induced with 1 mM IPTG overnight at 22 °C. Cells were harvested by centrifugation (5000 g, 20 min, 4 °C), washed with PBS and flash-frozen in liquid N2, and stored at − 80 °C until purification.

For purification, cells were resuspended in lysis buffer (30 mM Tris/HCl pH 7.5, 5 mM MgCl2, 100 mM NaCl, 1 mM DTT and 10 mM imidazole containing 0.2 mg/ml lysozyme, DNaseI (AppliChem) and Complete Protease inhibitor (Roche) and disrupted using a French press. The suspension was clarified by centrifugation at 30,000 × g (Sorval SLA 1500) at 4 °C for 30 min and loaded onto a 1 ml HisTrap HP column (GE Healthcare) on an ÄKTA purifier 10 system (GE Healthcare). Column was washed with 5 column volumes with wash buffer (30 mM Tris/HCl pH 7.5, 5 mM MgCl2, 100 mM NaCl, 1 mM DTT and 10 mM imidazole), and the bound protein was eluted with linear gradient of elution buffer (30 mM Tris/HCl pH 7.5, 3 mM MgCl2, 100 mM NaCl, 1 mM DTT and 300 mM imidazole). Elution fractions enriched in SmbA (as judged by SDS-PAGE) were pooled and concentrated to around 10 mg/ml using Amicon Ultra centrifugal concentrator with a nominal molecular weight cut-off of 30 kDa (Millipore AG). The concentrated protein was centrifuged at 16,000 × g at 4 °C for 15 min and loaded onto a Superdex 75 gel filtration column (Amersham Biosciences) equilibrated with 30 mM Tris/HCl pH 7.5, 5 mM MgCl2, 100 mM NaCl, 1 mM DTT. Fractions containing essentially pure SmbA (as judged by SDS-PAGE) were pooled and concentrated to a desired concertation for further experiments.

Crystallization

A Phoenix robot (Art Robbins Instruments) was used for a wide range of crystallization screening. Crystallization was carried out using the sitting drop vapour diffusion method at 20 °C by mixing the protein with the reservoir solution in a 1:1 ratio. The protein concentration was 5.0, 2.25 and 1.75 mg/ml upon adding c-di-GMP in 3.0 fold molar excess. Triangle diamond-shaped 3D crystals appeared in Pact premier D11 (Molecular dimension) after one week in 0.2 M Calcium chloride dihydrate 0.1 M Tris pH 8.0 and 20% w/v PEG 6000. Crystals were flash-frozen into two different cryoprotectants. The best diffraction was obtained from crystals cryo-protected with 25% ethylene glycol.

For the apo protein crystals, three different protein concentrations (20, 15 and 5 mg/ml) were used at room temperature. Crystals appeared within a week and continued growoing for a few additional days in a condition containing 200 mM NaCl and 10% v/w PEG 6000. The crystals were flash-frozen in liquid N2 for data collection at 100 K.

X-Ray diffraction data collection, phasing, and refinement

All single-crystal X-ray diffraction data sets were collected at PXI and PXIII beamline of Swiss Light source, Villigen, Switzerland.) Datasets were collected for the crystal of the SmbA∆loop apo and in presence of c-di-GMP. Diffraction data sets were processed either with MOSFLM44 or XDS45 and the resulting intensities were scaled using SCALA from CCP4/CCP4i2 suite46For solving the SmbA∆loop apo and complex structure, SmbAwt (PDB code, 6GS8) structure was used as search model without c-di-GMP. Both structures were solved by molecular replacement using PHENIX PHASER47. Further refinement of structures was carried out using REFMAC5 and Phenix refinement48. Model building was performed using COOT49 and model validation was carried out with molprobity50. Crystallographic data processing and refinement statistics are provided in Table 1.

Isothermal titration calorimetry (ITC)

Experiments were carried out at 25 °C or 10 °C, a syringe stirring speed of 300 rpm, a pre-injection delay of 200 secs, and a recording interval of 250 secs in a Microcal VP-ITC in ITC buffer (30 mM Tris–HCl pH 7.5, 150 mM NaCl, 5 mM MgCl2). All solutions were degassed below the temperature used in the experiments before loading into the calorimeter cell. Baseline correction and integration of the raw differential power data, and fitting of the resulting binding isotherms to obtain dissociation constants were performed using the Microcal ORIGIN software.

Analytical ultracentrifugation (AUC)

Sedimentation velocity (SV) centrifugation was performed on a ProteomLab™ XL-A analytical ultracentrifuge (Beckman-Coulter, Brea, CA, USA) using an AN60 Ti rotor with standard aluminum 2-channel centerpieces with quartz windows. The samples were spun at speeds ranging from 35,000 to 50,000 rpm depending on the protein size at 4 °C. The SmbAwt (38.9 μM) and SmbA∆loop (39.0 μM) in SEC buffer was subjected to ultracentrifugation in the absence and in presence of a fivefold molar excess of c-di-GMP. Radial scans were recorded with 30 µm radial resolution at ~ 3 min intervals. The software packages SEDFIT v 14.14 was used for data evaluation. After transformation of the recorded sedimentation velocity data taken in the intensity mode to interference data in the respective data evaluation software, time- as well as radially-invariant noise were calculated and subtracted. In SEDFIT (http://www.analyticalultracentrifugation.com), continuous sedimentation coefficient distributions c(s) were determined with 0.05 S resolution and an F-ratio = 0.95. Suitable s-value ranges between 0 and 30 S and for GA f/f0 between 1 and 4 were chosen. Buffer density (1.0136 g/ml) and viscosity (1.591 cP) were calculated with SEDNTERP v 20111201 beta (http://bitcwiki.sr.unh.edu/index.php). The partial specific volumes of the studied proteins were calculated according to the method of Cohn and Edsall as implemented in SEDNTERP. From the peak in the c(s) distribution, the frictional ratio f/f0 and the meolecular weight were obtained by SEDFIT based on the Stokes–Einstein and Svedberg equations51. Data were plotted using program ProFit (Quansoft, Zurich, Switzerland).

AlphaFold modeling

The SmbAwt AphaFold model was retrieved from Uniprot (https://www.uniprot.org) with accession code Q9A5E6. The X-ray structures were visualized using Pymol (https://pymol.org/2/) and compared to the AlphaFold model.

Bioinformatics

BLAST analyses were conducted using the NCBI-NR dataset. Multiple sequence alignments were generated using MAFFT in G-INS-i mode52 followed by manual refinement. The phylogenetic tree of 24 SmbA orthologs was inferred using the Maximum Likelihood method based on the JTT model53 as implemented in MEGA754. Branch lengths indicate the number of substitutions per site. The tree with the highest log likelihood (− 9134.38) is shown, with bootstrap support from 100 replicates indicated at branches. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances estimated using a JTT model, and then selecting the topology with a superior log-likelihood value. A discrete Gamma distribution was used to model evolutionary rate differences among sites (5 categories, + G = 2.2328)). The rate variation model allowed for some sites to be evolutionarily invariable ([+ I], 7.16% sites). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. There were a total of 323 positions in the final dataset.