Introduction

Chitin is a β-1,4-linked polysaccharide of N-acetylglucosamine (GlcNAc), and is degraded by chitinases1, β-N-acetylglucosaminidases2, chitin deacetylases3 and lytic polysaccharide monooxygenases4, to produce GlcNAc, chitin oligosaccharides, (GlcNAc)n (n = 2, 3, 4, 5, and 6) and their de-N-acetylated derivatives. The products are utilized as structural components in various chitin-containing organisms5 or as carbon and nitrogen sources by chitin consumers6. Vibrio spp. live in the marine ecosystem and possess all these enzymes, enabling them to efficiently catabolize chitin as their nutritional source7. The enzymatic degradation of chitin by extracellular chitinases in Vibrios produces (GlcNAc)n8,9,10, which are taken up into the periplasm through a chitoporin localized in the outer membrane11,12,13,14. (GlcNAc)n transported into the periplasm undergo further enzymatic degradation15,16,17 and are trapped by a periplasmic solute-binding protein (SBP) specific for (GlcNAc)n18. Various SBPs have been recognized to mediate the transport of the corresponding solutes into the cytoplasm through a specific transporter localized in the inner membrane19; furthermore, they interact with signal receptor proteins to control the signal response20. SBPs specific to (GlcNAc)n (referred to as CBP) also have such a dual functionality; (GlcNAc)n-liganded or unliganded CBPs interact with a chitin sensing protein as well as a (GlcNAc)2-specific ABC-transporter21,22. A large conformational change of CBP from “open” to “closed”, or vice versa, induced by binding/release of (GlcNAc)n19 triggers a series of interactions involving sensing or transporter proteins. It is therefore highly desirable to thoroughly analyze the structure, binding mechanism of CBP/(GlcNAc)n and the CBP/(GlcNAc)n/membrane protein interactions.

Sheepers et al.23 classified SBPs into seven clusters, A, B, C, D, E, F and G, according to their crystal structures, and cluster C was subdivided into five subclusters (I, II, III, IV and V) in our previous review24. Two CBPs from Vibrios have been investigated so far. One is from V. cholerae (VcCBP), a facultative anaerobe that causes a deadly disease to humans called cholerae25, and the second is from V. campbelli, (formerly V. harveyi, VhCBP), a bioluminescence marine bacterium that causes a fatal disease to aquatic animal called Vibriosis26. Both VcCBP and VhCBP belong to cluster C/subcluster IV (C-IV), in which SBPs are specific for oligosaccharides, including (GlcNAc)n, mannooligosaccharides27, and cellooligosaccharides28. The first crystal structures of CBP were solved for VcCBP and deposited in the PDB database in 2006; PDB codes 1ZTY for unliganded (open) VcCBP and 1ZU0 for (GlcNAc)2-liganded (closed) VcCBP, and in 2013; PDB codes, 4GF8 for unliganded (open) VcCBP and 4GFR for (GlcNAc)2-liganded (closed) VcCBP. However, no functional or mechanistic information on this CBP was reported. Suginta et al.29 solved the crystal structure of a closed form of (GlcNA)2-liganded VhCBP, with an amino acid sequence that shares 83% identity with that of VcCBP as shown in Fig. 1. They regarded this protein as comprising two domains (the Upper and Lower domains) connected by a flexible hinge region, which enabled CBP to embrace (GlcNAc)n in its substrate-binding groove lying between the two domains. Conserved Asp365, Phe437, Trp363 and Trp513 (Fig. 1) localized in the binding groove were found to contribute strongly to the interaction with (GlcNAc)230. Most other studies on SBPs proposed that the proteins adopt a two-domain conformation, which is converted from the open state to the closed state upon capture of their substrates31,32,33. Kitaoku et al.30 found a suggestive conformation, “half-open” form, in the crystal structure of mutant VhCBP (W513A), which provided some clues for the elucidation of the mechanism of (GlcNAc)n translocation by CBP. Moreover, a three-domain organization was proposed for the SBPs belonging to cluster C and referred to as domains I, II, and III19. Further studies are needed to gain more extensive insights into the domain organization of CBPs.

Figure 1
figure 1

Sequence alignment of the periplasmic solute-binding proteins specific to (GlcNAc)2 from Vibrios. Identical amino acids are written in bold. The secondary structures are designated α1, α2, α3, … (α-helices, open bars), β1, β2, β3,… (β-strands, arrows), and η1, η2, η3, … (310-helices, filled bars) from the N-terminus. The sequence of Upper1 domain is underlined (VcCBP) or overlined (VhCBP) in dark green, Upper2 domain in magenta, and Lower domain in cyan. Amino acid residues, which interact with the ligand and are conserved between VhCBP and VcCBP, are highlighted on a black background.

In this study, we conducted in-depth investigations of the structure and mechanism of CBP by means of crystallography, molecular dynamics simulation, thermal unfolding experiments and isothermal titration calorimetry (ITC), using VcCBP and (GlcNAc)n (n = 2, 3, 4, 5 and 6). Based on the crystal structures and molecular dynamics simulation, we here propose that the three domains, Upper1, Upper2, and Lower domains, which correspond to domains I, II, and III, respectively19, are required to explain the translocation process of (GlcNAc)2 by CBP. This proposal was well supported by the thermodynamic data for (GlcNAc)n binding to VcCBP.

Results

Crystal structures of VcCBP

We solved the structure of unliganded protein (open form) at 1.6 Å (Fig. 2A), a higher resolution than the VcCBP structures formerly deposited (2.2–2.3 Å resolution). The closed structure of (GlcNAc)3-liganded VcCBP was also solved at 1.22 Å resolution (Fig. 2B). The coordinates were deposited in the database with the PDB codes 8I5J and 8I5K, respectively. The statistics for the crystallization and refinement procedures are listed in Table 1. Superimposition of the unliganded structure (open form) obtained here with that of 1ZTY previously deposited revealed that these two structures are almost identical, with a root-mean-square deviation (RMSD) of 0.600 Å, although the space group of 8I5J (P21) was different from that of 1ZTY (P3221). The RMSD values obtained from the superimpositions of (GlcNAc)3-liganded structure (8I5K) with those of (GlcNAc)2-liganded VcCBP (1ZU0) and (GlcNAc)3-liganded VhCBP (6LZQ30) were 0.529 Å and 0.393 Å, respectively. Both structures obtained here for VcCBP showed three distinct domains, the Upper1, Upper2 and Lower domains (Fig. 2A,B). Compared with the unliganded structure, the angle of closure between Upper1 and Lower domains of the (GlcNAc)3-liganded structure was found to be 54.7°. This was determined from the rotation angle of the Gln490 α-carbon atom (Upper1 domain) using the Ala486 α-carbon, which is located at the hinge, as the fixed rotation center. The Upper1 region is colored in deep green, Upper2 magenta, and Lower cyan. Upper1 (1–35, 140–239, and 489-C-terminal) comprises α3-α5/α16, β1/β8-β12/β19-β21 and η2–η3, while Upper2 (36–139) comprises α1-α2, β2-β7 and η1. The Lower domain (240–488) comprises α6-α15, β13-β18 and η4–η6 (Figs. 1, 2A,B). The division of the upper structural region into Upper1 and Upper2 domains appeared to be reasonable from the superimposition of (GlcNAc)2-liganded (1ZU0) and (GlcNAc)3-liganded VcCBP (8I5K) (Fig. 2C), where the Lower domain fully overlapped (RMSD, 0.306 Å), while the Upper1 domain partly deviated (RMSD, 1.203 Å) and the Upper2 domain strongly deviated (RMSD, 1.930 Å). The structure liganded with (GlcNAc)3 differed slightly from that with (GlcNAc)2 especially in the Upper2 domain. Accommodating the reducing-end GlcNAc of the bound (GlcNAc)3 appeared to affect more strongly the Upper2 domain.

Figure 2
figure 2

Crystal structures of unliganded VcCBP (A, open state; 8I5J) and (GlcNAc)3-liganded VcCBP (B, closed state; 8I5K). The upper panels represent views from the front, and the lower panels show side views. Upper1 domain is colored dark green, Upper2 magenta, and Lower cyan. Bound (GlcNAc)3 is represented by stick model colored light green. The secondary structures are designated as α1, α2, α3,…, β1, β2, β3,…, and η1, η2, η3,…, from the N-terminus, and correspond to those shown in Fig. 1. (C) Superimposition of the structures of (GlcNAc)3-liganded (8I5K, black) and (GlcNAc)2-liganded VcCBP (1ZU0, red). Left, view from the front of the binding groove; middle, view from the Upper1 side; right, view from the Upper2 side.

Table 1 Data-collection and refinement statistics.

Binding mode of (GlcNAc)3

Figure 3A shows a 2F0Fc map of the bound (GlcNAc)3. As in the case of VhCBP29, electron density of (GlcNAc)3 was identified in the (GlcNAc)3-liganded VcCBP. Since Kitaoku et al.30 designated these subsites as Site1, Site2 and Site3, the same subsite nomenclature was used in this study. In the (GlcNAc)3-liganded VcCBP, the electron density of the Site3 GlcNAc was lower than those of the two GlcNAc residues at Site1 and Site2, and the occupancy of Site3 GlcNAc was set at 0.5. Two conformers of the Phe410 side chain were observed near the Site3 GlcNAc, and the individual occupancies of the side chain were also set at 0.5. It is most likely that the mobility of the Site3 GlcNAc is much higher than that of the other GlcNAc residues and that the Phe410 side chain flips alternatively back and forth in the liganded structure. Both GlcNAc units at Site1 and Site2 appeared to be hydrogen-bonded or in hydrophobic contact with Asp9, Asn203, Ser220, Phe221, Trp362, Asp364, Asn408, Phe410, Arg435, Phe436 and Trp512, of which all except Asp9 are conserved between VhCBP and VcCBP (highlighted on a black background in Figs. 1 and 3B), while the GlcNAc unit at Site3 makes only a couple of hydrogen bonds with the main chain carbonyl of Ala513 and the guanidyl nitrogen of Arg27, which are localized to the interface between the Upper1 and Upper2 domains (Fig. 3B, Supplementary Table S1).

Figure 3
figure 3

State of (GlcNAc)3 bound to VcCBP. (A) 2F0-Fc map of the bound (GlcNAc)3 and the Phe410 side chain. The occupancies of the Site3 GlcNAc and two conformers of the Phe410 side chain were each assumed to be 0.5. (B) Binding mode. The color system is identical to that in Fig. 2. The possible hydrogen bonds are shown by black broken lines. The red broken lines represent the distances between the α-carbons monitored in the molecular dynamics simulation shown in Fig. 6.

B-factor values visualized in the main chain structure of (GlcNAc)3-liganded VcCBP are shown in Fig. 4A, and were higher in the Upper2/Lower interface as well as the Site3-GlcNAc contact surface of Upper1 domain. Notably, the B-factors were significantly higher at the loop region immediately following the secondary structure η6 and the loop between α14 and α15 (red broken circles, Fig. 4A). These two loops are located at the Upper2-Lower interface. Furthermore, B-factor values of the individual GlcNAc residues were 9.86, 12.67 and 16.28 Å2, respectively (Fig. 4D), indicating the higher mobility of the Site3 GlcNAc. These B-factor data were fully consistent with the electron density data described above. The cooperative motions of the Site3 GlcNAc and the Phe410 side chain may be significant from the mechanistic viewpoint.

Figure 4
figure 4

Visualization of crystallographic B-factors for the VcCBP structure. (A) and (B) entire VcCBP structure in complex with (GlcNAc)3; (C) Cross-section of the (GlcNAc)3-binding site; and (D), close-up view of the (GlcNAc)3-binding site. The colors from violet to blue, green, yellow, orange and red indicate B factors from small to large. Broken circles colored white represent B-factor-higher regions, while those colored red represent loop regions with the highest B-factors.

Molecular dynamics simulation

Figure 5 shows the time-dependent RMSD for unliganded, (GlcNAc)2-liganded and (GlcNAc)3-liganded VcCBP. The unliganded form exhibited the largest motion (RMSD, 0.8 nm; the top panel of Fig. 5A) of the entire protein molecule, while the motions of the individual domains remained at lower levels (RMSD, 0.1–0.3 nm; Fig. 5A). This clearly indicated a domain motion, in which the individual domains did not change their own conformations but changed their relative arrangement. On the other hand, the motions of the entire protein molecules as well as individual domains were very low in the (GlcNAc)2-liganded and (GlcNAc)3-liganded VcCBP, where RMSDs were only 0.1–0.2 nm (Fig. 5B,C).

Figure 5
figure 5

Time-dependent backbone RMSDs calculated by molecular dynamics simulation. Unliganded VcCBP (A, 8I5J), (GlcNAc)2-liganded VcCBP (B, 1ZU0) and (GlcNAc)3-liganded VcCBP (C, 8I5K). From top to bottom: the entire protein structure, Upper1 + Upper2, Upper1, Upper2, and Lower domains.

Figure 6 shows the distances between the α-carbons of the two numbered amino acid residues (amino acids labeled in red, Fig. 3B). The distance 203–361, which reflects the separation between Upper1 and Lower domains (red broken line between Asn203 and Gly361 in Fig. 3B; the top panels of Fig. 6A,B), did not significantly differ between (GlcNAc)2-liganded and (GlcNAc)3-liganded VcCBP. For the distance 24–391 (red broken line between Thr24 and Ala391 in Fig. 3B; second panels from the top of Fig. 6A,B), which also reflects the separation between Upper1 and Lower domains at another site, no significant difference was observed between (GlcNAc)2-liganded and (GlcNAc)3-liganded VcCBP. Furthermore, the distance 135–517 (red broken line between Ser135 and Glu517 in Fig. 3B; the bottom panels of Fig. 6A,B) reflecting the separation between the Upper1 and Upper2 domains, did not exhibit any differences between the two liganded VcCBP. However, the distances 101–432 (Gln101–Gly432) and 24–101 (Thr24-Gln101) reflecting the separations of Upper2-Lower and Upper1–Upper2, respectively, fluctuated more in (GlcNAc)3-liganded VcCBP than in (GlcNAc)2-liganded (the third and fourth panels from the top of Fig. 6A,B). The movements are more intensive in the (GlcNAc)3-liganded VcCBP than in the (GlcNAc)2-liganded in this region. Elongation of the chain of the bound oligosaccharide from (GlcNAc)2 to (GlcNAc)3 was found to enhance the molecular motion around amino acid residue Gln101 located in the Upper2 domain. The larger conformational change in (GlcNAc)3-liganded VcCBP was also confirmed from Fig. 7, which shows the 2-dimensional projection of the trajectories of principal components, eigenvector1 and eigenvector2. In unliganded VcCBP, essential subspaces were widely extended as compared with the liganded VcBP (Fig. 7A; eigenvector1/eigenvector2, -7 ~ 15 nm/-22 ~ 9 nm) revealing a large conformational change from the open state to the closed state (Fig. 2A,B). Although the conformational spaces of the liganded states were much less extensive (Fig. 7B; − 6 ~ 3.5 nm/− 2.5 ~ 3.5 nm and Fig. 7C; − 4.5 ~ 6.3 nm/− 5.5 ~ 4.0 nm), (GlcNAc)3-liganded VcCBP was found to occupy a larger conformational space than (GlcNAc)2-liganded, indicating that the structural variations induced by (GlcNAc)3 binding to Site1, Site2 and Site3 were more abundant than those induced by (GlcNAc)2 binding to Site1 and Site2.

Figure 6
figure 6

The selected, time-dependent interatomic distances between α-carbons of the amino acids. From top to bottom: Asn203-Gly361 (Upper1-Lower), Thr24-Ala391 (Upper1-Lower), Gln101-Gly432 (Upper2-Lower), Thr24-Gln101 (Upper1-Upper2) and Ser135-Glu517 (Upper2-Upper1), for (GlcNAc)2-liganded VcCBP (A, 1ZU0) and (GlcNAc)3-liganded VcCBP (B, 8I5K).

Figure 7
figure 7

Two-dimensional projections of eigenvector1 and eigenvector2 showing the time-dependent conformational changes of unliganded (A), (GlcNAc)2-liganded (B) and (GlcNAc)3-liganded VcCBP (C). The color changes from black, violet, purple, magenta, red, orange and to yellow with progress of the simulation time.

A cross-correlation heat map was generated based on the Cα-atom positions using the ProDy python package. The results are shown in Fig. 8. In unliganded VcCBP (Fig. 8A), strong self-positive correlations (red) were found in the region of residue No. 1-239, which corresponds to Upper1 + Upper2 domains, and also in residue No. 350-440, which includes amino acid residues interacting with the ligand from Lower domain (Fig. 3B and Supplementary Table S1). However, strong inter-negative correlations (blue) were found between 1 and 239 (Upper1 + Upper2) and 350–440 (the interacting region of Lower domain). This clearly indicated a large domain motion from open to closed, vice versa, shown in Fig. 2A and B. Although the correlations were weaker in (GlcNAc)3-liganded VcCBP (Fig. 8C) than in the unliganded state, we observed self-positive correlations (red) in residue No. 36-139 (Upper2 domain); however, the inter-correlations between 36 and 139 (Upper2) and 140–239 (inner Upper1) and between 36 and 139 (Upper2) and 1–35 (N-terminal Upper1) were rather negative (blue) as compared with the unliganded state. Furthermore, inter-correlations between 140 and 239 (inner Upper1) and 1–35 (N-terminal Upper1) were rather positive (red). It was clear that Upper1 and Upper2 domains move unitedly in the unliganded state, but that these two domains move independently in the (GlcNAc)3-liganded state. For the (GlcNAc)2-liganded VcCBP (Fig. 8B), although correlations were even weaker than (GlcNAc)3-liganded state, similar trends were observed in the region corresponding to 1–239 (Upper1 + Upper2). Weak but significant inter-negative correlations (blue) between Upper1 + Uppe2 (1–239) and the ligand-interacting region of Lower domain (350–470) were also found in the two liganded states (Fig. 8B,C). However, the negative correlations were more intensive in the (GlcNAc)3-liganded state than in the (GlcNAc)2-liganded, indicating that the movements were enhanced in (GlcNAc)3-liganded VcCBP (Fig. 8C) as compared with (GlcNAc)2-liganded state (Fig. 8B).

Figure 8
figure 8

Cross correlation heat maps of Cα atoms around their mean positions for the entire simulation period. (A) Unliganded VcCBP (8I5J); (B) (GlcNAc)2-liganded VcCBP (1ZU0); and (C) (GlcNAc)3-liganded VcCBP (8I5K). The color gradation from red to blue corresponds to extents of correlated motions (from 1 to 0, positive correlations) and anti-correlated motions (from 0 to − 1, negative correlations). The horizontal and vertical axes represent the amino acid residue No. of VcCBP. The thin guided lines in the figures represent the boundaries of the individual domains, Upper1, Upper2, and Lower. The correspondence table on the right shows which domain is located where.

Thermal unfolding experiments with VcCBP in the presence of (GlcNAc)n

A typical dataset of the thermal unfolding transitions of VcCBP in the absence or presence of (GlcNAc)n was shown in Supplementary Fig. S1. The unfolding transition was highly cooperative, with the fraction unfolded increasing sharply at the transition temperature (Tm). The individual unfolding experiments were conducted twice, and the averaged values of Tm were calculated as listed in Table 2. Thermal unfolding of unliganded VcCBP took place at Tm of 43.2 °C. When (GlcNAc)2, (GlcNAc)3 or (GlcNAc)4 was added to VcCBP, Tm was elevated from 43.2 to 54.4 °C (ΔTm = 11.2 °C), 54.8 °C (ΔTm = 11.6 °C) and 53.8 °C (ΔTm = 10.6 °C), respectively. As in the case of VhCBP29, these three oligosaccharides bound strongly to VcCBP. However, ΔTm values induced by the addition of (GlcNAc)5 and (GlcNAc)6 were moderate, 6.3 °C for both, indicating that these two oligosaccharides bind to VcCBP with significantly lower affinities34. ΔTm for GlcNAc was only 2.8 °C, suggesting that the monomer bound to VcCBP with much lower affinity.

Table 2 Transition temperatures of thermal unfolding of VcCBP in the presence of GlcNAc and (GlcNAc)n (n = 2, 3, 4, 5, or 6).

Thermodynamic parameters for binding of (GlcNAc)n (n = 2, 3, and 4)

ITC experiments were conducted at 25 °C by titrating a 1.0 mM solution of GlcNAc or (GlcNAc)n (n = 2, 3, 4, 5 or 6) into VcCBP solution (50 µM). As shown in Fig. 9A–F, titrations with GlcNAc, (GlcNAc)5, or (GlcNAc)6 did not result in any heat release/absorption, whereas (GlcNAc)2, (GlcNAc)3 and (GlcNAc)4 exhibited clear heat release or absorption. This was consistent with the experimental data obtained from thermal unfolding experiments (Supplementary Fig. S1 and Table 2). It is noteworthy that titration with (GlcNAc)2 resulted in heat release but (GlcNAc)3 and (GlcNAc)4 resulted in absorption. The thermodynamic mechanism of the interaction with (GlcNAc)2 appears to be basically different from those with (GlcNAc)3 and (GlcNAc)4. Thermodynamic parameters were obtained based on the experimental data as listed in Table 3. For (GlcNAc)2, the enthalpic contribution (ΔH°) and the association constant (Kassoc) were obtained based on the theoretical fit and were − 2.40 kcal mol−1 and 4.08 × 106 M−1, respectively. Thus the binding free energy (ΔG°) was − 8.85 kcal mol−1. The entropic contribution (− TΔS°) was calculated to be − 6.45 kcal mol−1. The interaction was entropy-driven with a smaller enthalpy gain. When (GlcNAc)3 or (GlcNAc)4 was used instead of (GlcNAc)2, the entropy gain was even greater but this was compensated by the loss of enthalpy, resulting in lower binding affinities (ΔG°) of − 7.54 and − 8.36 kcal mol−1, respectively.

Figure 9
figure 9

Thermograms and the theoretical fits obtained by ITC analysis. Titrations of VcCBP with GlcNAc (A), (GlcNAc)2 (B), (GlcNAc)3 (C), (GlcNAc)4 (D), (GlcNAc)5 (E) and (GlcNAc)6 (F), were conducted in 20 mM Tris–HCl buffer, pH 8.0 at 25 °C. An iTC200 system (MicroCal Co.) was used to collect and analyze the experimental data. Other reaction conditions are described in the “Methods” section.

Table 3 Thermodynamic parameters obtained for the VcCBP/(GlcNAc)n (n = 2, 3, and 4) interactions by means of ITC.

Temperature dependences of thermodynamic parameters for (GlcNAc)n binding

ITC profiles of (GlcNAc)3 titration into VcCBP at various temperatures (10–35 °C) are shown in Supplementary Fig. S2. Heat absorption was observed up to 25 °C and converted to heat release at 30–35 °C. The binding stoichiometry of (GlcNAc)3/VcCBP remained almost constant at approximately 1.0. The interactions were more exothermic at higher temperature. The temperature dependence was then examined for (GlcNAc)2 and (GlcNAc)4 at various temperatures from 5 to 35 °C, and the thermodynamic parameters obtained were plotted against individual temperatures. The results are shown in Supplementary Figs. S3A-S3C. The individual thermodynamic parameters (ΔG°, ΔH° and -TΔS°) for the binding of (GlcNAc)2, (GlcNAc)3 and (GlcNAc)4 were linearly related to the temperature. Based on the slope of ΔH°, we calculated heat capacity changes (ΔCp°), which are listed in Table 4. Large negative values of ΔCp° were observed for the interactions of individual (GlcNAc)n with VcCBP, indicating significant hydrophobic interactions35. The highest ΔCp° (− 406 cal K−1 mol−1) was observed for the VcCBP-(GlcNAc)3 interaction, while lower ΔCp° values were obtained for (GlcNAc)2 (− 291 cal K−1 mol −1) and (GlcNAc)4 (− 307 cal K−1 mol−1). From the ΔCp° values thus obtained, we calculated the solvation entropy change (ΔS°solv) according to Eq. (2), and subsequently obtained the conformational entropy change (ΔS°conf) based on the Eq. (3). The results are also listed in Table 4. Both favorable ΔS°solv and unfavorable ΔS°conf contributed to the interaction with (GlcNAc)n. The favorable ΔS°solv can be explained by the exclusion of bound water molecules upon ligand binding, while the unfavorable ΔS°conf by the reduction in the fluctuation of the VcCBP structure as seen in Fig. 5, which shows the fluctuations of RMSDs in unliganded and liganded states.

Table 4 Dissection of entropic contribution to the VcCBP/(GlcNAc)n (n = 2, 3, and 4) interactions.

Binding experiments using VcCBP_R27A

To define the contribution of the Arg27 side chain to (GlcNAc)3 binding, we conducted ITC analysis of the interaction with (GlcNAc)3 using VcCBP_R27A. As in the case of VcCBP, heat absorption was observed up to 20 °C, and converted to heat release at 25–35 °C (Supplementary Fig. S4). The thermodynamic parameters were obtained for the individual temperatures, and the values were plotted as shown in Supplementary Fig. S3D. At all temperatures tested, the binding affinities (ΔG°) were found to be reduced by about 1 kcal/mol in VcCBP-R27A (Supplementary Fig. S3D, Table 3). Thus Arg27 participates in the binding of (GlcNAc)3 but its contribution is only moderate. The temperature dependence of ΔH° of VcCBP_R27A indicated that both the solvation entropy gain and conformational entropy loss were reduced (ΔSosolv, 110.9 → 93.5 cal K−1 mol−1; ΔSoconf,  − 70.3 → − 59.9 cal K−1 mol−1), and the reduction in the total entropic contribution (− TΔSo, − 9.56 → − 7.30 kcal mol−1) was compensated by the reduction in enthalpy loss (ΔHo, 2.02 → 0.92 kcal mol−1) (Tables 3 and 4).

Discussion

(GlcNAc)2 binds primarily to the Upper1/Lower interface (Site1/Site2)

As listed in Table 3, the interaction of (GlcNAc)2 with VcCBP was entropy-driven (− TΔS° = − 6.45 kcal mol−1) with a smaller enthalpy gain (ΔH° = − 2.40 kcal·mol−1). This is consistent with the data reported for VhCBP (ΔH° = − 3.9 kcal·mol−1 and − TΔS° = − 6.4 kcal mol−1)28. Since the amino acid residues involved in (GlcNAc)2 binding in VhCBP are almost conserved in VcCBP as shown in Fig. 1, the mechanism of (GlcNAc)2 binding in VcCBP is similar to that in VhCBP from the viewpoints of structure and thermodynamics. As shown in Fig. 3B and Supplementary Table S1, only the Upper1/Lower interface appears to be involved in the binding of two GlcNAc residues from the non-reducing end. Since VcCBP has been regarded as being specific for (GlcNAc)2 translocation to the ABC-transporter18, (GlcNAc)2 binds primarily to the interface between the Upper1 and Lower domains (Site1/Site2) before translocation.

Enthalpic Site1/Site2 and entropic Site3/Site4

From the thermal unfolding and ITC experiments, we found that the binding affinities of (GlcNAc)n (n = 2, 3, and 4) were comparable with each other (Supplementary Fig. S1 and Fig. 9; Tables 2 and 3), but that favorable contributions of entropic term (− TΔS°) to the binding affinities were enhanced in the order (GlcNAc)2 < (GlcNAc)3 < (GlcNAc)4 (Table 3). As the entropic contributions were enhanced, compensations were clearly found in the enthalpic terms. As seen from the crystal structures of (GlcNAc)2-liganded (1ZU0) and (GlcNAc)3-liganded VcCBP (Fig. 3B), reducing-end GlcNAc of bound (GlcNAc)3 exists beyond the Upper1/Lower interface and that of bound (GlcNAc)4 may be in contact with the Upper2/Lower interface (Site3 and Site 4). From the temperature dependence of ΔH°, we found that favorable solvation entropy (ΔS°solv) predominated over conformational entropy changes (ΔS°conf) as listed in Table 4, suggesting that Site3 and Site4 bind a relatively large number of water molecules in the open form, and that the bound water molecules may be excluded upon (GlcNAc)3 or (GlcNAc)4 binding. We calculated the solvent-accessible surface areas (ASA) for unliganded, (GlcNAc)2-liganded, and (GlcNAc)3-liganded VcCBPs based on their crystal structures36, and the data were presented in Supplementary Table 2. We found that the apolar solvent accessible surface area (ASAapolar) was reduced by 3.3 or 4.2% upon binding of (GlcNAc)2 or (GlcNAc)3, respectively. This indicated that (GlcNAc)3 binding to VcCBP excludes more water molecules from the apolar surface than (GlcNAc)2 binding. This is consistent with the positive values of ΔSsolv, which is greater in (GlcNAc)3 than in (GlcNAc)2 (Table 4). The thermodynamic data obtained for (GlcNAc)3 binding to VcCBP_R27A (Tables 3 and 4) well supported the contribution of solvation entropy in (GlcNAc)3 binding. The favorable entropy change derived from solvation was suppressed by mutation of Arg27 to alanine, indicating that the bound water molecules were at least partly removed by this mutation. For the interactions of Site3 GlcNAc involving Arg27, it appears that the hydrogen bond is less significant but entropic contribution caused by the release of bound water molecules is more significant. Thus, we propose that Site1/Site2 corresponding to the Upper1/Lower interface and Site3/Site4 corresponding to the Upper1/Upper2/Lower interface can be respectively regarded as enthalpic- and entropic-interaction sites.

Upper1 and Upper2 domains play different roles in (GlcNAc)2 translocation

In VcCBP, we found three structural domains, the Upper1, Upper2 and Lower domains (Fig. 2), which correspond to domains I, II and III of cluster C SBPs reported by Chandravanshi et al.19. The three-domain organization appears to be significant from a functional viewpoint. One of the reasons for this significance derives from the clear distinction found in the binding thermodynamics between Site1/Site2 and Site3/Site4; although both enthalpy and entropy contributions were involved in the interaction with the former site, the entropic contribution predominates in the latter site (Tables 3 and 4). The second reason is the larger RMSD of the Upper2 domain upon superimposition of (GlcNAc)2-liganded and (GlcNAc)3-liganded VcCBP structures (Fig. 2C). The larger RMSD is likely to produce the space for accommodating the additional GlcNAc unit at the interface between Upper1 and Upper2 domains. The third reason is the distinction in the molecular movements between the Upper1/Lower and Upper2/Lower interfaces. The distance Gln101-Gly432 fluctuated more in (GlcNAc)3-liganded than in (GlcNAc)2-liganded VcCBP, whereas the distances, Asn203-Gly361 and Thr24-Ala391 were similar (Fig. 6). In (GlcNAc)3-liganded VcCBP, the fluctuations were more intensive in the Upper2/Lower interface (Gln101-Gly432) than in the Upper1/Lower interface (Asn203-Gly361/Thr24-Ala391) as shown in Fig. 6B. The larger fluctuation in agreed well with the highest B-factor values in the loop immediately following η6 and in the loop between α14 and α15 (Fig. 4). A “half-open” conformation observed in Trp513-mutated VhCBP30 may correspond to a snapshot of the largely fluctuated state. Cross-correlation heat maps of (GlcNAc)2-liganded and (GlcNAc)3-liganded VcCBP (Fig. 8) also showed the independency of the Upper1 and Upper2 domains in their molecular movements; thus, it is most likely that Upper1 and Upper2 domains play different roles in the (GlcNAc)2 translocation process.

Hypothetical releasing site, Site5/Site6

As shown in Fig. 9, titrations of VcCBP with (GlcNAc)5 or (GlcNAc)6 did not result in any heat release/absorption; the binding affinities are too low to obtain the thermodynamic parameters for these oligosaccharides by ITC. Nevertheless, thermal unfolding data (Supplemental Fig. S1 and Table 2) revealed a significant elevation of the transition temperature of thermal unfolding (ΔTm = 6.3 °C), suggesting a significant interaction of (GlcNAc)5 or (GlcNAc)6 with VcCBP. In the interaction of (GlcNAc)6 with VcCBP, the two GlcNAc residues of the non-reducing end interact with the interface between the Upper1/Lower (Site1/Site2) and the neighboring (GlcNAc)2 unit also interacts with the Upper1/Upper2/Lower interface (Site3/Site4). However, the additional (GlcNAc)2 unit of the reducing-end side may be repelled from the protein surface (hypothetical Site5/Site6). This situation may bring about the lower but significant ΔTm in Table 2, accounting for the lower binding affinity of (GlcNAc)5 and (GlcNAc)6. We propose here that a specific substrate (GlcNAc)2 primarily binds Site1/Site2 with both enthalpy- and entropy-driven interactions, and is subsequently translocated to Site3/Site4, where the binding interaction is looser, leading to release of the sugar molecule from Site5/Site6 to a (GlcNAc)2-specific ABC transporter. The loosening of the interaction at Site3/Site4 may be caused by the higher mobility of the Upper2/Lower interface observed in the (GlcNAc)3-liganded structure (Figs. 4, 6 and 7). Thus, all translocation processes are conducted by the cooperative action of the three domains, Upper1, Upper2 and Lower.

Structure triggering the (GlcNAc)2 unit translocation

Kitaoku et al.30 observed electron density of the Phe411 side chain with a full occupancy (1.0) close to the N-acetyl methyl group of the Site2 GlcNAc in (GlcNAc)2-liganded VhCBP. However, in (GlcNAc)3-liganded VhCBP, the same side chain was found 3.5 Å away from the Site2 GlcNAc with a full occupancy (1.0). Here, in (GlcNAc)3-liganded VcCBP we observed the side chain of the corresponding phenylalanine residue (Phe410) at both positions with individual occupancies of 0.5 (Fig. 3A), indicating a significant flipping of the Phe410 side chain. The density of the Site3 GlcNAc was also observed with an occupancy of 0.5, indicating the higher mobility of the Site3 GlcNAc (Figs. 3A and 4D), which appeared to be coordinated with that of the Phe410 side chain. The coordinated motions of the phenylalanyl side chain and the Site3 GlcNAc suggested that Phe410 may be involved in the translocation of bound (GlcNAc)2 from Site1/Site2 to Site3/Site4. Perhaps the translocation process is triggered by interaction with the corresponding ABC transporter, which may further translocate (GlcNAc)2 to the hypothetical release site, Site5/Site6, located in the Upper2/Lower interface.

Conclusion

Taken together, the three domains, Upper1, Upper2, and Lower domains, found in the crystal structure of VcCBP play different roles and function cooperatively in translocation of (GlcNAc)2. The mechanism proposed here was fully supported by binding data obtained by thermal unfolding and ITC experiments and may be applicable to other translocation systems involving SBPs belonging to the same cluster.

Materials and methods

Materials

(GlcNAc)n (n = 2–6) oligosaccharides were produced by acid hydrolysis of chitin37 and purified by gel filtration on Cellufine Gcl-25 m (JNC Co., Tokyo, 3.5 × 180 cm). Ni–NTA Agarose was purchased from QIAGEN (Tokyo, Japan). Q-Sepharose Fast Flow and HiPrep 16/60 Sephacryl S-100 were from GE Healthcare (Tokyo, Japan) and TOYOPEARL Butyl-650 M was from Tosoh (Tokyo, Japan). Other reagents were of analytical grade and commercially available.

Construction of expression plasmid for VcCBP and VcCBP_R27A

Synthetic genes encoding VcCBP and VcCBP_R27A, in which Arg27 was mutated to alanine, were obtained from Invitrogen (Carlsabad, CA, USA). The nucleotide sequences of the genes were optimized to increase expression in E. coli without changing the amino acid sequences of these proteins. The expression vectors, pRham-VcCBP and pRham-VcCBP_R27A, were constructed by the Expresso(R) Rhamnose Cloning and Expression System, N-His (Lucigen, UK).

Protein expression and purification

The expression vector, pRham-VcCBP or pRham-VcCBP_R27A was transformed into E. coli C43(DE3). Induction with 0.2% α-L( +)-rhamnose was conducted according to the supplier’s instruction. After induction, the culture was incubated at 15 °C for 40 h, then the cells were harvested and disrupted by sonication in 20 mM Tris–HCl buffer, pH 8.0. The sonicated extract was centrifuged at 12,000×g for 15 min at 4 °C. The soluble fraction was dialyzed against 20 mM Tris–HCl buffer pH 8.0, and applied to a Ni–NTA column equilibrated with the same buffer. After washing the column with the Tris buffer, the bound protein fractions were eluted with a linear gradient of 0–0.2 M imidazole. The fractions containing a protein of molecular mass 60 kDa, which corresponds to that of VcCBP, were collected, and ammonium sulfate was added to the protein solution to a final concentration of 1 M. The solution was applied to the TOYOPEARL Butyl 650 M column equilibrated with 20 mM Tris–HCl buffer pH 8.0 containing 1 M ammonium sulfate. The adsorbed fraction was eluted with a linear gradient of 1–0 M ammonium sulfate in 20 mM Tris–HCl buffer pH 8.0. The protein fractions containing VcCBP were pooled and applied to a Q-Sepharose column previously equilibrated with the same buffer. The protein was eluted stepwise with 0.15 M NaCl in the same buffer. The fractions containing VcCBP were pooled and further applied to a Sephacryl S-100 HR gel-filtration column equilibrated with the same buffer containing 0.1 M NaCl. Fractions exhibiting a single protein band on SDS-PAGE38 (Supplementary Fig. S5) were pooled and stored at 4 °C.

Protein concentration

Protein concentrations were determined by reading absorbance at 280 nm, using an extinction coefficient of VcCBP (110,365 M−1 cm−1) calculated from the equation proposed by Pace et al.39.

Thermal unfolding experiments

To obtain the thermal unfolding curve of VcCBP, the CD value at 222 nm was monitored using a Jasco J-720 spectropolarimeter (cell length 0.1 cm), while the solution temperature was raised at a rate of 1 °C min−1 using a temperature controller (PTC-423L, Jasco). To facilitate comparison of unfolding curves, the experimental data (molar ellipticities) were normalized to obtain unfolded fractions at individual temperatures. To assess the binding ability of GlcNAc and (GlcNAc)n (n = 2, 3, 4, 5, and 6), the unfolding experiments of VcCBP were conducted in the presence or absence of (GlcNAc)n. Individual unfolding experiments were repeated twice under the same conditions. The transition temperature of thermal unfolding (Tm) was elevated when the ligand was added to the VcCBP solution. The elevation of TmTm) indicated the binding of the ligand40. The solvent condition was 20 mM Tris–HCl buffer pH 8.0. Final concentrations of the protein and (GlcNAc)n were 8 µM and 8 mM, respectively.

Isothermal titration calorimetry (ITC)

Solutions of 50 μM VcCBP or VcCBP_R27A in 20 mM Tris–HCl buffer, pH 8.0, were degassed and loaded into the sample cell (0.2028 mL). The individual ligands (1.0 mM) were dissolved in the same buffer, degassed and loaded into a syringe. Calorimetric titration was performed at 25 °C using an iTC200 system (Microcal Northampton, MA, USA). In the titrations, 2.5 μL of a ligand was injected into the sample cell at 180-s intervals with a stirring speed of 1000 rpm. The dilution heat caused by each titration was measured by titrating ligand to buffer solution without protein under identical conditions. The dilution heat thus obtained was subtracted from the heat change that was observed in the presence of protein. Individual titration experiments were repeated three times to obtain reliable values of thermodynamic parameters. The Origin software installed in the ITC instrument was used to analyze the ITC data. Using the One-set of Sites model, individual datasets obtained from the titration experiments fitted well to the theoretical curves, providing the stoichiometries (n), equilibrium association constants (Ka) and enthalpy changes (ΔH°) of the protein–ligand interactions. The binding free energy change (ΔG°) and entropy change (ΔS°) were calculated from the relationship as follows,

$$\Delta G^\circ \, = \, - RT\cdot{\text{ ln}}K_{{\text{a}}} = \, \Delta H^\circ - T\Delta S^\circ$$
(1)

The accuracy of the thermodynamic parameters obtained was assessed from the c-values calculated from the equation c = n·Ka·[M]t, where [M]t is the total concentration of protein41. The ITC measurements were conducted in 20 mM Tris–HCl buffer pH 8.0, at various temperatures from 5 to 35 °C. The ΔH° values obtained for various temperatures were plotted against temperatures, and the slope of a straight line fitted to the experimental points corresponds to the heat capacity change (ΔCp°). As the entropy of solvation is regarded as zero for proteins near 385 K, ΔCp° was converted to the solvation entropy change (ΔSsolv°) at 25 °C (298 K) according to the following relationship,

$$\Delta S\mathrm{solv}^\circ =\Delta C\mathrm{p}^\circ \mathrm{ ln}(\frac{298.15\mathrm{ K}}{385.15\mathrm{ K }})$$
(2)

The conformational entropy change (ΔSconf°) was calculated from ΔS° obtained from Eq. (1), the solvation entropy change (ΔSsolv°) and the mixing entropy change (ΔSmix°, − 8 cal K−1 mol−1), based on the following Eq. 42,

$$\Delta S^\circ = \Delta S_{{{\text{solv}}}}^\circ + \Delta S_{{{\text{mix}}}}^\circ + \Delta S_{{{\text{conf}}}}^\circ$$
(3)

Crystallization and data collection

Crystallization conditions for VcCBP were screened using the sparse-matrix sampling method by sitting drop vapor diffusion at 20 °C. Under optimized crystallization conditions, 1 μL of protein solution (5 mg/ml in water) was mixed with 1 µL of 0.1 M sodium citrate containing 0.2 M ammonium acetate with 30% w/v polyethylene glycol 4000, pH 5.6. Rod-like crystals of VcCBP grew within 3 weeks to a size of up to 0.1 × 0.1 × 0.5 mm3. To prepare the crystals of (GlcNAc)3-liganded VcCBP, the unliganded crystals were transferred to the crystallization well solution containing 26 mM (GlcNAc)3 and incubated at 20˚C for 3.5 h. The crystals were successfully grown in the presence of (GlcNAc)3. For data collection, the crystals were transferred into the cryoprotectant solution containing 0.01 M zinc sulfate, 0.1 M MES (pH 6.5) and 30% PEG MME550, and then flash-cooled in a nitrogen stream at 95 K. The diffraction data were collected at the beam-line BL-17A of Photon Factory (Ibaraki, Japan), using an EIGER X 16 M (Dectris), at a cryogenic temperature (95 K). The data were integrated and scaled with XDS43. The processing statistics are summarized in Table 1.

Structural determination and refinement

The structures of unliganded and (GlcNAc)3-liganded VcCBP were solved by the molecular replacement method using the program PHASER44, where the structures of unliganded VcCBP (PDB code, 1ZTY) and (GlcNAc)2-liganded VhCBP (PDB code, 5YQW) served as search models, respectively. For unliganded VcCBP, two protein molecules were located in the crystallographic asymmetric unit. The model was improved by several rounds of refinement with PHENIX45 and COOT programs46. Occupancies of the Site3 GlcNAc and two conformers of the Phe410 side chain were set at 0.5. The structure of unliganded VcCBP was refined to an Rwork/Rfree of 16.2/19.1% at a resolution of 1.6 Å. The final model contains two protein molecules that include residues 1–532 for each molecule and 773 water molecules. The stereochemistry of the model was verified using MolProbity47, showing 96.7%, 3.3% and 0% of protein residues in the favored, allowed and disallowed regions of the Ramachandran plot, respectively. For (GlcNAc)3-liganded VcCBP, one protein molecule was located in the crystallographic asymmetric unit. The structure of (GlcNAc)3-liganded VcCBP was refined to an Rwork/Rfree of 16.9/18.7% at a resolution of 1.22 Å. The final model contained one protein molecule that includes residues 1–532 and 563 water molecules. The stereochemistry verification showed 97.3%, 2.7% and 0% of protein residues in the corresponding regions, respectively. Molecular graphics were illustrated with PyMol software (http://www.pymol.org/). The refinement statistics are summarized in Table 1.

Molecular dynamics simulation

Molecular dynamics simulations were started from three different conformations of VcCBP, unliganded, (GlcNAc)2-liganded, and (GlcNAc)3-liganded VcCBP (PDB codes: 8I5J, 1ZU0, and 8I5K, respectively). Molecular dynamics package Gromacs48, version 2019 and 2020.7, were used for the simulations and analysis of the simulated data. Protein topologies were generated with Amber99SB force field49. The ligand topologies were generated with ACPYPE server50. Amber GAFF force field was used for generation of the parameters and partial charges with AM1-BCC model to correspond to HF/6-31G* RESP charges51. The structures were placed in a rectangular box 1 nm from the protein and solvated in a pre-equilibrated water configuration with TIP3 water model52. Counter ions were added to neutralize the system. The structure was energy-minimized with steepest descent algorithm. To equilibrate the system into constant temperature, pressure and density 100 ps NVT and NPT simulations constraining the heavy atoms and 100 ps NPT simulation without constraints were performed before the actual NPT production runs. Berendsen pressure coupling was used in the constrained NPT simulation53. In the production runs modified Berendsen method, velocity rescaling54 was used for temperature coupling and Parinnello and Rahman55 for pressure coupling. Production runs were 500 ns with time step of 2 fs and the conformations and energies were collected every 10 ps. RMSDs of the whole protein and the domains from the starting conformation of each simulation were calculated based on the backbone atoms. Principal component analysis (PCA) or essential dynamics56 based on the variations of Cα atom positions were used to extract the largest motions in each simulation. The cross-correlation data were obtained through ProDy server57 via interface from VMD58.

Solvent accessibility surface area (ASA)

ASA was calculated using the method reported by Fraczkiewicz and Braun36 (GetArea 1.1, TX, United States), based on the crystal structures of unliganded, (GlcNAc)2-liganded, and (GlcNAc)3-liganded VcCBPs.