The trimeric solution structure and fucose-binding mechanism of the core fucosylation-specific lectin PhoSL

The core α1–6 fucosylation-specific lectin from a mushroom Pholiota squarrosa (PhoSL) is a potential tool for precise diagnosis of cancers. This lectin consists of only 40 amino acids and can be chemically synthesized. We showed here that a synthesized PhoSL peptide formed a trimer by gel filtration and chemical cross-linking assays, and determined a structure of the PhoSL trimer by NMR. The structure possesses a β-prism motif with a three-fold rotational symmetry, where three antiparallel β-sheets are tightly connected by swapping of β-strands. A triad of Trp residues comprises the structural core, forming NH–π electrostatic interactions among the indole rings. NMR analysis with an excess amount of fucose revealed the structural basis for the molecular recognition. Namely, fucose deeply enters a pocket formed at a junction of β-sheet edges, with the methyl group placed at the bottom. It forms a number of hydrophobic and hydrogen-bonding interactions with PhoSL residues. In spite of partial similarities to the structures of other functionally related lectins, the arrangement of the antiparallel β-sheets in the PhoSL trimer is novel as a structural scaffold, and thus defines a novel type of lectin structure.

In the present study, we determined a novel structure of the PhoSL trimer by NMR, and elucidated interaction with fucose. The results provided a basis for understanding the structural stability and recognition of fucosylated glycans.

Results and Discussion
Trimerization of the PhoSL peptide. While the PhoSL molecule purified from the mushroom has been suggested to form a trimer or a tetramer 3 , we determined the oligomeric state of the chemically synthesized peptide. By gel filtration analysis with marker proteins, the molecular weight of PhoSL was estimated to be 17.6 kDa (Fig. 1A,B). This value is slightly less than four times the size of the PhoSL monomer (4.5 kDa).
Next, a cross-linking reaction was carried out by using 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide (EDC) and N-hydroxysuccinimide (NHS). These reagents covalently link amino groups in Lys residues or at the N-terminus and carboxyl groups in Asp/Glu residues or at the C-terminus, when pairs of the linkable groups are located in close proximity. After the reaction, three bands were observed in sodium dodecyl sulfate (SDS)-polyacrylamide gel electrophoresis (PAGE) (Fig. 1C). These corresponded to monomer, dimer, and trimer, respectively, as determined by mass spectrometry (Fig. 1D). The monomer and dimer are likely to be observed after incomplete reaction or linking of pairs within the same polypeptide that blocks the inter-chain linking. In contrast, the trimer band should represent the oligomeric state of PhoSL, because a band for the tetramer is not observed at all. In this cross-linking approach, presumable dimerization of this trimer, i.e., hexamerization, cannot be excluded; the linkable pairs may not exist around the interface between the two trimers. In the gel filtration profile, we indeed observed a minor peak for a presumable hexamer (~27 kDa), although it is now clear that the main peak is for the trimer (Fig. 1A). Thus, the oligomeric state of the PhoSL peptide was determined by combination of the two methods.
Solution structure of PhoSL trimer. We used the chemically synthesized peptide without isotope labeling for the NMR analyses. In the preliminary analyses, the protein dissolved in phosphate buffer (pH 6.0), which showed significant precipitation (data not shown). Instead, the protein solution was clear when dissolved in Tris buffer (pH 7.5). Therefore, we used this buffer for all the NMR experiments in the present study.
The NMR spectra of PhoSL contained a single set of resonances, which implies a structural symmetry of the PhoSL trimer. Backbone Nuclear Overhauser effect (NOE) connections revealed that PhoSL possessed a secondary structure rich in β-strands ( Fig. 2A,B). Namely, we observed intense sequential H α H N NOEs, implying close proximity between these protons, which are characteristic of the extended peptide conformation of β-strands 10 . In contrast, we did not observe intense sequential H N H N NOEs and medium-range H α H N NOEs (|i-j| = 3 or 4), which are characteristics of helices 10 . The backbone H α H α NOEs and H N H N NOEs were observed between the β-strands, revealing the arrangements of the strands in β-sheets ( Fig. 2A).
Within the β-sheets, however, we cannot distinguish between intermolecular and intramolecular NOE connections of the β-strands, due to the symmetry. Therefore, in the structure calculation, ambiguities with regard to this distinction were introduced (see Materials and Methods section; Table 1). Following these trials in calculations, the connections between β-strands were determined as those shown in Fig. 2C. Therefore in the final calculation, a part of NOEs in the interfaces between the strands were fixed with regard to intermolecular or intramolecular ones.
The resulting structure satisfied the structural constraints, i.e., NOE distance constraints and torsional angle constraints, as we observed low root mean square deviations (RMSDs) from these constraints ( Table 1). The structure also possessed ideal stereochemical properties, with small van der Waals energy, low RMSDs from ideal geometries, and favorable distributions in the Ramachandran plot. Consequently, it showed good convergence, with low RMSD from the mean structure (Table 1, Fig. 3A).
This solution structure revealed a β-prism scaffold with a clear three-fold rotational symmetry (Fig. 3B). It consists of three β-sheets each containing essentially the four β-strands. The N-terminal and C-terminal halves of β-strand 1, with a kink between Thr6 and Lys7, from different monomers, respectively, form contacts within a short range, from which we may say that the β-sheets expand to five-stranded (Fig. 2C). Also, contacts between β-strands 3 and 4 are intermolecular. Thus, the trimer is tightly connected by swapping of β-strands among the three β-sheets.
At the symmetric center, the indole rings of the three Trp32 contact one another and form the major part of the structural core (Fig. 3C). The contacts are achieved through the edge of a ring plane located over another ring plane, defining relative angle of 120 degrees. The indole Hε1 proton is closest to the ring plane of another Trp with a distance of ~3 Å. It appears, therefore, that there are the NH-π electrostatic interactions, which was considered to be a kind of hydrogen bonds, between the rings 11 . We should point out that in these mutual interactions polarization of the N-H bond itself enhances the density of the π-electrons of the acceptor ring, through electron delocalization. Thus, these interactions should significantly contribute to stabilizing the formation of the structural core. This geometry is consistent with an extremely upfield shifted resonance of the Hε1 proton (4.69 ppm) and NOEs with other ring protons (Fig. S1A in the Supplementary information); a strong ring-current effect 10 may have overcome the downfield shifting by hydrogen bonding.
Cys10 and Cys17 are located on β-strands 1 and 2, respectively, which are adjacent to each other (Fig. 3D). Because of the reductive condition during the NMR measurements, we observed Hγ protons of these Cys residues and NOEs with other residues (Fig. S1B). Considering their relative positions, however, we can expect a disulfide bridge under an oxidative condition, which is likely to contribute to the structural stability.
Innate PhoSL structure. Although the PhoSL peptide isolated from the mushroom is 40 amino acids in length 3 , the relevant gene cloned from the mushroom codes for a polypeptide of 180 amino acids 12 (Fig. S2A). This polypeptide contains three repeats of PhoSL-like sequence, the first of which corresponds to that used in the present study. It is predictable that, at least immediately after the synthesis, the PhoSL of a single polypeptide forms a pseudotrimer. Two homology models of the pseudotrimer were produced on the basis of the present PhoSL trimer structure by SWISS-MODEL system 13 (Fig. S2B). The two are different in the relative positions of the three repeats in the trimeric structure; either is geometrically allowed by the substantial length of the linkers between the repeats.
Presumably by proteolytic cleavage of the linkers in the mushroom cells, PhoSL becomes a heterogenic trimer of 40 amino acids polypeptides. It is noticeable that the sequence of PhoSL determined by the N-terminal sequence analysis does not exactly match any of the three repeats in the polypeptide (Fig. S2A). On the other hand, respective amino acids of the former match those of either one of the repeats, except for position 27. This suggests a possibility that the 40-amino acid PhoSL peptide from the mushroom is a mixture of those corresponding to the three repeats in the 180-amino acid polypeptide.
It should be noted that the residues that stabilize the structure, as described above, and those that involved in the fucose binding (see below) are conserved among the three repeats (a conservative substitution of Tyr by Phe at position 23 is observed in repeat 2; Fig. S2A). Therefore, the innate PhoSL or the heterogenic trimer probably existing in the cell should be stable and fully functional.
Recently, a core fucose-specific lectin from a bacterium Streptomyces rapamycinicus (SL2-1) was identified, which was highly homologous to PhoSL 14 . Interestingly, SL2-1 is ~180 amino acids in length and has three repeats of PhoSL-homologous regions. Therefore, it is likely that the tertiary structure of SL2-1 should be very similar to that of the innate PhoSL protein shown in Fig. S2B. PhoSL, SL2-1, and related proteins from several bacteria and fungi cited in ref. 14 , including another core fucose-specific lectin from Rhizopus stolonifer 15 , should comprise a novel family of lectins, which are specific to core fucosylation.

NMR analysis of binding of fucose.
To understand the mechanism of molecular recognition, we examined binding of L-fucose by NMR (Figs 4A and S3A). Upon titration of fucose, chemical shift perturbations, i.e., changes in positions of some PhoSL peaks, were observed. These gradual changes indicate that binding and release of fucose from the protein are fast enough to average the peaks in the free and bound states. Therefore, changes in the chemical shifts reflect those in the relative population. Theoretical fitting for the concentration dependencies yielded dissociation constants (K D ) of 5.8-6.2 mM (5.9 ± 0.2 mM; Fig. 4A). Note that this affinity is weaker than that between PhoSL and N-glycans (~3 μM) 3 by ~2000-fold. Nonetheless, this difference in K D s corresponds to that in free energy changes of 4.5 kcal/mol, for which one or two hydrogen bonds may compensate. We suggest, therefore, that there should be such favorable contacts between PhoSL and the other parts of the N-glycans, e.g., GlcNAc moieties, and that the above K D value for the fucose monosaccharide is reasonable. This is indeed likely, because an NMR analysis on another core α1-6 fucosylation-specific lectin from an alga Bryothamnion triquetrum showed that the 2nd GlcNAc of the N-glycan core is also involved in the binding, in addition to fucose 16 .
L-fucose is a mixture of the two anomers, i.e., αand β-forms, where the former is biologically relevant. Populations of the α and β anomers were ~30% and ~70%, respectively, as estimated from the NMR peak intensities of H6 methyl protons (Fig. 4B, upper spectrum). Saturation transfer difference (STD) experiment, where saturation of resonances of protein was propagated to those of fucose in rapid equilibrium of association/dissociation, indicated that both the two anomers bound to PhoSL (Fig. 4B). The populations of the anomers bound to PhoSL changed to ~36% and ~64%, as seen in the difference spectra, where the α anomer is likely to be slightly preferable. Thus, we should note that the K D value for fucose is an average value for the two anomers in such populations. The STD experiment also showed that H6 methyl group was more involved in the protein binding than the other protons; the only positive peaks in the difference spectrum were those for the H6 protons.
We then analyzed the NOESY spectrum of PhoSL with an excess amount of fucose (100 mM), and compared the chemical shifts of all protons with those of free PhoSL (Fig. S3B). Because chemical shift perturbations are related to changes in microscopic environment, we may predict the residues involved in the fucose binding. The results suggested that regions around the junctions of β-sheet edges were involved. Additionally allowed region (%) 12.4 9.1 Generously allowed region (%) 0 0 Disallowed region (%) 0 0 Mechanism of molecular recognition. In the same NOESY spectrum, we identified several NOE cross-peaks between PhoSL and fucose, although the molecules are in an equilibrium state, with rapid binding and release (Fig. 4C). We separately assigned the cross-peaks for the two anomers of fucose, and calculated the respective structures in a similar manner for the free PhoSL (Figs 5 and S4; Table S1). With these NOEs, the structure around the bound fucose did not strongly converge (Figs 5A and S4A). Nonetheless, the binding site and orientation became clear in the structures. The following descriptions on the binding mechanism are for α-fucose, although they are essentially the same for β-fucose, unless otherwise stated. The fucose-binding sites were located indeed where the chemical shift perturbation suggested, i.e., junctions of the β-sheet edges (Fig. 5B). At each site, a pocket was formed by several residues from two PhoSL chains (Fig. 5C). Fucose molecule deeply entered the pocket, with the C6 methyl group placed at the bottom, which was consistent with the STD result. Between fucose and the residues that comprise the pocket, a number of hydrophobic and hydrogen-bonding interactions are formed (Fig. 5C,D). Namely, the C6 methyl group and other carbon atoms contact aliphatic side chains of Val3, Leu21, and Val36, aromatic side chains of Tyr23 and Trp28, backbone carbon of Gly12, and aliphatic moiety of Asp13. Of noticeable is the contacts to Y23, which have a CH-π attractive effect 17 , other than the hydrophobic effect. Also, hydrogen bonds are formed between the O 2 , O 3 , or O 4 hydroxyl groups and nitrogen or oxygen atoms of Ala1, and Asp13 (Gly12 is also involved in hydrogen bonds for β-fucose; Fig. S4C,D). Because fucose in the pocket is not strongly fixed in the structural ensemble, some of these contacts are formed in a partial fraction; this may be more fixed when GlcNAc or N-glycan is attached. It should be noted that the O 1 atom of α-fucose points outward from the pocket and is open for connection to GlcNAc (Fig. 5B,C).
In the present study, we identified the binding site of fucose and a number of intermolecular interactions, which provided the structural basis for the molecular recognition. To reveal the mechanism of specificity for the α1-6 linkage, however, it will be necessary to analyze the structure of complex with a sugar containing GlcNAc. There may be favorable contacts between GlcNAc and PhoSL, or machinery for discrimination against the other types of linkages.

Sharing four-stranded antiparallel β-sheets with other lectins. A fucose-binding lectin from
a bacterium Ralstonia solanecearum (RSL) forms a symmetric trimer as well (Fig. 6A) 18 . A monomer consists of a tandem of two four-stranded antiparallel β-sheets, resulting in a six-bladed β-propeller structure with a pseudo-six-fold symmetry for the trimer.
A very similar six-bladed β-propeller structure is observed for another fucose-binding lectin from a fungus Aleuria aurantia (AAL) (Fig. 6B) 19 . The latter, however, is a monomeric polypeptide possessing six repeats of four-stranded β-sheets.
A mannose-specific lectin (or agglutinin) from a higher plant Galanthus nivalis (GNA) forms a tetramer (Fig. 6C) 20 . Each monomer consists of three four-stranded antiparallel β-sheets. When two of the three β-sheets in each monomer are considered, the structure of the tetramer appears to be an eight-bladed β-propeller structure partly similar to those of RSL and AAL. On the other hand, the three β-sheets in the monomeric part are arranged in a symmetric manner and comprise a pseudotrimeric β-prism structure similar to that of PhoSL (Fig. 6C, right  panel). It also contains a triad of Trp residues at this pseudosymmetric center.
Nonetheless, the above resemblance of GNA monomer to PhoSL trimer is superficial. Namely, β-prism of GNA is formed essentially by a single polypeptide, with swapping of one β-strand with another monomer. Also, the β-strands in the front layer direct clockwise, while the corresponding β-strands of PhoSL direct counterclockwise (Figs 2B and 6C); this is also true when the structures are looked from the opposite direction. In addition the three Trp rings of GNA contact among one another through simple hydrophobic forces, but not through the NH-π electrostatic forces. Furthermore, the binding pockets for sugar are formed over the β-sheet planes.
As above, RSL, AAL, and GNA commonly contain four-stranded antiparallel β-sheets as units. By rearrangements in the polypeptide level, they form β-propellers or similar structures. PhoSL is similar to them only in that it also contains similar antiparallel β-sheet as the structural unit. Indeed, when we used the Dali program 21 to search the data base for similar structures, only those possessing antiparallel β-sheets of similar length, including several β-propeller structures, were identified. This was also essentially the same when we artificially treated the PhoSL trimer as a single polypeptide before submitting to Dali. Therefore, the arrangement of the three β-sheets in PhoSL defines a novel type of structural scaffold.

Materials and Methods
Sample preparation. A PhoSL peptide without N-terminal acetylation or C-terminal amidation (APVPVTKLVCDGDTYKCTAYLDYGDGKWVAQWDTAVFHTT) was chemically synthesized by the standard solid-phase method and purified to >95% by HPLC (BEX Co., Ltd., Tokyo, Japan). The above sequence is derived from the gene relevant to the lectin described in a published patent application 12 and is different from  20 . In the right panel, a monomeric part of GNA with a swapping of a β-strand (circled in the left panel) is focused. In these figures, different polypeptide chains are distinguished by colors, and bound sugars are shown in stick representations. In the right panel of (C), a triad of Trp residues in the core is indicated also by stick.
ScIeNTIfIc REPORTS | (2018) 8:7740 | DOI:10.1038/s41598-018-25630-2 that in the literature 3 , APVPVTKLVCDGDTYKCTAYLDFGDGRWVAQWDTNVFHTG, at four positions (23rd, 27th, 35th, and 40th amino acids from the N-terminus, as underlined; see Fig. S2A). It was shown, however, that these alterations do not influence the specificity to core-fucosylated glycans 12 . In this patent, it was mistakenly described that this lectin was derived from a mushroom P. terrestris. However, it was later corrected to be derived from P. squarrosa in literature 3 . The concentration of the peptide in solutions was estimated from absorbance at 280 nm (A 280 ), where molar absorption coefficient was calculated from the amino acid sequences 22  NOE spectroscopy (NOESY), total correlation spectroscopy (TOCSY), and double-quantum-filtered correlation spectroscopy (DQF-COSY) spectra 10 for structure determination and chemical shift perturbation were recorded on Bruker Avance III-900 (900.13 MHz for 1 H) spectrometers at 308 K, where mixing times of 100 ms and 50 ms for NOESY and TOCSY, respectively. Fucose-titration and STD experiments were performed on a Bruker Avance III-500 (500.13 MHz for 1 H) spectrometer at 308 K. For STD, a spectrum with irradiation on resonance of Leu21/H δ 1 for 2 sec was subtracted from a reference spectrum with an off-resonance irradiation. NOESY, TOCSY, and simple 1 H spectra of a sample dissolved in 99.96% D 2 O (Isotec Inc.) after lyophilization were recorded at 298 K, where 20 hydrogen bond donors were identified. Spectral analysis and structure determination of PhoSL trimer. The NOESY, TOCSY, and DQF-COSY spectra were analyzed by Felix ver. 2007 (Felix NMR, Inc., San Diego, CA). The backbone and side chain 1 H resonances were assigned on the basis of the sequential NOEs 10 . Chemical shifts were referenced to the internal DSS. By analyzing the spectra, 14 and 5 pairs of H β and valine H γ resonances, respectively, were assigned stereospecifically and, at the same time, the relevant rotamers around the χ 1 angles were determined (methods described in refs 24,25 ). Also, two pairs of Leu H δ resonances were spectrospecifically assigned, and the relevant χ 2 rotamers were determined. Similarly, the χ 1 rotamers of three Thr residues were estimated.
A simulated annealing protocol with random initial velocities 26 and an extended initial structure was carried out by CNS ver. 1.3 27 . The distance constraints derived from the NOESY spectra and torsion angle constrains on the determined χ 1 and χ 2 rotamers were imposed as described previously 28 . For structure determination of the symmetric trimer, weak non-crystallographic symmetry (NCS) restraints were applied at 10 kcal mol −1 Å −2 . Conditions for the molecular dynamics were: 50,000 K for 2,000 steps in torsion-angle space at high-temperature annealing stage; 50,000 K (initial) for 2,000 steps in torsion-angle space at the first slow cooling stage; and 2,000 K (initial) for 2,000 steps in Cartesian space at the second slow cooling stage.
NOEs were categorized with regard to inter-or intramolecular ones. All the sequential NOEs (NOEs between adjacent residues) were treated as intramolecular ones. For medium-range (2 ≤ |i-j| ≤ 4; i and j are residue numbers of the two protons for the NOE) and long-range (|i-j| > 4) NOEs, ambiguity in terms of this classification was introduced by means of the sum-averaging function. In order to reduce the search space of the random simulated annealing and increase the acceptance ratio, however, some NOEs from the interstrand interfaces were fixed as either intramolecular or intermolecular ones. This was systematically introduced, as follows.
When NOEs around the interface between β-strands 3 and 4, i.e., those between regions Lys27-Gln31 and Thr34-Thr39, which included typical interstrand backbone NOEs (Val29/HN-Phe37/HN, Ala30/Hα-Val36/Hα, and Gln31/HN-Ala35/HN in Fig. 2A), were treated as intramolecular ones, we could not obtain any acceptable structures on the basis without a distance violation greater than 0.2 Å, no torsion angle violation greater than 2° from 3000 trials. Alternatively, when these NOEs were treated as intermolecular ones, we obtained eight acceptable structures from 3000 trials. In all of the accepted structures, the strands were arranged as shown in Fig. 2C. Therefore, NOEs around other interfaces between β-strands were fixed accordingly. Namely, the NOEs were intermolecular for Pro2-Val5 and Val9-Gly12 (the N-terminal and C-terminal halves of β-strand 1); intramolecular for Val3-Val9 and Cys17-Tyr23 (β-strands 1 and 2); and intramolecular for Lys16-Leu21 and Lys27-Thr34 (β-strands 2 and 3). Based on the calculated structure, acceptors of the hydrogen bonds were determined, and the relevant distance constraints were applied. Finally, in 50 trials, 20 structures were selected on the bases without a distance violation greater than 0.2 Å, no torsion angle violation greater than 2°, and the lowest total energies.
The structures of PhoSL-α-fucose and PhoSL-β-fucose complexes were calculated respectively by a simulated annealing protocol similar to the above. Intensities of the NOEs between PhoSL and either of the two anomers were calibrated to the relative occupancies seen in the STD experiment (Fig. 4B), as if the binding pocket was fully occupied by a particular anomer. For the initial structure of the calculation, the minimized mean structure obtained by a calculation in advance without the PhoSL-fucose NOEs was used, instead of the extended structure. Conditions for the molecular dynamics were weakened, i.e., 1,000 K for 1,000 steps in torsion-angle space at the annealing stage; 1,000 K (initial) for 1,000 steps in torsion-angle space at the first cooling stage; and 1,000 K (initial) for 1,000 steps in Cartesian space at the second cooling stage. The NCS restraints were applied only to the protein moiety. Calculation was performed for 20 trials, and all of the structures were accepted, without a distance violation greater than 0.2 Å or a torsion angle violation greater than 2°.