Differential recognition of Haemophilus influenzae whole bacterial cells and isolated lipooligosaccharides by galactose-specific lectins

Bacterial surfaces are decorated with carbohydrate structures that may serve as ligands for host receptors. Based on their ability to recognize specific sugar epitopes, plant lectins are extensively used for bacteria typing. We previously observed that the galactose-specific agglutinins from Ricinus communis (RCA) and Viscum album (VAA) exhibited differential binding to nontypeable Haemophilus influenzae (NTHi) clinical isolates, their binding being distinctly affected by truncation of the lipooligosaccharide (LOS). Here, we examined their binding to the structurally similar LOS molecules isolated from strains NTHi375 and RdKW20, using microarray binding assays, saturation transfer difference NMR, and molecular dynamics simulations. RCA bound the LOSRdKW20 glycoform displaying terminal Galβ(1,4)Glcβ, whereas VAA recognized the Galα(1,4)Galβ(1,4)Glcβ epitope in LOSNTHi375 but not in LOSRdKW20, unveiling a different presentation. Binding assays to whole bacterial cells were consistent with LOSNTHi375 serving as ligand for VAA, and also suggested recognition of the glycoprotein HMW1. Regarding RCA, comparable binding to NTHi375 and RdKW20 cells was observed. Interestingly, an increase in LOSNTHi375 abundance or expression of HMW1 in RdKW20 impaired RCA binding. Overall, the results revealed that, besides the LOS, other carbohydrate structures on the bacterial surface serve as lectin ligands, and highlighted the impact of the specific display of cell surface components on lectin binding.


Supplementary Introduction
The galactose-specific agglutinins from Ricinus communis (RCA) and Viscum album (VAA) belong to the family of AB-type ribosome-inactivating proteins 1 , which consist of two chains linked by a disulfide bridge: an A chain with rRNA N-glycosidase activity and a B chain with carbohydratebinding activity. VAA contains two carbohydrate-binding sites per B-chain, which are characterized by the central positioning of the aromatic ring of Trp38 and Tyr249, respectively. At submicrogramper-mL concentrations VAA forms [AB] 2 dimers through contacts of adjacent B chains (Fig. S1a). Dimerization spatially restricts accessibility to the Trp-sites, so that only the Tyr-sites are fully operative in the dimer 2 . On the other hand, RCA forms dimers through contacts of the A (Fig. S1b) 3 . The Trp sites are fully exposed in the RCA dimers, but the second set of carbohydrate-binding sites in the B chains is not functional due to a mutation introducing a histidine residue instead of Tyr. Thus, the ligand binding ability of the VAA and RCA dimers resides in the Tyr sites and Trp sites, respectively. As observed in the X-ray crystal structures of the agglutiningalactose complexes, besides stacking interactions with the aromatic rings the two sets of sites share strong hydrogen bonds of the side chains of asparagine and aspartic acid residues (Asn256/Asp235 and Asn46/Asp22, respectively) with the hydroxyl groups at positions 3 and 4 of galactose (Fig. S1c,d). The RCA Trp site, however, exhibits a more complex hydrogen bonding network, including a contact of Glu26 with the hydroxyl at position 1 in  configuration. This contact is not possible in the VAA Tyr site due to the presence of an alanine residue (Ala 239) in the equivalent position.

Supplementary Results and Discussion
Isolation of LOS NTHi375 . The LOS of NTHi375 was extracted and quantified using a combination of the Purpald assay 4 and densitometry of LOS bands upon DOC-PAGE and silver staining, using an optimized protocol (see Supplementary Methods for experimental details) that enabled detection and reliable quantitation of LOS bands of (or even below) 100 ng (Fig. S2a). The yield of LOS extracted from wild type (WT) NTHi375 was insufficient to perform microarray and NMR experiments at the LOS concentrations required. Therefore, strain NTHi375ΔompP5, which lacks the major outer membrane protein P5 and keeps unaltered the enzymatic machinery involved in LOS biosynthesis, was used as source for LOS isolation. This mutant rendered high signal intensity upon bacterial colony immunoblot analysis using the anti-phosphorylcholine (PCho) monoclonal antibody TEPC-15. Given that incorporation of PCho to the LOS molecule is phase variable, this observation suggests that NTHi375ompP5 is mostly PCho phase ON, although a higher LOS production in this mutant contributing to the increased TEPC-15 signal cannot be completely excluded 5,6 . Indeed, although batch-to-batch variations in the yield were observed, the amount of LOS extracted from NTHi375ΔompP5 was always significantly higher than that extracted from a WT NTHi375 culture of equal biomass (Fig. S2a,b), directly pointing to a higher LOS abundance in this mutant, maybe due to compensatory changes in the bacterial surface upon inactivation of the ompP5 gene. The structure of the NTHi375ΔompP5-derived LOS (hereafter referred to as LOS NTHi375 ) was characterized by NMR spectroscopy and found to be identical to that reported for the WT strain, as described in the next section.
NMR structural characterization of isolated LOSs. The 1 H-NMR spectra of isolated LOS RdKW20 and LOS NTHi375 in D 2 O showed only broad undefined signals, most probably due to lipid-mediated aggregation, as previously reported for LOS RdKW20 7 . To overcome this problem, the linkage between 3-  Fig. S3-10) helped to the resolution of signals overlapping in the 1 H dimension, and enabled assignment of chemical shifts and coupling constants of the anomeric protons. The results obtained for LOS RdKW20 -derived OSs (Fig. S3 and S4, and Table S1) were comparable to those previously reported for this strain 7 , confirming the presence of three main glycoforms bearing terminal -Gal (Hex3), -Gal (Hex4), or -GalNAc (Hex5) at the Hep III branch. Regarding LOS NTHi375 -derived OSs ( Fig. S5-S8, and Tables S2 and S3), anomeric resonances were observed in the 1 H-NMR spectrum at  = 5.03-5.13 and 5.60-5.68 ppm, corresponding to the three Hep residues of the inner core. Their identities were confirmed by ROE cross-peaks between the respective H1-H2 intra-residue pairs (Table S3). Intense transglycosidic ROE connectivities between Hep III H1/Hep II H2 and Hep II H1/Hep I H3 ( Fig. S7 and Table S3  proportion was roughly estimated. Using electrospray-ionisation mass spectrometry 9 , the NTHi375 LOS was previously described to be predominantly composed of Hex4 glycoforms, bearing a Gal(1,4)Gal(1,4)Glc extension at Hep III, together with mono-and di-sialylated Hex3 species accounting respectively for 9% and 5% of the LOS. However, no OS-bound Neu5Ac was detected in our NMR analysis. A plausible explanation is that the conditions used for mild acid hydrolysis of the Kdolipid A bond could also result in cleavage of the acid-labile Neu5Ac linkages 10,11 . Therefore, it seems reasonable to presume that the Hex3 form here observed for LOS NTHi375 -derived OSs resulted from hydrolysis of sialylated species. The small difference in the estimated population of Hex3 compared to that reported for sialylated NTHi375 LOS species could be attributed to the different approaches used in each case. A close inspection of the STD spectrum of VAA in complex with LOS NTHi375 -derived OSs hinted at protons of the terminal -Gal of the Hex3 glycoform suffering saturation transfer. However, as this glycoform is not expected to be naturally present in LOS NTHi375 , these signals were ignored. Overall, comparison of the NMR structural information obtained for the NTHi375ΔompP5 OSs with the mass spectrometry-based structure reported for WT NTHi375 LOS 9 was consistent with the mutant and WT strains bearing identical LOSs.

Supplementary Methods
Lectins. RCA and ConA were purchased from Vector Laboratories. VAA was isolated from Viscum album extracts, as previously described 12 . The concentration of RCA and ConA was determined from the absorbance at 280 nm, using the extinction coefficient calculated from the amino acid sequence with the ProtParam tool, available at http://web.expasy.org/protparam. VAA concentration was determined by the Lowry assay using ConA as standard 13 . When required, RCA and VAA were biotinylated by incubation for 1 h at 20 ºC with biotinamidocaproate ester derivative (GE Healthcare Life Sciences), according to the manufacturer's recommendations. To prevent modification of residues of the carbohydrate-binding site, biotinylation was performed in the presence of 20 mM lactose.

Bacterial strains and isolation of LOS.
NTHi strains used in this study included the otitis media isolate NTHi375, its isogenic mutants ΔompP5 (lacking the major outer membrane protein P5), ΔlgtF (whose LOS lacks the extension at Hep I), and ΔlpsA (lacking the LOS extension at Hep III), the laboratory strain RdKW20, and RdKW20hmw1 strain12 , a transformed RdKW20 strain carrying the hmw1 operon of NTHi strain 12. Bacteria were grown on chocolate agar and brain-heart infusion medium supplemented with 10 μg/mL hemin and 10 μg/mL β-nicotinamide adenine dinucleotide (sBHI), fixed with 4% paraformaldehyde, and labelled with SYTO-13 as described previously 14 .
For LOS isolation, NTHi strains were inoculated (3-5 colonies) in 400 mL sBHI and incubated at 37 °C and 5% CO 2 under shaking at 180 rpm. After 12 h, OD 600 was measured and the viability tested by serial dilution and plating on sBHI agar. Bacteria pellets were collected by centrifugation at 6,000 × g for 15 min at 4 ºC, resuspended in water (33.33 mL water/g of dry pellet), and thoroughly mixed with an equal volume of 66 ºC-preheated equilibrated phenol solution (Sigma-Aldrich). After incubation for 15 min at 66 ºC, the phenol-bacteria suspension mixture was cooled on ice and centrifuged at 8,000 × g for 15 min at 4 ºC for separation of the aqueous and phenol layers. The phenol layer was discarded and the LOS was precipitated from the aqueous layer by addition of four volumes of methanol containing 1% sodium acetate-saturated methanol and incubation for at least 12 h at -20 ºC. The precipitate was recovered by centrifugation at 8,000 × g for 15 min at 4 ºC, resuspended in 5 mL water and exhaustively dialysed against distilled water. The dialysate was then centrifuged at 100,000 × g for 6 h at 4 ºC, and the resulting pellet was resuspended in 2 mL water and lyophilised. For elimination of contaminating nucleic acids, the lyophilisate was resuspended at 10 mg/mL in 100 mM Tris-HCl, pH 7.0, containing 0.05% NaN 3 , and digested with 50 g/mL DNAse II and RNAse I (Sigma-Aldrich) for 30 min at 37 ºC. Next, protein traces were digested by incubating thrice with 50 µg/mL of proteinase K (Sigma-Aldrich) for 3 h at 55 ºC. The LOS was then precipitated and recovered by centrifugation as described above, and the pellet was resuspended in 2 mL water and exhaustively dialysed against distilled water. The dialysate was centrifuged at 100,000 × g for 6 h at 4 ºC, and the resulting pellet containing purified LOS was resuspended in 200 µL of distilled water. The purity of LOS suspensions was assessed by polyacrylamide gel electrophoresis in the presence of sodium deoxycholate (DOC-PAGE) and silver staining, as described below. LOS concentration was determined colorimetrically by the Purpald assay 4,15 , using Kdo (Sigma-Aldrich) and L-glycero-Dmanno-heptose (Carbosynth) as standards, and by densitometric quantitation of electrophoretic LOS bands.

Sodium deoxycholate-polyacrylamide gel electrophoresis (DOC-PAGE) and silver staining.
Tricine-DOC-PAGE was carried out using a discontinuous system consisting in 1) a separating gel containing 6 M urea and composed of 16.5% total acrylamide (T), with a concentration of 6% bisacrylamide (C) relative to the total concentration, 2) a spacer gel composed of 10% T and 3% C, and 3) a stacking 4% T/3% C gel, essentially as described 16 , except that the gels contained 0.15% DOC instead of SDS, and the sample buffer was 0.1 M Tris-HCl, pH 6.8, containing 1% DOC (w/v), 20% glycerol, and 0.1% bromophenol blue. Samples were mixed 1:1 with sample buffer and loaded into the gels without previous boiling.
For silver staining of LOS bands, the protocol described by Schägger 17 was optimized for oligosaccharide detection. First, the gel was soaked in fixing solution composed of methanol/acetic acid/milli-Q water (40:10:50 v/v/v) for 45 min. As crucial step, the gel was next soaked for 10 min in fixing solution containing 0.7% periodic acid for LOS oxidation. After thorough washing with milli-Q water, the gel was sensitized by incubation in 0.02% sodium thiosulfate for 1 min. Following two 1min washes with milli-Q water for removing excess ions, the gel was incubated for 45 min at 4 ºC with staining reagent containing 0.1% silver nitrate and 0.028% formaldehyde. After a brief wash with milli-Q water, developer solution consisting in 0.018% formaldehyde, 3% potassium carbonate, and 0.001% sodium thiosulfate was added. Development was stopped when considered appropriate by addition of 5% acetic acid. Gels were thoroughly washed with milli-Q water and stored in 7% acetic acid at 4 ºC. For band quantitation, gels were scanned with an Epson Perfection 3200 Photo scanner and the images digitized using the UN-SCAN-IT-gel v6.1 software (Silk scientific). The Rb-form of the lipopolysaccharide from Salmonella minnesota (Enzo Life Sciences) was used as reference for comparison of band mobilities. NMR experiments. Two mg of NTHi375ΔompP5or RdKW20-derived LOS was hydrolysed in 200 L 1% acetic acid in milli-Q water (v/v) for 3 h at 100 °C with frequent vortexing. An equal volume of milli-Q water was added to the hydrolysate, which was then kept for 36 h at 4 °C and later centrifuged at 8,000 × g and 4 °C. The supernatant, containing the oligosaccharide (OS), was lyophilized. NMR samples were prepared in 99.9% D 2 O buffer, containing 5 mM sodium phosphate, pD 7.2 (uncorrected value), and 200 mM NaCl.
All NMR spectra (Fig. S3-S11) were acquired at 310 K on a 600 MHz Bruker Avance spectrometer equipped with a cryoprobe, and processed with TopSpin 3.0 software (Bruker). The 1 H chemical shifts were referenced to the residual water signal using the equation  (ppm) = 5.051 − 0.011 × T (°C) 18 . The 1 H and 13 C NMR OS spectra were assigned using a combination of TOCSY (dipsi2phpr), 1 H-13 C HSQC (hsqcedetgp) and ROESY (roesyphpr) of standard pulse sequences included in Bruker TOPSPIN software. TOCSY experiments were performed with 20 and 70 ms mixing times. ROESY experiments were performed in the phase sensitive mode with presaturation, and the spin-lock module was of 300 ms.
For STD experiments, the efficiency of on-resonance frequencies of  = 7 ppm and -0.5 ppm was first compared, yielding identical STD patterns ( Supplementary Fig. S9). As aromatic irradiation (7 ppm) resulted in higher protein saturation (82% vs 70% for aliphatic irradiation) this on-resonance frequency was used for the full set of STD NMR experiments.

Molecular dynamics simulations. The structures of Gal(1,4)Gal (galabiose) and
Gal(1,4)Gal(1,4)Glc (globotriose) were built using the Carbohydrate Builder tool available at GLYCAM-Web 19 , and minimized and parametrized for AMBER 12 20 , MD simulations using GLYCAM 6 Force Field 21 . Proteinsugar complexes were built by docking the ligands into the Tyrsite of the crystal structure of the VAAgalactose complex (PDB code 1OQL). The structures of the two complexes were then processed with the XLEAP Amber module to get the input files for MD simulations. For the protein, the Force-Field 99 (ff99) was used 22 . A truncated octahedral box with dimensions of 10.0 Å for explicit TIP3P water molecules was defined 23 . MD simulations with no restraints in an explicit water solvent were carried out using the SANDER module in AMBER, with periodic boundary conditions and the particle-mesh Ewald approach 24 to account for electrostatic interactions. The protocol included four steps: 1) initial minimisation with protein and carbohydrate fixed, to allow water molecules to place properly, 2) minimisation of the whole system, 3) a 20-ps MD simulation with the position restrained for the complex, to relax the location of solvent molecules and to heat the system, and 4) a 10-ns unrestrained MD simulation of the complex at 300 K and 1 atm, with 50,600 structures saved for further analysis. The final frames were processed and analysed for robustness and equilibrium throughout the simulation, characterising its rmsd, potential, kinetic and total energies, temperature, pressure, volume, and density. Frames were clustered using the PTRAJ AMBER module to determine structure populations within the MD simulation. DBSCAN (density based) clustering algorithm was used, with a minimum of 30 points to make a cluster, 0.7 as the distance cut-off for forming clusters, and RMSD of atoms as distance metric. A total of 26 clusters for galabiose and 29 clusters for globotriose were obtained. The most representative structure for each cluster was selected for discussion.   Table S1. CH 2 are shown in petrol; CH and CH 3 in black. Figure S4. Section of a ROESY spectrum of LOS RdKW20 -derived oligosaccharides. The spectrum was acquired with a 300-ms spin-lock mixing time. Relevant intra-and inter-residue ROE contacts are labelled. In the zoomed section, S stands for substituted residues. Substitution of Gal II at position 3 results in a small but perceptible shift of the H3 resonance.  Table S2. CH 2 are shown in petrol; CH and CH 3 in black. Figure S6. Section of the TOCSY spectrum of LOS NTHi375 -derived oligosaccharides. Mixing time was 70 ms. The corresponding 1D 1 H-spectrum was acquired applying a gradient-based water suppression pulse sequence (zgesgp). Relevant correlation peaks are labelled in the zoomed section.

Figure S7. Section of a ROESY spectrum of LOS NTHi375 -derived oligosaccharides (I).
The spectrum was acquired with a 300-ms spin-lock mixing time. In the zoomed section, relevant intra-and inter-residue ROE contacts are labelled, including transglycosidic ROE cross-peaks between proton pairs Hep III H1/Hep II H2 and Hep II H1/Hep I H3 (see also Table S3). Figure S8. Section of a ROESY spectrum of LOS NTHi375 -derived oligosaccharides (II). The spectrum was acquired with a 300-ms spin-lock mixing time. Relevant inter-residue ROE contacts were observed between proton pairs Gal I H1/Glc II H3/H4/H5 and Gal II H1/Gal I H4/H6 (see also Table S3).   -(a, b) and LOS RdKW20 -derived (c, d) oligosaccharides were acquired with a 300 ms spin-lock mixing time. Strong inter-residual ROE contacts between the proton pairs Glc II H1/ Hep III H2 were common to oligosaccharide spectra from both strains (a, c). ROE contacts between Glc II H6 and Hep II H2 could be observed for LOS RdKW20 -derived OSs (d), while no crosspeak was spotted for NTHi375 (b).