Introduction

A key factor in the establishment of infections by most bacterial pathogens is their adherence to host epithelial cells1,2. Autotransporters (ATs) are the largest group of outer membrane and secreted proteins in bacteria and play important roles in virulence, including promoting adhesion3. ATs share a common domain organisation consisting of a Sec-dependent signal sequence, a passenger or α-domain and a C-terminal translocator β-domain4. The signal sequence and β-domain are required for transport of the α-domain through the inner and outer membranes, respectively. The α-domain is the functional portion of the protein and can drive phenotypes including cytotoxicity, aggregation, adhesion and/or invasion, features that enhance bacterial virulence, colonisation, biofilm formation, persistence and resistance to host innate defence mechanisms3,4,5. In an era of increasing antimicrobial resistance, a greater understanding of the mechanisms by which ATs augment bacterial pathogenesis is required if we are to develop new strategies to combat infections caused by multidrug-resistant pathogens.

Despite the abundance of AT genes in the GenBank sequence database and the importance of these proteins in bacterial pathogenesis, structural information about AT α-domains remains limited. To date, only 11 α-domains and some small fragments of trimeric ATs (a separate subfamily of ATs that are obligate trimers6,7,8) have been structurally characterised, and consequently, AT molecular mechanisms of action are largely unknown.

Based on limited structural information, AT α-domains adopt a general architecture comprising a long narrow right-handed β-helix that is embellished with loops and/or other small domains (e.g. trypsin-like domains) that confer different functional properties9,10,11,12. One of the best characterised ATs is the antigen 43 (Ag43) protein, which belongs to the largest and most diverse AT subfamily, the AIDA-I type class13. Ag43 is the only member with a known mechanism of action, whereby a head to tail dimerisation of Ag43 monomers on the surface of adjacent bacterial cells promotes aggregation and biofilm formation via a molecular ‘velcro-like’ mechanism14,15,16,17.

The surface-exposed α-domains of different AIDA-I type ATs exhibit extensive sequence variation13. An example of this is represented by UpaB from uropathogenic Escherichia coli (UPEC), the major aetiologic agent of urinary tract infection and a primary cause of sepsis18. UpaB is an AT that mediates UPEC adherence to extracellular matrix (ECM) proteins and enhances UPEC colonisation of the urinary tract19. Here we determine the structure of UpaB at high resolution and reveal that it adopts a unique architecture potentially comprising two distinct binding sites. One binding site is formed by significant extensions to its β-helix that form a groove that can interact with glycosaminoglycans. On its opposite face, a second binding region can interact with human fibronectin (FN) type III (FnIII). Our results suggest that the AT β-helix may have diverse roles in addition to acting as a structural scaffold5.

Results

Characterisation of UpaB reveals that it does not exhibit self-association properties

AIDA-I-type ATs including Ag43, TibA and AIDA-I belong to a group of self-associating ATs that promote bacterial aggregation and biofilm formation20. In the case of Ag43, aggregation is mediated via a series of hydrogen bonds, hydrophobic interactions and Van der Waals forces that drive a head-to-tail dimersation of Ag43 monomers on the surface of adjacent cells17. We previously showed that UpaB from the UPEC reference strain CFT073 does not mediate cell aggregation when overexpressed in E. coli laboratory strains19, but under certain conditions, it may have a slight indirect effect on these phenotypes21. UpaB is composed of an N-terminal signal sequence (residues 1–37), an α-domain (residues 38–500) and a β-domain (residues 501–776) (Fig. 1a). In order to understand the functional properties of UpaB, we cloned, expressed and purified the region encoding the UpaB α-domain (αUpaB) from UPEC CFT073 and used analytical ultracentrifugation sedimentation velocity experiments to assess its propensity to self-associate in solution. At 0.5, 1 and 2.2 mg ml−1, αUpaB produced a single sedimentation boundary and a continuous sedimentation-coefficient distribution (c(s)) (Fig. 1b) to give a single species with a standardised sedimentation coefficient of 3.1 s. Analysis by continuous mass distribution (c(M)) gave a molecular weight of approximately 52.7 kDa, consistent with a monomeric species. Likewise, small-angle X-ray scattering (SAXS) of αUpaB at concentrations <2.7 mg ml−1 was consistent with a monodisperse protein population (Supplementary Fig. 1), with the estimated mass, radius of gyration (Rg) and the maximum linear dimension (Dmax) from the experimental pair-distance distribution profile (p(r)) yielding values close to what would be expected for a solution of monomeric αUpaB (Supplementary Table 1). Thus, unlike the α-domain of Ag43 (αAg43), recombinant αUpaB does not self-associate. This fundamental difference in the functional properties of both proteins led us to determine the crystal structure of αUpaB.

Fig. 1
figure 1

The structure of the UpaB functional α-domain (αUpaB). a Domain organisation of UpaB comprising an N-terminal signal sequence (SP; residues 1–37), an α-domain (αUpaB; residues 38–500) and a β-domain (βUpaB; residues 501–776). b Analytical ultracentrifugation sedimentation velocity analysis of αUpaB. The continuous standardised sedimentation distribution [c(s)] shows that UpaB at 2.2 mg ml−1 exists as a 3.1 s20,w monomer. c Cartoon representation of the αUpaB structure, including d top view. The central domain consisting of extended β-strands is shown in dark green. The N-terminal and C-terminal β-helical domains are shown in yellow and light green, respectively. The top view has F1, F2 and F3 faces shown. e Stereo view of the 2FoFc electron density map contoured at 1σ of the cross-section of the αUpaB β-helix. Structural comparison of UpaB (green) with the α-domain of f pertactin (from B. pertussis; magenta; PDB 1DAB) and g Ag43a (from UPEC; blue; PDB 4KH3)

Overall UpaB structure

The αUpaB crystal structure was solved to 1.9 Å resolution (crystallographic Rfactor of 17.5%; Rfree 21.8%). The crystallographic refinement statistics are reported in Table 1. The structure was solved from a xenon derivative by single isomorphous replacement with anomalous scattering. One molecule of αUpaB was found in the asymmetric unit. The crystal structure of αUpaB exhibited a right-handed three-stranded β-helix consisting of 13 turns (Fig. 1c), with each triangular turn containing three faces, F1, F2 and F3 faces (Fig. 1d). The β-helix is predominantly stabilised by an inter-strand network of hydrogen bonds. The interior of the β-helix is packed mostly by long stacks of aliphatic residues (Fig. 1e), whereas the exterior is largely acidic in nature. At the C-terminus of the β-helix, αUpaB forms a two-stranded β-sandwich that is capped by a three-stranded β-meander motif. This latter region resembles the autochaperone region that is required for the folding of AT α-domains on the cell surface22. A DALI structural alignment revealed that αUpaB shared only a low resemblance to other AT structures in the PDB, with highest similarity to the Bordetella pertussis pertactin AT (PDB 1DAB); (18% sequence identity, Z-score 24.5 and r.m.s.d of 2.8 Å between 358 equivalent Cα atoms) (Fig. 1f). Compared to αAg43, αUpaB shares only 12% sequence identity, Z-score 14.6 and r.m.s.d of 3.8 Å between 242 equivalent Cα atoms (Fig. 1g). Further comparison with other ATs9,10,11,12,17 revealed the αUpaB β-helix is wider and shorter with a total length of 75 Å. However, the most striking feature of the αUpaB β-helix is the extended β-strands within turns 6–10, which reach up to 32 Å in length. In addition, the strands linking turns 2–3, 3–4, 4–5 and 5–6 are lengthened into a consecutive series of large loops. To date, these structural features have not been observed in the α-domain of any other AT protein23.

Table 1 UpaB data collection and statistics

UpaB can bind glycosaminoglycans

The β-strand extensions contributed by turns 6–10 and the long loops protruding between turns 2–6 form a long hydrophilic groove 11 Å wide and 12.5 Å deep on the F1 face of αUpaB (Fig. 2a, b). Sidechains from E165, S188, N189, Q197, T230 and E293 protrude into the groove and largely determine its slightly acidic nature. The results of the DALI search using αUpaB were further analysed to define a role for its groove and revealed that UpaB shared low structural similarity to polysaccharide degrading enzymes (1BHE, 5GKD, 4C2L), which are also composed of a β-helix with a prominent groove for binding polysaccharides24,25,26. The αUpaB groove most closely resembled the glycosaminoglycan (GAG) lyase chondroitinase B (PDB 1OFL) from Pedobacter heparinus27 (8% sequence identity, Z-score 16.5 and r.m.s.d of 3.3 Å) (Fig. 2c). Chondroitinase B is the closest homolog known to interact with human polysaccharides. Importantly, αUpaB shares a putative active site with chondroitinase B and other GAG lyases, located just outside the groove (Fig. 2c). This site comprises UpaB Lys 256 and Lys 343 situated in similar positions to chondroitinase B Lys250/Arg271 Brønsted base/acid pair required to break the glycosidic bonds of GAGs28. In chondroitinase B and other GAG lyases, the Lys250/Arg271 would be situated proximal to a bound calcium ion required for neutralisation of the GAG carboxylic group during bond cleavage. Indeed, we identified electron density associated with the αUpaB lysine pair likely to be a bound calcium (Supplementary Fig. 2). Similar to other lyases, this calcium ion would be held in place by the neighbouring αUpaB Glu 314 and Asn 316 residues. The likelihood of a GAG binding within the UpaB groove was tested using docking simulations (Fig. 2a). A model of a GAG was constructed and docked into the αUpaB groove using Autodock Vina. All of the docking conformations showed an interaction with the αUpaB groove, with one of the top conformations displaying an interaction with the putative lyase active site resembling a pre-cleavage state (Supplementary Fig. 2b). This binding conformation exhibited a significant predicted binding affinity of −9.4 kcal mol−1 (free energy of binding), based on an extensive hydrogen bonding network between the GAG hydroxyl groups and a number of polar residues within and around the αUpaB groove.

Fig. 2
figure 2

UpaB can bind glycosaminoglycans. Surface representation of a αUpaB and b top view of αUpaB, with electrostatic potential coloured from the most negative (red) to positive (blue), with a range of ± 10 kTe−1. The β-strand extensions contributed by turns 6–10 and long loops protruding from between turns 2–6 form an acidic groove. A GAG was modelled into the αUpaB groove showing that this feature can both accommodate a GAG molecule and place it in proximity to the putative lyase active site. c Structural comparison of αUpaB (green) to P. heparinus chondroitinase B (wheat; PDB 1DBG). UpaB shares a β-helix structure, groove, bound calcium (cyan and green) and location of a putative lyase active site with chondroitinase B. UpaB has a putative GAG lyase active site (top right panel) consisting of 256 K and 343 K in proximity to a bound calcium (cyan and green) similar to that of chondroitinase B (lower right panel). d Melting curve plots showing the fluorescence intensity (relative fluorescence units (RFU)) of Sypro orange as a function of temperature for purified αUpaB in the presence of GalN-α1-O-Ser and Lacto-N-neohexaose. The addition of these compounds resulted in a Tm shift of −3.23 and −3.67 °C, respectively (mean Tm shift of the 88 carbohydrates screened was <0.7 °C)

This investigation was followed by the screening of αUpaB against 2788 compounds (including 88 carbohydrate molecules) in a fluorescence thermal shift-based assay (Fig. 2d). Significant binding was shown to two ‘GAG-like’ molecules, Tn Antigen GalN-α1-O-Ser and lacto-N-neohexaose (Fig. 2d). GalN-α1-O-Ser closely resembles the O-glycosidic-linked saccharide to serine complex that anchors most GAGs to their core proteins, and the lacto-N-neohexaose is representative of a main chain GAG29. The role of the UpaB groove in this binding was shown by repeating this assay with a UpaB mutant (αUpaB_G1), designed by alanine substitutions to the prominent residues that stabilise the GAG interaction identified in our molecular docking studies (E165A, N189A, Q197A, N200A, Q203A, K256A and N316A). Although we confirmed that these alterations did not affect the secondary structure of αUpaB_G1 (Supplementary Fig. 3a) and αUpaB-G1 behaved in solution similar to the native protein (Supplementary Fig. 3b), this mutant was unable to bind the ‘GAG-like’ molecules as shown by the overlapping melting curve plots of αUpaB-G1 in the presence and absence of the GAGs (Supplementary Fig. 3c). Further analyses revealed that αUpaB did not display a broad affinity for some common GAGs found in the urinary tract including chondroitin sulfate A, B, C and heparin sulfate29,30 (Supplementary Figs. 4 and 5).

Overall, these results show that αUpaB can bind GAG-like molecules and that this binding is lost when we mutate the GAG-binding site. Our data also indicate that αUpaB may display a considerable substrate specificity (chemical library screening identified only two ‘GAG-like’ molecules out of 2788 compounds and we did not observe binding to four polysaccharides found in the urinary tract) and therefore may interact with a limited range of GAGs that we are yet to identify.

UpaB contains a FN-binding site

In addition to GAG containing proteoglycans, the epithelium is comprised of many other glycoproteins including ECM components31. E. coli expressing UpaB was previously found to bind these ECM proteins9. We examined this further by testing the ability of purified αUpaB to bind human FN, laminin and fibrinogen (Fig. 3a). αUpaB bound strongest to FN, and thus we focussed on understanding this interaction at the molecular level. Using surface plasmon resonance (SPR) we determined a dissociation constant (KD) of 45.2 ± 1.4 μM between UpaB and FN (Fig. 3b), with the latter immobilised to a CM5 sensor chip using the standard coupling procedure32. Although the determined KD may be somewhat underestimated owing to the restricted conformational flexibility of the immobilised FN, this binding affinity is consistent with other bacterial FN-binding proteins (including a Fn type III-binding protein)33,34,35. This affinity is also comparable with other physiologically important protein–protein interactions that mediate cell–cell contacts (i.e., T cell receptor–major histocompatibility complex (MHC) complexes range from 2 to 112 μM KD36). Importantly, this binding would not be in the context of a single protein–protein interaction but rather the expression of multiple copies of UpaB on the bacterial cell surface would further enhance the bacterial binding efficiency to FN.

Fig. 3
figure 3

Functional analysis of the UpaB fibronectin-binding site. a Assessment of UpaB binding to human fibronectin, laminin and fibrinogen by enzyme-linked immunosorbent assay (ELISA) using a UpaB-specific polyclonal antibody. UpaB showed highest affinity towards fibronectin. Statistical significance was determined by unpaired two-sample t test, *P < 0.001, n = 9; **P < 0.001, n = 9. b Surface plasmon resonance analysis of αUpaB binding to immobilised fibronectin. A series of concentrations (0.8–100 µM) of αUpaB, as indicated in the sensogram, were injected over fibronectin. The apparent equilibrium dissociation constant KD was determined using a steady-state affinity model. The data are expressed as mean ± standard error of the mean (SEM) of three replicates. c Assessment of binding to fibronectin by UpaB deletion mutants; αUpaB-Δt6–10 (grey); αUpaB-Δt1–2 (green), αUpaB-Δt3–4 (cyan), αUpaB-Δt5–6 (red) and αUpaB-Δt7–8 (yellow) using ELISA and a fibronectin-specific polyclonal antibody. αUpaB (native) was included as control. Data are shown as the means ± standard deviation of three replicates. d Assessment of binding to fibronectin by UpaB mutants containing targeted amino acid substitutions using ELISA and a fibronectin-specific polyclonal antibody. Targeted changes were made to various surface features of UpaB including an acidic patch αUpaB_S1 (red; N116A, D119A, N146A, N175A, D217A, K245A, D246A, D281A, R310A and D336A) and polar patch αUpaB_S2 (blue; N110A, K111A, N112A, D142A, N171A, D206A, D208A, N212A, N241A, N274A, N276A, N303A, N305A, K325A, D329A, D331A and D349A) on the F2 face, a hydrophobic patch αUpaB_S3 (yellow; V151A, I221A, V249A, A252G, A253G, Y285A, Y312A and V339A) between the F2 and F3 faces, along with a hydrophobic αUpaB_G2 (green, F101A, Y130A, Y187A, F195A, L201G, L202G, Y260A) and acidic patch αUpaB_G3 (orange, E103A, D138A, E165A, E226A) within the GAG binding groove. Binding to fibronectin by αUpaB_G1 (E165A, N189A, Q197A, N200A, Q203A, K256A and N316A) was also tested. Alteration of the surface acidic patch S1 abolished the ability of UpaB to bind fibronectin. Data are shown as the mean ± standard deviation of three replicates

Unlike many other bacterial FN-binding proteins (FnBPs), UpaB does not possess a characteristic GGXXXXV(E/D)(F/I)XX(D/E)T(Xx15)EDT FN-binding repeat (FnBR) sequence37. Therefore, to determine the region of αUpaB that binds FN, we generated a series of αUpaB mutants with specific deletions in β-strand turns, followed by overexpression and purification of the corresponding proteins (Supplementary Fig. 6a). Testing of the mutant proteins for their capacity to bind human FN revealed that deletion of the region encompassing the extended β-strands in turns 6–10 (αUpaB-Δt6–10) resulted in a significant reduction in binding to FN (Fig. 3c). Further analysis involving the progressive deletion of pairs of β-strand turns from the αUpaB N-terminus through the extended β-strand region, generating αUpaB-Δt1–2, αUpaB-Δt3–4, αUpaB-Δt5–6 and αUpaB-Δt7–8, demonstrated that the highest loss in FN binding was caused by deletion of turns 3–8. As such, most of the region encompassing the β-strand extensions comprises the primary site for binding FN. The secondary structure of the deletion mutant that was least affected (αUpaB-Δt1–2) and most affected (αUpaB-Δt5–6) to binding FN was confirmed by circular dichroism spectroscopy (Supplementary Fig. 3a). Finally, the same mutations were also generated in the full-length UpaB protein, thereby enabling us to assess its capacity to mediate binding to FN upon translocation to the E. coli cell surface. These UpaB deletion mutants, when expressed with their β-domain transporter, were all translocated to the cell surface, as confirmed by whole-cell enzyme-linked immunosorbent assay (ELISA) using polyclonal UpaB antibody (Supplementary Fig. 7a). Subsequent whole-cell ELISA experiments revealed that E. coli expressing these UpaB mutant proteins bound to FN in a manner consistent with the results obtained using purified recombinant proteins (Supplementary Fig. 7b).

To determine the specific site within turns 3–8 of αUpaB that interacted with FN, we initially investigated the GAG-binding groove on the F1 face, as it was the most prominent structural feature within this region. Utilising our αUpaB_G1 GAG-binding mutant, along with other mutants containing amino acid substitutions of hydrophobic (αUpaB_G2) and acidic (αUpaB_G3) residues within the groove, we found that these regions had no effect on FN binding as determined by ELISA (Fig. 3d). We next examined the other αUpaB faces for possible sites that could bind FN. We made amino acid substitutions to a predominantly acidic patch (αUpaB_S1) and polar region (αUpaB_S2) on the F2 face and a hydrophobic patch (αUpaB_S3) between the F2 and F3 faces (Fig. 3d).

Substitution of residues D116, D119, N146, N175, D217, K245, D246, D281, R310 and D336 on the F2 face to alanine (αUpaB_S1) caused almost complete loss of FN binding as determined by ELISA, while maintaining the correct secondary structure of αUpaB_S1 based on circular dichroism spectroscopic analysis (Supplementary Fig. 3a) and displaying a behaviour in solution similar that of the native protein (Supplementary Fig. 3b). This result mapped the FN-binding site to a ladder of charged/polar residues that are contributed from β-strands or loops in consecutive rungs of the αUpaB β-helix. The only established mode of interaction between FnBPs and FN involves the donation of a series of structurally disordered FnBPs to FN, which upon binding each form an additional β-strand within type I FN modules37,38. Thus the αUpaB–FN interaction is unique, with interacting residues contributed from already formed β-strands held tightly within the αUpaB β-helix by a hydrogen-bonding network.

UpaB can bind FN type III

In order to further investigate the atypical mode of interaction between UpaB and FN, we determined the region of FN bound by αUpaB. FN is composed of 12 type I modules (FnI), 2 type II modules (FnII) and 15–17 type III modules (FnIII)38. We obtained commercially available fragments of human FN which included a 45 kDa gelatin-binding fragment (FnI6–9, FnII1–2), a 70 kDa heparin/gelatin-binding fragment (FnI1–9, FnII1–2), a 120 kDa cell-binding fragment (FnIII2–11) and a 40 kDa C-terminal heparin-binding fragment (FnIII12–15)32 (Fig. 4a, Supplementary Fig. 6b). The binding of αUpaB to these FN fragments determined by ELISA revealed that it displayed strongest interaction with the cell binding fragment (FnIII2–11) and weak binding to the gelatin (FnI6–9, FnII1–2) and heparin/gelatin (FnI1–9, FnII1–2) fragments (Fig. 4b). Given the size of UpaB, this maps its binding site on FN to the first FnIII units in the cell-binding fragment, possibly also including some interaction with the neighbouring FnI units in the gelatin-binding fragment (note that the gelatin [FnI6–9, FnII1–2] and heparin/gelatin [FnI1–9, FnII1–2] fragments overlap in this region). The difference observed between UpaB binding to full-length FN compared to its fragments could be attributed to the lack of the FnIII1 unit within any of the commercially available fragments, which given its location would be an important contribution to the UpaB–Fn interaction. FnIII1–2 are valid targets for bacterial pathogens such as UPEC, as these 2 units are known to be involved in FN matrix assembly, whereby their disruption could facilitate bacterial spread39.

Fig. 4
figure 4

UpaB binds fibronectin type III. a Fibronectin domain organisation composed of 12 type I modules (FnI), 2 type II modules (FnII) and 15–17 type III modules (FnIII). Commercially available fragments used in this work include the heparin/gelatin FnI1–9, FnII1–2, the gelatin FnI6–9, FnII1–2, the cell binding FnIII2–11 and the C-terminal heparin FnIII12–15 fragments. b Binding of fibronectin fragments, as well as full-length (FL) fibronectin, to UpaB measured by enzyme-linked immunosorbent assay using an UpaB-specific polyclonal antibody. Data are shown as the mean ± standard deviation of three replicates. c Tandem β-zipper interaction between Fn binding repeat peptides from S. aureus Fn-binding protein A (FnBPA) and Fn type I modules 2 and 3. The established mode of interaction between bacterial proteins and fibronectin involves the donation of up to 11 structurally disordered bacterial fibronectin repeats to form additional β-strands with consecutive FnI modules. d Model of the UpaB-FnIII interaction derived from NAMD simulations using the structures of UpaB and the FnIII1–2 fragment (PDB: 2HA1), showing predominately hydrogen bonding between charged residues of UpaB (in particular, D246, D310, D336 and D375) and FnIII1 (residues K32, R36, K40 and E70). The equivalent mutant simulation did not show any appreciable hydrogen bonding

Bacterial interactions with FnIII are uncommon; most of the 100 known bacterial FnBPs bind to the FnI heparin- and gelatin-binding domains via a β-zipper interaction38 (Fig. 4c). However, there are an increasing number of bacterial proteins including Haemophilus influenzae Hap40, Staphylococcus epidermidis Embp41, Salmonella enterica serovar Typhimurium ShdA42 and Pasteurella multocida PM166532 that bind to FnIII. The AT Hap (13% sequence identity) also primarily binds to FnIII1–2 via four small binding motifs. These motifs are absent in UpaB and both the UpaB and Hap FN-binding sites do not share the same location.

Overall, the structural basis for how bacterial proteins interact with the FnIII fragment of FN has not been determined, and thus we investigated this interaction using molecular dynamics simulations with our αUpaB crystal structure and the structure of human FnIII1–2 (2HA1)39 (Fig. 4d; Supplementary Movie 1). To visualise this interaction, we also ran simulations with the αUpaB_S1 mutant that had lost its capacity to bind FN (Supplementary Movie 2). Modelling simulations were performed using NAMD 2.1243 for a cumulative total of 1.2 μs for each system (3 replicates of 400 ns each). Though the simulations are too short in timescale terms of protein–protein interactions to demonstrate specific binding, they do provide plausible binding mechanisms, showing that αUpaB could interact with FnIII via complementary charged residues without unfolding and/or donating β-strands. Specifically, our αUpaB-FnIII1–2 simulations indicate that αUpaB primarily interacts with FnIII1, through hydrogen bond interactions mediated by αUpaB D246, R310, D336 and D375 residues with complementary charged areas on FnIII, particularly residues K32, K40, E70, R36 and of FnIII1. Substitutions of the αUpaB FnIII-interacting residues to alanine in the αUpaB_S1 mutant greatly reduced hydrogen bond interactions observed in the simulations (Supplementary Movie 2). β-Helix–β-helix interactions have been observed previously for Ag43 and some trimeric ATs17,44,45. However, these studies show how an AT β-helix can interact with another type of fold, namely the β-sandwich fold of FnIII.

UpaB is highly conserved and immunogenic during UPEC infection

To determine whether the structural features of UpaB required for binding FnIII and GAGs are conserved across the E. coli species, we screened the NCBI public database and an in-house collection, which were represented by 2818 draft and 199 complete E. coli genome sequences. Overall, the upaB gene was present in 1019 strains in this collection (34%) and was found in UPEC strains as well as intestinal pathogenic, commensal and other extra-intestinal pathogenic strains. Analysis of these 1019 translated UpaB amino acid sequences revealed that 95% (968/1019) shared an amino acid sequence identity >89%. Comparison of the translated UpaB amino acid sequence from seven completely sequenced UPEC strains showed that the putative GAG lyase active site was strictly conserved; there was also high conservation of the residues that contribute to the acidic groove, as well as the residues that interact with FnIII (Supplementary Fig. 8). We also examined whether an immunological response against UpaB was elicited during human UPEC infection and showed that plasma samples from urosepsis patients infected with UpaB-positive E. coli strains produced significantly higher anti-αUpaB antibody titres compared to healthy individuals (Supplementary Fig. 9).

Relevance of GAG and FN-binding regions in vivo

We previously showed that mutation of upaB in CFT073 led to decreased bladder colonisation in experimental mice19. In order to examine how the GAG- and FN-binding properties of UpaB impact its function in vivo, we constructed plasmids containing the S1, G1 and double S1–G1 mutations in the full-length upaB gene. These plasmids were transformed into our upaB mutant strain (CFT073upaB) to generate a set of strains with plasmid pSU2718 (vector control), pUpaB (wild-type (WT) UpaB), pUpaBG1 (UpaB with mutated GAG-binding site), pUpaBS1 (UpaB with mutated FN-binding site) or pUpaBG1, S1 (UpaB with mutated GAG- and FN-binding sites). Next, we examined the capacity of our CFT073upaB complemented strains to colonise the mouse bladder. In these experiments, CFT073upaB complemented with pUpaB, pUpaBG1 and pUpaBS1 restored bladder colonisation at 24 h post-infection to a level equivalent to colonisation by WT CFT073 (Fig. 5). In contrast, complementation with either the vector control plasmid pSU2718 or pUpaBG1, S1 did not restore bladder colonisation to WT levels, and these levels were significantly reduced at 24 h post-infection compared to colonisation by CFT073upaB containing pUpaB, pUpaBG1 or pUpaBS1 (Fig. 5). This lack of complementation by pUpaBG1, S1 was not due to lack of expression of the mutant protein on the cell surface, as demonstrated by western blot analysis and whole-cell ELISA (Supplementary Fig. 10a, b). The stability of the pUpaBG1,S1 mutant was also confirmed by purification and biophysical characterisation of recombinant αUpaB_G1,S1 (Supplementary Fig. 3a and Supplementary Fig. 10c). A similar colonisation profile was observed for each of the UPEC strains in the urine of these experimentally infected mice (Supplementary Fig. 10d).

Fig. 5
figure 5

UPEC colonisation of the mouse bladder is enhanced by UpaB GAG- and fibronectin-binding interactions. C57BL/6 mice were challenged transurethally with wild-type CFT073, CFT073upaB(pSU2718), CFT073upaB(pUpaB), CFT073upaB(pUpaBG1), CFT073upaB(pUpaBS1) and CFT073upaB(pUpaBG1, S1). The results represent log10 CFU/0.1 g bladder tissue of individual mice at 24 h post-infection, and the horizontal bars mark group medians. A minimum of 20 mice were assessed per group (pooled from at least 2 independent experiments). Data were compared using Kruskal–Wallis analysis of variance (ANOVA) with Dunn’s multiple comparisons correction (*P < 0.05; **P < 0.01)

Discussion

AT adhesins are a common group of proteins that play a central role in bacterial pathogenesis. They allow bacteria to adhere to human cells, aggregate with other bacteria and form biofilms, all key facilitators of bacterial virulence. Currently, our understanding of the function of ATs is limited, and detailed information at the level of atomic structure as well as the precise molecular mechanisms that govern their interaction with target molecules is lacking. Here we describe the structure and mechanism of action for UpaB, a recently described AT adhesin from UPEC.

UpaB differs from other characterised AIDA-I AT adhesins as it does not self-associate. Self-association of AIDA-I ATs on the bacterial cell surface is the mechanism by which AIDA-I ATs promote bacterial aggregation and biofilm formation17. The lack of this feature in UpaB is consistent with previous studies which showed that UpaB overexpression did not impact on these phenotypes19,46. Subsequent determination of the αUpaB structure revealed an unprecedented departure from the common β-helix fold previously determined for all other ATs and led to the definition of several novel features in the architecture of UpaB. Extensions of β-strands within αUpaB together with large loops create a long acidic groove that can bind GAG. This site bears structural similarity to characterised GAG lyases28 and promoted the binding of UpaB to GAG-‘like’ substrates, thus supporting a potential polysaccharide-binding site in an AT protein.

We also show that UpaB can bind to human FN. Using a series of UpaB mutants in combination with different fragments of FN, we showed that this occurs via interaction with type III FN, most likely at FnIII1–2. Despite human FN being one of the most common targets for bacterial adhesins, a detailed mechanistic description of the structural basis for binding to FnIII is lacking32,38,40,41,42. In UpaB, this interaction involves the folded UpaB β-helix presenting a ladder of charged/polar residues that interact with complementary charges on FnIII. This is in contrast to previously characterised FnBP–FN interactions, which occur through the donation of a series of disordered repeats from the FnBP to form a tandem β-zipper interaction with consecutive FnI domains37,47. This mode of UpaB binding to FnIII might also be utilised by other bacterial proteins that bind FnIII. Indeed, binding interactions involving β-helices have been increasingly observed among ATs and other proteins17,44,45.

Our previous work showed that deletion of upaB reduces UPEC colonisation of the mouse bladder19. The in vivo data we present here show that mutating either the GAG-binding site or the FN-binding site does not affect UPEC colonisation, while mutating both binding sites leads to a modest decrease in colonisation. One possible interpretation of these results would be that UPEC colonisation may be facilitated by UpaB binding to both GAG and FN. However, it is possible that the multiple mutations required to inactivate both binding domains may have altered the stability or other functional features of the UpaB protein. Therefore, further work is needed to clarify the potential relevance of UpaB binding to GAG and FN during infection.

Methods

Bacteria and growth conditions

E. coli strains and plasmids used in this study are listed in Supplementary Table 2. Bacteria were routinely grown at 37 °C on solid or in liquid Luria-Bertani (LB) medium supplemented ampicillin (100 μg ml−1) or chloramphenicol (30 μg ml−1).

Cloning, expression and purification of αUpaB

The coding sequence for the upaB alpha domain (residues 38–500, locus tag c0426) was PCR-amplified from UPEC CFT073 genomic DNA using primers 4326-UpaB-F and 4327-UpaB-R containing ligation-independent cloning (LIC) overhangs (Supplementary Table 2). Using LIC cloning, the amplified gene was inserted into a modified version of a pMCSG717 vector, which encodes a N-terminal his6-tag followed by a thioredoxin (TRX) domain and a TEV protease cleavage site. The resulting plasmid, pUpaBα, was used for the expression of αUpaB and introduces three residues at the N-terminus (i.e. SNA) upon removal of the his6-TRX-tag with TEV. The αUpaB protein was expressed in E. coli BL21 (DE3) pLysS cells (Invitrogen) using autoinduction (24 h at 30 °C) in the presence of the appropriate antibiotics (ampicillin 100 μg ml−1, chloramphenicol 34 μg ml−1). Cells were harvested, resuspended in 25 mM Tris pH 7.5 and 150 mM NaCl and lysed by cell sonication. The lysate was cleared by centrifugation and loaded onto a HisTrap column (GE Healthcare). Proteins were eluted in a gradient of 0–500 mM imidazole. Fractions containing αUpaB were cleaved with TEV protease and the uncleaved protein was removed by further nickel affinity chromatography. Size exclusion chromatography (Superdex S-75 GE Healthcare) in 25 mM Hepes and 150 mM NaCl pH 7.0, was used to further purify αUpaB as assessed by sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE).

The pαUpaB plasmid was used as the parent vector for the construction of all αUpaB mutants, namely αUpaB-Δt6–10, αUpaB-Δt1–2, αUpaB-Δt3–4, αUpaB-Δt5–6, αUpaB-Δt7–8, αUpaB-G1, αUpaB-G2, αUpaB-G3, αUpaB-S1, αUpaB-S2, αUpaB-S3 and αUpaB-G1,S1 (Supplementary Table 2). All constructs were generated by Epoch Life Science, confirmed by sequencing and transformed into E. coli BL21 DE3 pLysS (Supplementary Table 2). The αUpaB mutants were expressed and purified as described for the native αUpaB.

Crystallisation

Crystals of αUpaB were grown at 20 °C using the hanging-drop vapour-diffusion technique. Crystals grew at 20 mg ml−1 in 0.1 M sodium acetate pH 4.8, 0.2 M ammonium sulfate and 28% (w/v) PEG 4000. Crystals pre-equilibrated in reservoir solution containing 20% glycerol were flash-cooled in liquid nitrogen. Xenon derivatisation was performed using a Xenon chamber (Hampton Research) at 20 bar for 1 min before flash freezing.

Structure determination and refinement

Native data were collected (λ = 0.954, −163 °C) from a single crystal with an ADSC Q315r CCD detector on the MX2 micro-crystallography beamline at the Australian Synchrotron. The data were integrated and scaled with HKL200048. Anomalous data were collected (λ = 1.3776, −163 °C) from 2 crystals at the MX2 beamline. This data was integrated, scaled and merged using XDS/XSCALE49. All crystals belonged to spacegroup P3121 with similar cell dimensions of a ≈ 69 Å, b ≈ 69 Å, c ≈ 166 Å and α = 90.0°, β = 90.0° and γ = 120.0°. This was consistent with one αUpaB molecule per asymmetric unit. The structure of αUpaB was determined by single isomorphous replacement using anomalous signal from Xenon. SHELX C,D,E50 was used to find the Xenon atoms, phasing and density modification. Eight Xenon atoms were found per asymmetric unit. ARP/wARP51 was used for initial model building against the experimental phases. This model underwent rounds of manual model building using the program COOT52 and refinement using Refmac553 and phenix.refine54 to 1.97 Å using native data. The quality of the model was monitored during refinement by the Rfree value, which represented 5% of the data. The structure was validated by the MolProbity55 server and the figures were created with PyMOL56. Ramachandran statistics showed 97.87% of residues in the most favoured region and 2.13% in the allowed regions. Details of data-processing statistics and final refinement values are summarised in Table 1.

Analytical ultracentrifugation

Sedimentation velocity experiments were performed in a Beckman Coulter model XL-I analytical ultracentrifuge with a An50-Ti rotor. Double-sector quartz cells were loaded with 400 μl of buffer (25 mM Tris pH 7.5 and 150 mM NaCl) and 380 μl αUpaB at 0.5, 1 and 2.2 mg ml−1. Initial scans were performed at 725 × g to determine the optimal wavelength and radial positions. Absorbance readings were collected at 280 nm and 128,794 × g. at 20 °C. Solvent density, solvent viscosity and estimates of the partial specific volume of αUpaB (0.7203 ml g−1) at 20 °C were calculated with SEDNTERP57. Data were analysed using c(s) and c(M) with SEDFIT58.

SAXS data collection and analysis

Data were collected on the SAXS–WAXS beamline at the Australian Synchrotron. Serial dilutions of a 2.7 mg ml−1 stock were made to give samples with concentrations between ~0.1 and 2.7 mg ml−1. All samples were centrifuged at 10,000 × g prior to being loaded into a 96-well plate. To minimise the effects of radiation damage, samples (~80 μl) were maintained at 283 K and drawn into a capillary from the 96-well plate and flowed past the beam. All measured two-dimensional data were averaged and corrected for transmission, solvent scattering and detector sensitivity and radially averaged to produce I(q) vs. q profiles using Scatterbrain (v 2.7.1).

The estimated molecular masses were calculated using values for contrast and partial specific volume predicted from the protein sequence using MULCh (v 1.1)59 along with the Porod volume. Data processing and Guinier analysis was performed using Primus (v 3.2)60. The pair-distance distribution function, p(r), was generated from the experimental data using GNOM (v 4.6)61, from which I(0), Rg and Dmax were determined.

The program CORAL (v 1.1)62 was used to generate 16 rigid-body models of the protein, where the missing N- and C-terminal residues from the crystal structure (PDB: 6BEA) were modelled as dummy residues. All models were qualitatively similar, and the model with the lowest χ2 was chosen as the representative structure. The program DAMMIN (v 5.3)63 was used to generate 16 molecular envelopes, which were averaged and filtered using the program DAMAVER (v 2.8)64. The SAXS data and models have been deposited in the SASBDB65.

Polysaccharide lyase assay and gel

Polysaccharide lyase assays were performed with human chondroitin sulfate A/C (Sigma), human chondroitin sulfate B (Sigma) or human heparin sulfate (Sigma) at 0.5 mg ml−1 in 100 mM Tris 50 mM sodium acetate pH 8.0 and 10 mM CaCl2 to final volume of 100 μl. Purified αUpaB was added to a final concentration of 0.05 mg ml−1 along with the negative control Antigen 43, with the positive control chondroitin lyase ABC (Sigma) used at 0.005 mg ml−1. Assays were set-up in 96-well microplates (UV-Star Greiner) and substrate cleavage was followed by A232 measurements every 4 min for 2 h at 37 °C, using an EnSpire 2300 multilabel reader (Perkin Elmer). Assay samples were then analysed by SDS-PAGE (4–12%) with Alician Blue/Silver staining.

ELISA of glycosaminoglycans, FN, laminin or fibrinogen with αUpaB

ELISAs of the glycosaminoglycans human chondroitin sulfate A, B, C and heparin sulfate were performed by coating the molecules onto Nunc Maxisorp flat-bottom 96-well plates (Thermo Scientific) at 5, 10, 20 and 40 μg ml−1. Plates were blocked with 1% w/v bovine serum albumin and probed with 10 µg ml−1 of αUpaB. The binding of αUpaB was detected using a UpaB-specific polyclonal antibody (1 in 500 dilution in phosphate-buffered saline (PBS))19 followed by alkaline phosphate-conjugated goat anti-rabbit IgG (1 in 10,000 dilution in PBS (Sigma A3687)). For human laminin (10 µg ml−1, Sigma), human fibrinogen (10 µg ml−1, Sigma) and full-length human FN (10 µg ml−1, Sigma), ELISAs were performed in the same manner. To detect binding to immobilised UpaB or UpaB mutants, ELISA plates were coated with purified αUpaB, truncates or αUpaB surface/groove mutants (10 µg ml−1) and probed with 10 µg ml−1 each of full-length human FN or FN fragments; including the gelatin-binding (Sigma), heparin/gelatin (Sigma), cell-binding (Merck Millipore) or C-terminal heparin-binding (Merck Millipore) α-chymotryptic FN fragments. The binding of FN and FN fragments was detected using anti-FN antibody (1 in 1000 dilution in PBS (Sigma F3648)) followed by alkaline phosphate-conjugated goat anti-rabbit IgG (1 in 10,000 dilution in PBS (Sigma A3687)). The reaction was developed in the presence of alkaline phosphatase substrate (Sigma) and absorbance was read at 405 nm.

Fluorescence thermal shift assays

Fluorescence thermal shift assay was conducted in 384-well plate format with an assay volume of 10 µl. Recombinant αUpaB protein sample and 5000× Sypro-Orange (Invitrogen) were diluted and mixed in a Hepes buffer (20 mM HEPES, 150 mM NaCl, pH 7.5) to a protein concentration of 0.5 µg per well and 5× Sypro-Orange. Because of the low signal intensity of αUpaB, higher than usual protein concentration was used. After adding the protein Sypro-Orange mix to a PCR plate, testing compounds were added using a Labcyte Echo550 acoustic liquid transfer robot. Plates were mixed, sealed with optical clear plastic seal and centrifuged. Thermal scanning coupled with fluorescence detection was performed on a CFX384 qPCR machine at 1.5 °C min−1 from 10 °C to 85 °C. Data analysis was performed using the in-house software excelFTS, which uses IDBS XLfit for fitting the fluorescence data to a Boltzmann function to determine the melting temperature Tm and other thermal transition parameters. Two compound libraries of 2700 molecules and 88 molecules each were screened with αUpaB. No hit compounds were found from the 2700 molecule Spectrum collection of known and experimental drugs and natural products (MicroSource Discovery). The screen of 88 carbohydrates yielded two hits.

Circular dichroism spectroscopy

An Aviv model 420 Circular Dichroism Spectrometer was used to investigate the structural properties of αUpaB proteins at 0.3 mg ml−1 in 25 mM Hepes pH 7.0 and 150 mM NaCl. Wavelength scans were performed with 0.5 nm steps between 200 to 250 nm at 20 °C.

ELISA of whole cells expressing UpaB with FN

Full-length upaB from UPEC CFT073 was cloned into plasmid pSU2718 at XbaI-HindIII restriction sites with primers 6460-UpaBF1 and 6461-UpaBR (Supplementary Table 2). The resultant parent plasmid (pUpaB) was used for construction of all UpaB deletion mutants used in this study (Supplementary Table 2). Specific mutations and deletions were introduced into upaB by Epoch Life Science to generate the following plasmids [pUpaBG1, pUpaBS1, pUpaBΔt1–2, pUpaBΔt3–4, pUpaBΔt5–6 and pUpaBΔt7–8]. All constructs were confirmed by sequencing and transformed into MS427 (Supplementary Table 2). For all assays, overnight cultures of bacterial cells were normalised to an optical density at 600 nm (OD600 nm) of 1.0. Whole-cell ELISAs were performed as described for purified UpaB proteins. Whole-cell ELISAs for the detection of UpaB and UpaB deletion mutants on the E. coli cell surface were performed using polyclonal anti-αUpaB antibody (1 in 500 dilution in PBS) and detected using alkaline phosphate-conjugated goat anti-rabbit IgG (1 in 10,000 dilution in PBS (Sigma A3687)). The interaction of UpaB deletion mutants with FN was examined by whole-cell ELISA using anti-FN antibody (1 in 1000 dilution in PBS (Sigma F3648)) and detected using alkaline phosphate-conjugated goat anti-rabbit IgG (1 in 10,000 dilution in PBS (Sigma A3687)).

SPR measurements

A Biacore T200 biosensor instrument was used to measure the affinity of the interaction of UpaB with full-length human FN. FN was covalently immobilised onto a CM5 chip at two different densities, 1000 RU and 5000 RU, using amine coupling method. SPR experiments were performed at 25 °C using PBS-T (1× PBS pH 7.4 and 0.05% Tween 20) as the running buffer. To generate binding data, UpaB at concentrations ranging from 100 to 0.8 µM was injected over immobilised FN at a constant flow rate of 90 ml min−1 for 50 s; UpaB dissociation was monitored by flowing running buffer at 90 ml min−1 for 150 s. The surface was regenerated after each cycle by injecting 0.1% SDS. Steady-state equilibrium analysis was carried out using the Biacore T200 evaluation software. KD is expressed as mean ± standard error of the mean (SEM). Experiments were conducted on three independent occasions with fresh immobilisation.

Mouse infections

Plasmids containing the full-length upaB gene with mutations in the GAG-binding site (pUpaBG1), FN-binding site (pUpaBS1) and both binding sites (pUpaBG1, S1) were generated by Epoch Life Science. These plasmids, together with a plasmid containing the full-length upaB gene (pUpaB) and the vector control (pSU2718), were transformed into CFT073upaB, respectively, to generate the strains used in the mouse UTI model. Female C57BL/6 mice aged 10–12 weeks (Animal Resources Centre) were inoculated by transurethral infection to deliver approximately 108 bacteria to the bladder, as described elsewhere66. Briefly, mice (n = 10 per group) were anaesthetised by inhalation exposure to isoflurane and a sterile Teflon-coated catheter attached to a 1 ml syringe was used to deliver 50 µl of PBS containing ~108 colony-forming units (c.f.u.) of bacteria (at a rate of 5 µl s−1) to the bladder. Urine samples were collected 24 h after challenge and were diluted in PBS and plated on LB agar or LB agar supplemented with 30 µg ml−1 chloramphenicol, as appropriate, for colony counts. Subsequently, the mice were euthanised (using isoflurane overdose followed by cervical dislocation), and the bladders and kidneys were collected, weighed and homogenised in sterile PBS. The tissue homogenates were diluted and plated on agar as above to quantify c.f.u. per 0.1 g tissue. Data are compiled from at least two independent experiments.

ELISA to detect anti-αUpaB antibodies from urosepsis patients

Blood plasma was collected from 45 patients presenting with urosepsis at the Princess Alexandra Hospital (Brisbane, Australia) along with matching blood culture UPEC isolates. Isolates were screened for the presence of upaB, resulting in 33 plasma samples with matching infecting strains positive for upaB to be assessed. Forty-two plasma samples from healthy volunteers were obtained as controls. Recombinant UpaB α-domain (αUpaB; 10 µg ml−1) was coated onto Nunc Maxisorp flat-bottom 96-well plates (Thermo Scientific), plasma samples were added and peroxidase-conjugated anti-human IgG (1:30,000 dilution in 5% skim milk (Sigma A0170)) was applied as a secondary antibody for detection (incubated at 37 °C for 90 min). Plates were developed with 3,3’,5,5’-tetramethylbenzidine with absorbance determined using a SpectraMax 190 Absorbance Microplate Reader at 450 nm. Statistical analysis between patient and healthy plasma was performed using an unpaired two-sample t test.

Sequence analysis

The upaB sequences from the NCBI database and in house collection were compared and aligned using CLC Main Workbench (Qiagen).

Modelling

Models for dynamics simulations were constructed for both αUpaB and αUpaB_S1 using the UpaB crystal structure and the FnIII1–2 crystal structure (2HA1). UpaB models were positioned approximately 10 Å from the FN fragment and solvated with TIP3P and electrically neutralised with 0.15 M sodium chloride. The models initial dimensions of 108 × 81 × 120 Å, contained 100,010 and 100,015, atoms respectively. Each model was run independently 3 times for 400 ns, performed with NAMD2.12 34, for a cumulative total of 2.4 µs of simulation. Simulations were performed with NAMD2.1243. Long-range Coulomb forces were computed with the Particle Mesh Ewald method with a grid spacing of 1 Å. Two-fs timesteps were used with non-bonded interactions calculated every 2 fs and full electrostatics every 4 fs while hydrogens were constrained with the SHAKE algorithm. The cut-off distance was 12 Å with a switching distance of 10 Å and a pair-list distance of 14 Å. The temperature was set to 310 K. Pressure was controlled to 1 atmosphere using the Nosé–Hoover Langevin piston method employing a piston period of 100 fs and a piston decay of 50 fs. Trajectory frames were captured every 100 ps. Simulation trajectories were viewed and mapped with VMD67.

Docking of the GAG into UpaB utilised a model based on the NMR structure of unsulfated chrondroitin (PDB: 2KQO). Docking of the GAG model was performed against the UpaB structure using Autodock Vina68. A 52 × 24 × 46 Å search space was set up to cover the UpaB surface and groove. Standard chemical bond torsions were applied to the GAG molecule and UpaB was kept rigid apart from the following residues: N189, Q197, K256, and K343. Docking conformations were ranked against the predicted free energy of binding (kcal mol−1).

Ethics statement

All animal experimentation was conducted in accordance with the guidelines of the National Health and Medical Research Council. The Griffith University Animal Ethics Committee approved this study (MSC/01/18/AEC). The use of human blood plasma from patients was approved by the institutional review board of the Princess Alexandra Hospital (2008/264). The need for patient informed consent was waived, as the primary purpose for the collection of these samples was for other diagnostic procedures, and all patient information was de-identified. The collection of human blood from control subjects was approved by the institutional review board of Griffith University (MSC/18/10/HREC). Informed consent was obtained from all control subjects.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.