Bacterial cell envelope protein (CEP) complexes mediate a range of processes, including membrane assembly, antibiotic resistance and metabolic coordination. However, only limited characterization of relevant macromolecules has been reported to date. Here we present a proteomic survey of 1,347 CEPs encompassing 90% inner- and outer-membrane and periplasmic proteins of Escherichia coli. After extraction with non-denaturing detergents, we affinity-purified 785 endogenously tagged CEPs and identified stably associated polypeptides by precision mass spectrometry. The resulting high-quality physical interaction network, comprising 77% of targeted CEPs, revealed many previously uncharacterized heteromeric complexes. We found that the secretion of autotransporters requires translocation and the assembly module TamB to nucleate proper folding from periplasm to cell surface through a cooperative mechanism involving the β-barrel assembly machinery. We also establish that an ABC transporter of unknown function, YadH, together with the Mla system preserves outer membrane lipid asymmetry. This E. coli CEP 'interactome' provides insights into the functional landscape governing CE systems essential to bacterial growth, metabolism and drug resistance.
As in eukaryotes such as yeast1, an extensive, but still largely unknown, network of physical associations mediates the diverse and vital roles of CEPs in bacterial membrane biology, pathogenesis and antibiotic resistance2,3. Knowledge of cell envelope protein–protein interactions (cePPIs) and their physical and functional organization into multi-protein complexes (MPCs) therefore provides critical information into the basic systems governing prokaryotic physiology, which in turn is essential to combat infections by antibiotic-resistant pathogens. Yet while previous small-scale investigations in E. coli have identified certain MPCs required for peptidoglycan synthesis, chemotaxis, lipopolysaccharide (LPS) export and envelope assembly2,4,5, experimental data concerning the binding partners and functions of most bacterial CEPs are still lacking6.
To address this important gap, we performed large-scale affinity-tagging for hundreds of engineered E. coli CE strains bearing C-terminal sequential peptide affinity (SPA) chromosomal tags to maintain cognate promoter regulation and physiologic expression levels7,8. We then purified these SPA-tagged fusion proteins under non-denaturing conditions and analyzed them by mass spectrometry (MS) to elucidate the compositions of native CE-associated macromolecular assemblies. The resulting high-confidence physical interaction map substantially extends knowledge of the CE interactome, and complements our previous characterization of soluble cytoplasmic protein complexes in E. coli7,8.
Generating a high-quality CEP interaction data set
Most CEPs are not readily solubilized using the standard combination of affinity purification and MS (AP/MS) workflow we developed to isolate soluble protein complexes7,8. To facilitate extraction of hydrophobic assemblies1,6 we used non-denaturing detergent solubilization procedures adapted from our previous study of yeast MPCs1. After evaluating 14 diverse detergents (1% w/v or v/v; Supplementary Fig. 1a–c), we selected C12E8 (octaethylene glycol monododecyl ether), DDM (n-dodecyl-D-maltoside) and Triton X-100 (octylphenol ethoxylate) as the most effective non-ionic detergents for recovery of diverse endogenous SPA-tagged MPCs from late log-phase E. coli (DY330) cultures grown in rich media (Supplementary Note 1).
Based on current subcellular annotations available in public databases, as well as widely used prediction programs, literature curation, and other information sources (Supplementary Note 1), we compiled a list of 1,617 annotated or predicted CEPs. Of these, we attempted to create chromosomal SPA-fusions for 1,347 proteins (Supplementary Table 1), whereas the remaining were not targeted due to questionable 'legacy' annotations. In total, we tagged over half (58%, 785 of 1,347) of the CEPs that were confirmed by immunoblotting or PCR; the remaining (562) were not detected under standard culture conditions, but a sizeable (41%; 229) fraction were subsequently identified as co-purifying interactors with the successful bait purifications.
From our proteome-wide analysis, in total, we successfully purified 785 CEPs (Fig. 1a and Supplementary Table 1), many of which (88%, 693 CEPs) have been associated with drug sensitivity and the intrinsic antibiotic 'resistome'9 (Supplementary Fig. 1d). These include 549 inner membrane proteins (IMPs), 45 outer membrane proteins (OMPs), and 129 periplasmic proteins (PEPs) involved in diverse processes encompassing transport, CE assembly and biofilm formation; 45 lipoproteins (6 IM, 39 OM); 15 extracellular space proteins; and two putative membrane-related proteins6,10.
To evaluate data quality and reproducibility, we affinity-purified 290 SPA-tagged CEPs using one to three different detergents, with multiple biological replicates per detergent, for a total of 1,751 AP/MS analyses (Supplementary Note 1). The remaining 495 baits were subject to AP/MS once, in the presence of 1 to 3 different detergents, for an additional 1,103 analyses (Supplementary Fig. 2a). In total, 2,854 affinity-purified CEP samples were analyzed by Orbitrap MS. CEP recovery (coverage) was consistently high regardless of protein topology (68% IMPs, 79% OMPs, 86% PEPs; Supplementary Fig. 2b), but was influenced by protein abundance (Supplementary Fig. 2c) and the number of predicted transmembrane helices or β-strands (Supplementary Fig. 2d). We successfully recovered most (84%, 147 of 175 tagged) CEP harboring annotated Pfam domains (Supplementary Fig. 2e), including ATP-binding cassette (ABC), 4Fe-4S ferredoxin and polypeptide transport associated (POTRA) domain-containing proteins (e.g., BamA), which are under-represented in public PPI databases. The average correlation between replicates (calculated based on protein spectral counts detected by MS) was high (r = 0.7), with low (σ = 0.1) variance (Fig. 1b), indicating the overall reproducibility, which is comparable to previous AP/MS-based analyses of soluble protein interaction networks11,12.
Interaction scoring and benchmarking
To derive a high-fidelity CE network, we mapped the MS/MS spectra obtained from each SPA-tagged bait purification to reference E. coli protein sequences using both the SEQUEST/ STATQUEST8 and alternate MS-GF+13 algorithms to boost peptide identification (Supplementary Note 1). The PPI data from each bait CEP were then filtered using a protein identification probability score of 90% or greater. The cut-off was chosen because the majority of known interactions curated in EcoCyc (a comprehensive database resource for E. coli) passed this threshold (Supplementary Fig. 3a), eliminating promiscuous non-specific interactors and retaining reliable interactions with adequate specificity. These filtered associations were then subjected to an integrative statistical framework to computationally assign interaction confidence scores (Supplementary Note 1), and we benchmarked the results against reference cePPIs curated in EcoCyc. We maximized coverage and accuracy using Bayesian inference14 to normalize and integrate the output of CompPASS (Comparative Proteomic Analysis Software Suite)15 and HGSCore (HyperGeometric Spectral Counts score)16 algorithms into a single unified probabilistic log-likelihood score (LLS) for each putatively interacting bait–prey and prey–prey protein pair (Fig. 1c and Supplementary Fig. 3b).
We eliminated associations below a stringent integrated score threshold (Σ LLS ≤ 5.27) that faithfully recapitulated known (positive reference) cePPIs according to area-under-the ROC (receiver operating characteristic) curve analysis (Fig. 1c and Supplementary Table 2). The final probabilistic network consists of 12,801 putative interactions and includes 77% (604) of all targeted E. coli CEPs (Supplementary Table 2), with broad representation across protein size, topology and function. Notably, despite ample anecdotal supporting evidence (Supplementary Table 3 and Supplementary Note 2), most (12,523) of these physical interactions, and one-quarter (217) of the CEPs have not been detected before8,17,18,19 (Fig. 1d). Similar to yeast1, each E. coli CEP had slightly fewer interaction partners by AP/MS than soluble proteins8 (P = 4.2 × 10−10; Supplementary Fig. 3c), possibly due to detergent-mediated disruption of lower abundance or weaker binding partners.
Multiple criteria support the overall reliability of network and underlying AP/MS data. We categorized interactions as one of two types (Supplementary Table 2): high-confidence (HC), representing half (46%, or 5,927) of the cePPIs, which were reproducibly detected by multiple replicates, different detergents, alternate methods (yeast and/or bacterial two-hybrid assays or biochemical fractionation coupled to MS; see below), or reported in a previous interaction study7,8,19; and medium-confidence (MC; 53%, 6,770) that are predicted to have a functional-association by STRING20, GeneMANIA21, or genomic context8, or which co-localize to the same compartment (Fig. 1e).
While the few remaining (<1%, 104) prey–prey associations were among cytoplasmic protein pairs, we also gained meaningful functional insights from these subnetworks into membrane transport metabolons (i.e., physical coupling of transporters and enzymes; see below) and other pathways22 (Fig. 1e and Supplementary Note 1).
When benchmarked against manually curated MPCs (Fig. 1f), both the high and medium confidence cePPIs recapitulated annotated assemblies more faithfully (e.g., average semantic similarity23) than previously published large-scale8,18,19 and small-scale E. coli studies17 (Supplementary Fig. 3d). As expected for functional macromolecules, genes encoding putatively interacting CEPs also had more similar genetic interaction patterns10,24,25 (Supplementary Fig. 3e) and drug sensitivity profiles9 (Supplementary Fig. 3f and Supplementary Table 3) as compared to random gene pairs, indicating enrichment for physiologically relevant associations. We note that one-third (30%, 3,853) of the total number (12,801) of protein pairs derived from the HGSCore algorithm were “prey–prey” associations (i.e., co-purifying non-bait interactors; Fig. 1e), most (3,710) of which are medium confidence and involve at least one CEP. Nevertheless, these interactions are functionally informative; for example, the high and/or medium confidence cePPIs both showed high sensitivity and reasonable specificity (based on fivefold cross-validation) as compared to previously published E. coli PPI data sets8,18,19 (Supplementary Fig. 3g).
To independently verify the global reliability of the AP/MS results, we employed a complementary proteomic approach to detect endogenous (untagged) macromolecules based on co-fractionation during size-exclusion chromatography of detergent-solubilized (0.020% Triton and 0.05% DDM) E. coli extracts (Supplementary Fig. 4a and Supplementary Note 1). As with previous soluble protein complexes26,27, we used quantitative MS to compare the co-elution profiles of proteins across the collected fractions (Supplementary Fig. 4b). From these we computed interaction likelihood metrics (Supplementary Note 1), which we then benchmarked against the EcoCyc reference set (Supplementary Fig. 4c). Co-purifying protein pairs showing high similarity (r = 0.8; Supplementary Fig. 4d) recapitulated an additional 13% (1,678) of the putative candidate cePPIs detected by AP/MS (Supplementary Table 2), providing further evidence of a stable native biophysical association.
We next subjected a subset of cePPIs to further independent experimental evaluation based on binary yeast and/or bacterial two-hybrid assays (Supplementary Note 1). Of the 103 pairs randomly selected from high confidence (81) and medium confidence (22) cePPI categories, we confirmed almost half (44, or 43%; Supplementary Fig. 4e and Supplementary Table 3), which agrees favorably with the validation rates obtained from analogous assays for E. coli soluble PPIs19. Confirmed association pairs include the putative fimbrial adhesion protein YehA with the periplasmic chaperone YehC, representing a putative chaperone-usher assembly pathway required for pilus/fimbrium biogenesis and biofilm formation, and the OM channel CusC and the membrane fusion protein CusB, representing a physical coupling underlying the copper/silver efflux system.
Topological properties of the MPC network
We examined the global topological organization of the cePPI network in comparison to the E. coli cytoplasmic interactome7,8. As seen for soluble protein complexes8, we found that essential CEPs (59, 4.4%), many of which (18) are targeted by antibiotics9 (Supplementary Fig. 5a), had higher connectivity than non-essential CEPs (P = 4.5 × 10−7; Supplementary Fig. 5b), as did evolutionarily conserved CEPs (Supplementary Fig. 5c), particularly peptidoglycan and cell wall proteins, whereas extracellular proteins tended to have the fewest partners (Supplementary Fig. 5d).
Half (46%, 5,939) of the interacting proteins co-localized to the same compartment, while many other PPIs (54%, 6,862; P = 4.2 × 10−2) bridged the Gram-negative CE compartments, of which three-fourths (79%, 5,451) occurred between CEPs and cytosolic proteins (Supplementary Fig. 5e and Supplementary Table 2). The latter include links between the IM Sec translocon (SecDEY) and cytosolic proteases (ClpAPX) involved in degradation and folding, suggesting bacterial protein homeostasis is coupled to protein translocation28 (Supplementary Note 2). Similarly, interactions between the periplasmic hydrogenase 2 small subunit (HybO) and cytoplasmic fumarate reductase (FrdB) suggest coordination in substrate reduction (such as quinones) during anaerobic respiration29. The β-barrel assembly OMP (BamA) and OM lipoproteins (BamCDE) also co-purified with the IM Sec translocon (SecDFGY), consistent with a primary role in β-barrel translocation, transport and membrane insertion OMP biogenesis30.
To define complex membership, we partitioned the cePPI network (high + medium confidence) into densely connected regions using a 'core-attachment' clustering algorithm31. In total, we identified 540 MPCs, of which 420 contain at least one CEP and 120 involve only cytosolic proteins (Fig. 2 and Supplementary Table 4), with predicted topologies consisting of either 54 matrix (i.e., interaction among all components), 68 spoke (i.e., bait connectivity with prey proteins), and 123 socio-affinity (i.e., integration of spoke and matrix) models based on the degree of PPI connectivity (the remaining 295 complexes were either binary or involve prey–prey associations). These clusters encompass 246 heterodimers (Supplementary Fig. 6a), of which over half (252, 60%) matched known (EcoCyc curated) assemblies (Supplementary Fig. 6b), while one-quarter (103, 25%) were bridged by cePPIs (88; as determined by bacterial or yeast two-hybrid assay; Supplementary Fig. 6c). The remaining 168 CEP complexes have not been reported before (Supplementary Table 4), representing a rich resource for biological discovery.
Several of the MPCs represent potential membrane transport metabolons32 encompassing sugar permeases, components of the phosphoenolpyruvate-dependent phosphotransferase system (PTS), and the histidine-containing phosphocarrier energy coupling protein (HPr or PtsH; Supplementary Table 3). HPr and other soluble energy-coupling proteins (e.g., sugar-specific IIA proteins) are known to associate with PTS permeases32, and our data link different transport complexes of the PTS. For instance, fructose PTS permease (FruA) co-purified with enzyme IIC constituents (e.g., GatC, NagE, TreB, MngA) in addition to its cognate energy-coupling factor, FruB33. Although not strictly membrane-bound22, HPr also co-purified with multiple regulatory targets (e.g., Adk, DhaK, PfkB, PykF; Supplementary Note 2). HPr is known to bind pyruvate kinase A (PykA) in the opportunistic pathogen Vibrio vulnificus34, whereas E. coli HPr interacts with an isozyme, pyruvate kinase 1 (PykF). We confirmed the latter association by steady-state kinetics (Supplementary Fig. 7a), and found that PykF is activated upon binding to non-phosphorylated HPr, suggesting that, similar to PykA34 of V. vulnificus, HPr regulates PykF by increasing its affinity for phosphoenolpyruvate.
Other interactions indicative of transporter-based metabolons (Supplementary Fig. 7b and Supplementary Note 2) include the housekeeping dipeptide permease Dpp, which co-purified with heterodimeric ATPases (DppDF) IMPs (DppBC) and the periplasmic peptide-binding protein DppA of the ABC peptide uptake system (Supplementary Fig. 7c), consistent with joint participation in peptide transport35. Likewise, the zinc transporter ZnuA co-purified with the PEP ZinT (YodA), and a zinc-uptake ABC transporter composed of the membrane permease ZnuB and the ATPase ZnuC36 (Supplementary Fig. 7d). It has been proposed that ZinT cooperates with ZnuA in zinc periplasmic uptake36, although its role in zinc homeostasis is not clear. Interaction of ZnuA with the transcriptional regulator MntR points to a possible control mechanism37, and the association of ZinT/ZnuA with the periplasmic receptor AraF (of the arabinose ABC transporter AraFGH; Supplementary Fig. 7d) implies metabolic coupling. The periplasmic receptor AraF that binds L-arabinose in the periplasm and delivers it to the IMP complex AraGH38 was also consistent with the physical link we observed. Similarly, multiple polycistronic operons, wherein a substrate-producing enzyme is co-expressed with a cognate transporter32, exhibited high physical connectivity (Supplementary Fig. 7e).
MPCs linked to antibiotic susceptibility
Defects in envelope assembly often disrupt the cell permeability barrier, leading to drug hypersensitivity9. We therefore examined the cePPI/MPC network for links to antibiotic susceptibility. A significant enrichment (P ≤ 0.05) for drug hypersensitivity was evident among E. coli strains lacking components of 68 MPCs (Fig. 3a and Supplementary Table 5), indicating a critical role in envelope integrity. These include mutants of core-oligosaccharide assembly factor waa (formerly rfa), which are hypersensitive to numerous antimicrobials9, presenting an attractive target for antibiotic potentiation. Mutations in components of the Tol-Pal envelope (tolBQ, pal) and purine (purDHLMT) systems also confer hypersensitivity to similar inhibitors9, consistent with overlapping cellular roles.
Because multidrug resistance (MDR) is a major emerging clinical concern39, we assessed the association of the multidrug transporter MdtF (YhiV) with a periplasmic lipoprotein (AcrA) of the AcrAB-TolC efflux pump (Supplementary Table 2). We found that if either mdtF or acrA was expressed alone (in an acrAB mdtEF quadruple mutant), export of an efflux test substrate (ethidium bromide) or antibiotics (levofloxacin, carbenicillin) was impaired. However, when acrA is co-expressed with mdtF, substantial transport of substrate antibiotics was observed (Fig. 3b), to a similar extent as upon co-expression of MdtE and MdtF (which are known to interact), consistent with AcrA and MdtF associating to form a functional drug efflux pump.
Another notable MPC had unexpected connections between a membrane-spanning translocation and assembly module (Tam) with the OMP biogenesis (BAM) machinery (Fig. 4a). TamA and TamB have been implicated in autotransporter secretion40, which is consistent with the interaction observed with an autotransporter processing protein (Ag43), but their precise role in Type V secretion is unclear (Supplementary Note 2). Whereas TamA, an OMP of the Omp85 family and a BamA homolog41, consisting of an N-terminal periplasmic domain composed of three POTRA repeats preceding a 16-stranded C-terminal β-barrel (Supplementary Fig. 8a), is restricted to Proteobacteria and Bacteroidetes41, TamB, an IMP with an N-terminal transmembrane helix, a C-terminal DUF490 domain (crucial for envelope integrity), and a large inter-domain with no assigned function, is widespread among Gram-negative bacteria42. Of note, we found that loss of either tamA or tamB enhanced the slow growth phenotype of E. coli mutants lacking bamA (Supplementary Note 2) but not other bam components of the BAM export system (Fig. 4b), suggesting involvement in BamA-mediated export, the dominant secretion mechanism for Type V autotransporters.
To examine this, we performed agglutination assays (Fig. 4c). Notably, tamB but not tamA mutants had impaired autotransporter secretion (Ag43 (Flu), AidA and TibA (Type Va), YadA (Type Vc)), and sedimentation profiles similar to that of control cells not expressing any autotransporters (Fig. 4d). The translocation of the N-terminal passenger domain of Ag43 to the outer cell surface, assessed by both flow cytometry (Fig. 4e,f) and transmission electron microscopy (Supplementary Fig. 8b), was normal in tamB mutants, but folding was deficient (Fig. 4g; Ag43 shown). This is evident because Ag43 autotransporters are far more sensitive to proteinase K treatment in tamB knockouts (complete digestion after 1 h) as compared to wild-type cells, which is indicative of misfolding of Ag43 in the absence of TamB (Supplementary Note 2).
To exclude a translocation defect, we monitored Ag43 secretion based on thermal release of the processed N-terminal passenger domain, which is proteolytically cleaved (after amino acid D551 by an unidentified protease) and detached by heat denaturation (60 °C)43. As in wild-type cells, the mature N-terminal passenger domain completely disappeared from the cell surface of heat-shocked tamB mutants, as shown by flow cytometry (Fig. 4e,f), while the released αAg43 fragment was detected in the culture supernatant (Supplementary Fig. 8c–e). This leads to a working model (Fig. 4h and Supplementary Note 2) in which BamA-TamA/B function together during TamB-nucleated folding of the mature N-terminal passenger domain, catalyzing β-domain membrane insertion to facilitate translocation of partially folded autotransporters.
Transporter complex confers lipid asymmetry
Another notable MPC consisted of an IM ABC transporter of unknown function, YadH, in association with four (of six) subunits of a broadly conserved Mla-family phospholipid transporter (composed of the OM lipoprotein MlaA, a substrate-binding IMP MlaD, the cytosolic STAS domain protein MlaB, and a nucleotide-binding IMP MlaF) that maintains OM lipid asymmetry44 (Fig. 5a). Consistent with joint participation in phospholipid transport, we verified these interactions by bacterial two-hybrid and repeat AP/MS experiments (Fig. 5b). Functionally, we found that yadH null cells were as hypersensitive as mlaA mutants to reagents like SDS/EDTA that destabilize LPS, causing phospholipid buildup in the OM44 (Fig. 5c). This hyperpermeability phenotype was not enhanced in yadH mla double mutants (Fig. 5c), consistent with a cooperative function, and was suppressed by overexpressing the phospholipase pldA, which targets surface-exposed phospholipid. As with mla44 mutants, yadH null cells showed no change in sensitivity to lipophilic drugs (e.g., erythromycin; data not shown), or in the LPS and OMP levels (Fig. 5d), or phosphatidylethanolamine and phosphatidylglycerol phospholipid species (data not shown). Rather, the cardiolipin profile of yadH mlaD double mutants was altered (Fig. 5e), consistent with the notion that Mla targets a relatively minor population of OM phospholipid molecules44.
Cardiolipin is thought to be synthesized at the IM and transported to the OM by the IMP PbgA (YejM) in E. coli and other Gram-negative bacteria45. Consistent with this, PbgA reproducibly co-purified with MlaE in replicate AP/MS analyses, but failed to pass our stringent scoring threshold. Further experiments are required to delineate how YadH mediates selective phospholipid trafficking in conjunction with Mla/PbgA, but our data suggest a role in the maintenance of OM lipid asymmetry via selective phospholipid substrate flow (Fig. 5f).
MPC conservation and diversification
We examined the evolutionary significance of the MPCs based on their patterns of conservation, as predicted by the presence or absence of any given MPC subunit ortholog encoded in a genome46. Using InParanoid47 to map orthologous relationships across bacteria (Supplementary Table 5), we observed that essential CEPs tended to be more highly conserved (P = 4.2 × 10−2; Supplementary Fig. 9a), even among the most highly divergent species (e.g., Mycoplasma, Rickettsia). Most (387, 68%) MPCs, however, were restricted to γ-proteobacteria (Supplementary Fig. 9a,b), which include important human pathogens (e.g., Pseudomonas, Vibrio). The degree of cePPI retention, both within and between complexes, was inversely proportional to phylogenetic distance (Fig. 2 and Supplementary Fig. 9a). For example, orthologs of subunits of the E. coli sulfonate-sulfur utilization (SsuEAD, Sbp) and sulfate/thiosulfate transport (Sbp-CysP) complexes were absent in Firmicutes and Planctobacteria (Supplementary Fig. 9b), suggesting trait loss occurred in a common ancestor.
Strikingly, among the 28 putative paralogous groups (39 CEPs) in the MPC network (Supplementary Fig. 10a and Supplementary Table 5), only one-quarter (7, 25%) had interactions in common. Paralogs with shared binding partners tended to have significantly (P = 1.6 × 10−5) more interaction partners (Supplementary Fig. 10b), but the degree of overlap did not reflect protein sequence similarity or abundance (Supplementary Fig. 10c,d). For example, despite >80% sequence identity between the paralogs AcrB and AcrF, the former had more interactions in common with the RND-family pump MdtF, a more divergent paralog, than with the more closely related AcrF (Supplementary Fig. 10e). On the other hand, methyl-accepting chemotaxis receptors with >60% sequence similarity (Tar, Tsr, Trg, Tap) had few overlapping PPIs, possibly reflecting environmental adaptions48. Interactions of Trg with IMPs involved in flagellar export and motor/switch function (FliGLMN, FlhA) or basal-body assembly (FlgGI, FliF), and Tap with the cytosolic phosphatase, CheZ, controlling phosphorylated CheY binding to FliM, agree with Tap's known role in chemotaxis2. There were also few interactions in common seen between the OM capsule polysaccharide exporters Wza and GfcE, or the functionally related IM tyrosine protein kinases (Wzc and Etk), both with >70% sequence identity and critical roles in bacterial pathogenesis. Yet Etk co-purified with GfcE (functionally similar to Wzc/Wza), its tyrosine phosphatase partner Etp, and the predicted OM lipoprotein GfcD (Supplementary Fig. 10e), consistent with Etk's known roles in polysaccharide secretion49 and capsule production50.
The physical interaction map presented here offers insights into the global functional organization of the E. coli CE, serving as the first biochemical blueprint of the interconnected modular architecture of this unique, conserved, and adaptive bacterial system. While we cannot exclude loss of important transient associations and disruption of hydrophobic interactions upon detergent solubilization, our systematic AP/MS approach captured thousands of known and unexpected cePPIs, revealing MPCs involved in processes central to microbial CE establishment, maintenance and integrity, and environmental responsiveness. Notably, our probabilistic network implicated previously unannotated CEPs in MP export, folding and assembly, OM biogenesis, and multidrug-resistance, examples of which we subsequently validated by independent experimentation (e.g., YadH in Mla-mediated OM lipid asymmetry maintenance). We also showed for the first time that autotransporters do not spontaneously self-assemble, but instead require chaperone assistance to nucleate folding in the periplasm before delivery to the bacterial cell surface through a cooperative mechanism involving the BAM and TAM machineries. While neither TamA nor TamB is required for passenger domain secretion, we established that BamA is the primary translocator in E. coli, while TamB is involved in folding and maturation.
Many additional mechanistic insights can be gleaned from the network, including CEP cross-talk critical for envelope formation and homeostasis, functional constraints underlying the evolution of Gram-negative bacteria, and potentially promising targets to potentiate existing antibiotics to combat the scourge of drug resistance. We also identified regulatory links by which HPr of the E. coli phosphotransferase system controls carbohydrate and energy metabolism, and unexpected associations between multiple RND-type MDR pumps, with wide-ranging relevance to bacterial pathophysiology. By making all of our data accessible via a dedicated web portal (http://ecoli.med.utoronto.ca/membrane) and public repositories (BioGRID, PRIDE accession number: PXD006247), we expect this resource will spur further exploration of other CEP complexes and interactions underlying critical CE-linked processes in other prokaryotes.
Topological classification of the E. coli proteome and target CEP selection.
We have made substantial efforts to systematically re-annotate the entire collection of 4,436 putative ORFs of E. coli K-12 on the basis of the most current (Supplementary Table 1), carefully filtered, subcellular annotations available in respected public databases and a hand-picked set of well-regarded and widely used prediction programs (i.e., α-helical inner membrane proteins by Phobius51, β-barrel outer membrane proteins (OMPs) by BOCTOPUS2 (ref. 52), and signal peptides intended for extracellular and periplasmic proteins by SignalP53), effectively supplanting questionable 'legacy' annotations.
While we do see the merit in using uniform annotations, the use of universally standardized GO terms to define topology is challenging as the GO hierarchical classifications are both very general (e.g., membrane - GO: 0016020; intracellular - GO: 0005622; cell - GO: 0005623), with distinctions such as the outer or inner lipoprotein classes largely missing. Therefore, we have compiled subcellular location information for E. coli K-12 proteins from six different sources: (i) our previously published E. coli subcellular annotation survey6, (ii) STEPdb ver 2.0 (ref. 54), (iii) EchoLOCATION55, (iv) EcoCyc, (v) UniProt, and (vi) GO (cellular component terms). For consistency, we list nine different subcellular compartments as broadly as possible: (i) IM, inner membrane; (ii) OM, outer membrane; (iii) PE, periplasm; (iv) LPI, IM lipoprotein; (v) LPO, OM lipoprotein; (vi) EC, extracellular; (vii) PG, peptidoglycan; (viii) MR, membrane-related; and (ix) CY, cytosol.
For each E. coli protein, a primary subcellular annotation was assigned based on maximum agreement (votes/counts) between these sources (referred to as “consensus” in Supplementary Table 1). In the absence of a consensus, a judgment call was made based on additional evidence such as protein function and domain information from EcoCyc as well as a manual literature search. Also, if most of the votes assigned a protein to the “no subcellular localization” category maximal, we assigned the protein to the next most likely subcellular compartment. Nevertheless, we also provide access to both the original terms provided by a given source as well as the consensus, allowing readers to view and sort annotations according to their preference.
With respect to the use of the term “membrane-associated,” we note that this assignment is used in EchoLOCATION, while UniProt refers to “membrane-related”54. In the “Compilation localization terms” sheet of the revised Supplementary Table 1, we have compiled annotations from the six aforementioned sources annotated as “membrane anchored,” “membrane associated,” “membrane,” “cell membrane,” “lipid anchor,” “cytoplasmic or periplasmic side of the peripheral membrane protein,” “single- and multi-pass membrane protein,” “intrinsic and extrinsic component of membrane,” “anchored component of membrane,” “cell envelope,” and “intracellular membrane-bounded organelle.” This is vital to derive meaningful information and trends regarding the membrane-related portion of the interaction network (Supplementary Figs. 2b and 5e). Nevertheless, the readers can track original annotations and curation sources.
Using this new scheme, we provide annotations documenting 1,347 putative CEPs, and during the manuscript preparation, we found an additional 270 proteins assigned to various CE categories (Supplementary Table 1) that were not purified by AP/MS. However, a significant fraction (141/270) of these proteins were detected as prey in other bait purifications, and 99 form part of our final PPI network.
Strains and plasmids.
All bacterial strains, plasmids, detergents, and DNA oligonucleotides used in this study are listed in Supplementary Table 5. Primers for SPA (sequential peptide affinity)-tagging was ordered in a 96-well plate format, and transformation, integration, and western-blot verification was performed essentially as previously described7,8,56. While we attempted to tag 1,347 CEPs, precise recombination and perfect in-frame fusion of the SPA-tag to the natural C terminus of the target protein was confirmed only for 785 CEPs by PCR or immunoblotting using the anti-FLAG antibody that recognizes the FLAG epitope located on SPA-tagged fusion strains7,8,56. Detailed detergent selection for CEP target purification and AP/MS procedures are described in Supplementary Note 1. Donor query mutant strain construction and conjugation procedures were as described57. The F− recipient single-gene-deletion strains (marked with the kanamycin resistance cassette) were from the Keio knockout library58 or the selection cassette was integrated into the 3′-UTR of essential genes to perturb transcript abundance57. To construct Hfr knockout mutants (marked with the chloramphenicol resistance cassette) with non-essential genes, we completely replaced the coding sequence of each open reading frame using λ-Red recombination57.
All strains used for assaying different pump substrates were derived from E. coli K-12 BW-RI that constitutively produces the tet repressor TetR and the lac repressor LacI. The acrA and mdtF single mutants were purchased from the E. coli Genetic Stock Center. The mdtEF double mutant was made using the method of Datsenko and Wanner59. To create acrAB and mdtEF quadruple mutant (i.e., 'Quad KO ΔacrAB mdtEF' in Fig. 3b), the acrAB mutation (the substitution of a kanamycin resistance gene (Kanr) for acrAB) from the previously made acrAB mutant TG1 acrAB60 was transferred to the mdtEF background by P1 transduction.
To construct the mdtEF overexpression strain (i.e., 'Quad KO mdtEF+' in Fig. 3b), the synthetic tet promoter (Ptet, a strong promoter that is repressed by TetR and the repression is released by a tetracycline or its analogs61) was substituted for the native mdtEF promoter as previously described62. Briefly, using the plasmid pKDT-Ptet as a template, the region ('Kanr:rrnBT:Ptet') containing a Kan resistance gene (Kanr), and an rrnB terminator (rrnBT) was amplified using the oligos Ptet.EF-P1 and Ptet.EF-P2 (Supplementary Table 5). Ptet.EF-P1 is composed of a 20-bp region at the 3′ end that is complementary to the FRT-flanking Kanr sequence, and a 50-bp region at the 5′ end that is homologous to the upstream region of the mdtEF promoter. Ptet.EF-P2 is composed of a 20-bp region at the 3′ end that is complementary to Ptet, and a 50-bp region at the 5′ end that is homologous to the first 50 bp of the mdtE gene. The PCR products (i.e., Kanr:rrnBT:Ptet) were purified and then electroporated into BW-RI acrAB cells, expressing the λ Red proteins encoded by plasmid pKD46. The cells were applied onto Luria–Bertani (LB) + Kan plates. Kanr colonies were screened for the replacement of native mdtEF promoter (−326 to −1 relative to the start codon of mdtE) by the 'Kanr:rrnBT:Ptet' fragment by colony PCR, followed by sequencing. The resultant strain was named 'Quad KO mdtEF+', in which acrAB is deleted and mdtEF is under the control of Ptet.
Similarly, to make the overexpression strain 'acrA' (i.e., 'Quad KO acrA+'), the 'Kanr:rrnBT: Ptet' fragment was substituted for the native acrAB promoter (−78 to −1 relative to the start codon of acrA) in the acrB and mdtEF deletion background, whereas for the overexpression strain 'mdtF' in Quad KO (i.e., 'Quad KO mdtF+'), the 'Kanr:rrnBT:Ptet' fragment was substituted for the native mdtEF promoter, mdtE and the mdtE/mdtF intergenic region in the acrAB deletion. Likewise, the 'acrA mdtF' overexpression strain (i.e., 'Quad KO acrA+mdtF+') was created by transferring the 'Kanr:rrnBT:Ptet-mdtF' from strain 'Quad KO mdtF+' into strain 'Quad KO acrA+' by P1 transduction. All constructs were confirmed by PCR and DNA sequencing.
Enriched CEP complexes sensitive to drugs.
To determine drug hypersensitivity among the components of CEP complexes, we compiled the phenotypic fitness score |≤ ± 2| from large-scale phenomics screens9 (focusing on 69 drugs that target specifically to cell wall, DNA synthesis, ribosome biogenesis and other cellular processes) for single-gene-deletion mutants corresponding to the CEP-encoding gene from the putative complexes predicted (Fig. 3a and Supplementary Table 5). Gene set enrichment analysis (GSEA)63 was then employed with ranked lists containing drug-sensitive (fitness score ≤ −2) phenotypes from phenomics screens9 using the predicted CEP complexes as gene sets. This analysis resulted in the significant enrichment (P ≤ 0.05) for drug hypersensitivity among the components of 69 of the 420 distinct CEP complexes tested.
Phylogenetic tree construction and determination of orthologous sequences.
The phylogenetic kinship of 20 different or diverse classes (Fig. 2 and Supplementary Fig. 9a,b) of bacteria were determined by retrieving the complete bacterial proteomes from the NCBI database. Proteomes compared using CVTree64 and T-REX65 were visualized using the corresponding phylogenetic trees. Specifically, this approach accounts for all protein sequences of an organism's proteome and counts the number of overlapping k-tuples to form a raw compositional vector with 20k components. Random background frequencies were subtracted by predicting the number of k-tuples from k-1 and k-2 mers through a simple Markovian model. By setting these 'normalized' frequencies in a fixed order, a normalized 20k dimensional vector for each organism was obtained. Finally, a correlation between two species was determined by calculating the projection of one normalized vector on another, from which a normalized distance between organisms could be gauged, allowing the construction of a phylogenetic tree.
Using all-versus-all BLASTP searches with the InParanoid script66 of the respective proteomes for a given pair of bacterial species, protein sequence pairs appearing as reciprocally best-scoring BLAST hits between each species were selected as central orthologous pairs. Proteins of both species that showed such an elevated degree of homology were clustered around these central pairs, forming orthologous groups. The quality of the clustering was further assessed by a standard bootstrap procedure. We only considered the central orthologous sequence pair with a confidence level of 100% as the genuine orthologous relationship.
Ethidium bromide (EtBr) accumulation.
E. coli BW-RI strains were grown overnight in 5 mL LB media from frozen stocks. About 150 μL of overnight-grown cultures were inoculated into 5 mL of fresh LB containing 0.2% glucose and 150 ng of chlorotetracycline for tetracycline promoter (Ptet). Strains carrying the Ptet promoter were grown with and without 150 ng chlorotetracycline for ∼1 h at 37 °C until an OD600 of ∼0.5 was reached. Cells were washed once in 1× PBS (phosphate buffered saline) and normalized to 0.15 for the EtBr retention assay.
The assay buffer consisted of 5 μg/mL EtBr, 0.4% glucose and 1× PBS in a volume of 1 mL. The cell suspensions were incubated for 30 min at room temperature to equilibrate EtBr that was retained inside the E. coli cells. Roughly 250 μL of the 1 mL suspension was aliquoted into three wells (analytical triplicates) on 96-well clear flat-bottom plates. The fluorescence readings were measured on a Tecan plate reader at 535 nm excitation and 612 nm emission. The cell suspension absorbance (600 nm) was also recorded to normalize the fluorescence signals.
Drug resistance assay.
The assay was performed in triplicate on 1.5% LB agar plates with carbenicillin or levofloxacin (100 mg/ml; 100 μl) added in a central well. E. coli (BW-RI) wild-type or mutant strains were streaked from the central well to the plate periphery. As the drug diffuses out into the plate, a gradient of compound is created, killing more sensitive cells at lower concentrations. After incubating plates at 37 °C for at least 24 h, the relative sensitivities of the mutant strains were expressed in terms of the distance of the diffused killing zones (in mm) relative to wild type.
Phenotypic and immunoblot analyses.
For assessing SDS-EDTA sensitivity, growth curve analyses were performed by inoculating the exponentially growing overnight cultures (OD600 ≈ 0.3–0.4) of the wild-type and single or double mutant strains into the 96-well microtiter plates containing 100 μl of LB medium in the presence or absence of SDS (0.5%) and EDTA (1.1 mM), incubated with shaking at 32 °C, and the absorbance of the culture was measured at OD600 using an automated Tecan Sunrise microplate reader every 15 min for 24 h.
For assessing the suppression of OM permeability defects of strains expressing multicopy pldA, respective mutant strains were transformed with pCA24N::pldA-His (marked with chloramphenicol) retrieved from ASKA overexpression library, following the protocol as previously described67. Sensitivity of the gene deletion mutant strains defective for OM was assayed using 6 mm filter paper disks (Sensi-Disc Susceptibility Test Discs; BD) impregnated with 5 μg of erythromycin by disk diffusion assay, essentially as described44.
Immunoblots for the LPS or OM marker protein levels in the wild-type and mutant strains were performed according to standard methods described10 with antisera raised against LptD, OmpA and MBP epitopes (a gift from T. Silhavy and C. Whitfield; see also Life Sciences Reporting Summary).
Phospholipid isolation and MS analyses.
Total phospholipids from the wild-type and mutant strains were extracted using the procedure essentially as previously described68. Extracted lipids were analyzed by direct-infusion electrospray ionization, at a flow rate of 5 μl/min, in an IonMax API ion source coupled to LTQ Orbitrap Elite MS. The instrument was calibrated with Pierce ESI Negative Ion Calibration solution. Measurements were carried out in negative ion mode with the following experimental conditions: the ion-spray potential of 3.5 kV and all other voltages were optimized for maximum molecular ion transmission and the transfer capillary temperature was set at 275 °C. Full-scan high-resolution mass spectra (R = 60,000 at m/z 400) were collected at a selected m/z range of 200 to 2,000 with a maximum injection time of 200 ms. Xcalibur software was used for data acquisition and processing.
Specially, we looked at the mass-to-charge ratio (m/z) of the product ions for phosphatidylethanolamine (PE) at 688.5 (corresponding to C15:0/cyC17:0 and C16:0/C16:1), 702.5 (C16:0/cyC17:0) and 714.5 (C18:1/C16:1 and cyC17:0/cyC17:0), as well as for phosphatidylglycerol at m/z 733.5 (corresponding to C16:0/cyC17:0) and 761 (C16:0/cyC19:0) using the product ion information described for phospholipid species in E. coli69. Similarly, as per the literature evidence70,71,72, we looked for the product ions of cardiolipin species at m/z 1360–1540.
Cloning of TamA, TamB and Ag43.
The coding sequence of the predicted mature TamA protein from E. coli BL21 (EcTamA22-577) was cloned in between the BamH1 and Xho1 endonuclease sites of modified pET26 expression plasmid, downstream of a PelB signal sequence, a 6× histidine tag and a thrombin cleavage site. A full-length and truncated TamB sequence from E. coli BL21 (EcTamB1-1259; EcTamB32-1259; EcTamB32-923) was cloned between the NdeI and Xho1 endonuclease sites of the pET28 expression vector, downstream of a 6× histidine tag and a thrombin cleavage site. The Ag43 premature autotransporter was cloned from E. coli K-12 using a restriction-free cloning method73 into a modified pHERD20T vector, downstream of a constitutive promoter.
To monitor the secretion of autotransporters, we designed an autotransporter-dependent agglutination assay in wild-type tamA and tamB mutant strains of E. coli K-12 BW25113 (ref. 58). Targeted gene deletion mutants were PCR verified to confirm the replacement of the genes with a kanamycin resistance cassette. Bacterial cells were transformed with the pHERD20T-Ag43, pAgH74, pTngH75 and pBAD18-YadA76 expression vectors, and individual colonies were grown in LB media supplemented with appropriate antibiotics until the optical density at 600 nm reached between 2.5 and 2.8 arbitrary unit (AU). The culture was measured at OD600 using 100 μl aliquots taken 1 cm below the surface of the liquid culture at specific time intervals. All experiments were performed in triplicate originating from three separate colonies.
Surface translocation of the Ag43 passenger domain was evaluated by flow cytometry. E. coli K-12 BW25113 wild-type tamA and tamB mutant strains transformed with an empty vector or the pHERD20T-Ag43, pAgH, pTngH and pBAD18-YadA plasmids were grown in 5 ml LB media until the OD600 reached 3.0 AU. Cells were washed, pelleted, and resuspended in 1 ml PBS (1× PBS pH 7.0; 0.9 mM MgCl2). Cells were washed a second time and resuspended in 200 μl PBS supplemented with mouse anti-His-tag primary antibodies (Pierce, Rockford, IL) at 1:200 dilution for 1 h at 4 °C.
Bacterial cells from the indicated samples were washed two times with the last resuspension procedure performed with 200 μl PBS containing anti-mouse IgG secondary antibody conjugated to phycoerythrin PE (Pierce, Rockford, IL) at 1:200 dilution, followed by 1 h incubation at 4 °C. The bacterial cells were then washed two times, and cells were fixed with 2% formaldehyde for 30 min at 25 °C. The fixed cells were analyzed with a FACSCalibur cytometer using 488 nm and 585 nm as the excitation and emission wavelengths, respectively. The threshold trigger was set on side scatter to eliminate background noise and only the intact cells were analyzed.
Proteinase-K shaving assay.
Wild-type and tamB mutant E. coli cultures transformed with the pHERD20T His-Ag43 expression vector were resuspended in PBS buffer supplemented with 1 mM MgCl2, 0.8 M Urea and incubated at 37 °C in the presence or absence of 0.1 mg/ml proteinase-K. About 500 μL aliquots were removed at various time points and proteinase-K was inactivated by the addition of 5 mM phenylmethanesulfonylfluoride. To insure that all clipped peptides from surface protein substrates were released in the extracellular medium, the aliquots were incubated at 55 °C for 3 min. The cells were then centrifuged at 1,500g for 3 min and the supernatant was analyzed by immunoblotting using anti-His-tag (HIS.H8) primary antibody to detect N-terminal His-Ag43.
For cell surface immunogold labeling, overnight bacterial cultures grown in LB media were harvested, washed with 1× PBS and then blocked with PBS containing 1% bovine serum albumin (BSA) for 1 h at room temperature. The cells were then incubated with mouse anti-His-tag (HIS.H8) primary antibody at a 1:20 dilution in PBS containing 1% BSA at 4 °C overnight. Cells were washed thrice with PBS and then blocked with PBS containing 1% BSA for 1 h at room temperature. The cells were labeled with 10 nm colloidal gold-conjugated goat anti-mouse IgG (H+L chain) at 1:20 dilution in PBS containing 1% BSA for 1 h at room temperature. Labeled cells were washed thrice with PBS followed by distilled water. Cells resuspended in distilled water were adsorbed onto formavar/carbon-coated nickel grids and stained with 1% uranyl acetate for 20 s. Air-dried grids were examined under a Hitachi H-7000 transmission electron microscope at 100 kV.
The query mutant donor strains, tamA and tamB, in Hfr Cavalli (marked with chloramphenicol resistance) were screened against an arrayed select set of F− recipient non-essential and hypomorphic alleles of essential genes using our established conjugation-based synthetic genetic array screening procedure57.
Source code availability.
Code for scoring PPIs are provided in Supplementary Note 1.
Life Sciences Reporting Summary.
Further information on experimental design is available in the Life Sciences Reporting Summary.
The PPI data from AP/MS experiments have been deposited to the ProteomeXchange Consortium via PRIDE (data set identifier: PXD006247) partner repository. The high-confidence PPIs can be accessed via our publically available web portal (http://ecoli.med.utoronto.ca/membrane) or from the open access BioGRID database dedicated to archiving PPIs for humans or model organisms. All other data are available from corresponding authors upon request.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Proteomics Identifications Database
We thank T. Silhavy, C. Whitfield, J.P. Côté, M. Mourez and P. Dersch for generously providing plasmids and reagents. We are also grateful to R. Reithmeier and M. Jessulat (Babu Lab) for advice. This work was supported by grants from the Canadian Institutes of Health Research to J.F.G., J.P. and A.E. (MOP-106449), J.P., A.E. and M.B. (PJT -148831) and T.F.M. (MOP-115182); the Ontario Ministry of Education and Innovation to T.F.M. and A.E.; the National Institutes of Health to P.U., M.H.S. and A.E. (GM109895); the Natural Sciences and Engineering Research Council of Canada to T.F.M. (DG-40197), A.Go. (DG-315735), J.P. (DG-06664), and M.B. (DG-20234); and the Canada Foundation for Innovation to T.F.M., M.B. and A.E.
Integrated supplementary information
Classification of the E. coli proteome and target cell envelope protein selection for AP/MS screens
Co-purifying protein pairs compiled from reference Ecoyc database and from this study
The physiological relevance and quality of PPIs by anecdotal supporting evidence, drug sensitivity profiles, and two-hybrid screens
Putative (or novel) CEP complexes and their subunits identified by the core-attachment based clustering algorithm are indicated with their respective topology model
Antibiotic susceptibility, evolutionary conservation, and paralogous analyses on CEPs or complexes, and bacterial strains/plasmids used in this study
About this article
Applied Microbiology and Biotechnology (2018)