The S-component fold: a link between bacterial transporters and receptors

The processes of nutrient uptake and signal sensing are crucial for microbial survival and adaptation. Membrane-embedded proteins involved in these functions (transporters and receptors) are commonly regarded as unrelated in terms of sequence, structure, mechanism of action and evolutionary history. Here, we analyze the protein structural universe using recently developed artificial intelligence-based structure prediction tools, and find an unexpected link between prominent groups of microbial transporters and receptors. The so-called S-components of Energy-Coupling Factor (ECF) transporters, and the membrane domains of sensor histidine kinases of the 5TMR cluster share a structural fold. The discovery of their relatedness manifests a widespread case of prokaryotic “transceptors” (related proteins with transport or receptor function), showcases how artificial intelligence-based structure predictions reveal unchartered evolutionary connections between proteins, and provides new avenues for engineering transport and signaling functions in bacteria.

Similarly, the BceAB-RS system of Bacillus subtilis consists of an ABC transporter interfacing with a histidine kinase to mediate signaling in response to antimicrobial peptides... George NL, Orlando BJ.Architecture of a complete Bce-type antimicrobial peptide resistance module.Nat Commun.2023 Jul 1;14 (1):3896. doi: 10.1038/s41467-023-39678-w. PMID: 37393310 While such systems are distinct from the S-component containing proteins identified in the current manuscript as they are not confined to a single polypeptide, one could envision them being considered "transceptor complexes".It may be worthwhile to mention proteins like CbrA, and complexes like PstSCAB-PhoUR and BceAB-RS in the discussion to highlight that combined signaling/transport has been documented in bacterial proteins(complexes) other than the S-component proteins identified in the current manuscript.
3. In figure 2A many of the proteins identified to contain an S-component fold do not have annotated functon(grey boxes).I am curious, do these proteins without functional annotation tend to have AlphaFold predicted structures that may hint at their function?How many of these non-annotated proteins are also likely to be S-component containing histidine kinases.
4. In the paragraph starting on line 208 the authors describe features of 5TMR SHK helix-0, and suggest that this helix may mediate dimerization.Given the power of AlphaFold-multimer in predicting protein-protein interactions, it seems likely that the authors could rapidly obtain a dimerized model of one of the S-component containing 5TMR-HKs.Such a model may lend credence to their hypothesis that H0 is important in mediating dimerization.5. Similarly to point 4 above, in the final model proposed in the discussion (line 370) and figure 6 the authors mention how substrate binding to the S-component domain of SHKs may promote their dimerization and initiation of the phosphorelay cascade.Alternatively, one could envision that the SHKs exist predominantly as dimers in the membrane even in the absence of substrate, and substrate binding to the S-component domain would trigger conformational changes that propagate throughout the histidine kinase to initiate signaling.In this reviewers opinion, the alternate scenario with predimerized SHK may be more consistent with current models of SHK structure/function.However, I acknowledge that more experimental evidence with S-component containing SHKs will be required to delineate between these two hypothetical models.6. line 119: "the best-scoring matches were homologues S-components" This wording is a little odd.Maybe "homologues" should be "homologous"?
Reviewer #2 (Remarks to the Author): The manuscript submitted by Drs Partipilo and Slotboom represent a new bioinfomatics analysis of an unique family of microbial transporter proteins, that was attracted attention of many microbiologists, biochemists and structural biologists during the last decade.All ECF transporters contain a substrate specific S-component and many of them share the energy-coupling components termed EcfAA`T, that provide the essential ATP-dependent energy for an unique uptake mechanisms.The tertitiary structures of many S-components have been solved during recent years and the authors explored their structural homologs among both existing tertiary structures (from within the PDB Databank) and also from the recently combined new databases containing the AlphaFold model predicted protein structures.Using the bioinformatics tools to search for a structurally similar proteins taht cannot be identified using the protein primary structure similarity search tools (such as BLAST), the authors were able to identify (i) a high structural similarity between various S-components that are specific to different substrates and share only limited protein sequence similarity, and (ii) a significant structural similary between S-componenets and proteins representing families of sensor histidine kinaseses (their transmembrane domains) and teh MreD family of poorly characterized proteins involved in the shaping of bacterial cell.These is a very interesting and important finding that shades the light into a possible evolutionary origin of these protein families.The authors investinagted further the phylogentic distribution and domain compositions of the studied and identified protein families and provided a few hypotheses on fold specialization of sensor kinases and ECF transporters.The study is well planned, the bioinfomatics methods used are solid and the manuscript is well written and I found it easy to follow.COngratulations to the authors for this excellent example of practical use of AlphaFold generated structures for protein evolutionary studies.

Reviewer #3 (Remarks to the Author):
This is a well-written paper that provides support for an interesting hypothesis about evolution, structure, and function of important protein families in bacteria.The strength of this manuscript is in its clarity and interesting results.Weaknesses of this manuscript are the lack of any experimental data.The authors could also more clearly state the level of confidence in their approach.They cite papers related to the strengths/weaknesses of Alphafold analyses, such as citation #47, but don't elaborate much on why this work is cited.
Fig. 1 -Since the figure is already in color, change EcfA and EcfA' to a different color (shades of yellow, perhaps).I realize that they are quite different, but the cell membrane is also light grey, so it would improve the figure if the EcfA proteins were not grey.
Line 132: awkward sentence structure.Perhaps change to: most likely because the sole function of MreD is determining cell shape in rod-shaped bacteria Line 200: The AlphaFold structures of representative proteins with the S-component fold from ECF transporters, SHKs and MreDs were used for structural multi-alignments and build a phylogenetic tree (Figure S1 and Table S5).
Change this to: were used for structural multi-alignments and to build a phylogenetic tree (Figure S1 and Table S5).Also -I don't think that the trees need to be shown, even as a supplemental figure.As built and shown, they are not very useful.
Fig 3B, 4 and 6 -can this figure be changed to make it simpler for people who can't see red or green?Using brown, purple, and orange shades in place of red and green will make the figure more accessible.
Fig. 3 could be a table -why not just provide the numbers used to make these figures?It is simple for the reader to make a figure if they wish.Reviewers' comments: Reviewer #1 (Remarks to the Author): In this manuscript by Partipilo, M. and Slotboom, DJ. the authors utilize recent developments in AI based protein structure prediction (AlphaFold) and fold recognition (Foldseek) to identify a structural link between the S-component of ECF transporters and the transmembrane region of 5TMR sensor histidine kinases.The manuscript illustrates a beautiful example of how recent developments in AI-based protein structure prediction and bioinformatics resources can provide new and novel insights into common protein structural folds, which could not have been achieved with methods such as protein sequence comparison alone.The finding of the Scomponent fold being widespread throughout both ECF transporters and histidine kinases in various bacterial phyla highlights the importance of this fold in mediating both signal (nutrient) perception and acquisition.
Overall, this manuscript is very well written and easy to understand.The figures are also very well made and easy to comprehend.I believe that the authors results will be of immense interest to a wide variety of structural biologists, microbiologists, and bioinformaticians.The authors have done an excellent job illustrating the power of AIbased protein structure prediction in facilitating novel biological discoveries with widespread implications across bacterial species.10.1038/s41598-020-62337-9.PMID: 32214184 X.X.Zhang, J.C. Gauntlett, D.G. Oldenburg, G.M. Cook, P.B. Rainey.Role of the transporter-like sensor kinase CbrA in histidine uptake and signal transduction J. Bacteriol., 197 (2015Bacteriol., 197 ( ), pp. 2867Bacteriol., 197 ( -2878 We agree with the reviewer that the definition of transceptor used in this manuscript requires clarification, as several cases of small domains or full-length proteins bridging signal sensing and membrane transport are already known in prokaryotes.We now have clarified in our Introduction (lines 48-52): 'Here, we use a narrow definition of transceptors as integral membrane proteins that structurally resemble transporters, yet function as receptors.This definition excludes cases in which a dedicated soluble domain acts as a bridge between transport proteins and signal transduction systems (e.g.STAC domain alone or incorporated into CbrA from Pseudomonas putida). 10,11' 2. Along similar lines as point 1 above, membrane protein complexes consisting of an ABC transporter and histidine kinase working in tandem to transport molecules and mediate signaling are well documented in bacteria.For instance, the PstSCAB-PhoUR system in E. coli is well known to couple phosphate transport and signaling in a single membrane protein complex... Similarly, the BceAB-RS system of Bacillus subtilis consists of an ABC transporter interfacing with a histidine kinase to mediate signaling in response to antimicrobial peptides... George NL, Orlando BJ.Architecture of a complete Bce-type antimicrobial peptide resistance module.Nat Commun.2023 Jul 1;14(1):3896.doi: 10.1038/s41467-023-39678-w.PMID: 37393310 While such systems are distinct from the S-component containing proteins identified in the current manuscript as they are not confined to a single polypeptide, one could envision them being considered "transceptor complexes".It may be worthwhile to mention proteins like CbrA, and complexes like PstSCAB-PhoUR and BceAB-RS in the discussion to highlight that combined signaling/transport has been documented in bacterial proteins(complexes) other than the S-component proteins identified in the current manuscript.
We thank the reviewer for these remarks.In order not to lose the focus on AIpredictions and S-components in our Discussion section, we decided to mention wellestablished examples of interactions between transporters and receptors in our Introduction instead.
BceAB-S is used to exemplify the formation of complexes occurring between transporters and receptors, together with DctA-DcuS from Escherichia coli (10.1111/j.1365-2958.2012.08143.x).This is stated as follows (lines 41-46): Although mediated by distinct protein components, transport and signal sensing can synergistically take place within larger complexes, as demonstrated by the C4dicarboxylate transporter DctA that forms a complex with the fumarate sensor DcuS in Escherichia coli, 5 and the structurally characterized BceAB-S module from Bacillus subtilis, 6 where an ABC transporter (BceAB) interfaces an histidine kinase (BceS) to respond to antimicrobial peptides.' 3.In figure 2A many of the proteins identified to contain an S-component fold do not have annotated functon(grey boxes).I am curious, do these proteins without functional annotation tend to have AlphaFold predicted structures that may hint at their function?How many of these non-annotated proteins are also likely to be Scomponent containing histidine kinases.
Most of the proteins resulting from the Foldseek search are without functional annotation in the related UniProt entry.They are likely S-components without functional annotation.This is now explicitely stated in the Results section (lines 144-146): 'Finally, entries not functionally annotated represented 20 and 60% of the hits, depending on the specific search, which in most cases are likely to be S-components (although not annotated as such in protein databases).'Only a marginal number of the none-annotated protein hits (mostly close homologues to each other) are 5TMR-SHKs.Below we attach a table of representative results from the SWISSPROT Foldseek search (with a larger number of hits) supporting this conclusion, in which repetitions among different search-results are omitted.4. In the paragraph starting on line 208 the authors describe features of 5TMR SHK helix-0, and suggest that this helix may mediate dimerization.Given the power of AlphaFold-multimer in predicting protein-protein interactions, it seems likely that the authors could rapidly obtain a dimerized model of one of the S-component containing 5TMR-HKs.Such a model may lend credence to their hypothesis that H0 is important in mediating dimerization.
We thank the reviewer for the suggestion.We now modified Figure 3A, including an AlphaFold multimeric prediction of YpdA highlighting the AI-predicted interactions at the level of helix0.We modified the text accordingly (lines 232-235): AlphaFold predictions for the homodimeric conformation of 5TMR-SHKs highlight the close proximityand probable interaction -of the conserved polar side chains (around 4 Å), corroborating the hypothesis of the formation of an intramembranous salt bridge between two protomers (Figure 4A).
A dedicated paragraph describing the procedure behind the AlphaFold-prediction generated via ColabFold is available in the Material and Methods section (lines 476-481).
5. Similarly to point 4 above, in the final model proposed in the discussion (line 370) and figure 6 the authors mention how substrate binding to the S-component domain of SHKs may promote their dimerization and initiation of the phosphorelay cascade.Alternatively, one could envision that the SHKs exist predominantly as dimers in the membrane even in the absence of substrate, and substrate binding to the Scomponent domain would trigger conformational changes that propagate throughout the histidine kinase to initiate signaling.In this reviewers opinion, the alternate scenario with pre-dimerized SHK may be more consistent with current models of SHK structure/function.However, I acknowledge that more experimental evidence with S-component containing SHKs will be required to delineate between these two hypothetical models.
We thank the reviewer for raising the point about the preexisting dimeric state, regardless of substrate binding, and have now modified the text (lines lines 246-251 and 384-398): While in ECF transporters, the binding of substrate leads to association with the ECF module, we speculate that in SHKs, conformational changes upon substrate binding may either cause dimerization of monomeric receptors, or a specific reorganization of pre-existing interactions in the dimeric receptor complex, which subsequently leads to the transmission of the signal from the membrane to the soluble domains.
and Rather than toppling and associating with the ECF module, potential conformational changes at the membrane domain interface induced by the substrate recognition may culminate into an orchestrated series of intramolecular rearrangements of the cytosolic domains, and eventually enable the downstream phosphorelay cascade.
Thus, the cartoon in figure 6 has been modified accordingly.
6. line 119: "the best-scoring matches were homologues S-components" This wording is a little odd.Maybe "homologues" should be "homologous"?
We thank the reviewer for noticing the odd wording and changed 'homologues Scomponents' into 'homologous S-components'.
Reviewer #2 (Remarks to the Author): The manuscript submitted by Drs Partipilo and Slotboom represent a new bioinfomatics analysis of an unique family of microbial transporter proteins, that was attracted attention of many microbiologists, biochemists and structural biologists during the last decade.All ECF transporters contain a substrate specific Scomponent and many of them share the energy-coupling components termed EcfAA`T, that provide the essential ATP-dependent energy for an unique uptake mechanisms.The tertitiary structures of many S-components have been solved during recent years and the authors explored their structural homologs among both existing tertiary structures (from within the PDB Databank) and also from the recently combined new databases containing the AlphaFold model predicted protein structures.Using the bioinformatics tools to search for a structurally similar proteins taht cannot be identified using the protein primary structure similarity search tools (such as BLAST), the authors were able to identify (i) a high structural similarity between various S-components that are specific to different substrates and share only limited protein sequence similarity, and (ii) a significant structural similary between S-componenets and proteins representing families of sensor histidine kinaseses (their transmembrane domains) and teh MreD family of poorly characterized proteins involved in the shaping of bacterial cell.These is a very interesting and important finding that shades the light into a possible evolutionary origin of these protein families.The authors investinagted further the phylogentic distribution and domain compositions of the studied and identified protein families and provided a few hypotheses on fold specialization of sensor kinases and ECF transporters.The study is well planned, the bioinfomatics methods used are solid and the manuscript is well written and I found it easy to follow.COngratulations to the authors for this excellent example of practical use of AlphaFold generated structures for protein evolutionary studies.
We thank the reviewer for the flattering words about our overall work, in terms of the used methods, clarity in writing, up to the potential scientific output.
Reviewer #3 (Remarks to the Author): This is a well-written paper that provides support for an interesting hypothesis about evolution, structure, and function of important protein families in bacteria.The strength of this manuscript is in its clarity and interesting results.Weaknesses of this manuscript the lack of any experimental data.The authors could also more clearly state the level of confidence in their approach.They cite papers related to the strengths/weaknesses of Alphafold analyses, such as citation #47, but don't elaborate much on why this work is cited.
We appreciate the reviewer's description of the strengths to our manuscript and agree with the comments on the lack of experimental data.We now explicitly state that experimental work is required in future (lines 355-356): "While it is important that the predicted structural relatedness between 5TMR-SHKs and S-components will be tested experimentally in future work 53 ,…" Nonetheless, the main aim of our work was providing a new structural AI-driven approach for the development of biological hypotheses, that would otherwise be impossible using the conventional sequence similarity-based tools.This work alone required a full manuscript, and we feel that our analyzes have served this purpose, with the hope of inspiring scientists from diverse backgrounds to look for hitherto unknown connections in other protein families, and subsequently test them experimentally.
In the absence of experimental evidence, we are not currently able to absolutely prove or disprove the hypotheses, and we strongly rely on AlphaFold predictions that have limitations described in detail in reference 47 (now ref. 53).We now provide a clearer explanation of these limitations, while stating more in detail the level of confidence in our approach in the Discussion section (lines 356-366): 'several indicators show that the confidence of the prediction is high: i) with both AIpredicted and experimentally determined structures of S-components used as queries structural homology between S-components and 5TMR-SHKs was found (Fig. 2A), ii) the confidence (TM) scores obtained for the hits were high, in almost all cases above 0.5 (Table S2), iii) the 5TMR-SHK hits were interspersed in the ranking among hits of validated S-components, iv) fold recognition from the sequence of the membrane domain of the 5TMR-SHKs always found crystallographic structures of S-components (Tables S3-S4), v) well conserved amino acids in the pocket for substrate recognition in 5TMR-SHKs (Fig. 4B) correspond to amino acids involved in substrate binding in S- (shades of yellow, perhaps).I realize that they are quite different, but the cell membrane is also light grey, so it would improve the figure if the EcfA proteins were not grey.
We understand the concern of the reviewer about too many elements in grey, and thus decided to change the color of the membrane cartoon to avoid confusion.Consistently, we did the same with figure 6.
Line 132: awkward sentence structure.Perhaps change to: most likely because the sole function of MreD is determining cell shape in rod-shaped bacteria To improve the sentence structure, we rephrased it into 'because the sole function of MreD is tightly coupled to determining cell architecture in rod-shaped bacteria', ensuring to avoid bold statements on the precise function of MreD in the complex process of shape determination of rod bacteria.
Line 200: The AlphaFold structures of representative proteins with the S-component fold from ECF transporters, SHKs and MreDs were used for structural multialignments and build a phylogenetic tree (Figure S1 and Table S5).Change this to: were used for structural multi-alignments and to build a phylogenetic tree (Figure S1 and Table S5).
We rephrased line 200 as suggested.
Also -I don't think that the trees need to be shown, even as a supplemental figure.As built and shown, they are not very useful.
The reviewer's point here is understandable, since we (authors) carefully considered before submitting the article whether it would be appropriate to show these results in the manuscript.Our aim with Figure S1 was to highlight the limitations of a structural alignment built on entries that are extremely different in sequence and function, regardless of the predictive or experimental nature of the protein structures.We believe it is very important to understand whether this shared fold between transport proteins, receptors and scaffold proteins (proposed function of MreD) had a common origin, or is the result of independent evolutionary events in the fold landscape of membrane proteins.While at present we have not been able to answer this intriguing question, even including distinct analyzes with only the membrane domain or fulllength 5TMR-SHKs, we hope that our efforts represented in Figure S1 can be the starting point for further research.We prefer to keep the figure, but leave it to the editor to decide.
Fig 3B, 4 and 6 -can this figure be changed to make it simpler for people who can't see red or green?Using brown, purple, and orange shades in place of red and green will make the figure more accessible.
We agree, and to make our figures more accessible, we modified figures 4 and 6 Fig. 3 could be a table -why not just provide the numbers used to make these figures?It is simple for the reader to make a figure if they wish.
We believe that this is a matter of taste.We prefer figures over table.Furthermore, the numbers are continuously being updated with new InterPro releases (on average there are four new releases every year), and would soon become partially inaccurate, while the pie charts and bar diagrams will by-and-large will remain unchanged.The bar graph in Figure 3A and the pie graph in Figure 3B make readers immediately aware of the relationships between S-components, 5TMR-receptors and MreD proteins in terms of taxonomic distribution, regardless of what the absolute numbers are.For this reason, we believe the choice of visual representation of the data is more suitable to the test of time and to the constant update of database annotations, without compromising the take-home message we want to convey to the readership.
For information we added the tables here:   We thank the reviewer for the kind words..We indeed meant substitution per site, and modified it accordingly.
The response seems convincing and the proposed changes, in my opinion, suitably address R3's concerns.Of course, the authors have not added experimental work but I agree with them that this seems to go beyond the goal of this work, which was to demonstrate the use of AI based prediction for identifying conserved folds in distantly related proteins.

Fig 6
Fig 6 is nicely put together and would be useful for teaching and developing further hypotheses about these proteins.Fig S1.This line in the legend doesn't make sense: The tree scale indicates 31 the number of amino acid substitutions for site.Do you mean per site?

Fig. 1 -
Fig. 1 -Since the figure is already in color, change EcfA and EcfA' to a different color

Fig 6
Fig 6 is nicely put together and would be useful for teaching and developing further hypotheses about these proteins.

Fig S1 .
Fig S1.This line in the legend doesn't make sense: The tree scale indicates 31 the number of amino acid substitutions for site.Do you mean per site?

Table 1
The occurrence among different species of the proteins with the shared fold.Data was obtained from the respective PFAM entries available in the 'Taxonomy' section of the InterPro database https://www.ebi.ac.uk/interpro/entry/pfam/ (PF12822 for S-components, PF04093 for MreD proteins and PF07694 for 5TMR-containing SHKs).

Table 2
The distribution of S-components, SHKs and MreDs among bacterial phyla.The analysis on the number of species refers to the release InterPro 96.0.