Unanticipated functional diversity among the TatA-type components of the Tat protein translocase

Twin-arginine translocation (Tat) systems transport folded proteins that harbor a conserved arginine pair in their signal peptides. They assemble from hexahelical TatC-type and single-spanning TatA-type proteins. Many Tat systems comprise two functionally diverse, TatA-type proteins, denominated TatA and TatB. Some bacteria in addition express TatE, which thus far has been characterized as a functional surrogate of TatA. For the Tat system of Escherichia coli we demonstrate here that different from TatA but rather like TatB, TatE contacts a Tat signal peptide independently of the proton-motive force and restricts the premature processing of a Tat signal peptide. Furthermore, TatE embarks at the transmembrane helix five of TatC where it becomes so closely spaced to TatB that both proteins can be covalently linked by a zero-space cross-linker. Our results suggest that in addition to TatB and TatC, TatE is a further component of the Tat substrate receptor complex. Consistent with TatE being an autonomous TatAB-type protein, a bioinformatics analysis revealed a relatively broad distribution of the tatE gene in bacterial phyla and highlighted unique protein sequence features of TatE orthologs.

Bacteria, archaea, and plant chloroplasts have the capability to transport precursor proteins in a folded state across membranes. Precursor proteins that qualify for this mode of transport are primarily distinguished by an SRRxFLK sequence motif in the N-terminal part of their signal sequences. This consensus motif, of which the double arginine is largely invariant, is recognized by so-called Tat translocases (Tat stands for twin-arginine translocation). A second determinant for the specific recognition of signal peptides by Tat translocases is the hydrophobicity of their core region 1,2 .
Tat translocases do not pre-exist in the membrane as stable protein complexes but rather assemble on demand 3,4 from TatA-type and TatC-type membrane proteins. The TatC subunits of Tat translocases possess six transmembrane helices 5,6 . They associate with a varying number of homologous TatA-type proteins, whose N-terminal structure is characterized by a single transmembrane helix followed by an amphipathic domain [7][8][9][10] . Our model organism, the Gram-negative bacterium Escherichia coli, expresses three TatA-type proteins, TatA, TatB and TatE. As per degree of homology, TatA and TatB evolved early from a common ancestor whilst TatE emerged from TatA by a late gene duplication event.
TatC functions as the primary docking site for the Tat signal peptide at the Tat translocase 11,12 directly recognizing the RR-consensus motif 13 . TatC provides binding sites for TatA and TatB 12,[14][15][16] . In concert with TatB, TatC forms a dome-shaped intramembrane binding cavity allowing the hairpin-like insertion of the Tat signal peptide 14,17 .
Despite their structural homology, TatA and TatB have distinct, non-interchangeable functions during Tat-dependent translocation. By interacting with two different sites on TatC, TatB links neighboring TatC monomers thereby forming circular, hetero-multimeric substrate receptor complexes 14,15 . TatB interacts with the Tat signal peptide downstream of the RR-consensus motif in a proton-motive force independent manner 11,14,18 . In addition, it functions as a major binding platform for the folded mature domain of Tat substrates 19 .
TatA is more abundant than all the other Tat subunits with the actual stoichiometry, however, being at issue [20][21][22] . Depending on the proton-motive force, TatA associates with the TatBC complex 23 and the signal peptide 14,24 . 1 It facilitates through its homo-oligomerization 25,26 the transmembrane translocation of the substrate in a still elusive manner. TatA competes with TatB for binding to TatC [14][15][16] .
The function of the smallest TatA-type protein of E. coli, TatE, remains to be elucidated. In vivo studies demonstrated that TatE can maintain Tat transport in the absence of a functional TatA 3,27,28 suggesting TatE be a functional paralog of TatA. This, however, would be insufficient to explain why TatE has persisted during evolution as an individual isoform. Previously we demonstrated that TatE is a regular constituent of the Tat translocase in E. coli 28 . We found TatE to interact with TatA, TatB, and TatC and to localize to active Tat translocases in vivo. Here we show that TatE exhibits distinct properties rendering it a functional hybrid between TatA and TatB. Using a bioinformatic approach we demonstrate that tatE genes are more abundant among the bacterial kingdom than anticipated, further emphasizing an individual relevance of this Tat subunit.

Results and Discussion
TatE and TatB of E. coli share functional properties. We previously demonstrated that TatE displays the properties of a constitutive component of the E. coli Tat translocase, as it localizes to functional Tat translocases in living cells and interacts individually with TatA, TatB, and TatC 28 . However, whereas TatA 11,24,29 ,TatB 14,18,30 ,and TatC 11,13,18 have all been shown to come into contact with the RR-signal sequence of Tat substrate proteins, direct interactions of TatE with substrate have not yet been demonstrated. We therefore equipped the model Tat substrate TorA-MalE335 with the photo cross-linker p-benzoyl-phenylalanine (Bpa). In TorA-MalE335, the RR-signal peptide of the natural E. coli Tat substrate TorA (trimethylamine oxide reductase) is fused to a C-terminally truncated version of the periplasmic maltose-binding protein MalE 12,17 . Figure 1A highlights the four positions within the TorA signal peptide that were individually replaced by Bpa during cell-free transcription/translation, using an amber stop codon-based approach. The in vitro synthesized Bpa variants of TorA-MalE335 (pTMal) were incubated with inside-out inner membrane vesicles, which had been prepared from a derivative of E. coli strain DADE overexpressing various combinations of the plasmid-encoded TatABCE proteins in a ΔtatABCE background. When cross-linking was initiated by irradiation with UV-light, the previously described 11,14,18,31 site-specific interactions with TatC and TatB were obtained. Thus, Bpa located at position F14 within the RR-consensus motif of the TorA signal peptide cross-links to TatC (Fig. 1B, lane 4, blue star), whereas Bpa incorporated into the hydrophobic core and the c-region of the TorA signal sequence (cf. Fig. 1A) yields adducts to TatB and TatA (Fig. 1B, lanes 10, 16, 22, green and red stars, respectively). When in this experimental setup membrane vesicles containing TatEBC instead of TatABC were employed (lanes 6,12,18,24), cross-linking of the TorA signal sequence to TatB and TatC persisted (blue and green stars), whereas the TatA cross-links (red stars) were no longer obtained. Instead, lower molecular mass adducts (orange stars) appeared that by size correspond to a cross-link between TorA-MalE335 and TatE. This was confirmed by immuno-precipitation using antibodies directed towards TatE (Fig. 1C, lane 7). When vesicles contained both TatE and TatA, the signal sequence of TorA-MalE335 was found cross-linked to both Tat subunits (Fig. 1D, lane 6). These findings indicate that in the assembled Tat translocase, TatE locates close to the hydrophobic core and c-region of an RR-signal peptide exactly like TatA and TatB do. No competition between TatE and TatA for interacting with the Tat substrate was observed under these experimental conditions.
A characteristic feature of the cross-links between RR-signal peptides and TatA is their sensitivity towards dissipation of the H + -motive force 11 . This is illustrated in Fig. 1E, where the UV-dependent adduct between the TorA-MalE335 precursor and TatA (red star) disappears upon addition of the protonophore cyanide m-chlorophenyl-hydrazone (CCCP), whereas the TatB adducts (green stars) persist in the presence of CCCP (lanes 5 and 6). In contrast to TatA and exactly like TatB, TatE was found cross-linked to TorA-MalE335 regardless of whether CCCP was present or not (lanes 8 and 9, orange stars). The results presented in Fig. 1 therefore reveal a property of TatE that would not be expected, if TatE were a functional homologue solely of TatA. Interaction with a Tat signal sequence independently of the H + -motive force rather is a typical feature of TatB.
We previously reported that in the absence of TatA and TatB, TatC by itself enables RR-signal sequences of Tat substrates to insert into the cytoplasmic membrane of E. coli. Insertion was shown to proceed to the point that RR-signal sequences are recognized by signal peptidase and prematurely cleaved without the actual Tat substrates being translocated 17 . In this scenario, a typical feature of TatB, which is not shared by TatA, is to prevent this TatC-mediated premature cleavage of the signal peptide. This is addressed in Fig. 2A. When the precursor form of TorA-MalE335 (pTMal) is synthesized in vitro in the presence of membrane vesicles lacking all Tat components (ΔTat), it is completely sensitive towards digestion by proteinase K (compare lanes 1 and 2). In the presence of TatABC-containing vesicles, however, pTMal becomes processed by the signal peptidase of the vesicles to the mature form (lane 3, mTMal), which due to transport into the vesicle lumen is now resistant towards proteinase K (lane 4). Only a minor fraction of uncleaved precursor (pTMal) is translocated under these conditions as indicated by protease resistance (lane 4). Note that the slightly smaller size of the protease-treated pTMal (compare lanes 3 and 4) is the result of proteinase K removing of a few N-terminal amino acids from the membrane-embedded signal peptide of translocated yet non-processed pTMal 17 .
If in these experimental conditions membrane vesicles are used that contain only TatA and TatC (lane 5), roughly 30% of pTMal are still converted to mTMal (Fig. 2B) similar to what is seen with TatABC vesicles (Fig. 2A, compare lanes 3 and 5). As previously shown, the cleavage of pTMal by TatAC vesicles requires an intact RR-motif, i.e. recognition by TatC, as well as a functional signal peptidase cleavage site 17 . Although caused by signal peptidase, cleavage of pTMal is premature, because the TatAC vesicles do not allow translocation, as demonstrated by the accessibility of the cleaved mTMal to proteinase K (lane 6). In contrast, TatBC vesicles lacking TatA do not allow for the conversion of pTMal to mTMal ( Fig. 2A, lane 7; Fig. 2B). A minimal cleavage of pTMal by TatBC vesicles to a product slightly bigger in size than mTMal (lane 7) was also observed by vesicles entirely lacking the Tat translocase (lane 1) and is therefore caused by an unknown protease. Prevention of premature processing by TatBC vesicles is consistent with TatB and TatC concertedly forming an intramembrane binding cavity for the RR-signal peptide. Because TatA is not a primary constituent of this binding cavity, it is not able to prevent the signal peptide from crossing the membrane and being prematurely processed by signal peptidase 14,17 . Membrane vesicles harboring only TatE and TatC, however, showed a significantly reduced premature processing of pTMal, although they were not as inhibitory as TatBC vesicles ( Fig. 2A, lane 9; Fig. 2B). Thus like TatB and different from TatA, TatE is also able to counteract the TatC-mediated premature cleavage of the Tat substrate, yet with less efficiency than TatB. The data presented in Figs. 1 and 2 collectively suggest that TatE might play a role as part of the TatBC receptor complex for Tat substrates.

TatE paralogs show distinct sequence motifs and occur also outside of Enterobacteria. A
TatB-like function of TatE would also be consistent with the about 50-fold lower expression level of tatE compared to tatA 20 . Nevertheless, TatE of E. coli shows a much higher sequence identity with TatA than with TatB and was shown to partially compensate the phenotype of a tatA deletion mutant under certain experimental conditions 3,15,27,28 . Given such a seeming bifunctionality and the fact that TatE has been characterized as a constitutive member of the E. coli Tat translocase 28 , TatE of E. coli seems to be a distinct member of the TatA family rather than a dormant surrogate for TatA or TatB. Consistent with this assumption, the N-terminal amino acid sequences of TatA, TatB, and TatE from E. coli reveal individual differences as manifested by a disparate distribution of charged amino acids (Fig. 3A). TatE and TatB each possess two charged amino acid residues in position 3 (N-tail) and position 8 (transmembrane helix), although the latter one is of opposite polarity (Lys in TatE, Glu, however, in TatB). In contrast, TatA of E. coli is uncharged in its N-terminus.
In order to investigate the significance of the two N-terminal charged residues of E. coli TatE, we searched the NCBI Reference Sequence protein database for homologous sequences. 3120 sequences were obtained, of which 121 were annotated as TatE, 99 as SecE, 2041 as TatA, and 859 were annotated differently. The sequences annotated as SecE clearly share characteristics with TatE orthologues, whereas SecE is a structurally totally diverse subunit of the functionally unrelated Sec translocase 32 . It is therefore likely that these 99 TatE homologues have erroneously been annotated as SecE in the data base used. For reasons of unambiguity, we, however, excluded them from our analysis.
Among the 121 TatE sequences, 111 are from the Order of Enterobacterales that contains the Family of Enterobacteriaceae, for which almost exclusively TatE paralogs had hitherto been described. The 10  Fig. 3B. Lysine at position 8 was found to be conserved to 99%. The majority (72%) of those TatE sequences display glutamate at position 3 with a preceding Gly. In the remaining 28% of sequences, Glu is found at position 2 and Gly at position 3 resulting in either a Gly-Glu or a Glu-Gly pair at this place. Thus in all enterobacterial TatE sequences, the N-termini are distinguished by an E 3 xxxxK 8 or an E 2 xxxxxK 8 motif.
In line with similar functional properties of TatE and TatB, an N-proximal negatively charged residue, as found here to be conserved among the enterobacterial TatE orthologs, was previously shown to be associated with TatB-like functions. The Freudl group isolated mutants in E. coli tatA that phenotypically suppressed the deletion of TatB. Most of the suppressors had gained an aspartate at the otherwise uncharged N-terminus of TatA and thereby the ability to functionally replace TatB 33 .
Out of the 3120 sequences obtained from our database search for homologues of the E. coli TatE, 2041 entries were annotated as TatA (Table 1), which reflects the high sequence identity between both Tat proteins 15 . These TatE homologues are almost entirely of proteobacterial origin (Table 1). When screened for the occurrence of charged N-terminal amino acids, 108 of those 2041 TatA proteins were found to harbor the E 3 xxxxK 8 or E 2 xxxxxK 8 motif, 44 merely a Lys at position 8, whereas 1889 do not carry similar charge patterns in their N-termini. Interestingly, of those bacterial species encoding TatA homologues with the aforementioned charge patterns, 50 possess an additional TatA paralog, of which the N-terminus is not charged. This is exemplified in Fig. 4 for selected species from the Order of Vibrionales, of which both types of TatA paralogs were separately aligned. The one with the higher homology to E. coli TatE and the shorter length (63-78 amino acids) is labeled TatA_1. Seven of them in fact display the E 3 xxxxK 8 motif (boxed), whilst all of them possess the Lys at position 8. The second group of TatA paralogs (denominated TatA_2) lacks any N-terminal charges and consistently contains the Gln 8 of E. coli TatA. These findings demonstrate that the occurrence of TatE-type orthologs harboring distinct N-terminal charged amino acids is not limited to enterobacteria but also encompasses other Gamma-proteobacteria. Furthermore, their co-existence in the genomes with TatA-type paralogs, which do not carry TatE-specific N-terminal charge patterns, would support the idea that TatE-type proteins when co-expressed might serve a unique functional purpose.
A TatE paralog had also been described for the Gram-positive organism Corynebacterium glutamicum 34 . In contrast, the inclusion threshold for TatE homologues that we defined for our database search did not yield any result, neither from C. glutamicum nor from other Gram-positive organisms. We therefore performed a sequence alignment of the TatABE proteins form C. glutamicum and E. coli and found that the annotated TatE of C. glutamicum actually shares a higher sequence similarity with TatA than with TatE from E. coli (Fig. 5A). Moreover, the annotated TatE of C. glutamicum lacks any charged amino acids within its N-terminal region but shares the Gln 8 with E. coli TatA (Fig. 5A). However, without further functional analyses, a clear assignment of the two TatA paralogs of C. glutamicum to the TatA and TatE families is difficult to undertake.
On the other hand, we found the ExxxxK motif characteristic for E. coli TatE also contained in TatAc, which is one of the three annotated TatA paralogs of the Gram-positive bacterium Bacillus subtilis (Fig. 5B). Intriguingly, TatAc was recently shown to functionally compensate for mutation-borne defects of TatAy, which is another TatA paralog of B. subtilis 35 . TatAc was, however, not able to complement the complete absence of TatAy suggesting that TatAc and TatAy might functionally interact 35 . Thus the minimal Tat translocases of B. subtilis consisting of single TatC and TatA components 36,37 might in fact associate with an auxiliary TatA component. In Fig. 5B we aligned the three TatA paralogs TatAc, TatAy, TatAd of B. subtilis with the protein sequences of E. coli TatA, TatE, and TatB. Strikingly, the sequence alignment of B. subtilis TatAc and E. coli TatE, both exhibiting the ExxxxK motif, gave 45% identical amino acid residues within the aligned parts indicating that TatAc might more likely be a functional paralog of E. coli TatE than of E. coli TatA. Hence, all these sequence compilations demonstrate a wide distribution of TatA-family members that possess a Lys at a position equivalent to the 8 th position of enterobacaterial TatE (Fig. 3B). This suggests that unique TatE-type paralogs of TatA might operate in a much larger number of bacterial Tat translocases than previously appreciated.

TatE-specific interactions with the Tat proteins.
Following an analysis of co-evolutionarily predicted residue contacts between TatC and TatA-family proteins as well as the results of molecular dynamics simulations, it was recently proposed that Glu 8 , which is conserved among almost all TatB orthologs, is part of a polar cluster that ligates TatB to TatC in a functional manner 15 . Similarly, Gln or His that populate residue 8 in the vast majority of TatA paralogs would allow TatA to interact with TatC via the same polar cluster 15 . In contrast, the Lys 8 residue, which obviously is a hallmark of TatE-type orthologs, is likely to mediate different contacts with TatC. This is strongly suggested by the phenotype of an E. coli TatB variant, which due to a Glu 8 to Lys 8 substitution associates with TatC in a manner that allows proficient recognition of otherwise transport-defective signal peptides [38][39][40] . By inference, TatA-family members naturally displaying a positively charged residue at position 8, such as most TatE paralogs, are also likely to interact with TatC in a manner different from the mostly negatively charged TatB orthologs. Association of TatE paralogs with TatC could even represent an advanced step in the assembly of a functional Tat translocase following the initial TatBC interaction.
In order to obtain more information on how TatE might associate with the other Tat proteins, we explored N,N'-dicyclohexylcarbodiimide (DCCD) as a cross-linking agent. DCCD is known to form amide bonds between carboxyl groups located in hydrophobic environments and primary amines 41 . We recently realized that TatB and TatC present in inner membrane vesicles of E. coli can be cross-linked in this way (unpublished results). As shown in Fig. 6A, dependent on the addition of DCCD to membrane vesicles containing TatABC a ~45 kDa product appears (green diamond), which is recognized by anti-TatB and anti-TatC antibodies (αTatB, lane 2; αTatC, lane 8). If the vesicles contain TatE instead of TatA (TatEBC), DCCD treatment results in an additional cross-link, ~37 kDa in size, which is recognized by anti-TatB and anti-TatE antibodies (lane 4, orange triangle, αTatB, αTatE). We therefore conclude that in these vesicles, TatE is so closely spaced to TatB that both proteins can be cross-linked by DCCD. This is consistent with previously identified contact sites between TatE and TatB, which photo cross-linking revealed in the transmembrane and amphipathic helices of TatE 28 . An immediate proximity between TatE and TatB within the substrate receptor complex is further reflected by the findings of Fig. 1 demonstrating that in the absence of the H + -motive force, the same sites of the TorA signal peptide contact both TatB and TatE.
We next asked where on TatC such a TatB-TatE heterodimer could be located. As to TatB, several recent studies have identified the transmembrane helix 5 of TatC as a docking site for the transmembrane helix of TatB 5,14,15,42 . One of the residues located in transmembrane helix 5 of TatC that repeatedly was shown to mediate contacts with TatB is methionine 205. We therefore incorporated Bpa into E. coli TatC at position 205 using an in vivo amber stop codon approach and prepared membrane vesicles that carried the TatC 205Bpa mutant in the presence of TatAB and TatEB (Fig. 6B). In the context of the TatAB proteins, TatC 205Bpa when activated by UV light yielded four prominent adducts (lane 2, αTatC). Adducts running at ~65 and ~40 kDa (blue stars) were recognized only by antibodies directed against TatC and therefore represent dimers and trimers of TatC 14,42 . The ~50 kDa cross-linking product (blue diamond) was also recognized by anti-TatB antibodies (αTatB, lane 6) confirming this known contact site of TatC for TatB. The ~37 kDa adduct of TatC 205Bpa (red square) was also detected by anti-TatA antibodies (αTatA, lane 6) consistent with the previously established overlap of the TatA and TatB Interestingly, the DCCD-mediated contact between TatB and TatE (Fig. 6A, lane 4, orange triangle), was almost gone in the presence of TatA (lane 6). Instead membrane vesicles containing all four Tat components gave rise to new adducts that by size and immuno-reactivity represent TatE-TatA oligomers (orange dots, compare lanes 4 and 6, 10 and 12). If this reflects a true exchange of TatB for TatA as the binding partner of TatE following the recruitment of TatA to the substrate-bound TatBC receptor complex, is speculative at this point of time. The DCCD-mediated cross-links that we obtained between TatE and TatA, however, suggest that TatE could fulfil a role in the recruitment of TatA to the TatBC receptor complex through hetero-oligomerization with TatA. Because in vivo, TatE is present only in sub-stoichiometric amounts compared with TatA 20 , TatE could conceivably function as a nucleation point for a TatBC-dependent oligomerization event of TatA.
In conclusion, our results suggest that TatE paralogs have overlapping functions between TatA and TatB. Similar to TatB, TatE is involved in substrate binding yet possibly at a later step. Although it docks at transmembrane helix five of TatC much like TatA and TatB do, the individual interacting epitopes of TatC are likely to vary. We propose that TatE could play a role in the oligomerization of TatA.

Methods
Plasmids. The plasmids used in this study are listed in Table S1. To construct plasmid pEC, vector pEBC_ LinkRBS was used as a template. Primers flanking tatB were designed (pECKI for and rev, Table S1) and phosphorylated. The tatB gene was deleted during vector amplification. The PCR product was purified and ligated prior to transformation. Plasmid pEBC_LinkRBS was also used to prepare the amber stop codon mutant at position Met205 in tatC in the context of tatEB using mutagenesis PCR 19 and the primers Met205 for and rev listed in Table S1. Construction of the same TatC variant in the context of tatAB was previously described 14 . In vitro reactions. The plasmid-encoded Tat substrate TorA-MalE335 and its Bpa variants were synthesized in a cell free transcription/translation system in 50 μl reactions 44 . Cell extracts were prepared using E. coli strain SL119 45 or Top10 (Invitrogen) transformed with plasmid pSup-BpaRS-6TRN(D286R) for Bpa incorporation into TorA-MalE335. The vesicles were added 12 min after starting protein synthesis at 37 °C and incubated for 18 min at 37 °C. In order to disrupt the proton motive force, 0.1 mM CCCP was added at the onset of synthesis. After incubation with vesicles, Bpa crosslinking was induced by UV irradiation for 20 min on ice. To visualize transport of TorA-MalE335 into the vesicles, completed reactions were treated with 0.5 mg/ml Proteinase K.
For immuno-precipitation, the samples were denatured in 1% SDS for 10 min at 95 °C after crosslinking. Antisera against TatA, TatE and TatB were incubated with Protein A-Sepharose beads for 90 min at 4 °C. Denatured samples were cleared by brief centrifugation and applied to the antibody-loaded Protein A-Sepharose during 3 h at 4 °C on a spinning wheel. After 4 washing steps, antibody-bound material was released by incubating in SDS-PAGE loading buffer for 10 min at 37 °C and 1.400 rpm.
The proteins were resolved on SDS-PAGE using 10% polyacrylamide gels. Radioactive gels were developed by phosphorimaging using a Storm 845 instrument. Quantification was performed using ImageQuantTL.
To detect crosslinks between the Tat subunits, 5 µl vesicles were diluted with 95 µl INV-buffer 44 . When indicated 1 µl DCCD (50 mM) was added. The samples were exposed to UV light for 20 min on ice, precipitated with trichloroacetic acid and resuspended in 100 µl Tricine sample buffer 47 . Proteins were then resolved using 9% Tricine SDS-PAGE and identified by Western blotting using antibodies against TatE (10 µl sample per lane), TatA (7 µl per lane), TatB and TatC (20 µl per lane).
Protein sequence analysis. All sequences used in this research were obtained from Uniprot or NCBI Reference Sequence (RefSeq) protein databases. The NCBI RefSeq protein database 48 release 79 was queried with the TatE from Escherichia coli using BLASTP 49 with cut-off e-value set to 1e-3. After removing incomplete entries, 3120 sequences were obtained. Among them, the 111 enterobacterial homologues were aligned with Clustal Omega 50 and the consensus sequence logo was plotted using WebLogo3 51 . The degree of sequence similarity of Tat proteins between E. coli (TatA, TatE, TatB) and C. glutamicum (TatA, TatE, TatB) or B. subtilis (TatAy, TatAc, TatAd) were investigated by pairwise-alignment using NCBI BLAST. Data availability. All data generated or analysed during this study are included in this published article (and its Supplementary Information files).