EMC rectifies the topology of multipass membrane proteins

Most eukaryotic multipass membrane proteins are inserted into the membrane of the endoplasmic reticulum. Their transmembrane domains (TMDs) are thought to be inserted co-translationally as they emerge from a membrane-bound ribosome. Here we find that TMDs near the carboxyl terminus of mammalian multipass proteins are inserted post-translationally by the endoplasmic reticulum membrane protein complex (EMC). Site-specific crosslinking shows that the EMC’s cytosol-facing hydrophilic vestibule is adjacent to a pre-translocated C-terminal tail. EMC-mediated insertion is mostly agnostic to TMD hydrophobicity, favored for short uncharged C-tails and stimulated by a preceding unassembled TMD bundle. Thus, multipass membrane proteins can be released by the ribosome–translocon complex in an incompletely inserted state, requiring a separate EMC-mediated post-translational insertion step to rectify their topology, complete biogenesis and evade quality control. This sequential co-translational and post-translational mechanism may apply to ~250 diverse multipass proteins, including subunits of the pentameric ion channel family that are crucial for neurotransmission.


EMC rectifies the topology of multipass membrane proteins
Haoxi Wu , Luka Smalinskaitė & Ramanujan S. Hegde   Most eukaryotic multipass membrane proteins are inserted into the membrane of the endoplasmic reticulum.Their transmembrane domains (TMDs) are thought to be inserted co-translationally as they emerge from a membrane-bound ribosome.Here we nd that TMDs near the carboxyl terminus of mammalian multipass proteins are inserted post-translationally by the endoplasmic reticulum membrane protein complex (EMC).Site-speci c crosslinking shows that the EMC's cytosol-facing hydrophilic vestibule is adjacent to a pre-translocated C-terminal tail.EMC-mediated insertion is mostly agnostic to TMD hydrophobicity, favored for short uncharged C-tails and stimulated by a preceding unassembled TMD bundle.Thus, multipass membrane proteins can be released by the ribosome-translocon complex in an incompletely inserted state, requiring a separate EMC-mediated post-translational insertion step to rectify their topology, complete biogenesis and evade quality control.This sequential co-translational and post-translational mechanism may apply to ~250 diverse multipass proteins, including subunits of the pentameric ion channel family that are crucial for neurotransmission.
Multipass membrane proteins, defined by the presence of more than one TMD, play crucial roles in the transfer of information and molecules across biological membranes 1,2 .Most eukaryotic multipass membrane proteins are inserted co-translationally at the endoplasmic reticulum as individual TMDs emerge from a membrane-bound ribosome 3 .Recent work suggests that at the mammalian endoplasmic reticulum, different parts of a multipass protein are inserted by different factors.The first TMD can be inserted by the EMC or by passing through a lateral gate in the Sec61 translocation channel [4][5][6][7] .Subsequent TMDs can also be inserted through the Sec61 lateral gate or into a lipid-filled cavity behind Sec61 created by the multipass translocon (MPT), an assembly of three complexes termed PAT, GEL and BOS [8][9][10] .GEL is structurally and evolutionarily related to the EMC, suggesting that it may be the insertion factor within the MPT [11][12][13] .
It is thought that both EMC and MPT insert TMDs only when the flanking translocated domain is shorter than ~50 amino acids, whereas Sec61 can mediate TMD insertion flanked by longer translocated domains 4,8,9,14 .Sec61 can translocate long hydrophilic polypeptides because it houses an aqueous translocation channel [15][16][17][18] , a feature lacking in either EMC or any of the MPT complexes 8,[19][20][21][22][23] .By using EMC, Sec61 and MPT in these ways, membrane proteins of widely varying topology and translocated domains can be co-translationally weaved into the lipid bilayer 3,9 .However, the last TMD poses unique problems if it is located within ~50 amino acids of the C terminus (hereafter termed a terminal TMD).As termination occurs before it can engage either Sec61 or the MPT, its insertion is necessarily post-translational.How a terminal TMD of a multipass protein is inserted is not clear.
The most straightforward case is one in which the penultimate TMD and the final TMD are separated by a long translocated loop.In this instance, the translocated loop would already be threaded into the ribosome-bound Sec61 channel at the time of termination (Fig. 1a).The terminal TMD would then necessarily enter the Sec61 channel after termination, from where it would enter the membrane post-translationally, presumably through Sec61's lateral gate.By contrast, the insertion mechanism for a terminal TMD preceded by a short translocated loop (Fig. 1b) or for a terminal TMD followed by a short translocated tail (Fig. 1c) is not clear.In the first situation, the penultimate TMD would not have enough of a tether to have engaged Sec61 before termination.The final two TMDs would be released from the ribosome and both would need to insert post-translationally Artic e https://doi.org/10.1038/s41594-023-01120-6 This problem is exemplified by the large and important family of Cys-loop pentameric ion channels.This family includes acetylcholine receptors, γ-aminobutyric acid type A (GABA A ) receptors, glycine receptors and others 24 .They play crucial roles in neurotransmission, and their incorrect biogenesis would lead to complex neurologic consequences 25 .Each subunit is composed of four TMDs with both the N terminus and C terminus facing the extracellular environment.The last TMD is followed by a tail that is typically only ~10-20 amino acids.This means that TMD4 will be mostly or entirely within the ribosome when translation terminates.The preceding three TMDs would have already been inserted given the ~100 amino acid long cytosolic loop between TMD3 and TMD4.Hence, TMD4 would be released from the ribosome and must be inserted post-translationally by the route shown in Fig. 1c.Given the exceptional importance of this class of proteins, we investigated how the final TMD of a GABA A receptor subunit is inserted as a model for the general problem of terminal TMD insertion outlined above.

The C-terminal TMD of GABRA1 is inserted by EMC
In considering factors that might mediate terminal TMD insertion of GABA A receptor subunits, we were intrigued by the earlier observation that the loss of EMC impairs the expression of multiple members of Cys-loop pentameric ion channels in worms 26 .As subunits of these channels use an N-terminal signal sequence for endoplasmic reticulum targeting and initiation of N-terminal translocation, it is now appreciated that insertion of the first TMD is not expected to be EMC-dependent 4 .TMD2 and TMD3 would then insert via the MPT given the short translocated loop between them 8,10 .This suggests that the EMC requirement might be at a later stage of insertion, folding or assembly.Among these possibilities, a potential role in the insertion of TMD4 was attractive because the biochemical reaction is similar to the post-translational insertion of tail-anchored proteins, which is an established role for EMC 7,14 .
To examine a potential role for EMC in Cys-loop channels, we focused on GABA A receptors, whose reliance on EMC has also been seen in mammals 27 .Using a previously characterized HEK293 cell line with stable inducible expression of a heteropentameric GABA A receptor 28,29 , we tested the effect of acute EMC depletion.As seen in earlier studies, induced surface expression of the GABA A receptor was reduced to less than ~50% in cells knocked down for EMC4, which generally phenocopies the insertase deficiency seen with a loss of EMC2, EMC3, EMC5 or EMC6, other core subunits of EMC [30][31][32] (Fig. 2a).No effect was seen for a dual-color fluorescent reporter of the Sec61-inserted asialoglycoprotein receptor 1 (ASGR1), but a strong reduction was seen for a similar reporter of the known EMC-inserted tail-anchored protein squalene synthase (SQS).Immunoblotting for GABRA1, the α1 subunit of the GABA A receptor, indicated that its lack of surface expression corresponded to its degradation, consistent with a failure in biogenesis (Fig. 2b).Importantly, cells lacking EMC contain the normal complement of all major factors involved in endoplasmic reticulum protein biogenesis (Extended Data Fig. 1 and ref. 4), consistent with normal biogenesis of most secretory pathway proteins 4,14,[31][32][33] .This suggests that the effect of EMC on GABA A receptor biogenesis is unlikely to be indirect.
To examine whether the EMC-dependent step involves membrane insertion of a GABA A receptor, we reconstituted the insertion of GABRA1 in vitro.Translation of 35 S-labeled GABRA1 in reticulocyte lysate in the presence of semi-permeabilized cells (SPCs) resulted in signal sequence cleavage and glycosylation of the N-terminal domain in the endoplasmic reticulum lumen (Fig. 2c).Identical results were seen using SPCs from EMC knockout (∆EMC) cells, consistent with previous work showing that signal sequences use Sec61 to initiate translocation of the N-terminal domain independent of EMC 4 .
We used a protease protection assay to monitor TMD insertion of GABRA1 downstream of N-terminus translocation.Proteinase K cleaved concomitant with translocation of the intervening loop.In the second situation, the terminal TMD would also need to be inserted post-translationally concomitant with translocation of the C-terminal tail (Fig. 1c).The factors that mediate these post-translational insertion reactions are not known.

Artic e
https://doi.org/10.1038/s41594-023-01120-6 the long cytosolic loop between TMD3 and TMD4, generating two protected fragments: an N-terminal protected fragment corresponding to the first three TMDs of GABRA1 and a C-terminal protected fragment corresponding to TMD4 and a short translocated C-terminal tail (Fig. 2c).Wild-type and ∆EMC SPCs generate similar levels of N-terminal protected fragments after proteinase K digestion, indicating that EMC does not participate in the insertion of the first three TMDs of GABRA1.
By contrast, The C-terminal protected fragment (recovered using an antibody against the C-tail of GABRA1) is reduced in ∆EMC SPCs to less than half that seen in wild-type SPCs, suggesting an impairment in TMD4 insertion.To corroborate this conclusion, we introduced a glycosylation site into the GABRA1 C-tail (GABRA1-glyc) to monitor C-tail translocation as a proxy for TMD4 insertion (Fig. 2d).Glycosylation of the C-tail was impaired by more than 50% in the absence of
As with SQS, insertion of GABRA1 TMD4 was not entirely eliminated in ∆EMC SPCs, presumably explaining why GABA A receptor surface expression is not completely eliminated in ∆EMC cells.We considered whether the residual GABRA1 TMD4 insertion in ∆EMC SPCs could potentially be mediated by the lateral gate of Sec61, the GEL complex or the GET complex, which inserts tail-anchored proteins of high hydrophobicity.Of these, the GET insertase seemed unlikely because later crosslinking experiments did not observe an interaction between a membrane-tethered C-terminal TMD and GET3, the targeting factor required for access to the GET insertase 14,34,35 .As the Sec61 and GEL complexes are probably used for GABRA1 N-terminal translocation and insertion of the TMD2-TMD3 module, respectively, we devised a simplified reporter to test TMD4 insertion in isolation.This C-tail translocation reporter consists of an artificial signal-anchor comprising 23 leucine residues (23L) followed by a cytosolic loop, TMD4 of GABRA1 and a translocated C-tail (23L-GABRA1).As 23L can be inserted independently of EMC, GEL or Sec61, we can test the role of these factors in TMD4 insertion.Glycosylation sites in the N-terminal and C-terminal tails were used to monitor translocation.
As with full-length GABRA1, C-tail translocation of 23L-GABRA1 was EMC-dependent.Elimination of the GEL complex (by knockout of its TMCO1 subunit) or treatment with Apratoxin A (ApraA), a potent

Artic e
https://doi.org/10.1038/s41594-023-01120-6 inhibitor of Sec61's lateral gate [36][37][38] , had no effect on 23L-GABRA1 insertion (Extended Data Fig. 2).Combining one or both manipulations with the elimination of EMC showed no further impairment of C-tail translocation, suggesting that neither Sec61 nor GEL can contribute to TMD4 insertion.These results indicate that TMD4 insertion is primarily mediated by EMC, with residual EMC-independent insertion occurring unassisted or through an unknown insertase.Given the experimental and theoretical support for unassisted insertion 39,40 , a membrane-tethered TMD4 could readily access this insertion route.Nonetheless, such alternative mechanisms seem to be minor contributors relative to EMC.EMC uses a cytosol-facing hydrophilic vestibule and membraneembedded hydrophilic groove, both housed primarily in EMC3, to facilitate translocation of a flanking hydrophilic segment concomitant with TMD insertion 7,19,21,23 .To test whether this established mechanism was used for TMD4 insertion of GABRA1, we analyzed the effect of EMC3 mutations along the translocation route.In these experiments, ~70-90% of endogenous EMC3 is replaced by long-term stable overexpression of FLAG-tagged EMC3, with the excess EMC3 being effectively degraded by cellular quality control 6 .Mutation of EMC3 at either a cytosolic methionine-rich loop at the entry of the hydrophilic vestibule (M cyt-1 -S) or a charged residue within the hydrophilic groove (R31A) impaired insertion of both SQS and TMD4 of GABRA1 (Fig. 2e).This result shows that EMC's insertase activity is involved in terminal TMD insertion, not some other part of EMC such as its putative chaperone surface 41 .We infer this conclusion because the degree of impairment in these mutants is similar for both SQS and TMD4 of GABRA1, and because these EMC mutants are fully assembled and intact 23 .Thus, the last TMD of GABRA1 is post-translationally inserted into the endoplasmic reticulum membrane through EMC using a similar mechanism as previously known substrates.

Translocating C-tail samples EMC during insertion
The 23L-TMD strategy provided a simplified reporter to analyze substrate parameters that influence EMC-dependent terminal TMD insertion.However, GABRA1 TMD4 insertion efficiency in this reporter was relatively low (a point we will return to later), so we devised a more efficiently inserted construct containing the TMD of SQS (23L-SQS; Fig. 3a).The linker between 23L and SQS was of the same length (~100 amino acids) as the cytosolic loop preceding the final TMD of GABRA1, so the 23L domain would have been inserted and diffused away from the translocon by the time translation terminates and the terminal TMD emerges from the ribosome 42 .We confirmed that 23L-SQS retains EMC-dependent insertion of the SQS TMD, which was impaired by mutations in EMC's insertase path (Fig. 3a).
With this construct, we used site-specific chemical crosslinking to probe the local environment during C-terminal TMD insertion.A single cysteine in the C-tail of 23L-SQS crosslinked to a single cysteine engineered in the cytosolic vestibule of EMC at position 216 of EMC3 (Fig. 3b).EMC3 crosslinked to non-glycosylated and singly glycosylated substrate, but not to doubly glycosylated substrate.This indicates that EMC's cytosolic vestibule can crosslink to a substrate whose C-terminal TMD has yet to be inserted but whose N-terminal TMD has already been inserted (and is glycosylated).Importantly, a cysteine in the middle of the adjacent SQS TMD showed almost no crosslinking to EMC3 (Fig. 3c).This provides a specificity control for the observed tail-mediated crosslinks and supports a model in which EMC mediates translocation of the hydrophilic tail through its hydrophilic vestibule.
In further support of this idea, two independent EMC3 mutants that partially impair translocation led to a ~1.9-fold increase in crosslinks between the C-tail and EMC's cytosolic vestibule (Extended Data Fig. 3).This observation is consistent with a longer residence time at this pre-translocation step when the translocation reaction is impaired, similar to earlier observations for an N-terminal EMC-dependent TMD 6 .Although less efficient, a UV-activated photo-crosslinker in the C-tail of 23L-SQS crosslinked to EMC3 in wild-type EMC (Fig. 3d).These results indicate that an incompletely inserted membrane protein encounters EMC, presumably by diffusional sampling within the membrane, and uses EMC's hydrophilic vestibule for translocation of the C-terminal tail concomitant with TMD insertion.In the absence of a functional EMC insertase, the C-terminal TMD presumably binds rapidly to the nearby membrane surface 40 but evidently cannot be inserted as efficiently by an unassisted mechanism or by another insertion factor (Extended Data Fig. 2).Binding to the membrane surface might explain why the TMD does not show any obvious increase in crosslinking to cytosolic chaperones or targeting factors in the absence of EMC as might otherwise be expected for a TMD exposed to the cytosol 14,[43][44][45] .

Determinants of EMC-mediated C-terminal TMD insertion
The 23L-SQS reporter was then modified to test the substrate features that influence terminal TMD insertion.In the first series of experiments, we found that extending the C-tail shifted the insertion pathway from EMC to Sec61 (Fig. 4a).At lengths of 25 and 35 amino acids, the TMD is insufficiently exposed outside the ribosome to engage the Sec61 lateral gate in the hairpin topology required for C-tail translocation.These constructs were, therefore, unaffected by the Sec61 inhibitor ApraA and were sensitive to the loss of EMC.By contrast, at 55 amino acids or longer, C-tail translocation was unaffected in ∆EMC SPCs but completely inhibited by ApraA.The 45 amino acid tail showed an intermediate effect, being partially sensitive to both EMC loss and Sec61 inhibition.Thus, EMC and Sec61 are largely non-redundant in terminal TMD insertion.The critical switchover point between EMC-dependence and Sec61-dependence is ~45 amino acids.This length is close to the translocation limit for the Oxa1 family (of which EMC is a member 46,47 ) but long enough for the TMD to reach Sec61's lateral gate in the appropriate looped topology 18 .
Charged residues in the C-terminal tail of 23L-SQS were seen to modestly but clearly reduce tail translocation (Fig. 4b).No difference was seen for tails with net positive or net negative charges.Importantly, all of these charge variants were translocated in an EMC-dependent manner.The charge-imperviousness for C-tail translocation of 23L-SQS was unexpected because EMC-mediated insertion of an N-terminal TMD (that is, a signal-anchor) or a tail-anchored protein is selectively disfavored if the tail to be translocated contains multiple positive charges 4,6,48 .The basis of this different behavior may relate to the length of time available for EMC-mediated insertion for a signal-anchor or tail-anchor versus the terminal TMD of a multipass protein.A ribosome displaying a signal-anchor has limited time for EMC-mediated insertion before docking onto Sec61, at which point EMC becomes inaccessible.Similarly, moderately hydrophobic tail-anchored proteins that are not promptly inserted by EMC can instead be inserted into mitochondria.Consistent with this interpretation, insertion of such signal-anchors and tail-anchors improves when Sec61 is depleted or when mitochondrial targeting is impaired, respectively 6,49 .By contrast, a multipass protein is already committed to endoplasmic reticulum insertion by the time EMC must insert the final tethered TMD, which would have prolonged and repeated access to EMC.Thus, EMC's substrate preference against positive charge translocation can be overcome by simply providing more time, implying that it can accommodate a broader range of substrates as a terminal TMD insertase.
A similar rationale probably explains the finding that in the context of 23L-SQS, EMC is able to accommodate a broad range of TMD hydrophobicity (Fig. 4c).For tail-anchored protein insertion, EMC-dependence is strongly influenced by hydrophobicity, with TMDs of higher hydrophobicity becoming dependent on the GET pathway.For example, SQS can progressively be made EMC-independent and GET-dependent by replacing less-hydrophobic residues in the TMD with leucine residues 14 .The same changes in 23L-SQS not only did not shift EMC-dependence but improved insertion overall.This indicates that EMC's preference for low-hydrophobicity TMDs seen for

Artic e
https://doi.org/10.1038/s41594-023-01120-6tail-anchored proteins is not a reflection of EMC limitations but rather a consequence of high-hydrophobicity TMDs being captured by the GET pathway in the cytosol.Considered together, these findings illustrate that EMC's substrate specificity for terminal TMDs of multipass proteins is broad with respect to flanking charge and hydrophobicity but is limited by tail lengths shorter than ~50 amino acids.

Immature membrane domains facilitate EMC targeting
As noted above, 23L-GABRA1 shows EMC-dependent insertion of the terminal TMD (Extended Data Fig. 2) but with clearly lower efficiency than that seen in native GABRA1 (~38% compared to ~65%; Fig. 5).In considering possible explanations, we recognized that EMC has been proposed to function as a chaperone in addition to its insertase activity 33,41 .This suggested the possibility that in native GABRA1, TMD4 is brought in proximity to EMC by its preceding TMDs engaging EMC through its putative chaperone function.Replacing the first three TMDs of GABRA1 with 23L would eliminate this 'targeting' activity, explaining the loss of TMD4 insertion efficiency.
As a test of this idea, we asked whether TMD4 insertion could be rescued if TMD4 were preceded by an unrelated membrane domain that might be a putative chaperone substrate.Earlier studies have shown that until insertion of all of its TMDs, G-protein coupled receptors are targets for intramembrane chaperoning 50 .Therefore, we preceded TMD4 of GABRA1 with the first three TMDs of rhodopsin (termed Rho(1-3)) and tested the efficiency of terminal TMD insertion.To ensure that Rho(1-3) was not an EMC insertion substrate, we initiated its translocation using an N-terminal signal peptide and lumenal domain preceding its first TMD.Relative to 23L-GABRA1, terminal TMD insertion of Rho(1-3)-GABRA1 was markedly higher and similar to that seen with native GABRA1 (Fig. 5).
As Rho(1-3) is unrelated to TMD1-3 of GABRA1, these results indicate that the efficient insertion of TMD4 is not due to its complementarity with the earlier TMDs.Instead, the findings support a model in which the earlier TMDs target an otherwise poorly inserted TMD4 to EMC for efficient insertion.Consistent with this idea, the insertion of the SQS TMD improved when preceded by Rho(1-3) relative to 23L-SQS (Fig. 5), which itself was slightly more efficient than SQS inserted as a tail-anchored protein.This suggests that EMC-mediated insertion improves slightly by tethering a TMD close to the membrane (for example, with 23L) and improves further if

An expanded substrate repertoire of EMC
Using the insights gained from GABRA1 and 23L-SQS, we sought to predict all other terminal TMD substrates of EMC in the human genome.Leveraging the structural prediction afforded by AlphaFold2 (ref.51) combined with the positive-inside rule 52 , we manually curated the orientation of all 1,784 annotated multipass endoplasmic reticulum membrane proteins in the human genome.By plotting the C-tail lengths by topology, number of total TMDs and hydrophobicity of the terminal TMD (Fig. 6a,b), we found that 244 multipass proteins contain a terminal TMD whose downstream tail is non-cytosolic and 50 amino acids or shorter (Supplementary Table 1).From this manually curated list, we chose for analysis six proteins containing terminal TMDs of varying hydrophobicity and C-tail lengths.The functional expression of five of these proteins, or their close homologs, have been observed in earlier studies to be dependent on EMC 26,27,31,33 .Therefore, we tested their terminal TMDs in the 23L-TMD reporter with a C-terminal glycosylation site (Extended Data Fig. 4) to determine whether impaired C-tail translocation could explain their EMC dependence.Each of these showed at least partial dependence on EMC for C-terminal TMD insertion (Fig. 7a,b).The variable levels of insertion in the absence of EMC among the different substrates might reflect their capacity for unassisted insertion.Alternatively, they might be able to access other insertases such as GET, Sec61 or GEL depending on either TMD hydrophobicity or length of the C-tail, but this remains to be investigated.
We also tested two unrelated native proteins, SOAT1 and YIPF1, that also contain a terminal TMD with a translocated C-tail.These were chosen based on the prediction that earlier steps in their insertion would be EMC-independent, thereby allowing us to monitor terminal TMD insertion using a single C-tail glycosylation site (see Fig. 7a).As per our predictions (Fig. 6), both were observed to be partially EMC-dependent for C-tail translocation (Fig. 7c).As the separation of glycosylated from non-glycosylated products of full-length SOAT1 was challenging, we verified our conclusion with a better-resolved SOAT1 construct lacking the N-terminal cytosolic domain.EMC-dependent terminal TMD translocation of full-length and N-terminally deleted SOAT1 was further validated by selective recovery of the glycosylated products using the lectin conconavalin A. Of note, both endogenous and exogenously expressed SOAT1 were shown in earlier studies to be strongly dependent on EMC in cells, but the basis of this dependence was not clear 32 .Our finding that insertion of the final TMD of SOAT1 is EMC-dependent to a comparable level as SOAT1 biogenesis in cells now provides an explanation.The ~250 new putative substrates of EMC are highly diverse in their topology and function and may help to explain the complex and pleiotropic phenotypes of EMC loss in a wide range of organisms 7,53 .

Discussion
Our study provides three insights into the problem of membrane protein topogenesis.First, we reveal that unlike previous suggestions 4,14,33 , EMC is not specific for poorly hydrophilic TMDs and does not have a strong discriminatory capacity against the translocation of positive charges.Instead, it seems that EMC's preferences against positive charges and high hydrophobicity that were observed in earlier studies are consequences of competing reactions.Thus, EMC's intrinsic capacity for TMD insertion is broader than had been thought.Second, various substrates that show an EMC requirement in cells, such as pentameric ion channels 26,27 and SOAT1 32 , can probably be ascribed to a failure of terminal TMD insertion.These substrates had been puzzling because they are neither tail-anchored proteins nor initiated with a signal-anchor in the N exo topology (in which the N terminus faces the exoplasmic side of the membrane), which previously were the known targets for EMC function.Although we cannot exclude additional roles for EMC The topology of all annotated multipass proteins in the human genome (1,784 total), determined using a combination of AlphaFold2 structure prediction to identify TMDs and the positive-inside rule to deduce orientation (see Supplementary Table 1).The C-tail locations (C exo or C cyt for exoplasmic and cytosolic, respectively) and lengths were tabulated and plotted versus the numbers of TMDs in the protein.b, The data from a were re-plotted relative to the ∆G app 65 score of the final TMD.Representative substrates analyzed in Fig. 7 for EMC-dependent translocation of their C-terminal TMDs are highlighted in red.

Artic e
https://doi.org/10.1038/s41594-023-01120-6 in GABRA1 or SOAT1, the magnitude of the defect seen in C-terminal TMD insertion can explain the consequence in cells.Third, and perhaps of most conceptual importance, is the finding that the topology of multipass proteins can be determined by a combination of co-translational and post-translational reactions.Co-translational events result in committing the protein for endoplasmic reticulum targeting and insertion of most TMDs, whereas other TMDs can be inserted post-translationally after the substrate has presumably departed the ribosome-translocon complex.Diffusion away from the ribosome (or dissociation of the ribosome from the membrane) would be a prerequisite for EMC-mediated insertion because EMC cannot access substrates near the exit tunnel of a Sec61-bound ribosome 6,22 .Hence, our findings show that at least some multipass proteins are released from the ribosome-translocon complex in a topologically immature form, with EMC rectifying the topology at a later step.Targeting to EMC for this post-translational rectification step may be facilitated by EMC's putative chaperone activity 33,41 , which would preferentially engage immature membrane-embedded domains such as those lacking a terminal TMD.In cells, a failure to rectify the topology promptly would presumably lead to engagement by quality control factors within the membrane of the topologically incomplete protein 54,55 or in the cytosol of the uninserted TMD 45,56,57 , thereby explaining why these EMC substrates are degraded in ∆EMC cells.
Although we have demonstrated post-translational topological rectification for C-terminal TMDs, it is plausible that in other circumstances, pairs of TMDs separated by a short loop are similarly inserted post-translationally (for example, Fig. 1b).The Oxa1 family (which, in eukaryotes, is composed of EMC, GEL and GET complexes) has been shown to be capable of such insertion 8,10,[58][59][60] , which might be needed if some TMDs are skipped during co-translational insertion.Skipping of TMDs followed by post-translational rearrangement has been suggested in earlier work 61 , but the basis of such a mechanism is unclear.Our findings now suggest a plausible mechanism utilizing an Oxa1 family member, which is an idea that warrants experimental analysis in future work.

Artic e
https://doi.org/10.1038/s41594-023-01120-6 Our study adds to the emerging principle that there is a straightforward segregation of function between the Oxa1 family and the SecY family 3,9 .Oxa1 family members mediate TMD insertion when the translocated segment of a flanking polypeptide is shorter than ~50 amino acids, whereas Sec61 (or SecY in prokaryotes) mediates TMD insertion when the translocated flanking domain is longer.In eukaryotes, tail-anchored proteins are inserted by the GET and EMC insertases 14,62,63 , signal-anchored proteins with a short translocated N-terminal tail are inserted by EMC 4,6 , internal TMD pairs with short loops are inserted by GEL 8,10 and, as shown here, terminal TMDs with a short translocated C-tail use EMC.In bacteria, the sole Oxa1 family member YidC would perform all of these jobs, perhaps explaining why it is essential 64 .The larger and more diverse membrane proteome in eukaryotes might have driven an expansion and specialization of endoplasmic-reticulum-localized Oxa1 family members, which perhaps also affords a degree of redundancy and robustness to the essential process of membrane protein biogenesis.

Online content
Any methods, additional references, Nature Portfolio reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41594-023-01120-6.

Small interfering RNA knockdown and flow cytometry
For monitoring the surface expression of GABA A receptors, either negative control small interfering RNA (siRNA) (Invitrogen, 4390843) or EMC4 siRNA (Ambion, s27733) was transfected into the 293 cell line expressing the human GABA A receptor using Lipofectamine RNAiMAX according to the manufacturer's protocol.A final concentration of 10 nM of siRNA was used and the total knockdown time was 66 h.At 60 h, doxycycline was added to a final concentration of 0.1 µg ml -1 to induce expression of the GABA A receptor for 6 h.Cells were then collected and subjected to surface labeling.Pelleted cells were resuspended with 100 µl of cold PBS, supplemented with 1 µl of phycoerythrin-labeled FLAG antibody (BioLegend, 637310) and incubated in the dark at 4°C for 1 h.Cells were then washed with cold PBS, passed through a 70 µm filter and then analyzed on a BD LSR II flow cytometer for appropriate fluorescent channels.A total of 30,000 events were analyzed, and phycoerythrin fluorescence, reflective of GABA A receptor surface levels, was plotted as a histogram using FlowJo.To analyze the stability of ASGR1 or SQS, 293 cell lines stably expressing GFP-P2A-RFP-ASGR1 or GFP-P2A-RFP-SQS were used as previously described 50 .The P2A sequence in these constructs causes ribosome skipping, resulting in the translation of equimolar amounts of GFP and the RFP-tagged protein.Therefore, a steady-state RFP:GFP ratio reflects the stability of the RFP-tagged protein.Failure in biogenesis will lead to degradation by cellular quality control pathways and a decreased RFP:GFP ratio.Knockdown was performed as for GABA A receptor cells, with the GFP and RFP fluorescence monitored by flow cytometry on 30,000 cells.The RFP:GFP ratio was plotted as a histogram using FlowJo.

Preparation of endoplasmic-reticulum-enriched membranes
Approximately 80% of confluence cells were collected by trypsinization.Cells were pelleted, washed once with cold 1×PBS and flash-frozen in liquid nitrogen.Thawed cell pellets were mixed with 5 volumes of 20 mM HEPES, pH 7.4; 5 mM KCl; 1.5 mM MgCl 2 ; 2 mM dithiothreitol (DTT); and protease inhibitor (Roche, 10106399001) and incubated on ice for 15 min.Cells were lysed on ice by 35 strokes of dounce homogenization (DWK Life Sciences, 357542).Cell lysates were adjusted to 20 mM HEPES, pH 7.4; 210 mM mannitol; 70 mM sucrose; 0.5 mM EDTA; 2 mM DTT; and protease inhibitor.Cell debris and nuclei were cleared by centrifugation at 4 °C for 10 min at 700×g.Membranes were then pelleted by centrifugation at 4°C for 10 min at 8,500×g, washed once and resuspended in a buffer containing 20 mM HEPES, pH 7.4; 210 mM mannitol; 70 mM sucrose; 0.5 mM EDTA; 2 mM DTT; and protease inhibitor to give an OD280 of 20.Different amounts of resuspended Artic e https://doi.org/10.1038/s41594-023-01120-6membranes were used to assay the levels of key translocation components by blotting.
For incorporating the photoreactive amino acid p-benzoyl-lphenylalanine (Bpa) through amber suppression, the following components are included in the translation reaction 68 : suppressor Bacillus stearothermophilus tRNA CUA Tyr (5 µM); E. coli Bpa tRNA synthetase (0.25 µM); and Bpa (100 µM).These components were pre-mixed into a 10× solution (in 50 mM HEPES, pH 7.4, 100 mM KOAc, 1 mM Mg(OAC) 2 ) and were pre-incubated at 32°C for 15 min before adding to the translation reaction.Where indicated, the Sec61 lateral gate inhibitor ApraA (obtained from V. Paavilainen and K. McPhail) was included in the translation reaction at 2 µM.

Protease protection assays
The 60 µl in vitro translation reactions were chilled on ice, and the SPCs were pelleted (20,000×g for 2 min), washed once with 1×RNC buffer (50 mM HEPES, pH 7.4, 100 mM KOAc, 5 mM Mg(OAc) 2 ) and resuspended in 30 µl 0.5×RNC buffer.Samples lacking SPCs were used directly without pelleting.Samples were divided into two aliquots; one aliquot (two-thirds of the total volume) was adjusted to 0.5 mg ml -1 proteinase K and incubated on ice for 50 min.Proteinase K was quenched by adding 250 mM of PMSF for 2 min, then transferring the entire reaction to a tenfold excess volume of 1% SDS, 100 mM Tris-HCl, pH 8.0 pre-heated to 100°C and heated for 10 min.The samples were either analyzed directly or subjected to immunoprecipitation as indicated in the figure legends.

Site-specific crosslinking
The 120 µl in vitro translation reactions were used for bismaleimidohexane (BMH) crosslinking experiments.All steps following the translation reaction were at 4°C until the reaction was denatured in SDS.SPCs were pelleted and resuspended in 60 µl 0.5×RNC buffer.One aliquot was removed as the no-crosslinking control and the remainder of the sample was adjusted to 250 µM BMH (Thermo Scientific, 22330) and incubated on ice for 10 min.The crosslinking reaction was quenched by adjusting the final concentration of DTT to 25 mM.After denaturation in 1% SDS, 100 mM Tris-HCl, pH 8.0, the samples were either analyzed directly or processed further for immunoprecipitation or deglycosylation.SMPH (Succinimidyl 6-((beta-maleimidopropionamido) hexanoate)) and UV crosslinking experiments were performed similarly to the BMH crosslinking experiments, with the following differences: SMPH (Thermo Scientific, 22363) crosslinking (200 µl total reaction) was at 200 µM final concentration for 30 min, and quenched with 50 mM Tris-HCl pH 7.4 and 5 mM DTT; UV crosslinking (100 µl total reaction) was on ice with UV irradiation by a UVP Blak-Ray B-100AP high-intensity lamp with the bulb positioned ~10 cm above the samples.

Immunoprecipitation and PNGase F treatment
Denaturing immunoprecipitation after crosslinking, proteinase K digestion and glycanase digestion have been previously described 6 .SDS-denatured samples were diluted tenfold in ice-cold immunoprecipitation buffer (1×PBS, 250 mM NaCl, 0.5% TX-100, 10 mM imidazole) and mixed with 2.5 µl of anti-FLAG resin (Millipore, A2220), 2.5 µl of Monoclonal Anti-HA resin (Millipore, A2095), or 5 µl of protein A resin (Repligen, CA-HF-0100) along with the appropriate antibody.A total of 1.25 µg of GABRA1 antibody (Invitrogen, PA5-79291) was used per immunoprecipitation.The mixture was rotated end-over-end for 1.5 h (for FLAG or HA) or 3 h (for GABRA1 immunoprecipitation) at 4°C.Beads were washed twice with cold immunoprecipitation buffer and eluted by boiling in 10 µl of 2.5× SDS-PAGE sample buffer for 10 min.For deglycosylation experiments, crosslinked samples (Fig. 3) or total translation reactions (Extended Data Fig. 4) were split into two halves after denaturation with 0.5% SDS and 50 mM Tris-HCl, pH 8.One half was untreated and the other was adjusted to 1% NP-40, 1× GlycoBuffer 2 and 25 U ml -1 of PNGase F (NEB, P0704S) and digested at 32°C for 30 min.Both halves were subjected to immunoprecipitation as described above (Fig. 3) or analyzed directly (Extended Data Fig. 4).

Bioinformatic analysis of membrane proteins
All proteins containing TMDs were retrieved from the UniProt database 69 .The UniProt annotations were used to define the start and end of the TMD helices.Proteins containing a single TMD or multipass membrane proteins localized to mitochondria were manually removed from this set.The AlphaFold2 (ref.51) predicted structure, available from the UniProt database for each of the remaining 1,784 multipass membrane proteins, was inspected manually to annotate the number of TMD helices and the overall charge of each TMD-flanking side of the structure.The overall basic flank was designated cytosolic as per the positive-inside rule 52 .Then the C-terminal TMD was identified and assigned the appropriate orientation, hydrophobicity (as calculated using the ∆G app predictor 65 ) and flanking C-tail length.C-tails facing the cytosol were designated 'C cyt ' and those facing the opposite orientation were designated 'C exo '.The curated list is provided in Supplementary Table 1; information from this table was used to generate the plots in Fig. 6.

Quantification of C-tail translocation
Quantification was performed on raw phosphorimager files using Fiji.The pixel intensity and area of each band were measured, from which the background intensity was subtracted.For the C-terminal TMD reporters, C-tail translocation was calculated by dividing the value for the C-tail translocated band (typically the 2×-glycosylated Artic e https://doi.org/10.1038/s41594-023-01120-6 Extended Data Fig. 3 | Analysis of EMC3-substrate crosslinking in EMC3 mutants. 35S-methionine labeled 23L-SQS that contains a single cysteine within its C-tail was translated in the presence of SPCs derived from cells expressing FLAG tagged WT or each of two EMC3 mutants that impair its insertase function (F148L or R13E).Translation products were analyzed directly (-SMPH) or subject to SMPH crosslinking and FLAG denaturing IP (+SMPH, EMC3 denat.IP).Differentially glycosylated substrates and EMC3 crosslinks are indicated.Relative crosslinking was quantified by normalizing intensity of crosslinked products relative to totals.Note that earlier analysis has shown that these mutants do not affect either the expression level or assembly of EMC 6 .

Fig. 1 |
Fig.1| Three modes of terminal TMD insertion for multipass membrane proteins.TMDs located within ~50 amino acids of the stop codon of a multipass membrane protein will be partially or completely inside the ribosome exit tunnel at the time of translation termination.This means that they will necessarily be inserted by one of three post-translational mechanisms depending on the preceding membrane domain and intervening loop.a, When the penultimate TMD is followed by a long (more than 50 amino acids) translocated loop, its translocation will be Sec61-dependent8 .Hence, the loop will already be threaded through the Sec61 channel by the time of termination (left diagram).The terminal TMD inside the ribosome will then necessarily enter Sec61 (middle diagram), from where it presumably accesses the membrane through Sec61's lateral gate.b, When the final two TMDs are closely spaced, neither one can be inserted co-translationally because the tether downstream of the penultimate TMD is too short to engage Sec61 at the time of termination (left diagram).Therefore, both TMDs will be released and inserted post-translationally by an unknown mechanism.c, When the final TMD is located near a translocated C terminus, the TMD and tail are mostly or entirely inside the ribosome at the time of termination (left diagram).Their post-translational insertion mechanism is also unknown.Note that the machinery involved in the insertion and chaperoning of earlier TMDs is not shown for simplicity.

Fig. 2 |
Fig. 2 | EMC is required for C-terminal TMD insertion of GABRA1.a, Stable cell lines expressing the indicated inducible constructs were treated with nontargeting or EMC4-targeting siRNAs for 3 days, induced for 6 h and analyzed by flow cytometry for surface GABA A receptor levels using phycoerythrin (PE) labeled antibody (left), total levels of RFP-SQS (middle) or total levels of RFP-ASGR1 (right).SQS and ASGR1 levels are normalized to GFP, an internal expression control that is separated from the reporter by a ribosome-skipping viral P2A sequence.KD indicates knockdown.b, Total levels of GABRA1 and EMC4 were analyzed by blotting in cells treated with control or EMC4 siRNA.β-actin was used as a loading control.c, Topology diagram of human GABRA1; the predicted fragments resulting from proteinase K (PK) digestion are shown on top.Tail and loop lengths are indicated.NPF, N-terminal protected fragment; CPF, C-terminal protected fragment.The bottom panel shows the topology analysis of GABRA1 in the endoplasmic reticulum of wild-type (WT) and EMC6-knockout (∆E) 293 cells. 35S-methionine-labeled GABRA1 was translated in rabbit reticulocyte lysate in the absence (Ø) or presence of SPCs derived from WT or ∆E 293 cells.After translation, the SPCs were recovered by centrifugation and analyzed directly (−PK) by SDS-PAGE and autoradiography or subjected to PK digestion (+PK).Reactions lacking SPCs were analyzed similarly without centrifugation.Aliquots of both −PK and +PK samples were subjected to immunoprecipitation via an antibody against the C terminus of GABRA1 (antigen labeled in purple).The glycosylated (+glyc) and non-glycosylated (−glyc) products are indicated.NPF and CPF are indicated with green and purple arrowheads, respectively.d, 35 S-methionine-labeled SQS and GABRA1 each with a C-terminal glycosylation site were translated in the presence of SPCs derived from ∆EMC cells or cells stably overexpressing either wild-type EMC3-FLAG or insertase-deficient mutants of EMC3-FLAG variants (M cyt-1 -S and R31A).SS indicates a cleavable signal sequence.e, Quantification of three independent experiments as shown in d, with mean ± s.d.plotted.C-tail translocation was quantified by plotting per cent glycosylation (for SQS) or the per cent of glycosylated products that contain two glycans (For GABRA1).C-term, C-terminal.

Fig. 3 |
Fig. 3 | Substrate C-tail samples EMC before translocation.a, The 23L-SQS reporter (top) consists of a short translocated N-tail, a first TMD made of 23 leucine residues, an ~100 amino acid cytosolic loop, a second TMD and flanking sequences from SQS and a short translocated C-tail.Both terminal tails have glycosylation sites to monitor translocation.35S-methionine labeled 23L-SQS was translated in the presence of SPCs derived from ∆EMC cells or cells expressing variants of EMC3 (WT, M cyt-1 -S and R31A).Products with different glycosylation states are indicated.Quantification of three independent experiments (mean ± s.d.) is plotted.C-tail translocation is determined as the per cent of glycosylated products that contains two glycans.b,35  S-methionine-labeled 23L-SQS without or with a single cysteine within the C-tail, translated in the presence of SPCs derived from ∆EMC (∆E) cells or cells expressing EMC3-216C (in which a single cysteine is introduced into cysteine-free EMC3 at residue 216 located in its cytosolic vestibule)6 .One aliquot was analyzed directly (−BMH) and another was treated with BMH.Crosslinked samples were analyzed directly (+BMH) or after EMC3 denaturing immunoprecipitation via FLAG tag (+BMH EMC3 denat.IP).The positions of 23L-SQS with zero, one or two glycans, and crosslinks between 23L-SQS and EMC3 or lumenal proteins are indicated.Lanes containing EMC3 crosslinks were digested with PNGase F to confirm that the crosslinks contain either zero or one glycan and hence are not crosslinks to fully translocated double-glycosylated 23L-SQS.c,35  S-methionine-labeled 23L-SQS containing a single cysteine within the C-tail or the SQS TMD, translated in the presence of EMC3-216C SPCs and analyzed similarly to b. d,35  S-methioninelabeled 23L-SQS without or with a single Bpa incorporated within the C-tail (through an amber suppression system) was translated in the presence of SPCs derived from WT or ∆E cells.One aliquot was analyzed directly (−UV) and another was subjected to UV crosslinking.Crosslinked samples were analyzed after EMC3 denaturing immunoprecipitation via FLAG tag (+UV EMC3 denat.IP).Labeling is similar to b.

Fig. 4 |
Fig. 4 | Determinants of substrate C-tail translocation.a, 35 S-methionine-labeled 23L-SQS variants with the different C-tail lengths indicated (top) translated in the presence of SPCs derived from WT or ∆E cells.Where indicated, 2 µM the Sec61 inhibitor ApraA was included in the translation reaction.Translation reactions were analyzed directly, and substrates with zero, one or two glycans attached are indicated.The two-glycan product is indicative of successful C-tail translocation.C-tail translocation was quantified by calculating the percentage of all glycosylated products that contain two glycans.b, 35 S-methionine-labeled 23L-SQS variants with charged residue mutations in the C-tail flanking the TMD were analyzed as in a.The sequences of the variants are shown with the net C-tail flanking charge indicated.c, 35 S-methionine-labeled 23L-SQS variants with leucine mutations in the TMD were analyzed as in a.The sequences of the variants are shown along with the hydrophobicity of each TMD as a ∆G app score 65 , in which negative values indicate favored membrane insertion.

Fig. 5 |Fig. 6 |
Fig. 5 | Targeting to EMC facilitates C-terminal TMD insertion.The GABRA1 TMD4 or SQS TMD was preceded by either the 23L TMD or TMD1-3 of rhodopsin, separated by a 100 amino acid linker.These constructs were translated in the absence (Ø) or presence of SPCs from WT or ∆E cells and analyzed by SDS-PAGE and autoradiography.Substrates with zero, one or two glycans attached are labeled.Per cent C-tail translocation, calculated as in Fig. 4, is indicated.

Fig. 7 |
Fig. 7 | EMC facilitates C-terminal TMD insertion of distinct classes of substrates.a, Domain organization of a C-terminal TMD reporter, which contains 23L and test TMD (yellow), a cytosolic loop (~100 amino acids) and two glycosylation sites in both N-tail and C-tail (left); topology and domains of full-length SOAT1 (middle) and YIPF1 (right) containing C-terminal glycosylation sites.b, 35 S-methionine-labeled C-terminal TMD reporters with indicated TMDs, translated in the absence (Ø) or presence of SPCs from WT or ∆E cells.Translation reactions were analyzed directly by SDS-PAGE and autoradiography.The positions of translated reporters with zero, one or two glycans are indicated.In the absence of endoplasmic reticulum membranes, cytosolic quality control factors recognize non-translocated substrates and modify them with mono-or poly-ubiquitin (Ub (n) ) as indicated by the black dots.Note that S38A1 contains two C-tail glycosylation sites, resulting in a triply glycosylated product (blue arrow).C-tail translocation was quantified as previous figures.c, 35 S-methionine-labeled SOAT1 or an N-terminal domain deletion of SOAT1 (SOAT1∆N), translated in the absence (Ø) or presence of SPCs from WT or ∆E cells.One aliquot was subject to denaturing immunoprecipitation through an N-terminal HA tag (Totals); another was subjected to conconavalin A pulldown (ConA) to recover the glycosylated population.Non-glycosylated and glycosylated products are indicated, and percent glycosylation was calculated.In ConA recovered lanes, glycosylation was normalized to WT. d, 35 S-methionine-labeled YIPF1 was translated in the absence (Ø) or presence of SPCs from WT or ∆E cells and analyzed directly by SDS-PAGE and autoradiography.Substrate glycosylation state and per cent C-tail translocation are indicated.

Fig. 4 |
Glycosylation analysis of C-terminal TMD reporters.The diagram shows the domain organization of the C-terminal TMD reporters consisting of an N-terminal translocated tail, a TMD with 23 leucines, a cytosolic loop (~100aa) and a test TMD followed by a C-terminal translocated tail.If properly inserted, both N-and C-tails are glycosylated. 35S-methionine labeled reporters were translated in the absence or presence of WT SPCs.Translated products were either analyzed directly or subject to PNGase F digestion to definitively identify the glycosylated products as indicated.Ubiquitinated non-translocated products [Ub (n) ] are indicated in black dots.A triply glycosylated form of S38A1 due to a second consensus site in the C-tail is marked by a blue arrow.