Main

Eight months into the COVID-19 pandemic, no vaccines or antiviral drugs are available against the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the causative agent of the pandemic, owing to a lack of knowledge about the detailed structures and functions of the essential virus proteins. The RNA genome of SARS-CoV-2 encodes three membrane proteins (Fig. 1a): the spike protein, which binds the cell-surface receptor to mediate virus entry; the membrane protein, which contributes to virus assembly and budding1; and the envelope protein E. E is a 75-residue viroporin (Fig. 1b) that forms a cation-selective channel across the ERGIC membrane2,3. In SARS-CoV-1, E mediates the budding and release of progeny viruses4 and activates the host inflammasome5. E’s channel activity is blocked by hexamethylene amiloride (HMA)6 and amantadine (AMT)7; the latter also inhibits the viroporins of influenza A virus and HIV-1 (refs. 8,9). E deletion gives rise to attenuated viruses in some coronaviruses10,11,12, whereas E mutations that abolish channel activity cause reduced virus pathogenicity.12 Thus E is a potential antiviral drug target and vaccine candidate against SARS-CoV-2.

Fig. 1: Function, amino acid sequence and fingerprint NMR spectra of the SARS-CoV-2 E protein.
figure 1

a, E forms a cation-selective ion channel and mediates SARS-CoV-2 budding and release from the host cell’s ERGIC lumen. b, Domain architecture of E and sequence alignment of E’s transmembrane segment among several human-infecting coronaviruses. Highly conserved polar residues are shown in red. CTD, C-terminal domain; NTD, N-terminal domain. c,d, 2D 15N-13Cα correlation spectrum (c) and 2D 13C-13C correlation spectrum (d) of ERGIC-membrane-bound ETM. The spectra, measured at ambient temperature, show high sensitivity and resolution, indicating that ETM is structurally homogeneous in lipid bilayers.

Despite its importance to SARS-CoV-2 pathogenesis, E’s high-resolution structure, particularly for the ion-conducting transmembrane (TM) domain (residues 8–38) (Fig. 1b)2,3, has been elusive. Sedimentation equilibrium and gel-electrophoresis data for the homologous SARS-CoV-1 E indicate that the TM domain assembles into a homopentamer in detergents such as sodium dodecyl sulfate (SDS) and perfluorooctanoic acid6,13,14. Although early X-ray scattering data have suggested a helical hairpin model for E15, subsequent solution NMR studies of E bound to several detergent micelles, including dodecylphosphocholine (DPC)10, SDS6 and lyso-myristoylphosphatidylglycerol (LMPG)16, consistently indicate a single-span TM helix. However, the pore-facing residues and the pentameric assembly are not well-established. Fourier-transform infrared dichroic data suggest that the ETM helix orientation in lipid bilayers may be sensitive to the presence or absence of charged residues at the two termini of the TM domain, and by inference, the membrane surface charge17,18.

Here, we use solid-state NMR to determine the structure of the SARS-CoV-2 ETM structure in phospholipid bilayers, to avoid potential structural distortion caused by detergents. The structure sets the stage for the design of E inhibitors as antiviral drugs.

Results

Backbone conformation of ETM in lipid bilayers

We reconstituted ETM into an ERGIC-mimetic lipid bilayer containing phosphatidylcholine, phosphatidylethanolamine, phosphatidylinositol, phosphatidylserine and cholesterol. For comparison, we also incorporated the protein into a dimyristoylphosphocholine (DMPC): dimyristoylphosphoglycerol (DMPG) model membrane, abbreviated as DMPX below. ETM was expressed in Escherichia coli using a hexahistidine (His6)–small ubiquitin-like modifier (SUMO) fusion tag and purified first by nickel affinity column chromatography and then by reverse-phase HPLC after cleavage of the solubility tag (Extended Data Fig. 1).

One-dimensional (1D) 13C and 15N NMR spectra of the protein in ERGIC and DMPX membranes show temperature-insensitive high intensities (Extended Data Fig. 2a,b), indicating that the protein is immobilized in lipid bilayers at ambient temperature. Two-dimensional (2D) 15N-13C and 13C-13C correlation spectra show well-resolved peaks (Fig. 1c,d), with 13C and 15N linewidths of 0.5 ppm and 0.9 ppm, indicating that the protein conformation is highly homogeneous. We assigned the chemical shifts using three-dimensional (3D) correlation NMR experiments (Extended Data Fig. 3a). These chemical shifts indicate that residues 14–34 form the α-helical core of the TM domain (Extended Data Fig. 3b,c and Supplementary Table 1). Comparison of spectra between the two membranes and at different temperatures (Extended Data Fig. 2d–f) indicate that the N-terminal segment (residues Glu8–Ile13) is dynamic at high temperature but is mostly α-helical, whereas the C-terminal segment (residues Thr35–Arg38) is more rigid but displays temperature-dependent conformations. Acidic pH perturbed the chemical shifts of C-terminal residues Leu34 to Arg38 (Extended Data Fig. 4), supporting the conclusion that the C terminus is conformationally plastic.

Oligomeric structure and hydration of ETM

The overall temperature insensitivity of the protein spectra suggests that ETM is oligomerized in lipid bilayers. To determine the oligomeric structure, we prepared two mixed labeled samples to measure intermolecular contacts. An equimolar mixture of 13C-labeled protein and 4-19F-Phe-labeled protein (Extended Data Fig. 1e) was used to measure intermolecular 13C-19F distances using the rotational-echo double-resonance (REDOR) technique19 (Fig. 2a). ETM contains three regularly spaced Phe residues, Phe20, Phe23 and Phe26, at the center of the TM segment. 1D and 2D 13C NMR spectra were measured without and with 19F pulses. The resulting difference spectra show the signals of carbons that are in close proximity to a fluorinated Phe on a neighboring helix (Fig. 2b and Extended Data Fig. 5a–c). As expected, residues Val17 to Leu31 are affected by 4-19F-Phe, while residues Ile13 to Ser16 and Ala36 to Arg38 show no REDOR dephasing. Moreover, the three Phe residues display 2 resolved 19F chemical shifts with a roughly 2:1 intensity ratio, indicating that one of the residues has a distinct side chain conformation. A 2D 13C-19F correlation spectrum (Fig. 2c) shows a cross-peak between the −118 ppm 19F signal and Ala22 Cβ, indicating that this −118 ppm peak is due to either Phe20 or Phe23. The −113 ppm 19F peak shows strong cross-peaks with aromatic and numerous aliphatic 13C chemical shifts. Since Phe20 and Phe26 are too far away from each other to form intermolecular contacts, the −118 ppm 19F peak must be assigned to Phe20, while the −113 ppm peak must be assigned to Phe23 and Phe26. To constrain the interhelical packing at the two termini of the TM domain, we prepared a sample with mixed 13C and 15N labels and measured 2D NHHC correlation spectra to identify exclusively intermolecular 15N-13C correlations (Fig. 2d). These experiments together yielded 35 interhelical 13C-19F distance restraints and 52 interhelical 15N-13C correlations, which are crucial for determining the oligomeric structure of ETM.

Fig. 2: Measurement of interhelical distances and water accessibility of membrane-bound ETM.
figure 2

a, Schematic of mixed [19F]ETM and [13C]ETM in a five-helix bundle. b, 2D 13Cα-F REDOR spectra of ERGIC-membrane-bound ETM. The control spectrum (S0, black) shows the signals of all residues, whereas the difference spectrum (ΔS, red) shows the signals of residues that are close to fluorinated Phe on a neighboring helix. c, 2D 13C-19F correlation spectrum allows assignment of the –118 ppm peak to Phe20 due to a cross-peak with Ala22, whereas the −113 ppm peak is assigned to Phe23 and Phe26 on the basis of correlations with Phe23 and Phe26 and with Val24 and Val25. A 1D 1H-19F cross-polarization (CP) spectrum is shown on the left. d, 2D NHHC correlation spectrum of mixed [13C]ETM and [15N]ETM, measured using 0.5 ms (red) and 1 ms (black) 1H mixing. All peaks arise from interhelical contacts. Selected assignments are given. e, Residue-specific water accessibilities of ERGIC-bound ETM obtained from the intensity ratios of water-edited spectra measured with 9 ms and 100 ms 1H mixing. Higher values (blue) indicate greater water accessibility. f, Water-edited intensities of ETM (black) and influenza BM2 (blue) obtained as the peak intensity ratios of the 9-ms and 100-ms spectra. Closed and open symbols indicate resolved and overlapped peaks, respectively. Error bars indicate the random uncertainty, which is propagated from the signal-to-noise ratios of the two spectra. ETM shows lower water-edited intensities than does BM2, indicating that the ETM pore is drier than the closed BM2 pore. g, Water- and lipid-edited 13C spectra of membrane-bound ETM. The Phe signals are high in the lipid-edited spectra but very low in the water-edited spectra, indicating that the three Phe residues are poorly hydrated and point to the lipids or the helix–helix interface.

To further constrain the architecture of ETM self-assembly, we measured residue-specific water accessibilities using water-edited 2D 15N-13C correlation experiments (Fig. 2e and Extended Data Fig. 5d)20,21. Water 1H magnetization transfer is the highest to the N-terminal residues, is the least to the central residues Leu17 to Ala32 and is moderate to the C terminus (Fig. 2f). Thus, the hydration gradient of the protein is primarily along the bilayer normal. The preferential hydration of the N terminus is especially manifested by the high water-transferred intensity of Leu19 compared with that of Thr30, despite favorable chemical exchange to the Thr side chain22,23,24. For the dehydrated center of the TM domain, Leu28 and Val25 show higher hydration than do their neighboring residues, suggesting that these two residues face the pore. A complementary lipid-edited experiment (Fig. 2g) showed much higher intensities for the Phe side chain carbons than their corresponding water-transferred intensities, indicating that the Phe residues are largely lipid-facing. The ERGIC-bound ETM shows twofold lower water accessibility than that of the closed state of the influenza BM2 at the same pH25 (Fig. 2f).

Structure calculation of ETM in ERGIC membranes

We calculated the structure of ETM using the above 56 (ϕ, ψ) torsion angles, 87 interhelical distance restraints (Supplementary Tables 2 and 3) and 196 intrahelical 13C-13C contacts obtained from 250-ms 2D 13C spin diffusion spectra (Extended Data Fig. 6)26. Initial calculation using directionally ambiguous interhelical contacts where the observed helix is assumed to contact either of the two neighboring helices did not converge. Since previously reported micelle-bound ETM structures show substantial variations in pore residue identities and handedness of the helical bundle, we evaluated various pentamer packing models (Extended Data Fig. 7 and Supplementary Table 4) for their agreement with experimentally measured constraints, including the water and lipid accessibilities, interhelical Phe–Phe contact in the 13C-19F REDOR data and 13C secondary chemical shifts. A single pentamer model, characterized by having Asn15 and Val25 at similar pore-facing orientations and all three Phe residues facing lipids, was found to best describe the experimental data. This model was subsequently used to disambiguate the direction of interhelical contacts.

The lowest-energy structure ensemble, calculated using XPLOR-NIH (Supplementary Table 5 and Table 1), shows a long and tight 5-helix bundle with a vertical length of ~35 Å for residues Val14–Leu34. The structure resolution is higher for the middle of the TM domain, where 13C-19F REDOR distance restraints are available, and lower for the two termini, where fewer distance restraints are available (Fig. 3a and Extended Data Fig. 8a,b). The side chain rotamers are not precisely defined, especially for side chains well away from the central three Phe residues (Extended Data Fig. 8b). The channel diameter, represented by backbone Cα–Cα distances between helices i and i + 2 for pore-facing residues, varies from 11 Å to 14 Å. The helix is tilted by a small angle of 5–10˚ from the bilayer normal (Fig. 3b), but the orientation is not uniform along the length of the peptide, because the helix is non-ideal but exhibits a rotation angle change, or twist, between residues Phe20 and Phe23 (refs. 10,16). Consistent with the small tilt angle, the helical bundle does not display a strong handedness. The pore of the channel is occupied by predominantly hydrophobic residues, including Asn15, Leu18, Leu21, Val25, Leu28, Ala32 and Thr35 (Fig. 3b,c and Extended Data Fig. 8a,b), explaining the poor hydration of the protein. The N-terminal pore is constricted by Asn15, which forms interhelical side chain hydrogen bonds (Fig. 3g)27. The pore-facing positions of Asn15 and Val25 are consistent with single-channel conductance data showing that p.N15A and p.V25F abolish cation conductance3,7. The helix–helix interface is stabilized by aromatic stacking of Phe23 and Phe26 (Fig. 3e,g) and van der Waals packing among methyl-rich resides such as the Val29–Leu31–Ile33 triad (Fig. 3f). These extensive hydrophobic interactions give rise to a tighter helical bundle than do the viroporins influenza BM2 and HIV-1 Vpu (Extended Data Fig. 8d).

Table 1 NMR and refinement statistics
Fig. 3: Structure of SARS-CoV-2 envelope protein’s transmembrane domain in ERGIC-mimetic lipid bilayers.
figure 3

a, Ensemble of ten lowest-energy structures. b, Side view of the most representative structure together with the HOLE-calculated pore water (gray). Pore-lining residues are shown as sticks. c, Simplified two-helix view with the pore-facing residues and their distributions in the lowest-energy ensemble. d, Pore radius of ETM obtained from the HOLE program. eh, Additional snapshots from the most representative structure. e, Lipid-facing and helix-interface positions of the three Phe residues. f, Two clusters of methyl-interdigitating Leu, Ile and Val residues, stabilizing the helix–helix interface. g, Top views of the N-terminal Glu8, the pore-facing Asn15, and the three Phe residues. h, Surface plots of the pentamer, showing the N- and C-terminal vestibules where Asn15 and Lys28 are the first pore-facing residue.

ETM interactions with hexamethylene amiloride and amantadine

To investigate how ETM interacts with drugs, we measured the chemical shifts of the protein in the presence of HMA and [3-19F]amantadine. At a drug:protein molar ratio of 4:1, HMA caused significant chemical shift perturbations (CSPs) to N-terminal residues, including Thr9, Gly10, Thr11, Ile13 and Ser16, followed by more modest CSPs for the C-terminal Ala36 and Leu37 (Fig. 4a–c). This trend is consistent with the micelle data10,16, but the CSPs in lipid bilayers are much larger, with the N-terminal 9TGT11 triplet giving per-residue CSPs of 0.35–0.70 ppm. Moreover, the CSPs in lipid bilayers were measured under only fourfold drug excess, while in micelles, the smaller CSPs were measured under higher drug excesses of 10- to 31-fold10,16.

Fig. 4: Effects of HMA and AMT binding to ETM in DMPC: DMPG membranes.
figure 4

a, 2D 15N-13Cα correlation spectra of HMA-free (apo, black) and HMA-bound ETM (orange), showing chemical-shift perturbations (CSPs) by HMA. b, 2D 15N-13Cα correlation spectra of the apo (black) and HMA-bound ETM (orange). c, Residue-specific CSPs induced by HMA and AMT. N-terminal residues are the most perturbed by the drugs, and HMA causes greater perturbation than AMT. Dashed lines indicate the average CSPs. d, A representative docking pose of HMA. The drug lies in the N-terminal vestibule, with the guanidinium group interacting with polar residues such as Thr11.

The higher sensitivity of ETM to HMA in lipid bilayers strongly suggests that the bilayer-bound protein conformation is more native. A docking pose based on these CSPs found that HMA intercalates shallowly into the N-terminal lumen with a distribution of orientations (Fig. 4d and Extended Data Fig. 9), suggesting a dynamic binding mode wherein HMA exchanges between multiple helices and inhibits cation conduction by steric occlusion of the pore. Within the ensemble of docked structures, more HMA molecules point the guanidinium into the pore and the hexamethylene ring towards the lipid headgroups than in the reverse orientation. AMT caused smaller CSPs than HMA (Fig. 4c and Extended Data Fig. 10a,b), but the binding site remains at the N terminus. Using the 3-19F label on adamantane, we measured protein-drug proximities using 13C-19F REDOR. The spectra showed modest dephasing for the N-terminal Asn15 and C-terminal Ile33 (Extended Data Fig. 10c–e), in qualitative agreement with the observed CSPs. The CSPs of HMA are larger than those of AMT and are consistent with the stronger affinity of HMA6 than AMT7 for SARS-CoV E, as well as with the micromolar half-maximal effective concentration (EC50) reported for HMA against other human coronavirus E proteins28.

Discussion

The current lipid-bilayer-based structural model of SARS-CoV-2 ETM has similarities with, but also considerable differences from, micelle-derived structural models (PDB 5X29)16. In LMPG micelles, the TM domain of a longer E construct (residues 8–65) also displays a kinked helix and a disordered N terminus, but the helical bundle is right-handed16, and the helices are more tilted and loosely packed (Extended Data Fig. 8c). In comparison, the bilayer-based ETM structural model does not have a strong handedness, consistent with the small helical tilt angle, and both reflect the measured interhelical distance restraints (Supplementary Tables 2 and 3). The heavy-atom r.m.s. deviation (r.m.s.d.) for residues 14–34 between the 2 structural models is 6.1 Å, and the positions of various important residues differ. For example, in the LMPG-derived structural model, Phe26 is pore-facing and Thr30 is interhelical16, but in the bilayer-derived structure model, both residues point to lipids. The lipid-facing position of Thr30 in the current model is supported by single-channel conductance data showing that mutations of residues such as Thr30 and Thr11 to Ala do not affect the channel activity3. Another structural model of ETM determined in DPC micelles10 showed a left-handed and coiled helical bundle that differs qualitatively from the LMPG-bound model. These structural differences likely result from a combination of insufficient experimental restraints as well as an inherent conformational plasticity of the ETM. The LMPG-based structural model was obtained from ten unambiguous interhelical distances but no orientational restraints16, whereas the DPC-based structural model was built with orientational restraints but no unambiguous interhelical distance restraints10. For comparison, the current bilayer-derived ETM structure model was calculated from 87 interhelical distance constraints (Table 1).

Apart from experimental limitations, ETM’s oligomeric structure may be intrinsically sensitive to the membrane environment29 because the highly hydrophobic nature of the long central portion of the TM segment makes interhelical interactions non-specific. Indeed, SARS-CoV viruses with a p.V25F mutation develop escape mutants p.L27S, p.L19A, p.T30I and p.L37R in mice, implying that E’s channel activity is restored by these compensatory double mutations12. We speculate this could result from moderate changes of the helix rotation angle to give rise to alternate packing of the helical bundle. Future studies of E mutants are required to elucidate the structural basis for the loss and restoration of ion-channel activity.

How does the SARS-CoV-2 ETM structure compare with the structures of equivalent viroporins of influenza and HIV-1 viruses in lipid bilayers? The ETM helical bundle is compact and rigid, while AM2 and BM2’s TM domains, which have a higher percentage of polar residues such as His and Ser, form wider and more hydrated pores (Extended Data Fig. 8d)9,25. The HIV-1 Vpu TM domain has a high percentage of hydrophobic residues, similarly to SARS-CoV-2 E, but forms a shorter (~20 Å vertical length) pentameric helical bundle with more tilted helices (~20˚)30,31. The ETM helical bundle is more immobilized than M2 and Vpu helical bundles32, and does not undergo rigid-body fast uniaxial rotation at high temperatures in DMPX membranes (Extended Data Fig. 2). This immobilization suggests that ETM may interact extensively with lipids3. Finally, the helix distortion at residues Phe20–Phe23 may cause the two halves of the protein to respond semi-independently to environmental factors such as pH, charge, membrane composition and other viral and host proteins.

Which structural features of this ETM helical bundle might be responsible for cation conduction? We hypothesize that the N terminus, which contains a (E/D/R)8X(G/A/V)10 XXhh(N/Q)15 motif (Fig. 1b), where h is a hydrophobic residue, contains the cation selectivity filter. In this conserved motif, the most exposed residue, Glu8, belongs to a dynamic N terminus whose residues (for example Thr9 and Gly10) manifest intensities only at high temperature (Extended Data Fig. 2d–f). The Glu8 side chain carboxyl is deprotonated at neutral pH and protonated at acidic pH, as manifested by 13C chemical shifts (Extended Data Fig. 2c). We speculate that the protonation equilibria of this loose ring of Glu8 quintet, together with the anionic lipids in the ERGIC membrane, may regulate the ion selectivity of ETM at the channel entrance. Such a ring of negatively charged Glu residues has been observed as selectivity filters in the hexameric Ca2+-selective Orai channels33 and designed K+ channels34. The third residue of the motif (G/A/V) is conserved among coronaviruses to be small and flexible (Fig. 1b), which might permit N-terminus motion and/or prevent occlusion of the channel lumen. The last residue of the motif is conserved to be either Asn or Gln, whose polar sidechains can coordinate ions and participate in interhelical hydrogen bonds to stabilize the channel27. At the C-terminal end of the TM segment, the conserved small residues Ala32 and Thr35 provide an open cavity for ions. In contrast to these small polar residues, the central portion of the TM domain contains four layers of hydrophobic residues, Leu18, Leu21, Val25 and Leu28, which narrow the pore radius to ~2 Å (Fig. 3d). This narrow pore can permit only a single file of water molecules, thus partially dehydrating any ions that move through the pore. Therefore, the structure determined here may represent the closed state of SARS-CoV-2 E, while the open state might have a larger and more hydrated pore. Narrow pores with multiple hydrophobic layers have also been observed in larger ion channels, including the tetrameric K+ channel TMEM175 (ref. 35) and the pentameric bestrophin channels36,37. Thus, it is possible to achieve charge stabilization and ion selectivity in such a hydrophobic environment, although the detailed mechanisms remain to be understood.

The present membrane-bound ETM structure suggests that small-molecule drugs should have high-affinity binding to both the acidic Glu8 and the polar Asn15 in order to occlude the N-terminal entrance of the protein. The membrane topology of SARS-CoV-2 E is now recognized to be Nlumen–Ccyto on the basis of antibody-detected selective permeabilization assays38 and glycosylation data39. This orientation may prime the protein to conduct Ca2+ out of the ERGIC lumen to activate the host inflammasome5. Thus, small-molecule drugs should ideally be targeted and delivered to the Golgi and ERGIC of host cells to maximally inhibit SARS-CoV-2 E’s channel activity40.

Methods

Cloning of recombinant ETM(8–38)

The gene encoding full-length SARS-CoV-2 E protein (NCBI reference sequence YP_009724392.1, residues 1–75) was purchased from Genewiz. The gene encoding the TM domain (residues 8–38, ETGTLIVNSVLLFLAFVVFLLVTLAILTALR) was isolated using PCR and cloned into a Champion pET-SUMO plasmid (Invitrogen). The plasmid was transfected into E. coli BL21 (DE3) cells (Invitrogen) to express the SUMO–ETM fusion protein containing an N-terminal His6 tag (Extended Data Fig. 1a). The construct’s DNA sequence was verified by Sanger sequencing (Genewiz).

Expression and purification of [13C,15N]ETM

A glycerol cell swab stored at –70 °C was used to start a 10-ml LB culture containing 50 μg ml–1 kanamycin. The starter culture was used to inoculate 2 l of LB medium. Cells were grown at 37 °C until an optical density at 600 nm (OD600) of 0.6–0.8 was reached, and were collected by centrifugation for 10 min at 20 °C and 4,400g. These LB cells were resuspended in 1 l of M9 medium (pH 7.8, 48 mM Na2HPO4, 22 mM KH2PO4, 8.6 mM NaCl, 4 mM MgSO4, 0.2 mM CaCl2, 50 mg kanamycin) containing 1 g/L 15N-NH4Cl. The cells were incubated in M9 media for 30 min at 18 °C, then 1 g l–1 [U-13C]glucose dissolved in 5 ml sterile H2O and 3 ml 100× MEM vitamins were added. The cells were grown for another 30 min, then protein expression was induced by addition of 0.4 mM IPTG along with 2 g l–1 [U-13C]glucose in 10 ml sterile H2O. Additional IPTG was added after 1 h to bring the final concentration to 0.8 mM. Protein expression proceeded overnight for 16 h at 18 °C, reaching an OD600 of 2.5.

The cells were spun down at 4 °C, and 5,000 r.p.m. for 10 min and resuspended in 35 ml Lysis Buffer I (pH 8.0, 50 mM Tris-HCl, 100 mM NaCl, 1.0% Triton X-100, 0.5 mg ml–1 lysozyme, 10 μl benozonase nuclease, 1 mM Mg2+, 10 mM imidazole). Cells were lysed at 4 °C by sonication (5 s on and 5 s off) for 1 h using a probe sonicator. The soluble fraction of the cell lysate was separated from the inclusion bodies by centrifugation at 17,000g for 20 min at 4 °C. The supernatant was loaded onto a gravity-flow chromatography column containing ~6 ml nickel affinity resin (Profinity IMAC, BioRad) that was pre-equilibrated with Lysis Buffer I. The fractions were bound to the resin for 1 h by gentle rocking at 4 °C. The column was washed with 50 ml of Wash Buffer I (pH 8.0, 50 mM Tris-HCl, 100 mM NaCl, 0.1% DDM, 30 mM imidazole). SUMO–ETM was eluted with 10–15 ml elution buffer (pH 8.0, 50 mM Tris-HCl, 100 mM NaCl, 0.1% DDM, 250 mM imidazole) (Extended Data Fig. 1b). The eluted protein was diluted to one-third of the original concentration by adding twice the elution volume of dilution buffer (pH 8.0, 50 mM Tris-HCl, 100 mM NaCl, 0.1% DDM) to reduce the imidazole concentration before protease cleavage. Approximately 20% of the protein was found in the insoluble membrane and inclusion body fraction. To purify this fraction, the pelleted mass was resuspended in lysis buffer II (lysis buffer I with added 6 M urea) and rocked gently at 4 °C overnight. Soluble protein was isolated by centrifugation at 17,000g for 20 min at 4 °C. Nickel affinity column chromatography proceeded as described above for the soluble fraction, except that wash buffer II (wash buffer I with added 3 M urea) was used in place of wash buffer I.

The purified SUMO–ETM from both the soluble and inclusion body fractions was cleaved by adding 1:10 (wt/wt) SUMO protease:SUMO–ETM and 5 mM TCEP for 2 h at room temperature with gentle rocking. The cleavage efficiency was assessed by analytical HPLC to be ~75%. ETM was purified using preparative RP–HPLC on a Varian ProStar 210 System using an Agilent C3 column (5-μm particle size, 21.2 mm × 150 mm). The protein was eluted using a linear gradient of 5–99% (9:1, acetonitrile:isopropanol):water containing 0.1% trifluoroacetic acid over 35 min at a flow rate of 10 ml min–1 (Extended Data Fig. 1c). The purified protein was dried down to a film with a stream of nitrogen gas and placed under vacuum overnight. The protein film was stored at −20 °C. The yield of the purified protein was 10 mg l–1 of M9 medium. Labeling efficiency was ~94%, as estimated by MALDI mass spectrometry (Extended Data Fig. 1d). [U-13C]ETM and [U-15N]ETM were expressed and purified using the same protocol but substituting [15N]NH4Cl or [13C]glucose with unlabeled reagents.

Expression of 4-19F-Phe fluorinated ETM

A glycerol cell swab was used to start a 10 ml LB culture containing 50 μg ml–1 kanamycin. The starter culture was then used to inoculate 2 l of M9 medium (pH 7.8, 48 mM Na2HPO4, 22 mM KH2PO4, 8.6 mM NaCl, 4 mM MgSO4, 0.2 mM CaCl2, 50 mg kanamycin) containing 3 g l–1 unlabeled glucose and 1 g l–1 unlabeled NH4Cl. The cells were grown in M9 at 37 °C for medium for 8 h until an OD600 of 0.5 was reached. The cells were collected by centrifugation at 4,400g for 10 min at 20 °C, then concentrated into a fresh 1-l M9 culture and incubated at 30 °C for 60 min. Subsequently, 1.5 g l–1 glyphosate was added to halt the pentose phosphate pathway41, followed by addition of 115 mg l-Trp, 115 mg l-Tyr and 400 mg of 4-19F-l-Phe to the culture. After 30 min, IPTG was added to a final concentration of 0.4 mM, and protein expression proceeded at 30 °C for 5.5 h. The cells were collected by centrifugation at 4,400g for 10 min at 4 °C. The pellet was stored at −70 °C until purification. Cell lysis and protein purification followed the same protocol, except that the ETM peak during preparative HPLC was collected in 2 fractions of ~1 min each. Fluorine incorporation in the two fractions was measured using MALDI mass spectrometry. The first fraction had a higher incorporation level of 83% for all 3 Phe residues labeled with 19F, indicating a per-residue labeling efficiency of 94% (Extended Data Fig. 1e). Only this fraction was used to prepare the mixed 13C- and 19F-labeled protein for distance measurement. The final yield of the Phe fluorinated ETM expression was 1.5 mg l–1 of M9 medium. When the protocol was originally tested using 100 mg l–1 4-19F-Phe, 1.0 g l–1 glyphosate, 6 g l–1 unlabeled glucose and with expression at 18 °C for 5.5 h, a much lower per-residue labeling efficiency of ~35% was obtained.

Membrane sample preparation

Eight protein samples were prepared for this study. Five membrane samples contained [13C,15N]ETM and one contained [13C]ETM. Another sample contained a 1:1 mixture of 13C-labeled protein:15N-labeled protein. The last sample contained a 1:1 mixture of 13C-labeled protein:4-19F-Phe-labeled protein. Six of the 8 samples were prepared in a pH 7.5 Tris buffer (20 mM Tris-HCl, 5 mM NaCl, 2 mM EDTA and 0.2 mM NaN3). One sample was prepared in a pH 5 citrate buffer with calcium (20 mM citrate, 5 mM CaCl2 and 0.2 mM NaN3), while the final sample was prepared in the same pH 5 citrate buffer without calcium chloride. Further details about membrane sample preparation and 3-19F-amantadine synthesis are given in Supplementary Note 1.

Solid-state NMR experiments

Most solid-state NMR spectra were measured on a Bruker AVANCE NEO 900 MHz (21.1 T) spectrometer and an Avance II 800 MHz (18.8 T) spectrometer using 3.2 mm HCN probes. 13C-19F REDOR experiments were conducted on an Avance III HD 600 MHz (14.1 T) spectrometer using a 1.9 mm HFX probe. Magic-angle-spinning (MAS) frequencies were 11.8 kHz for 900-MHz experiments and 14 kHz for the 800- and 600-MHz experiments. Radiofrequency (RF) field strengths on the 3.2-mm probes were 50–91 kHz for 1H, 50–63 kHz for 13C and 33–42 kHz for 15N. RF field strengths on the 1.9-mm MAS probe were 83–130 kHz for 1H, 62.5 kHz for 13C and 71 kHz for 19F. Sample temperatures are direct readings from the probe thermocouple, whereas actual sample temperatures are 5–15 K higher at the MAS frequencies employed. 13C chemical shifts are reported on the tetramethylsilane scale using the adamantane CH2 chemical shift at 38.48 ppm as an external standard. 15N chemical shifts are reported on the liquid ammonia scale using the N-acetylvaline peak at 122.00 ppm as an external standard.

2D 13C-13C correlation experiments were conducted using combined-driven (CORD) mixing42 for 13C spin diffusion. 2D and 3D 15N-13C correlation spectra, namely, NCACX, NCOCX and CONCA43, were measured on the 900-MHz spectrometer. These experiments used spectrally induced filtering in combination with cross-polarization (SPECIFIC-CP)44 for heteronuclear polarization transfer. Water-edited 2D 15N-13Cα correlation spectra were measured under 11.8-kHz MAS20,21 using 1H mixing times of 9 ms and 100 ms. 2D 15N-13C correlation spectra were measured using an out-and-back transferred-echo double resonance (TEDOR) pulse sequence on the 800 MHz NMR45. Intermolecular 2D NHHC correlation spectra46 were measured used 0.5 ms and 1 ms 1H-1H mixing. 1D and 2D 13C-19F REDOR experiments19,47,48 were used to measure distances between 4-19F-Phe-labeled and 13C-labeled ETM, and between 13C-labeled ETM and 3-19F-AMT. Detailed parameters for the solid-state NMR experiments are given in Supplementary Table 6. Details for the 13C-19F REDOR simulations and fitting are given in Supplementary Notes.

NMR spectral analysis

NMR spectra were processed in the TopSpin software and chemical shifts were assigned in Sparky49. TALOS-N50 was used to calculate torsion angles (ϕ, ψ) after converting the 13C chemical shifts to the DSS scale. Residue-specific chemical shift differences (Δδ) between drug-bound and apo samples were calculated from the measured 13C and 15N chemical shifts (δ) according to:

$$\Delta \delta = \sqrt {\left[ {\mathop {\sum}\limits_{C_i} {\left( {\delta _{C_i}^{\rm{drug}} - \delta _{C_i}^{\rm{apo}}} \right)^2} + \frac{{\left( {\delta _N^{\rm{drug}} - \delta _N^{\rm{apo}}} \right)^2}}{{2.5}}} \right]}$$
(1)

2D heatmaps of normalized water-edited 2D NCA spectra were generated using an in-house Python script that removes spectral noise while calculating intensity ratios. The intensities of the 9 ms and 100 ms spin diffusion spectra of the ERGIC-bound ETM were read using the NMRglue package51. Spectral intensity was noise-filtered by setting signal lower than 3.5 times the average noise level in an empty region of the 2D spectrum to zero for the S spectrum and to a large number for the S0 spectrum24,25. The intensities were divided and scaled by the number of scans to obtain a 2D contour map that reflect the peak intensity ratios between the 9-ms and 100-ms spectra.

The water accessibility data for the high-pH influenza BM2 proton channel (Fig. 2f) were originally measured in 2D 13C-13C correlation spectra with 4 ms (S) and 100 ms (S0) 1H spin diffusion25. To allow comparison with the ETM spectra measured at 9 ms and 100 ms mixing, we scaled the BM2 S (4 ms)/S0 (100 ms) ratios by the integrated aliphatic intensity ratio of 1.976 between the 1D BM2 water-edited spectra measured with 9 ms and 4 ms of mixing. This scaling factor was verified to be accurate for two resolved sites, Thr24 and Gly26, in the 1D 13C spectra.

XPLOR-NIH structure calculations and analysis

Initial structure calculation using ambiguous interhelical restraints, where each helix can contact both neighboring helices, did not converge. Thus, we generated parallel pentameric models to specify the direction of 13C-19F and NHHC distance restraints where possible. The models take into account the water- and lipid-edited spectra to qualitatively identify the pore- versus lipid-facing orientations of the residues. The best-case ideal helix model (Extended Data Fig. 7a), with 3.5 residues per helical turn, places Asn15 at the pore-facing d position and Phe20 at the lipid-facing b position, in agreement with the water-edited spectra. However, the model conflicts substantially with other data. For example, Thr35(c) (Thr35 at position c) and Leu31(f) are lipid-facing in this model, which contradict the water-edited spectra; Val29(d) and Phe26 (a) are pore-facing, which contradict the water- and lipid-edited spectra. The arc of Phe20(b), Phe23(e) and FPhe6(a) on the helical wheel makes it unlikely to establish interhelical Phe–Phe contacts, thus contradicting the 13C-19F distance data.

Since the ideal-helix geometry cannot agree with all experimental data, we sought better models by including slight deviations from an ideal helix. We turned to the measured chemical shifts to determine where such a deviation is most likely to occur. The Cβ chemical shift of L21 is 1.4 ppm downfield from the average of all other helical Leu residues (Extended Data Fig. 3b), suggesting that the helix is disordered between Phe20 and Phe23. Indeed, such a disorder was already noted in previous solution NMR data10. We generated four alternative pentamer models with varying positions and degrees of helix disorder (Extended Data Fig. 7b-e and Supplementary Table 4). Only one model (model 5), generated by a small rotation angle advance of ~50˚ at Phe23, adequately reproduces all key features of the experimental data. This model places Asn15(d) and Val25(d) at the same pore-facing position and the three aromatic residues at the arc of Phe20(c), Phe23(f) and Phe26(b). This model was then used to disambiguate the NHHC and 13C-19F distance restraints (Supplementary Tables 2 and 3) by mainly considering only residues that are fewer than four residues away in the primary sequence and that are in close proximity between two helical wheels. With this approach, 42 of the 87 interhelical restraints were set to be unambiguous. In principle, the handedness of the helical bundle can be determined from the registry of interhelical contacts if the position of interfacial residues are known. However, remaining 13C and 15N chemical-shift overlap among the many hydrophobic residues precluded unequivocal determination of the handedness of the helical bundle. Orthogonal experimental constraints, such as backbone N-H bond orientations, which would directly probe the helix tilt angle, will be needed to obtain a higher-resolution structure.

As has been previously described25, the ETM structure was calculated using XPLOR-NIH26 hosted on the NMRbox52. The calculation contained two stages. In the first stage, five extended ETM monomers were placed in a parallel pentamer geometry with each monomer located 20 Å from the center of the pentamer. A total of 120 independent simulated annealing runs were performed with 5,000 steps of torsion angle dynamics at 5,000 K, followed by annealing to 20 K in decrements of 20 K with 100 steps at each temperature. After the annealing, energy minimizations in torsion angle and Cartesian coordinates were carried out. The five monomers were restrained to be identical in the annealing step using the non-crystallographic symmetry term PosDiffPot and the translational symmetry term DistSymmPot. Chemical-shift-derived torsion angles (ϕ, ψ) predicted by TALOS-N were implemented with the dihedral-angle restraint term CDIH, with ranges set to the higher value between twice the TALOS-N predicted uncertainty and 20°. Measured interhelical distance restraints were implemented using the NOE potential. Distance upper limits were set to 9.0 Å and 11.5 Å for 500 μs and 1,000 μs of 1H-1H mixing for the NHHC constraints. Negative REDOR contacts, that is, 13C sites without dephasing, were implemented as two NOE’s: one to each neighboring helix. Implicit hydrogen bonds using the hydrogen-bonding database potential term HBDB were implemented during annealing to favor the formation of the α-helical conformation. Finally, standard XPLOR potentials were used to restrain the torsion angles using a structural database with the term TorsionDB, and standard bond angles and lengths were set with terms BOND, ANGL, IMPR and RepelPot. The structures were sorted by energy, using all the potentials in the calculation. The scales for all potentials are given in Supplementary Table 5.

In the second stage, the three lowest-energy structures from the annealing stage were used as independent inputs for structure refinement. A total of 64 independent XPLOR-NIH runs from each of the three starting structures were performed with 5,000 steps of torsion angle dynamics at 1,000 K followed by annealing to 20 K in decrements of 10 K with 100 steps at each temperature. This was followed by energy minimizations in torsion angle and Cartesian coordinates. All the potentials employed in annealing were also used during refinement, with two additions. The 13C-13C correlations were implemented as intramolecular NOE restraints with an upper limit of 8.0 Å. Inter-residue cross-peaks to long hydrophobic side chains, such as Phe, Ile, and Leu, were sometimes violated. Consequently, the upper limits for these 5% of restraints were increased to 12.0 Å. Explicit hydrogen bonds for residues Ile13 (hydrogen-bonded to Val17)–Asn15 (hydrogen-bonded to Leu19) and Phe23 (hydrogen-bonded to Leu27)–Thr30 (hydrogen-bonded to Leu34) were substituted for implicit hydrogen bonds using the same HBDB potential. Finally, the scales of the NOE, Repel and TorsionDB potentials were increased (Supplementary Table 5). All 192 structures from the three independent runs were pooled and sorted using the CDIH, NOE, HBDB, BOND, ANGL, IMPR, Repel and Repel14 potentials, while excluding PosDiffPot, DistSymmPot and TorsionDB potentials. The ten structures with the lowest energies across the specified potentials were included in the final structural ensemble. Where single-structure images are shown, the most representative conformer, selected as the model with the lowest average r.m.s.d. for residues 10–36 with respect to all the other structural models, is shown. The Ramachandran plot statistics for the final structure ensemble are as follows: 93% of residues are in favored regions, 5% of residues are in allowed regions and 2% of residues are in disallowed regions. The only outlier is Leu37, which is outside the TM helix, near the C terminus.

Graphical images depicting the structures were generated in PyMOL v2.3.4. The reported channel radii were calculated using the HOLE program53, and represent the radii of the largest sphere that can be accommodated from exclusion of the van der Waals diameter of all atoms at each XY plane along the Z channel coordinate, which is collinear with the bilayer normal and the putative direction of ion permeation. The cutoff radius for the calculation was 5 Å. The HOLE output was visualized in PyMOL by setting the van der Waals radius of the HOLE-generated spheres ‘SPH’ to the B-factor values of the SPH output. Details of HMA docking to ETM are given in Supplementary Notes.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.