As the global burden of SARS-CoV-2 infections escalates, so does the evolution of viral variants with increased transmissibility and pathology. In addition to this entrenched diversity, RNA viruses can also display genetic diversity within single infected hosts with co-existing viral variants evolving differently in distinct cell types. The BriSΔ variant, originally identified as a viral subpopulation from SARS-CoV-2 isolate hCoV-19/England/02/2020, comprises in the spike an eight amino-acid deletion encompassing a furin recognition motif and S1/S2 cleavage site. We elucidate the structure, function and molecular dynamics of this spike providing mechanistic insight into how the deletion correlates to viral cell tropism, ACE2 receptor binding and infectivity of this SARS-CoV-2 variant. Our results reveal long-range allosteric communication between functional domains that differ in the wild-type and the deletion variant and support a view of SARS-CoV-2 probing multiple evolutionary trajectories in distinct cell types within the same infected host.
SARS-CoV-2 spike (S) glycoprotein prominently differs from other betacoronavirus S proteins in the insertion of a furin cleavage site in the S1/S2 junction site1. The S trimer glycoprotein is responsible for binding to the ACE2 receptor and for viral cell entry after cleavage at the S1/S2 junction and S2´ sites2. Critical to this process is proteolytic processing of S by host cell proteases3. After intracellular cleavage at the S1/S2 junction by a furin-like protease to produce the S1 and S2 subunits, S gets destabilized and can be further primed by cleavage at the S2´ site by host serine proteases on the plasma membrane such as TMPRSS24,5 or the endosomal cysteine proteases cathepsin B/L6. S1 comprises the N-terminal domain (NTD), the receptor-binding domain (RBD), and the SD1 and SD2 domains7,8. S2 contains the S2´ cleavage site, the fusion peptide, a fusion peptide proximal region (FPPR), a HR1 heptad repeat, a central helix and a connector domain followed by a HR2 heptad repeat, the transmembrane domain and the C-terminal cytoplasmic domain7,8. Receptor binding destabilizes S, allowing S2´ cleavage, leading to shedding of S1 while S2 reorganizes to mediate fusion of viral and cellular membranes, enabling entry of SARS-CoV-2 into the host cells9. The furin cleavage site is a four amino acid motif located on a solvent-exposed flexible loop of S7. Furin-cleaved S was shown to open more efficiently suggesting an increased binding to human ACE2 than uncleaved S10. The furin cleavage site thus contributes substantially to the high infectivity of SARS-CoV-2, adding to the lethality of the virus.
After the growth of a low passage isolate of SARS-CoV-2 from February 2020 in the African green monkey kidney cell line Vero E6, a cell line routinely used to propagate viruses from clinical isolates, we discovered a virus subpopulation with an S variant (termed here BriSΔ) exhibiting an in-frame 8 amino acid deletion encompassing the furin recognition motif and S1/S2 cleavage site (amino acids 679-687 NSPRRARSV, replaced by I)11. Subsequently, further deletion variants abrogating S1/S2 cleavage were identified after viral passaging in cell culture12,13,14 and at low frequency in clinical samples, attenuating infection in animal models15,16,17,18. Moreover, deleting only PRRA in S by reverse genetics resulted in a recombinant ΔPRRA SARS-CoV-2 which exhibited increased infectivity and viral titer in Vero E6 cells, but a 10-fold reduced viral titer in Calu-3 2B4 lung epithelial carcinoma cells compared to the wild-type (WT) virus19, indicating the acquisition of a furin cleavage site increased SARS-CoV-2 fitness for replication in respiratory cells.
Here, we dissect the structure, dynamics and mechanism of the BriSΔ deletion variant S we identified, to gain insight into how diversification of the virus by elimination of a loop-region comprising the furin recognition motif and S1/S2 cleavage site impacts viral cell tropism, infectivity, spike protein stability and receptor binding, revealing molecular communication between functional regions within the spike glycoprotein allowing SARS-CoV-2 to evolve intra-host diversity in distinct cell types.
BriSΔ variant and wild-type SARS-CoV-2 clonal isolation
Direct RNA sequence analysis of a virus stock of SARS-CoV-2 isolate hCoV-19/England/02/2020, produced by a single passage in Vero E6 cells, revealed the presence of the WT SARS-CoV-2 and the BriSΔ variant (Fig. 1a). To obtain homogenous virus populations, the mixed virus stock was subjected to two rounds of limiting dilution in Vero E6 and human Caco-2 cells (Supplementary Fig. 1). Nanopore direct RNA sequencing confirmed that the limiting dilution yielded WT SARS-CoV-2 from Caco-2 cells. In contrast, BriSΔ was selected for in Vero E6 cells (Fig. 1a) as expected11. The differences in the infectivity of the WT and BriSΔ viruses were then compared on Vero E6, Vero E6/TMPRSS2, Caco-2, Caco-2-ACE2 and Calu-3 cells using a range of virus dilutions for infection (Fig. 1b–f). The starting virus volumes for the infections were based on equal viral genome copy numbers as determined by qRT-PCR (equating to a starting multiplicity of infection (MOI) of 10 for the WT virus, based on the Vero E6 cell titer) rather than MOI values. Although viral genome copy numbers do not necessarily reflect virus infectivity, viral growth assays on Vero E6 and Caco-2-ACE2 cells using MOIs determined on either Vero E6 or Caco-2-ACE2 cells showed that the infectivity of the two viruses differed, depending on the cell type used to determine the MOI (Supplementary Fig. 2). The percentage of virus-infected cells for the five different cell lines was analyzed 18 hours after virus infection, before multiple rounds of virus replication. In Vero E6 cells, half-maximal infection was achieved with an ~6-fold higher dilution of BriSΔ as compared to WT virus (Fig. 1b). Overexpression of TMPRSS2 protease in Vero E6 cells5 resulted in a substantially higher infection efficiency for both viruses; close to 100% of cells were infected with an up to 16-fold dilution of WT virus and an up to 64-fold dilution of BriSΔ virus (Fig. 1c). Thus, the lack of the TMPRSS2 protease contributes to, but is not the only reason why the BriSΔ variant infects Vero E6 cells better than WT virus. Differences in the route of cell entry either via fusion at the plasma membrane or receptor-mediated endocytosis20,21,22 may account for this result. Interestingly our results using Vero E6/TMPRSS2 cells are similar to those of Zhu et al18 comparing the replication of WT SARS-CoV-2 and a virus (Sdel) containing a 7 amino acid deletion encompassing the furin cleavage site but differ from those based on a competition assay between the WT and ΔPRRA viruses, which infected Vero E6/TMPRSS2 cells equally well19. Even though we selected WT SARS-CoV-2 from Caco-2 cells by serial dilution, BriSΔ infected Caco-2 cells better than the WT at high virus titers; 25% versus 10% infected cells were observed with the starting dilutions of BriSΔ and WT, respectively (Fig. 1d). Overexpression of the ACE2 receptor in Caco-2 cells led to 70% infection of cells up to 32-fold dilution of WT, whereas only about 35% of cells could be infected by the BriSΔ variant at the same dilution (Fig. 1e). Thus, both the WT and BriSΔ viruses infect Caco-2 cells better when ACE-2 expression was increased, but the improvement for WT virus was substantially higher. Calu-3 lung cells were infected about 2-fold better by WT virus for all dilutions except the starting dilutions (Fig. 1f), corroborating the contribution of the furin site in SARS-CoV-2 S to improved infection of lung cells19. Differences in the maximal level of infection of the different cell lines at 18 hours after virus infection were observed, most likely due to either difference in the expression levels of ACE2 and cellular proteases required for virus entry or intrinsic cellular factors restricting initial viral replication22.
Next, we tested neutralization of the WT and BriSΔ viruses. No difference was found in neutralization of the two viruses when Vero E6/TMPRSS2 and Vero E6 cells were infected with equal amounts of infectious virus based on cell infectivity, in the presence of a commercial antibody binding the RBD (Fig. 1g) or human serum from a convalescent COVID-19 patient (Supplementary Fig. 3), respectively, indicating that both virus species were neutralized with equal potency by the antibodies.
Cryo-EM structure of BriSΔ glycoprotein
To understand the structural impact of the deletion of the furin cleavage site on SARS-CoV-2 S architecture, we produced the BriSΔ spike by MultiBac/insect cell expression23. We purified the glycoprotein by affinity purification and size exclusion chromatography (Supplementary Fig. 4a, b) We used the peak fraction from SEC for negative stain EM quality control (Supplementary Fig. 4c) and cryogenic electron microscopy (cryo-EM) (Supplementary Fig. 5 and Supplementary Table 1). We determined the BriSΔ structure without applying symmetry (C1) at 3.0 Å resolution (Supplementary Fig. 6a). In our analysis, all BriSΔ particles exhibited the locked conformation of the S trimer we had described previously23. After applying 3-fold symmetry (C3) we obtained a 2.8 Å cryo-EM map (Fig. 2 and Supplementary Fig. 6b). In this compact locked S conformation, the receptor-binding motif (RBM) is buried inside the RBD trimer obstructing ACE2 receptor binding (Supplementary Fig. 7). - Previously, we discovered a free fatty acid (FFA) binding pocket in the locked structure of SARS-CoV-2 S, and identified a small molecule tightly bound in the pocket, with the molecular mass of linoleic acid (LA) as determined by electron-spray ionization mass-spectroscopy (ESI-MS)23, a feature subsequently corroborated in coronavirus S from pangolin24. Subsequently, similar density was also identified in other S structures (PDBIDs 6ZB5, 7JJI, 6ZGI, 6ZGE, 6XR8, 6ZP2, and 7DF3. In the locked BriSΔ structure, all three pockets are again occupied by a small molecule (Fig. 2a, b). We chose a method orthogonal to ESI-MS, namely hydrophilic interaction liquid chromatography followed by tandem mass spectrometry (HILIC-MS-MS) and highly purified LA as a calibration standard, to analyze our BriSΔ glycoprotein samples. Our HILIC-MS-MS analysis provides unambiguous, complementary evidence that the small molecule is indeed LA. In our structure, LA is bound in a bi-partite binding pocket where one RBD provides a hydrophobic ‘greasy’ tube to accommodate the hydrocarbon tail of LA, while residues R408 and Q409 of the adjacent RBD provide a polar lid coordinating the carboxy head group of LA (Fig. 2b). In the BriSΔ C1 structure, we identified virtually identical tube-shaped densities in all three RBD domains (Supplementary Fig. 7), indicating high occupancy of all three pockets. Using masked 3D classification, we scrutinized the data set for potential heterogeneity in LA binding and found that at least 95% of the RBDs were LA-bound in our structure (Supplementary Fig. 8). Our previous ESI-MS results and the present HILIC-MS-MS results thus are consistent and together identify the small molecule bound in the FFA-pocket unambiguously as the essential free fatty acid LA.
We scrutinized our BriSΔ structure and compared it with previously determined S structures for conserved stabilizing features (Fig. 2). Disulfide bonds are known to play a crucial role in stabilizing the S trimer and individual domains. Five out of 14 annotated disulfide bonds in S25 stabilize the RBD including the disulfide bond linking C336 and C361 (Fig. 2d, e). Three arginine R1039 residues, one each from the three polypeptide chains in the S trimer, form a hydrogen bond cluster (Fig. 2g). In this cluster, the arginine residues are symmetrically arranged around the central trimer axis with short-range contacts of 4.65 Å present between the carbon atoms of the guanidino groups. The guanidinium planes stack in a parallel manner on top of the aromatic plane of the juxtaposed F1042, and a salt bridge is formed to E1031 of the adjacent S polypeptide chain (Fig. 2g). R1039, E1031, and F1042 are conserved in all human coronaviruses, highlighting their central importance. In the vicinity, a disulfide bond is formed by conserved residues C1032 and C1043, arranging E1031 and F1042 at the required distance and in the proper conformation to stabilize the R1039-mediated interaction (not shown). Opening of the RBDs was shown previously to induce an asymmetry in the trimer structure that breaks this H-bond cluster8,26. LA binding in the FFA-pockets induces conformational changes to the residues surrounding the FFA-binding pocket in the RBD and beyond, including the NTD, SD2, and the fusion peptide proximal region (FFPR). Re-organization of SD2 in the locked structure results in a stabilization of the region around R634 (Fig. 2k). This arginine residue is stabilized by π-stacking on Y837 and thus connects to the FPPR of the neighboring subunit. Such π-stacking interactions were observed in a previous structure that had an intact furin site and unassigned density in the FFA-binding pocket10. R634 stacking to Y837 is additionally stabilized by a hydrophobic interaction (Fig. 2k). Fixed in a rigid position through these interactions, C840 can form a disulfide bond to C851 which additionally stabilizes the FPPR (Fig. 2f). This disulfide bond-mediated stabilization of the FPPR has been described in a cryo-EM structure of full-length S protein comprising the native transmembrane domain9. We scrutinized S structures in the protein data bank (PDB) and the sandwiched R634 appears to be a hall mark of the locked conformation. In contrast, S structures in the closed, but not locked, conformation show no π-stacking interaction of R634. Instead, residues 620–640, as well as parts of the FPPR, are disordered, and residues Y636 and R634 adopt different conformation, underscoring a functional link between the locked conformation, the sandwiched R634 and the FFA-binding pocket.
Our BriSΔ construct harbors WT residues K986 and V987, which often are mutated to prolines to stabilize S in a prefusion state. In BriSΔ, the valine fits well into the density while the lysine sidechain appears to be somewhat flexible (Fig. 2h). Importantly, BriSΔ lacks 8 amino acids including the furin cleavage site and the S1/S2 cleavage site located on a flexible loop. This loop is now shorter due to the deletion and thus more rigid, as evidenced by density in the C1 map which allowed to build a poly-alanine chain (Fig. 2i).
N-glycosylation of BriSΔ is comparable to previous S structures (Supplementary Table 2). Interestingly, WT residues S673, T676, T678, and S680 close to the furin site are all candidates for O-glycosylation which is dependent on proline P68127. It was shown that O-glycosylation of these residues negatively affects furin cleavage, contributing to S stability and infectivity27. P681 and S680 are lacking in BriSΔ, and P681 is mutated in other lineages, including the B.1.1.7 variant that emerged in Kent, UK28 and rapidly spread globally. The enzymes responsible for O-glycosylation (GALNTs) are expressed differently depending on cell type29, and the absence of P681 in BriSΔ could thus contribute to the observed dominance of this variant in certain cell types. Indeed, it was shown that mutation of P681 to R, which is present in several variants of concerns including the recent rapidly spreading Indian ‘delta’ variant, increases viral fusion30.
Functional analysis of BriSΔ
The cryo-EM structure of BriSΔ evidenced exclusively particles in the locked conformation in which RBM binding to ACE2 is obstructed (Fig. 2 and Supplementary Figs. 5–7). In cell-based assays, however, BriSΔ virus remains infectious (Fig. 1). To address this apparent discrepancy, we biochemically analyzed the interaction of a range of S proteins with the ACE2 receptor. We compared BriSΔ binding to ACE2 with S protein lacking the RBM, S protein where the furin site residues are replaced by an alanine (SRRAR->A), uncleaved WT S (SRRAR) and furin-cleaved WT S (SRRAR*) (Fig. 3a and Supplementary Table 3). ACE2-binding ELISA (performed using a surrogate virus neutralization test kit (sVNT)) indicated that all S proteins efficiently bind ACE2, except the S protein lacking the RBM used as a control (Fig. 3b). For the other S proteins, half-maximal binding was observed between 64 and 128 nM S protein in the assay. We determined the dissociation constant (KD) of BriSΔ and ACE2 by surface plasmon resonance (SPR) with biotinylated ACE2 immobilized on a streptavidin-coated chip (Fig. 3c and Supplementary Fig. 9). The binding of BriSΔ (KD = 2.5 nM) to ACE2 is not significantly different as compared to SRRAR->A (1.4 nM)23. In agreement, the maximal RU values, indicating mass deposited on the ACE2-coated chip, were virtually identical for BriSΔ, SRRAR->A and uncleaved WT S (Fig. 2d and Supplementary Fig. 9a, b). The partially cleaved WT S yielded lower RUmax signals, possibly due to partial dissociation of the S trimer and binding of the smaller S1 fragment to ACE2. Our biochemical assays thus establish that the BriSΔ trimer can adopt an open, ACE2-binding competent conformation, which however was not observed on the cryo-EM grids. This finding is in agreement with our results that the BriSΔ variant and WT virus can be neutralized efficiently with similar amounts of a commercial monoclonal antibody recognizing the RBD (Fig. 1g), and with human convalescent sera (Supplementary Fig. 3). Also, similar observations were made in a recent study that identified a fully closed Cryo-EM structure of S with no significant change in ACE2-binding affinity31.
Targeted molecular dynamics simulations indicate that LA stabilizes locked BriSΔ more than wild-type S
We next characterized the conformational changes in BriSΔ and WT S by using targeted molecular dynamics simulations (Fig. 4) to explore the force required to open a single RBD from a closed position in the S trimer for apo and LA-bound systems corresponding to WT and BriSΔ. Initially, a range of harmonic force restraints was applied to the WT spike system to find a restraint force under which the RBD opened in half of the test 10 ns simulations. Such conditions allowed an appropriately large number of repeats to be statistically significant over these non-equilibrium simulations. Thus we performed 50 simulations over 10 ns for each system (Fig. 4a–d) applying a harmonic restraint force constant of 0.2 kJ/mol/nm to raise a single RBD from an equilibrated, closed conformation to a target, open state, corresponding to the energy-minimized open S model built on EMD-114623. RMSDs were calculated for C-alpha positions corresponding to the single RBD in each case. Openings in which this RBD moved closer to the open than closed state (i.e. greater than 50% open) were counted as open, as this conformation could still feasibly accommodate an ACE2 receptor interaction32. As the same harmonic restraints and RMSD calculation methods were used to compare the WT and BrisΔ systems (apo and LA-bound), the relative differences reflect the degree of stability afforded by either the mutated furin site, the bound LA, or both.
Starting from their closed conformation, we observed that all BriSΔ and almost all apo WT S opened during the 10 ns MD simulations. In contrast, bound LA in the WT S substantially reduces the number of opening events (the cross-over time-point when the RBD opens more than halfway) compared to the apo WT spike (Fig. 4a, b). Notably, the binding of LA to BriSΔ (Fig. 4d) completely prevents RBD opening at 0.2 kJ/mol/nm in these simulations. Increasing the force constant to 0.3 kJ/mol/nm was required to raise the RBD of LA-bound BriSΔ over this time period (Fig. 4e). Example plots illustrate how during the course of the simulation the force applied either resulted in the RBD opening (Fig. 4f and Supplementary Movie 1) or failed to pull the RBD open (no cross-over) as determined by root-mean-square deviation (RMSD) (Fig. 4g) The time at which the RMSD traces cross (cross-over time) occurred was binned (in ns steps) for all 50 replicates for each system, as shown in Fig. 4a–e. In summary, 98% simulations of the apo WT S and 100% of BriSΔ open the RBD with a force constant of 0.2 kJ/mol/nm applied, while only 14% of the WT and 0% of the BriSΔ systems complexed with LA open with the same force applied (Fig. 4h). Thus, LA binding significantly stabilizes the locked conformation of both, WT S (P = 0.0027) and BriSΔ (P = 0.0024). Importantly, LA apparently shifts the equilibrium favoring the locked conformation for BriSΔ more than for WT S (P = 0.039) (Fig. 4h). This hypothesis is supported by our cryo-EM structure which shows that virtually all BriSΔ trimers were LA-bound and in the locked conformation (Fig. 2). The spike protein is glycosylated and though glycosylation probably influences the ease with which the RBD opens, the glycosylation sites of BrisΔ are the same as WT and so the effects exerted by glycosylation are likely to be similar for both cases.
Communication between the furin cleavage site, the FFA-binding pocket and the FPPR
We next performed 180 short dynamical-nonequilibrium simulations to analyze intra-spike communication and identify any structural networks connecting the FFA-binding pocket to functionally important motifs within WT S and BriSΔ (Fig. 5 and Supplementary Fig. 10a). In these simulations, the LA molecules were removed from the FFA pockets (from equilibrated simulations, Supplementary Fig 10b) and the response of the systems to this perturbation was determined using the Kubo-Onsager approach33,34,35. The simulations revealed a cascade of structural changes occurring in response to the LA removal and highlight the route by which such changes are transmitted through the S protein (Supplementary Figs. 11 and 12). Strikingly, we find that the furin cleavage site responds swiftly to LA in the WT S, despite being 40 Å away from the FFA pocket (Fig. 5). The nonequilibrium simulations also show allosteric connection between the FFA pocket and V622-L629, D808-S813 and the fusion peptide proximal region (FPPR, residues 833–855) in WT S (Fig. 5 and Supplementary Fig. 11). Similar structural responses are observed for the three WT S subunits (Supplementary Fig. 11). The V622-L629 region is part of a larger loop structure spanning residues 617-641, close to the R634-Y837 cation-π interaction site (Fig. 2k)36. The V622-L629 region, close to R634, and the furin-cleavage site respond rapidly to the perturbation in WT S: 0.1 ns after LA removal, conformational rearrangements are observed in both regions (Fig. 5 and Supplementary Fig. 11). As the simulations proceed, a gradual increase in deviations is observed for both the furin-cleavage and V622-L629 regions in WT S, with the conformational changes being further propagated to the segments adjacent to the fusion peptide, namely the FPPR and D808-S813. These results show direct coupling between the FFA-binding pocket and important, distant regions of the protein, including the furin-cleavage site. Our simulations also highlight differences in the dynamic response and in the effect of LA between the WT and BriSΔ spikes (Fig. 5 and Supplementary Figs. 11 and 12): in BriSΔ, only the V622-L629 region responds rapidly to LA removal. The deletion site itself shows smaller conformational rearrangements in BriSΔ when compared to WT S, confirming a more rigid arrangement of the shortened loop which in WT S comprises the furin cleavage and S1/S2 cleavage sites. These results indicate that BriSΔ has different allosteric and dynamical behavior from the WT S.
We investigated here the structure and functional characteristics of BriSΔ, a patient-derived SARS-CoV-2 variant we identified as a viral subpopulation by passaging SARS-CoV-2 isolate hCoV-19/England/02/2020. We find important shared features, and also marked differences between BriSΔ and an artificial ΔPRRA SARS-CoV-2 in which only the four residues of the furin cleavage site had been removed by reverse genetics19 as well as Sdel, a variant containing a cell passage acquired deletion encompassing the furin cleavage site18. On the one hand, all three deletion variants, BriSΔ, ΔPRRA, and SdeI, replicate substantially better in Vero E6 cells but show impaired replication in Calu-3 cells - underscoring that WT SARS-CoV-2 evolved to efficiently infect and replicate in respiratory cells. On the other hand, we find differences highlighting the impact of the exact sequence context of the deletion: expression of serine protease TMPRSS2 in Vero E6 cells reduced the replication advantage of ΔPRRA SARS-CoV-219, but not of BriSΔ or Sdel18. In our experiments, BriSΔ and WT SARS-CoV-2 both exhibited increased infection and replication in Vero E6/TMPRSS2 cells, but BriSΔ retained higher infection and replication rates as compared to WT virus (Fig. 1c). Importantly, the engineered ΔPRRA virus required more antibodies for neutralization of the virus compared to WT SARS-CoV-2, while monoclonal antibody and human sera neutralize BriSΔ and WT SARS-CoV-2 with equal efficacy (Fig. 1g and Supplementary Fig. 3). Interestingly, ACE2 receptor overexpression in Caco-2 cells improved the infection and replication of BriSΔ and WT SARS-CoV-2 as compared to Caco-2 cells with a lower constitutive expression of ACE2 (Fig. 1d, e).
Mechanistically, the different infectivity of cells expressing high levels of TMPRSS2 is probably explained by the fact that WT SARS-CoV-2 S is cleaved at the furin cleavage site (S1/S2 cleavage), which primes S for cleavage with TMPRSS2 (at S2´) resulting in activation of the fusion peptide1. The virus can then fuse at the plasma membrane to enter cells20,22. It has been shown that the furin-cleaved S can also interact with neuropilin (NRP1) following the CendR rule, potentially enhancing viral entry37,38. Lacking these modalities, BriSΔ appears to have evolved to be preferentially taken up by receptor-mediated endocytosis and then potentially cleaved by an intracellular protease before fusing via the endosome. Thus, Calu-3 cells which express TMPRSS2 are better infected by WT virus, and they replicate WT virus more efficiently than BriSΔ virus. In addition, intracellular defence pathways exist such as the interferon-induced transmembrane proteins (IFITMs) that may be more active against virus entering through the endosome rather than fusing at the plasma membrane39. Vero E6 cells express low to very levels of TMPRSS2, therefore WT virus cannot enter efficiently by fusing at the plasma membrane, and it has recently been suggested that WT SARS-CoV-2 is also taken up by receptor-mediated endocytosis in these cells21. Our data suggest this process is more efficient for the BriSΔ virus. Notably, Vero E6 cells are deficient in the interferon pathway, and entry via the endosome may not be inhibited in these cells17. We hypothesize that deletion of the furin cleavage site potentially results in a more stable viral particle secreted from cells, with increased expression of full-length BriSΔ in a prefusion state, which might result in less infectivity of TMPRSS2 expressing cells compared to WT. BriSΔ, compared to WT virus, showing higher infectivity of Vero E6 cells which have low levels of TMPRSS2 protease and are deficient in the interferon pathway also suggests that in the heterogenous environment of the human body such deletion variants can emerge in suitable cell types serving as potential niches for SARS-CoV-2 to further evolve or specialize.
The cryo-EM structure revealed a locked BriSΔ glycoprotein trimer (Fig. 2). Our HILIC-MS-MS results unambiguously identify LA as the small molecule tightly bound in the three bipartite pockets. The LA-bound locked form obstructs both ACE2 binding and integrin-mediated cell entry. We observe two motifs stabilizing LA-bound locked BriSΔ that appear to be characteristic for this S conformation, namely the H-bond cluster formed by R1039 stabilizing the trimer interface (Fig. 2g) and the cation-π interactions between R634 and Y837 stabilizing the interaction of S1 with the S2 of the adjacent trimer subunit (Fig. 2k). Y837 is part of the FPPR motif which is folded in the locked, but not in the closed conformation of S36. R634 is part of an extended loop structure that holds the FPPR in place, and alterations in this region, for instance the D614G S mutation, favor the opening of the RBDs and disorder the FPPR40, possibly exposing the S2´ cleavage site upstream of the fusion peptide9.
Biochemical interaction assays show that BriSΔ glycoprotein binds ACE2 with comparable affinity as S protein that does not have the deletion (Fig. 3). Thus, BriSΔ can adopt open conformation(s), consistent with the infectivity of the BriSΔ variant (Fig. 1). We conclude that cryo-EM analysis, while indicating clear preference for the locked form by the BriSΔ trimer, may not fully reflect the dynamics of the S trimer which is subject to a range of stabilizing and destabilizing modalities41. A decreased stability of open BriSΔ trimer likely derives from the more rigid loop (T676-Q690) lacking 8 amino acids including two protease cleavage sites. In agreement with this, it has been suggested that furin cleavage facilitates the opening of WT S10 and the presence of ACE2 receptors enhances the opening of the RBDs42. Our study indicates that the locked BriSΔ conformation is stabilized as compared to WT S by the additional interactions we described within the glycoprotein trimer, and the shortened loop may assist to keep these properly in place. Thus, BriSΔ may have evolved different opening kinetics from WT S and other variants, without noticeably affecting the equilibrium binding constant to ACE2. Consistent with this, our molecular dynamics simulations clearly support different stability and kinetics of RBD opening of LA-bound BriSΔ and WT S. In our targeted MD simulations, LA-bound BriSΔ requires significantly more exertion of force as compared to WT S, to adopt an open conformation (Fig. 4). Nonequilibrium MD simulations show that in WT S, both the furin site and the V622-L629 region respond rapidly to LA removal; after 0.1 ns conformational changes are observed in both regions (Fig. 5). In contrast, in BriSΔ, a fast conformational response is only observed for the V622-L629 region, indicating effectively no communication between the shortened loop and the LA pocket (Fig. 5). The nonequilibrium simulations confirm allosteric connections between the LA pocket and regions V622-L629, D808-S813 and the FPPR (Fig. 5), which are stabilized by R634 π-stacking interaction with Y837 in the FPPR in the locked conformation (Fig. 2k). LA removal destabilizes these regions.
Appreciation of the scale of intra-patient sequence variability in infection by HIV – likewise an RNA virus - led to a much better understanding of the HIV infection lifecycle and the realization of the need for combination drug therapy43. Similarly, evidence accumulates for intra-host genetic diversity in SARS-CoV-244,45. The identification of viruses with furin cleavage site deletions in clinical samples from COVID-19 patients16,17 and the differential infectivity and replication of the WT and BriSΔ viruses suggests that in the heterogenous environment of the human body, SARS-CoV-2 variants with abrogated S1/S2 cleavage replicate preferentially in specific cell types. While WT SARS-CoV-2 clearly has a competitive advantage in respiratory tract cells19, the BriSΔ virus could preferentially infect and replicate in other human cell types as a niche and reservoir to delay full clearance of SARS-CoV-2 infection by the immune system. Indeed, post-mortem tissue analysis revealed viruses with a furin cleavage site deletion in heart and spleen tissue which could represent such reservoirs17. Previous studies of RNA viruses have empirically demonstrated that the phenotype of a viral population can change measurably without changes in the consensus sequence due to increased diversity in evolved virus subpopulations46. We propose that BriSΔ exemplifies such SARS-CoV-2 evolution, with variants of the virus exploiting different cells and tissue types as niches for probing evolutionary space resulting in intra-host genetic and functional diversity, underscoring the need for analyzing individual virus genomes rather than viral populations for better understanding of transmission chains and pathology, and development of future treatments to overcome SARS-CoV-2.
SARS-CoV-2 propagation and assay
Human Calu-3 (ATCC® HTB-55™), Caco-2 (ATCC® HTB-37™; a kind gift from Dr Darryl Hill) and African green monkey Vero E6 (ATCC® CRL 1586™) cell lines were obtained from the American Type Culture Collection. A Caco-2 cell line expressing ACE2 (Caco-2-ACE2; a kind gift from Dr Yohei Yamauchi, University of Bristol) and Vero E6 cells modified to constitutively express TMPRSS2 (Vero E6/TMPRSS2 cells5; obtained from NIBSC, UK) were also used in the study. Cells were cultured in Dulbecco’s modified Eagle’s medium plus GlutaMAX (DMEM, Gibco™, ThermoFisher) supplemented with 10% fetal bovine serum (FBS) and 0.1 mM non-essential amino acids (NEAA, Sigma Aldrich) except Calu-3 cells, that were grown in Eagle’s minimal essential medium plus GlutaMAX (MEM, GibcoTM, ThermoFisher) supplemented with 10% FBS, 0.1 M NEAA, and 1 mM sodium pyruvate. All cells were grown at 37 °C in 5% CO2.
SARS-CoV-2 isolation and sequencing
A mixed virus population containing the SARS-CoV-2 WT isolate hCoV-19/England/02/2020 (GISAID ID: EPI_ISL_407073) and the “Bristol” variant derived from it (BriSΔ) in which spike amino acids 679-687 (NSPRRARSV) had been deleted and replaced with Ile11 was grown on either Vero E6 cells or Caco-2 cells and a single virus population was isolated after two rounds of limiting dilution in either cell line (Supplementary Fig. 1). In brief, the virus samples were serially diluted and grown on either Vero E6 or Caco-2 cells in order to favor the growth of the BriSΔ variant or the WT virus respectively. After 5 days of incubation, the culture supernatants in wells showing cytopathic effect (CPE) at the highest dilution were again diluted and the process was repeated. An aliquot of culture supernatant from wells showing CPE at the highest dilution was used for RNA extraction and RT-PCR using a primer set designed to discriminate the wild-type and BriSΔ viruses. Stocks of the purified viruses were produced and quantified for genome copy number by qRT-PCR and titered on Vero E6 and Caco-2/ACE2 cells as described previously11,23. To verify the viral genome sequences, the wildtype and BriSΔ variant virus were grown for 24 h in Vero E6 cells before harvesting the cells and extracting the total RNA with Trizol reagent (ThermoFisher) prior to direct RNA sequencing using an Oxford Nanopore flow cell as previously described47. The sequenced reads were aligned to the SARS-CoV-2 genome using minimap2 and the S gene was assessed visually for deletions as well as by an in-house script designed to look for significant deletions in this region and across the whole genome. All work with infectious SARS-CoV-2 was done inside a class III microbiological safety cabinet in a containment level 3 facility at the University of Bristol.
SARS-CoV-2 infection and growth assays
Cells were seeded the day prior to infection in appropriate media in µClear 96-well Microplates (Greiner Bio-one). The culture supernatants were removed, and cells were infected with WT or BriSΔ SARS-CoV-2 in infection medium (MEM with GlutaMAX supplemented with 2% FBS and 0.1 M NEAA). For infection assays, virus was diluted in an 8-step 2-fold dilution series from neat virus (equal genome copy numbers) in triplicate. For growth assays, cells were infected with MOI 0.5 based on cell type-dependent titers from Caco-2/ACE2 and Vero E6 cells; one plate was infected per time point containing 6 replicates per condition. After 1 h of infection at room temperature, the virus was removed and replaced with an infection medium and incubated at 37 °C in 5% CO2. At assay-dependent times (18 h for infection assay and 24/48/72 hours for growth assays) the cells were fixed in 4% (v/v) paraformaldehyde for 60 minutes for image analysis. Fixed cells were permeabilized with 0.1% Triton-X100 in PBS and blocked with 1% (w/v) bovine serum albumin before staining with a monoclonal antibody against the SARS-CoV-2 nucleocapsid protein (N) (1:2000 dilution; 200-401-A50, Rockland) followed by an appropriate Alexa Fluor-conjugated secondary antibody (1:2000 dilution; ThermoFisher-11008 and life technologies-11036) and DAPI (Sigma Aldrich). To determine the number of virus-infected cells, images were acquired on an ImageXpress Pico Automated Cell Imaging System (Molecular Devices) using the 10X objective. Stitched images of 9 fields covering the central 50% of the well were analyzed for virus-infected cells using Cell Reporter Xpress software (Molecular Devices). The cell number was determined by automated counted of DAPI stained nuclei and infected cells were identified as those cells in which positive N staining was detected associated with nuclear DNA staining. For growth assays, supernatants from the six replicate wells were collected and pooled prior to cell fixation, and viral RNA content was quantified by qRT-PCR as previously described23. Statistical significance was assessed using an unpaired Student’s test in GraphPad Prism v8.4.3.
SARS-CoV-2 neutralization assays
For virus neutralization assays Vero E6/TMPRSS2 cells were seeded the day prior to infection in appropriate media in µClear 96-well Microplates. Before infection, a commercial monoclonal antibody (Absolute Antibody; Sb#15) recognizing the S protein receptor-binding domain (RBD) or, alternatively, heat-inactivated (30 min at 56 °C) convalescent serum from a SARS-CoV-2 infected individual, were serially diluted, in duplicate, in infection medium over an 8-fold dilution range. Equal amounts of the wild type and BriSΔ viruses (based on Vero E6 cell infectivity) diluted in infection medium, were mixed with the antibody/antisera dilutions and incubated for 60 minutes at 37 °C. Following the incubation, the culture supernatants were removed from the cells and replaced with the virus: antibody/serum dilutions followed by incubation for 18 h at 37 °C in 5% CO2. Cells were then fixed, and the number of infected cells was determined by immunofluorescence assay and image analysis as described above. The percentage of infected cells relative to control wells infected with virus and infection media only were calculated and variable slope non-linear fit curves were assigned using GraphPad Prism v8.4.3.
Protein expression and purification
The construct encoding BriSΔ was synthesized at Genscript (Genscript Inc, New Jersey USA) followed by cloning into pACEBac1 plasmid (Geneva Biotech, Switzerland). The expression construct comprises SARS-CoV-2 spike amino acids 1 to 1208 with the 8 amino acid deletion, followed by a linker, a T4-foldon trimerization domain, another linker and finally an octahistidine affinity purification tag (Supplementary Table 3). BriSΔ was produced with the MultiBac baculovirus expression system (Geneva Biotech)48 in Hi5 cells using ESF921 media (Expression Systems Inc.). Supernatants from transfected cells were harvested 3 days post-transfection by centrifugation of the culture at 1000×g for 10 min followed by another centrifugation of supernatant at 5000×g for 30 min. The final supernatant was incubated with 10 mL HisPur Ni-NTA Superflow Agarose (Thermo Fisher Scientific) per 3 L of culture for 1 h at 4 °C. Subsequently, a gravity-flow column was used to collect the resin bound with BriSΔ, followed by washing with 30 column volumes (CV) of wash buffer (65 mM NaH2PO4, 300 mM NaCl, 20 mM imidazole, pH 7.5), 30 CV high salt buffer (65 mM NaH2PO4, 1000 mM NaCl, 20 mM imidazole, pH 7.5), and again 30 CV wash buffer. BriSΔ protein was eluted using a step gradient of elution buffer (65 mM NaH2PO4, 300 mM NaCl, 235 mM imidazole, pH 7.5). Elution fractions were analyzed by reducing Coomassie-stained SDS-PAGE. Fractions containing BriSΔ were pooled, concentrated using 50 kDa MWCO Amicon centrifugal filter units (EMD Millipore) and buffer-exchanged in SEC buffer (20 mM Tris, pH 7.5, 100 mM NaCl). Concentrated BriSΔ was subjected to size exclusion chromatography (SEC) using a Superdex 200 increase 10/300 column (GE Healthcare) equilibrated in SEC buffer. Peak fractions were analyzed by reducing SDS-PAGE and negative stain electron microscopy (EM) (Supplementary Fig. 4); fraction 8 was used for cryo-EM.
The construct encoding SΔRBM was synthesized by Genscript (New Jersey USA) and subcloned into pACEBac1 plasmid (Geneva Biotech). SΔRBM comprises, as described above for BriSΔ, a T4-foldon trimerization domain and an octahistidine tag. In SΔRBM the ACE2-interacting receptor-binding motif (RBM) in the RBD is deleted and replaced with a glycine-serine rich linker. In addition, SΔRBM comprises mutations K986P and V987P (Supplementary Table 3). Protein was produced and purified as described above except that the SEC was performed in 1x PBS at pH 7.5.
The construct encoding SRRAR was synthesized and cloned into pACEBac1 (Genscript). SRRAR comprises an intact furin site, mutations K986P and V987P, the T4 foldon trimerization motif and an octahistidine affinity purification tag (Supplementary Table 3). SRRAR was produced and purified as described above for SΔRBM.
SRRAR* was obtained by incubating SRRAR with furin protease (New England Biolabs) supplemented with 1 mM CaCl2 and incubated at room temperature for 8 hours.
Negative stain sample preparation and microscopy
In all, 4 µL of 0.05 mg/mL SARS-CoV-2 spike protein was applied onto a freshly glow discharged (1 min at 10 mA) CF300-Cu-50 grid (Electron Microscopy Sciences), incubated for 1 min, and manually blotted. In total, 4 µL of 3% uranyl acetate was applied onto the same grid and incubated for 1 min before the solution was blotted off. The grid was loaded onto a FEI Tecnai12 120 kV BioTwin Spirit TEM. Images were acquired at a nominal magnification of ×49,000.
Cryo-EM sample preparation and data collection
In total, 4 µL of 1.25 mg/mL BriSΔ was loaded onto a freshly glow discharged (2 min at 4 mA) C-flat R1.2/1.3 carbon grid (Agar Scientific), blotted using a Vitrobot MarkIV (Thermo Fisher Scientific) at 100% humidity and 4°C for 2 s, and plunge frozen. Data were acquired on a FEI Talos Arctica transmission electron microscope operated at 200 kV and equipped with a Gatan K2 Summit direct detector and Gatan Quantum GIF energy filter, operated in zero-loss mode with a slit width of 20 eV using the EPU software. Data were collected in super-resolution at a nominal magnification of ×130,000 with a virtual pixel size of 0.525 Å. The dose rate was adjusted to 6.1 counts/physical pixel/s. Each movie was fractionated in 55 frames of 200 ms. In all, 8639 micrographs were collected in a single session with a defocus range comprised between −0.8 and −2 µm.
Cryo-EM data processing
The dose-fractionated movies were gain-normalized, aligned, and dose-weighted using MotionCor250. Defocus values were estimated and corrected using the Gctf program51. 1,168,229 particles were automatically picked using Relion 3.0 software52. The auto-picked particles were extracted with a box size of 110 px (2x binning). Reference-free 2D classification was performed to select well-defined particles. After four rounds of 2D classification, a total of 403,252 particles were selected for subsequent 3D classification. The initial 3D model23 was filtered to 60 Å during 3D classification in Relion using 8 classes. 199,758 particles from Class 6 (Supplementary Fig. 5) were re-extracted with a box size of 220 px (1.05 Å/px, unbinned) and used for subsequent 3D refinement. The 3D-refined particles were then subjected to a second round of 3D classification using 3 classes. Class 2 and 3 were combined yielding 196,832 particles. These particles were subjected to 3D refinement without applying any symmetry. The maps were subsequently subjected to local defocus correction and Bayesian particle polishing in Relion 3.1. Global resolution and B factor (−97.6 Å2) of the map were estimated by applying a soft mask around the protein density, using the gold-standard Fourier shell correlation (FSC) = 0.143 criterion, resulting in an overall resolution of 3.03 Å. C3 symmetry was applied to the Bayesian polished C1 map using Relion 3.1, with 590,496 particles yielding a final resolution of 2.80 Å (B factor of -106.7 Å2). Local resolution maps were generated using Relion 3.1 (Supplementary Fig. 6).
Cryo-EM model building and analysis
For model building, UCSF Chimera53 was used to fit an atomic model of the SARS-CoV2 Spike locked conformation (PDB ID 6ZB523 into the C3-symmetrized BriSΔ cryo-EM map. The model was rebuilt using sharpened54 and unsharpened maps in Coot55 and then fitted into the C1 cryo-EM map. Namdinator56 and Coot were used to improve the fit and N-linked glycans were built into the density for both models where visible. Restraints for non-standard ligands were generated with eLBOW57. The model for C1 and C3-symmetrized closed conformation was real space refined with Phenix58, and the quality was additionally analyzed using MolProbity59 and EMRinger60, to validate the stereochemistry of the components. Figures were prepared using UCSF chimera and PyMOL (Schrodinger, Inc).
Masked 3D classification in Relion 3.1
The refined C3 particles stack was expanded 3-fold according to C3 symmetry in Relion. The symmetry-expanded particle stack was then used as input for the masked 3D classification with the focus mask corresponding to individual single chain subunits. Masked 3D classification61 was performed without alignment using 5 classes for C3-expanded particles (Supplementary Fig. 8). Visual inspection of the 3D classes showed 95% of the chains clustered in the first 3 classes with LA-bound in the hydrophobic pocket within the RBD.
LA detection was performed by multi reaction monitoring (MRM) experiments using a Sciex QTrap 4500 system coupled to hydrophilic interaction liquid chromatography. The Analyst 1.7.0 software from Sciex was used for instrument control. For calibration, an unprocessed LA (Sigma Aldrich, Germany) sample dissolved in acetonitrile was used. For spike sample preparation, 100 µL of purified protein sample (1.09 mg/mL) was mixed with 400 μL chloroform for 2 h on a horizontal shaker in a teflon-sealed glass vial at 25°C. Subsequently, the top organic phase was transferred to a new glass vial and placed in a desiccator for evaporation of the chloroform for 30 min. After all chloroform was evaporated, 50 μL acetonitrile was added to dissolve the fatty acids. From this solution, 10 μL were injected for MRM experiments. A flow rate of 0.55 mL/min was used for the binary flow elution program with acetonitrile (solvent B). The measurements were performed in negative ionization mode. Source ionization, precursor selection and fragmentation parameters for LA monitoring using hydrophilic interaction liquid chromatography were: curtain gas = 35 psi; temperature = 600 °C; nebulizer gas = 65 psi; heater gas = 80 psi; collision gas = 9; ionization voltage = −4500V; Q1 = 279.2 m/z; Q3 = 261.2; dwell time = 250 ms; collision energy = −24 V; declustering voltage = −100 V; cell exit potential = −11 V; entrance potential = −10 V. The recorded data were analyzed using the MultiQuant 3.0.2 software from Sciex.
Surrogate virus neutralization test assay
The SARS-CoV-2 surrogate virus neutralization test (sVNT) kit was obtained from GenScript Inc. (New Jersey, USA). Serial dilution series were prepared (from 0-2048 nM) of the different purified SARS-CoV-2 S proteins and RBD in PBS pH 7.5. This dilution series was treated as described23. Briefly, dilution series were mixed with the same volume of diluted horseradish peroxidase-conjugated RBD (HRP-RBD) from the sVNT kit and incubated at 37 °C for 30 min. In all, 100 μL of the mixtures were then added to the ELISA plate wells coated with ACE2, according to the manufacturer’s protocol. Triplicates of each sample were made. The plate was then incubated at 37 °C for 15 min sealed with tape to avoid evaporation and subsequently washed 4 times with 1x wash buffer. The signal was developed by adding tetramethyl-benzidine (TMB) solution to each well, incubating 15 min in the dark, followed by adding stop solution supplied with the kit. The absorbance at 450 nm was immediately recorded. The data was plotted using Microsoft excel. Standard deviations from three independent replicates were added as error bars.
Surface plasmon resonance experiments
Interaction experiments using surface plasmon resonance (SPR) were carried out with a Biacore T200 system (GE Healthcare) according to the manufacturer’s protocols and recommendations. Experiments were setup as described before23. Briefly, purified biotinylated ACE223 was immobilized on a streptavidin-coated (SA) chip (GE Healthcare) at ~50 RUs. BriSΔ was injected at concentrations of 40 nM, 80 nM,120 nM, and 160 nM. SΔRBM, SRRAR->A, SRRAR, and SRRAR* were injected at 40 nM and 160 nM. The running buffer for all measurements was PBS buffer pH 7.5. Sensorgrams were analyzed and KD, kon and koff values were determined with the Biacore Evaluation Software (GE Healthcare), fitting the raw data using a 1:1 binding model. All experiments were performed in triplicates.
Molecular dynamics simulations
The starting structures were comprised of closed WT (WT, P0DTC2) and furin-site deletion (BriSΔ) spike trimers, which had either been equilibrated in the presence of linoleic acid (LA) or not (apo). The target structure was the corresponding S trimer in which a single chain had an RBD in an extended conformation as in EMD-1114623. Exploratory forces were applied to all four systems to find a value that resulted in a distribution of openings when subjected to 10 ns dynamic simulation. Hence starting structures were simulated for 10 ns (each system was replicated 50 times) with harmonic restraints of 0.2 kJ/mol/nm force applied to pull the RBD C-alphas to align with those of the corresponding C-alphas of the RBD of the open structure corresponding to (EMD-1114623. The BriSΔ LA-bound system was subjected to a further 50 replicate simulations at k = 0.3 kJ/mol/nm, because it failed to open at k = 0.2 kJ/mol/nm. A paired T-test was applied to test for significance when comparing the data sets.
Protein coordinates were prepared as described23. Loops for the unstructured regions of the locked (LA-bound) and open (apo) cryo-EM structures were built using Chimera (UCSF)62. Loop deletions were carried out in Chimera for the furin-site deletion (BriSΔ) and the sequences were verified by Clustal63 alignment with the published sequence11. Likely disulfide bonds were reconstructed based on experimentally observed distances and each chain sequence was used in an EBI-blast check to verify WT Spike sequence post build. PROCHECK64 was then used to check the quality of the resulting structure prior to simulation. ACPYPE65 was used to prepare the topologies for linoleic acid. For the purposes of comparison for the targeted MD the first 10 residues of the open chain were removed so that both the starting and open target structures had the same number of residues.
All simulations were performed under the Amber99SB-ildn66,67,68 forcefield in NPT ensembles at 310 K using periodic boundary conditions. We have previously shown that these protocols give structures in good agreement with experiment for the WT spike. Hydrogen atoms, consistent with pH 7, were added to the complex. Short-range electrostatic and van der Waals interactions were truncated at 1.4 nm while long-range electrostatics were treated with the particle-mesh Ewald method and a long-range dispersion correction was applied. A simulation box extending 2 nm from the protein was filled with TIP3P water molecules and 150 mM Na+ and Cl– ions were added to attain a neutral charge overall. The pressure was controlled by the Berendsen barostat and temperature by the V-rescale thermostat. The simulations were integrated with a leapfrog algorithm over a 2 fs time step, constraining bond vibrations with the P-LINCS method. Structures were saved every 0.1 ns for analysis in each of the simulations over 100 ns. Simulations were run on the Bristol supercomputer BlueCrystal4, the BrisSynBio BlueGem, and the UK supercomputer, ARCHER.
The GROMACS-2019.269 suite of software was used to set up and perform the molecular dynamics simulations and analyses. Molecular graphics manipulations and visualizations were performed using VMD-1.9.170 and Chimera-1.10.262.
To map the structural changes associated with the LA removal from the FFA-binding pocket in the WT and BriSΔ spike proteins, two sets of 90 dynamical-nonequilibrium simulations were performed. In this approach, the response of a system to a perturbation is directly computed by calculating the difference of a given property between simulations with and without a perturbation. Subtracting the perturbed and unperturbed pairs of simulations at a given time, and averaging the results over many replicates, allows for the identification of the events associated with signal propagation and the determination the statistical significance of the observation33,34,35,71. The perturbation was generated by the (instantaneous) removal of LA from the FFA-binding pocket. Note that the perturbation used here is not intended to represent the physical process of unbinding, but rather is designed to force the system out of equilibrium and drive a rapid response within the protein, as it adapts to LA removal. A graphical representation of the procedure is shown in Supplementary Fig. 10a. Three equilibrium MD simulations, 200 ns each, were performed for the locked form of the unglycosylated head region of the WT and BriSΔ S proteins with LA bound. All equilibrium simulations were considered sufficiently equilibrated after 50 ns (Supplementary Fig. 10b). The simulation conditions for the equilibrium and dynamical-nonequilibrium simulations were identical to those described in23,72. The starting conformations for the nonequilibrium simulations were obtained from the equilibrated part of the equilibrium LA-bound simulations. Conformations were taken every 5 ns, and each of the LA molecules were (instantaneously) removed from all the FFA-binding pockets. The resulting apo system was then simulated for 5 ns (Supplementary Figs. 11 and 12). 180 apo dynamical-nonequilibrium simulations were performed (90 simulations each for BriSΔ and WT S). Three positive ions were also removed from the solvent to maintain the electroneutrality of the systems (required for the PME method73).
The Kubo-Onsager approach33,34,35,71 was used to extract and characterize the structural changes in the protein associated with LA removal. The response to perturbation is directly measured by averaging a given property (in this case, the position of the Cα atoms) in the perturbed (apo) and unperturbed (LA-bound) simulations at a given time, for multiple pairs of trajectories (Supplementary Fig. 10a). For each pair of LA-bound equilibrium and perturbed apo dynamical-nonequilibrium simulations, the difference in positions of each Cα was determined at equivalent points in time, namely after 0, 0.1, 1, 3, and 5 ns of simulation. The pairwise comparison between the positions of Cα atoms allows for the direct identification of the conformational rearrangements while reducing the noise coming from side-chains fluctuation. The Cα-positional deviations at each point in time were averaged over all 90 replicates. The statistical significance of the structural changes identified here is demonstrated by a low standard error of the averages (data not shown). Upon LA removal, the FFA pocket contracts becoming occupied by the sidechain of the residues lining the pocket (Supplementary Figs. 13–15). Additionally, no breaking of the cation-π interaction between R634 and Y837 (Supplementary Fig. 16) or unfolding of the Y837 region was observed in BriSΔ upon LA removal. Similarly, no significant changes were observed in the network of interactions formed by R1039 in the 5 ns following LA removal (Supplementary Figs. 17 and 18).
Statistical significance was determined by calculating standard deviations following standard mathematical formulae. Standard deviations were calculated from independent triplicates unless indicated otherwise.
Further information on research design is available in the Nature Research Reporting Summary linked to this article.
The data that support this study are available from the corresponding authors upon reasonable request. Structural datasets and coordinates generated during the current study have been deposited in the Electron Microscopy Data Bank (EMDB) under accession numbers EMD-12818 (C3 structure) and EMD-12842 (C1 structure) and in the Protein Data Bank (PDB) under accession numbers: 7OD3 (C3 structure) and 7ODL (C1 structure). Reagents are available from K.G., C.S. and I.B. under a material transfer agreement with the University of Bristol. Source data are provided with this paper.
Coutard, B. et al. The spike glycoprotein of the new coronavirus 2019-nCoV contains a furin-like cleavage site absent in CoV of the same clade. Antivir. Res. 176, 104742 (2020).
Shang, J. et al. Structural basis of receptor recognition by SARS-CoV-2. Nature 581, 221–224 (2020).
Millet, J. K. & Whittaker, G. R. Host cell proteases: Critical determinants of coronavirus tropism and pathogenesis. Virus Res. 202, 120–134 (2015).
Hoffmann, M. et al. SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell 181, 271–280.e8 (2020).
Matsuyama, S. et al. Enhanced isolation of SARS-CoV-2 by TMPRSS2-expressing cells. Proc. Natl Acad. Sci. USA 117, 7001–7003 (2020).
Ou, X. et al. Characterization of spike glycoprotein of SARS-CoV-2 on virus entry and its immune cross-reactivity with SARS-CoV. Nat. Commun. 11, 1620 (2020).
Wrapp, D. et al. Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation. Science 367, 1260–1263 (2020).
Walls, A. C. et al. Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein. Cell 181, 281–292 e6 (2020).
Cai, Y. et al. Distinct conformational states of SARS-CoV-2 spike protein. Science 369, 1586–1592 (2020).
Wrobel, A. G. et al. SARS-CoV-2 and bat RaTG13 spike glycoprotein structures inform on virus evolution and furin-cleavage effects. Nat. Struct. Mol. Biol. 27, 763–767 (2020).
Davidson, A. D. et al. Characterisation of the transcriptome and proteome of SARS-CoV-2 reveals a cell passage induced in-frame deletion of the furin-like cleavage site from the spike glycoprotein. Genome Med. 12, 68 (2020).
Ogando, N. S. et al. SARS-coronavirus-2 replication in Vero E6 cells: replication kinetics, rapid adaptation and cytopathology. J. Gen. Virol. 101, 925–940 (2020).
Sasaki, M. et al. SARS-CoV-2 variants with mutations at the S1/S2 cleavage site are generated in vitro during propagation in TMPRSS2-deficient cells. PLoS Pathog. 17, e1009233 (2021).
Klimstra, W. B. et al. SARS-CoV-2 growth, furin-cleavage-site adaptation and neutralization using serum from acutely infected hospitalized COVID-19 patients. J. Gen. Virol. 101, 1156–1169 (2020).
Lau, S. Y. et al. Attenuated SARS-CoV-2 variants with deletions at the S1/S2 junction. Emerg. Microbes Infect. 9, 837–842 (2020).
Wong, Y. C. et al. Natural transmission of bat-like SARS-CoV-2PRRA variants in COVID-19 patients. Clin. Infect. Dis. 73, e437-e444ciaa953 (2020).
Peacock, T. P. et al. The furin cleavage site in the SARS-CoV-2 spike protein is required for transmission in ferrets. Nat. Microbiol. 6, 899–909 (2021).
Zhu, Y. et al. A genome-wide CRISPR screen identifies host factors that regulate SARS-CoV-2 entry. Nat. Commun. 12, 961 (2021).
Johnson, B. A. et al. Loss of furin cleavage site attenuates SARS-CoV-2 pathogenesis. Nature 591, 293–299 (2021).
Shang, J. et al. Cell entry mechanisms of SARS-CoV-2. Proc. Natl Acad. Sci. USA 117, 11727–11734 (2020).
Dittmar, M. et al. Drug repurposing screens reveal cell-type-specific entry pathways and FDA-approved drugs active against SARS-Cov-2. Cell Rep. 35, 108959 (2021).
Murgolo, N. et al. SARS-CoV-2 tropism, entry, replication, and propagation: considerations for drug discovery and development. PLoS Pathog. 17, e1009225 (2021).
Toelzer, C. et al. Free fatty acid binding pocket in the locked structure of SARS-CoV-2 spike protein. Science 370, 725–730 (2020).
Zhang, S. et al. Bat and pangolin coronavirus spike glycoprotein structures provide insights into SARS-CoV-2 evolution. Nat. Commun. 12, 1607 (2021).
UniProt, C. UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res. 49, D480–D489 (2021).
Karathanou, K. et al. A graph-based approach identifies dynamic H-bond communication networks in spike protein S of SARS-CoV-2. J. Struct. Biol. 212, 107617 (2020).
Zhang, L. et al. Furin cleavage of the SARS-CoV-2 spike is modulated by O-glycosylation. Proc. Natl Acad. Sci. USA 118, e2109905118 (2021).
Rambaut, A. et al. Preliminary genomic characterisation of an emergent SARS-CoV-2 lineage in the UK defined by a novel set of spike mutations. (2020).
Bennett, E. P. et al. Control of mucin-type O-glycosylation: a classification of the polypeptide GalNAc-transferase gene family. Glycobiology 22, 736–756 (2012).
Saito, A. et al. Enhanced fusogenicity and pathogenicity of SARS-CoV-2 Delta P681R mutation. Nature https://doi.org/10.1038/s41586-021-04266-9 (2021) (Epub ahead of print).
Wrobel, A. G. et al. Structure and binding properties of Pangolin-CoV spike glycoprotein inform the evolution of SARS-CoV-2. Nat. Commun. 12, 837 (2021).
Gur, M. et al. Conformational transition of SARS-CoV-2 spike glycoprotein between its closed and open states. J. Chem. Phys. 153, 075101 (2020).
Ciccotti, G., Jacucci, G. & McDonald, I. R. Thought-experiments by molecular dynamics. J. Stat. Phys. 21, 1–12 (1979).
Ciccotti, G. Computer Simulation In Material Science. (ed. P. V. Meyer) 119–137 (Kluwer Academic Publishers, 1991).
Ciccotti, G. & Ferrario, M. Non-equilibrium by molecular dynamics: a dynamical approach. Mol. Simul. 42, 1385–1400 (2016).
Kun Qu, K., Xiong, X., Ciazynska, K. A., Carter, A. P. & Briggs, J. A. G. Structures and function of locked conformations of SARS-CoV-2 spike. Preprint at https://www.biorxiv.org/content/10.1101/2021.03.10.434733v1 (2021).
Daly, J. L. et al. Neuropilin-1 is a host factor for SARS-CoV-2 infection. Science 370, 861–865 (2020).
Cantuti-Castelvetri, L. et al. Neuropilin-1 facilitates SARS-CoV-2 cell entry and infectivity. Science 370, 856–860 (2020).
Winstone, H. et al. The polybasic cleavage site in SARS-CoV-2 spike modulates viral sensitivity to type I interferon and IFITM2. J. Virol. 95, e02422–20 (2021).
Yurkovetskiy, L. et al. Structural and functional analysis of the D614G SARS-CoV-2 spike protein variant. Cell 183, 739–751.e8 (2020).
Berger, I. & Schaffitzel, C. The SARS-CoV-2 spike protein: balancing stability and infectivity. Cell Res. 30, 1059–1060 (2020).
Benton, D. J. et al. Receptor binding and priming of the spike protein of SARS-CoV-2 for membrane fusion. Nature 588, 327–330 (2020).
Perelson, A. S. et al. Decay characteristics of HIV-1-infected compartments during combination therapy. Nature 387, 188–191 (1997).
Tonkin-Hill, G. et al. Patterns of within-host genetic diversity in SARS-CoV-2. Elife. 10, e66857 (2021).
Lythgoe, K. A. et al. SARS-CoV-2 within-host diversity and transmission Science 372, eabg0821 (2021).
Morley, V. J., Sistrom, M., Usme-Ciro, J. A., Remold, S. K. & Turner, P. E. Evolution in spatially mixed host environments increases divergence for evolved fitness and intrapopulation genetic diversity in RNA viruses. Virus Evol. 2, vev022 (2016).
Donovan-Banfield, I., Turnell, A. S., Hiscox, J. A., Leppard, K. N. & Matthews, D. A. Deep splicing plasticity of the human adenovirus type 5 transcriptome drives virus evolution. Commun. Biol. 3, 124 (2020).
Fitzgerald, D. J. et al. Protein complex expression by using multigene baculoviral vectors. Nat. Methods 3, 1021–1032 (2006).
Amanat, F. et al. A serological assay to detect SARS-CoV-2 seroconversion in humans. Nat. Med. 26, 1033–1036 (2020).
Zheng, S. Q. et al. MotionCor2: anisotropic correction of beam-induced motion for improved cryo-electron microscopy. Nat. Methods 14, 331–332 (2017).
Zhang, K. Gctf: Real-time CTF determination and correction. J. Struct. Biol. 193, 1–12 (2016).
Scheres, S. H. RELION: implementation of a Bayesian approach to cryo-EM structure determination. J. Struct. Biol. 180, 519–530 (2012).
Goddard, T. D., Huang, C. C. & Ferrin, T. E. Visualizing density maps with UCSF Chimera. J. Struct. Biol. 157, 281–287 (2007).
Terwilliger, T. C., Sobolev, O. V., Afonine, P. V. & Adams, P. D. Automated map sharpening by maximization of detail and connectivity. Acta Crystallogr. D Struct. Biol. 74, 545–559 (2018).
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D Biol. Crystallogr. 66, 486–501 (2010).
Kidmose, R. T. et al. Namdinator - automatic molecular dynamics flexible fitting of structural models into cryo-EM and crystallography experimental maps. IUCrJ 6, 526–531 (2019).
Moriarty, N. W., Grosse-Kunstleve, R. W. & Adams, P. D. electronic Ligand Builder and Optimization Workbench (eLBOW): a tool for ligand coordinate and restraint generation. Acta Crystallogr. D Biol. Crystallogr. 65, 1074–1080 (2009).
Liebschner, D. et al. Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix. Acta Crystallogr. D Struct. Biol. 75, 861–877 (2019).
Chen, V. B. et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr. D Biol. Crystallogr. 66, 12–21 (2010).
Barad, B. A. et al. EMRinger: side chain-directed model and map validation for 3D cryo-electron microscopy. Nat. Methods 12, 943–946 (2015).
Nakane, T., Kimanius, D., Lindahl, E. & Scheres, S. H. Characterisation of molecular motions in cryo-EM single-particle data by multi-body refinement in RELION. Elife 7, e36861 (2018).
Pettersen, E. F. et al. UCSF Chimera-a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004).
Sievers, F. et al. Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega. Mol. Syst. Biol. 7, 539 (2011).
Laskowski, R. A., MacArthur, M. W., Moss, D. S. & Thornton, J. M. P. PROCHECK - a program to check the stereochemical quality of protein structures. J. Appl. Cryst. 26, 283–291 (1993).
Sousa da Silva, A. W. & Vranken, W. F. ACPYPE - AnteChamber PYthon Parser interfacE. BMC Res. Notes 5, 367 (2012).
Wang, J., Wolf, R. M., Caldwell, J. W., Kollman, P. A. & Case, D. A. Development and testing of a general amber force field. J. Comput. Chem. 25, 1157–1174 (2004).
Case, D. A. et al. The Amber biomolecular simulation programs. J. Comput. Chem. 26, 1668–1688 (2005).
Lindorff-Larsen, K. et al. Improved side-chain torsion potentials for the Amber ff99SB protein force field. Proteins 78, 1950–1958 (2010).
Van Der Spoel, D. et al. GROMACS: fast, flexible, and free. J. Comput. Chem. 26, 1701–1718 (2005).
Humphrey, W., Dalke, A. & Schulten, K. VMD: visual molecular dynamics. J. Mol. Graph 14, 33–38 (1996). 27-8.
Oliveira, A. S. F., Ciccotti, G., Haider, S. & Mulholland, A. J. Dynamical nonequilibrium molecular dynamics reveals the structural basis for allostery and signal propagation in biomolecular systems. Eur. Phys. J. B 94, 144 (2021).
Shoemark, D. K. et al. Molecular simulations suggest vitamins, retinoids and steroids as ligands of the free fatty acid pocket of the SARS-CoV-2 spike protein. Angew. Chem. Int. Ed. Engl. 60, 7098–7110 (2021).
Essmann, U., Perera, L. & Berkowitz, M. L. A smooth particle mesh Ewald method. J. Chem. Phys. 103, 8577–8593 (1995).
We thank all members of the Berger and Schaffitzel teams as well as Robin Shattock (Imperial College, UK) and Adam Finn (Bristol UNCOVER Group and Children’s Vaccine Centre, Bristol Medical School) for their assistance and advice. We thank Simon Burbidge, Thomas Batstone and Matt Williams for computation infrastructure support. We would like to thank the Advanced Computing Research Centre (ACRC) at the University of Bristol for access to BlueCryo, BlueCrystal Phase 4 and BlueGEM, and the UK HECBioSim for access to the UK supercomputer, ARCHER. We are particularly grateful to Thiru Thangarajah (Genscript) for early access to Genscript’s cPass™ SARS-CoV-2 Neutralization Antibody Detection/Surrogate Virus Neutralization Test Kit (L00847). We thank Sebastian Fabritz and the Core Facility for Mass Spectrometry at the Max Planck Institute for Medical Research for their support on MS measurements. For the purpose of Open Access, the authors have applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission. This research received support from the Elizabeth Blackwell Institute for Health Research and the EPSRC Impact Acceleration Account EP/R511663/1, University of Bristol, from BrisSynBio a BBSRC/EPSRC Research Centre for synthetic biology at the University of Bristol (BB/L01386X/1) (I.B., C.S., A.J.M., D.K.S., and A.S.F.O.) and from the BBSRC (BB/P000940/1) (C.S. and I.B.). This work received generous support from the Oracle Higher Education and Research program to enable cryo-EM data processing using Oracle’s high-performance public cloud infrastructure (https://cloud.oracle.com/en_US/cloud-infrastructure) and the EPSRC through a COVID-19 project award via HECBioSim to access ARCHER (A.J.M.). We acknowledge support and assistance by the Wolfson Bioimaging Facility and the GW4 Facility for High-Resolution Electron Cryo-Microscopy funded by the Wellcome Trust (202904/Z/16/Z and 206181/Z/17/Z) and BBSRC (BB/R000484/1). The authors are grateful to University of Bristol’s Alumni and Friends, which funded the ImageXpress Pico Imaging System. O.S. acknowledges support from the Elisabeth Muerer Foundation, the Max Planck School Matter to Life and the Heidelberg Biosciences International Graduate School. J.S. is the Weston Visiting Professor at the Weizmann Institute of Science, part of the excellence cluster CellNetworks at Heidelberg University and acknowledges funding from the European Research Council (ERC, contract no. 294852), SynAd and the MaxSynBio Consortium, funded by the Federal Ministry of Education and Research of Germany and the Max Planck Society, from the SFB 1129 and Project 240245660-SFB1129 P15 of the German Research Foundation (DFG) and from the Volkswagen Stiftung (priority call “Life?”). A.D.D. and D.A.M. are supported by the United States Food and Drug Administration (HHSF223201510104C) and UK Research and Innovation/Medical Research Council (MRC) (MR/V027506/1). M.K.W is supported by MRC grants MR/R020566/1 and MR/V027506/1 (awarded to A.D.D). A.J.M. is supported by the British Society for Antimicrobial Chemotherapy (BSAC-COVID-30) and the EPSRC (EP/M022609/1, CCP-BioSim). I.B. acknowledges support from the EPSRC Innovative Future Vaccine Manufacturing and Research Hub (EP/R013764/1). C.S. and I.B. are Investigators of the Wellcome Trust (210701/Z/18/Z; 106115/Z/14/Z).
F.G. and I.B. report shareholding in Imophoron Ltd, unrelated to this Correspondence. D.F. and I.B. report shareholding in Geneva Biotech SARL, unrelated to this Correspondence. C.S., D.F., and I.B. report shareholding in Halo Therapeutics Ltd related to this Correspondence. Patent applications describing methods, material compositions and formulations based on the present observations have been filed. The remaining authors declare no competing interests.
Peer review information
Nature Communications thanks the anonymous reviewers for their contribution to the peer review of this work. Peer reviewer reports are available.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Gupta, K., Toelzer, C., Williamson, M.K. et al. Structural insights in cell-type specific evolution of intra-host diversity by SARS-CoV-2. Nat Commun 13, 222 (2022). https://doi.org/10.1038/s41467-021-27881-6
This article is cited by
Nature Communications (2022)
Signal Transduction and Targeted Therapy (2022)