Main

COVID-19, caused by the coronavirus SARS-CoV-2 (refs. 1,2), continues to devastate the world. The viral RNA-dependent RNA polymerase (RdRp, encoded by nonstructural protein 12 or nsp12) functions as a holo-RdRp (comprising nsp7/nsp82/nsp12) in a replication–transcription complex (RTC) (holo-RdRp + RNA) to direct RNA synthesis from the viral RNA genome3,4,5. The RdRp is also a target for the clinically approved antiviral remdesivir6,7,8. In addition to the holo-RdRp, the virus encodes several nucleic acid processing enzymes, including a helicase (nsp13), an exonuclease (nsp14), an endonuclease (nsp15) and methyltransferases (nsp14 and nsp16)9. Little is known about how these enzymes coordinate to replicate and transcribe the viral genome.

Nsp13, essential for viral replication10,11,12,13, is a superfamily 1B (SF1B) helicase that can unwind DNA or RNA substrates with a 5′ → 3′ directionality14,15,16. Along with the two canonical RecA ATPase domains of SF1 helicases14,17, nsp13 contains three additional domains: an N-terminal zinc-binding domain (unique to nidoviral helicases), a stalk and a 1B domain13,18,19. Previous studies established that the nsp13 helicase forms a stable complex with the RTC, and single-particle cryogenic-electron microscopy (cryo-EM) structures of an nsp132–RTC (the RTC with two nsp13 protomers bound) have been determined20,21,22.

In the nsp132–RTC structure, two protomers of nsp13 sit on top of the RTC with each nsp13-zinc-binding domain interacting with one of the two N-terminal helical extensions of nsp8 (refs. 20,21,22). The two nsp13 protomers were originally named nsp13.1 and nsp13.2 (ref. 20), but we will use the nomenclature proposed by Hillen23, nsp13T (binds on the ‘Thumb’ side of the RdRp) and nsp13F (binds on the ‘Fingers’ side of the RdRp) (Extended Data Fig. 1). This overall architecture places the nsp13T active site directly in the path of the downstream template-RNA (t-RNA). The cryo-EM maps showed that the 5′ single-stranded overhang of the t-RNA (Extended Data Fig. 1) passed through the nucleic acid binding channel of nsp13T (ref. 24), but the low resolution of the map due to structural heterogeneity precluded detailed modeling20.

The structural analysis of the nsp132–RTC provided new perspectives into the role of the nsp13 helicase in the complex viral replication–transcription program, suggesting that nsp13 may facilitate processive elongation by the RdRp on the highly structured RNA genome25,26, but may also generate backtracked RTCs for proofreading, template-switching during subgenomic RNA transcription or both20,27. How nsp13 directs these diverse processes that regulate RdRp function remains less well understood. For instance, the structures indicate that nsp13 translocates on the t-RNA strand in the 5′ → 3′ direction16, while the RdRp would translocate on the same strand in the opposite direction (Extended Data Fig. 1). How can the RdRp rapidly replicate the roughly 30 kilobase (kb) viral genome28 if it is opposed by the helicase? Also, what is the role of the second nsp13 protomer (nsp13F), which appears capable of ATPase and translocation/helicase activity but does not appear to be engaged with nucleic acid in the structures20,27?

Here we describe an extensive structural analysis of a cryo-EM dataset of the nsp13–RTC, combined with molecular dynamics (MD) simulation analysis of the resulting structures. The results yield a cryo-EM map of the nsp132–RTC at a nominal resolution of 2.8 Å (2.1–2.5 Å in the active site core of the RdRp, Extended Data Figs. 24). Structural heterogeneity apparent in the nsp13 portions of the map was resolved by classification approaches, revealing four distinct conformational states of the nsp13 subunits. Analysis of these conformational states suggest solutions to the apparent contradictions regarding the role of nsp13 and provides further insight into models for nsp13 function during viral replication–transcription.

Results

An augmented cryo-EM dataset allows nsp13–RTC structural analysis

Previously we described a single-particle cryo-EM analysis of a stable SARS-CoV-2 nsp13–RTC from a curated set of 88,058 particle images20. These particles were classified into three distinct assemblies, nsp131–RTC (4.0 Å nominal resolution), nsp132–RTC (3.5 Å) and a dimer of nsp132–RTC ((nsp132–RTC)2; 7.9 Å). Here we analyzed a much larger dataset (nearly five times as many particles, Extended Data Fig. 2 and Supplementary Table 1) collected from the same sample preparation. From a consensus refinement (Extended Data Figs. 2 and 3, map 1; note that Extended Data Fig. 2 shows the details of the cryo-EM processing pipeline and Extended Data Fig. 3 is a streamlined cryo-EM processing pipeline that highlights the essential steps), the particles were classified29 into the same three assemblies observed previously (nsp131–RTC (map 2), nsp132–RTC (map 3), (nsp132–RTC)2 (map 4))20 with a very similar distribution of particles between the three assemblies (Table 1, Extended Data Figs. 2 and 3 and Supplementary Table 1), confirming the robustness of the classification procedure. We focus primarily on the nsp132–RTC because the bulk of the particles (72%) belong to this class and generated the highest resolution map (Extended Data Figs. 24, map 3; 2.9 Å nominal resolution).

Table 1 Cryo-EM data collection, refinement and validation statistics for nsp131–RTC, nsp132–RTC and (nsp132–RTC)2

To obtain the best possible consensus cryo-EM map of the entire complex, we generated a series of cryo-EM maps by focused refinement around subdomains of the nsp131–RTC (map 2) and nsp132–RTC (map 3) maps and combined these, generating a composite map with a nominal resolution of 2.8 Å (Extended Data Figs. 24, map9). Local-resolution analysis30 suggested that the active site and the nidovirus RdRp-associated nucleotidyl transferase domain (NiRAN) ligand-binding pocket of the RdRp were resolved to between 2.1 and 2.6 Å resolution (Extended Data Fig. 4). This was supported by the excellent quality of the cryo-EM map, where the ADP-Mg2+ bound in the NiRAN domain enzymatic site could be visualized (Extended Data Fig. 5) and RNA base pairs near the RdRp active site could be identified directly from the cryo-EM density (Extended Data Fig. 5). Although not as well resolved, the ADP-AlF3-Mg2+ and surrounding residues in the nsp13 active sites could also be modeled (Extended Data Fig. 5).

Despite the excellent map quality for most of the RTC (Extended Data Figs. 1 and 4), features of the composite consensus map (map9) suggested substantial heterogeneity in the nsp13 subunits, particularly in the RecA2 and 1B domains (Extended Data Fig. 4). Therefore, we generated a mask surrounding the nsp13T and nsp13F RecA1, RecA2 and 1B domains (of map 3, Fig. 1 and Extended Data Fig. 3) and used masked classification with signal subtraction31 to identify four distinct conformational states (Table 2, Fig. 1 and Extended Data Figs. 2, 3 and 6) with substantial differences in the dispositions of the nsp13 subunits, particularly nsp13T.

Fig. 1: Four conformational states of the nsp132–RTC.
figure 1

Top, cryo-EM density (map 3, local-resolution filtered) colored according to the code on the left. A mask was constructed surrounding the nsp13T and nsp13F 1B, RecA1 and RecA2 domains (gray mesh). The 315,120 particles were divided into four distinct structures (classes I–IV) by focused classification inside the mask, followed by further refinement (Extended Data Figs. 2 and 6). Class II contained the most particles, and the nsp13T RecA domains were completely closed (Extended Data Fig. 6), entrapping the 5′ t-RNA segment in a groove between the two RecA domains and the 1B domain (Fig. 2). Therefore, class II (nsp13T-engaged) was used as a reference for comparison of the other structures. Each class was characterized by one dominant conformational change (class I) nsp13T-apo, the RecA domains were completely open (Extended Data Fig. 6) and devoid of RNA (Fig. 3), (class III) swiveled, the nsp13T protomer as a whole was rotated 38° as shown (Fig. 5), (class IV) 1B-open, the nsp13T 1B domain was rotated open by 85° (Fig. 4) (Extended Data Fig. 6 and Supplementary Videos 1 and 2).

Table 2 Cryo-EM data collection, refinement and validation statistics for the nsp132–RTC conformational states

The class II structure (Fig. 1 and Extended Data Fig. 6) contains the most particles and the nsp13 subunits are best resolved in this map (map 12, Fig. 1 and Extended Data Figs. 3 and 6). Compared to the other structures, the nsp13T RecA domains of class II (map 12) are closed onto each other more than the other structures (Extended Data Fig. 6) and are thereby engaged most tightly with the RNA (below). We call this the ‘nsp13T-engaged’ structure and use it as a reference to give a general overview of the conformational changes in the other classes.

While each of the classes shows substantial changes in both the disposition of each nsp13 subunit as a whole as well as intramolecular domain motions within each nsp13 subunit, each structural class can be characterized by one dominant conformational change in nsp13T (compared to the nsp13T-engaged structure used as a reference):

  1. (1)

    In class I, the nsp13T RecA2 domain is rotated open by 21° with respect to RecA1. Concomitantly, the RNA-binding site is empty while occupancy of the nsp13T nucleotide-binding site is ambiguous. We therefore call this the ‘nsp13T-apo’ structure (Fig. 1).

  2. (2)

    In class III, the nsp13T subunit swivels as a whole by 38° away from nsp13F. We call this the ‘nsp13T-swiveled’ structure (Fig. 1).

  3. (3)

    In class IV, the nsp13T domain 1B is rotated 85° away from the nsp13T RNA-binding channel, creating the ‘1B-open’ structure (Fig. 1).

The nsp13T-engaged conformation grasps the downstream RNA t-strand

In the nsp13T-engaged structure, the distance between the center-of-gravity of the two nsp13T RecA domains, 27.3 Å, is the shortest of the eight nsp13 conformations (Extended Data Fig. 6). The RecA domains are thus ‘closed’ and grasp the downstream t-RNA single-stranded 5′ segment emerging from the RdRp active site, giving rise to well-resolved cryo-EM density for the RNA passing through the helicase (Fig. 2a). The RNA is corralled in a tunnel between the two RecA domains and the 1B domain, which is also in a closed conformation (Figs. 1 and 2a). The pattern of purines and pyrimidines in the cryo-EM density is clearly discernable, allowing the unique sequence register of the RNA engaged with the nsp13T helicase to be determined (Fig. 2a).

Fig. 2: In class II (nsp13T-engaged), the nsp13T RecA domains and 1B domain clamp onto the 5′ single-stranded t-RNA.
figure 2

a, Right, overall view of the nsp13T-engaged structure. Proteins are shown as molecular surfaces except nsp13T is shown as a backbone ribbon, and nsp13F is removed and shown only as a dashed outline. The RNA is shown as atomic spheres. The boxed region is magnified on the left. Left, nsp13T is shown as a backbone worm but with side chains that interact with the t-RNA shown. Cryo-EM density for the downstream 5′ t-RNA segment is shown (transparent blue surface) with the t-RNA model superimposed. The pattern of purines/pyrimidines in the RNA density was clear and unique, allowing the identification of the sequence register for the nsp13T-bound RNA. b, Schematic illustrating nsp13T-RNA interactions (Supplementary Video 2).

The ordered RNA segment is 7 nucleotides (nt) in length (+9 to +15; Fig. 2), with the five central nucleotides (+10 to +14) completely enclosed within the helicase. The RNA phosphate backbone generally faces the nsp13T RecA domains and the mostly stacked bases face the 1B domain (Fig. 2). As might be expected, the helicase establishes extensive interactions with the RNA phosphate backbone, including several polar interactions. Interactions with the RNA bases are mostly van der Waals interactions and not expected to be base-specific (Fig. 2).

The nsp13T-apo state

Comparison of the nsp13T-apo and nsp13T-engaged structures revealed a striking change in the conformation of the RecA-like ATPase domains of nsp13T. Superimposition of the α-carbons of nsp13T RecA1 (residues 235–439) or RecA2 (residues 440–596) alone yielded root-mean-square deviations (r.m.s.d.s) of 0.387 and 0.673 Å, respectively, indicating the structures of the individual domains are very similar between the two states. However, superimposition via the α-carbons of only RecA1 gave an r.m.s.d. of 7.05 Å for the RecA2 α-carbons, indicating a substantial change in the relative disposition of the two domains. The movement of RecA2 with respect to RecA1 corresponds to a roughly 21° rotation about the axis shown in Fig. 3a (Supplementary Video 1), corresponding to an opening of the RecA domains; the nsp13T RecA domains of the nsp13T-apo state are the furthest open of any of the eight nsp13 protomer structures (Extended Data Fig. 6).

Fig. 3: In nsp13T-apo, the RecA domains are open and devoid of RNA density.
figure 3

a, The RecA1 domains of class I (nsp13T-apo) and class II (nsp13T-engaged) were superimposed (r.m.s.d. of 0.387 Å over 205 α-carbons), revealing that the RecA2 domain of nsp13T-apo was rotated open by 21° about the rotation axis shown. The ADP-AlF3-Mg2+ from the nsp13T-engaged structure is shown as yellow atomic spheres. b, Right, overall view of the nsp13T-apo structure. Proteins are shown as molecular surfaces except nsp13T, which is shown as a backbone ribbon, and nsp13F, which is removed and shown only as a dashed outline. The RNA is shown as atomic spheres. The boxed region is magnified on the left. Left, cryo-EM density of nsp13T-apo reveals that the RNA path is empty (the RNA path from the nsp13T-engaged structure is denoted by cyan spheres). c, Three independent simulations of nsp13T bound to ATP-Mg2+/RNA, ADP-Mg2+/RNA, ATP-Mg2+ and ADP-Mg2+. Values of r.m.s.d. plotted represent the heavy-atom r.m.s.d. of the RecA2 lobe (after alignment on the RecA1 lobe) with respect to nsp13T-engaged (blue) and nsp13T-apo (orange) cryo-EM structures. d, The r.m.s.d. histograms represent aggregate values across all three replicates shown in c (Supplementary Video 1).

The consensus nsp132–RTC cryo-EM map (map 3, Extended Data Figs. 2 and 3) contains low-resolution density indicating that the downstream single-stranded 5′ segment of the t-RNA occupies the nsp13T RNA-binding channel (Extended Data Fig. 1). Moreover, the t-RNA 5′ segment occupying the nsp13T RNA-binding channel of the nsp13T-engaged state is well resolved (Fig. 2). By contrast, the nsp13T-apo cryo-EM density shows that the nsp13T RNA-binding path is empty (Fig. 3b). The nsp13T-apo cryo-EM density also does not support occupancy of ADP-AlF3-Mg2+ in the nucleotide-binding site of nsp13T, although the low resolution of the map in this region makes this conclusion tentative.

Spontaneous transition of nsp13T between the engaged and apo conformations

To characterize the RecA1-RecA2 interdomain movement and how a bound substrate may influence that movement, we performed MD simulations of free nsp13T (that is, without nsp13F or the RTC) under four different substrate-bound conditions (ATP-Mg2+/RNA, ADP-Mg2+/RNA, ATP-Mg2+ only and ADP-Mg2+ only). For each condition, we ran three independent 5-μs simulations, all initiated from the nsp13T-engaged conformation (Figs. 1 and 2).

In simulations of ATP-Mg2+/RNA-bound nsp13T, the RecA2 domain maintained its general orientation with respect to RecA1 throughout the simulations (Fig. 3c). The average r.m.s.d. of RecA2 between the initial nsp13T-engaged cryo-EM structure and the structures from the MD trajectories, aligned on the RecA1, was low (roughly 2.9 Å, some adjustment of RecA2 from the initial nsp13T-engaged cryo-EM structure position in these simulations was expected, as the cryo-EM structure was determined using ADP-AlF3/RNA in place of ATP-Mg2+/RNA). Conformations resembling the nsp13T-apo structure (r.m.s.d. < 3.5 Å) were not observed (Fig. 3c,d).

In simulations of ADP-Mg2+/RNA-bound nsp13T, RecA2 rotated away from its initial position in the nsp13T-engaged conformation and nsp13T-apo-like conformations were repeatedly visited throughout the simulations (Fig. 3c,d). The ADP-Mg2+/RNA-bound nsp13T-apo-like conformations were metastable, and interconverted with the nsp13T-engaged conformations. Spontaneous and reversible conversion between the nsp13T-engaged and nsp13T-apo conformations was also observed in the simulations of ATP-Mg2+-bound and ADP-Mg2+-bound nsp13T (Fig. 3c,d). These results indicate that the presence of both the ATP-Mg2+ and RNA may stabilize the nsp13T-engaged conformation and that the absence of either substrate may destabilize the nsp13T-engaged conformation and facilitate the transition to the nsp13T-apo conformation, consistent with the observations from the cryo-EM analysis.

The nsp13T ‘1B-open’ conformation explains RNA synthesis in the presence of nsp13

In the nsp13T-engaged state, the downstream single-stranded t-RNA is guided through a deep groove between the RecA1 and RecA2 domains that is completely closed off by the 1B domain (Fig. 4a). In the 1B-open structure, the nsp13T 1B domain rotates 85° about the stalk away from the nsp13T RNA-binding channel, creating an open groove rather than a closed tunnel (Fig. 4b). The cryo-EM density allows modeling of the downstream single-stranded t-RNA emerging from the RdRp active site up to the edge of the open groove proximal to the RdRp, but the RNA density disappears there, indicating that the RNA is not engaged within the active site of the helicase (Fig. 4b).

Fig. 4: 1B-open structure.
figure 4

a,b, Comparison of nsp13T-engaged (a) and 1B-open (b) structures. Two views are shown, a top view (top) and a side view (bottom). In the top view, the proteins are shown as molecular surfaces and color-coded according to the key at the bottom. In the side view, nsp13F is shown only as a dashed outline. The RNA is shown as atomic spheres. In the 1B-open structure (b), the nsp13T 1B domain is rotated open by 85° (represented by thick black arrows). The 5′ t-RNA emerging from the RdRp active site approaches the nsp13T RNA-binding groove but does not enter it (Supplementary Video 2).

In the 1B-open conformation, the nsp13T 1B domain appears to be trapped open by the presence of nsp13F (Fig. 4b), with the transition from the 1B-open to the 1B-closed conformation blocked by nsp13F. Consistent with this, we analyzed the nsp13 conformational states in the nsp131–RTC (nsp13F absent) by masked classification with signal subtraction around the RecA1, RecA2 and 1B domains of the single nsp13 (Extended Data Fig. 2) but the 1B-open nsp13 conformation was not observed. We propose that the 1B-open conformation of the nsp13T 1B domain is trapped by the presence of nsp13F.

We note that in the (nsp132–RTC)2 dimer (Extended Data Figs. 24), the nsp13 protomers corresponding to nsp13T are also in the ‘1B-open’ state, as was observed by Yan et al.32. Since the dimer only comprises 8% of our particle dataset (Supplementary Table 1) while the nsp132–RTC complex comprises 72% of the particles, we have focused our attention on the latter complex. We observe that the (nsp132–RTC)2 dimer forms in the absence of additional factors such as nsp10-14 (ref. 20), in contrast to what is reported in Yan et al.32.

Yan et al.32 observed the 1B-open state of nsp13T (labeled nsp13-2 in their nomenclature) in their (dimer) dCap(0)–RTC structure, curiously assigned as a backtracking-competent state. This is at odds with: (1) observations that nsp13T in the 1B-open conformation does not engage RNA in its RNA-binding groove (Fig. 4b, also observed by Yan et al.32) and so would not be useful for RNA translocation, and (2) the finding that nsp13 stimulated SARS-CoV-2 RTC backtracking27.

Spontaneous nsp13T 1B-open to 1B-closed transition

The conformations of the nsp13T 1B domain in the nsp13T-engaged and nsp13T-apo structures are related by a roughly 10° rotation around the nsp13-stalk, but the 1B domains are closed on the nsp13-RecA domains in both structures. We refer to these collectively as 1B-closed states (Fig. 1). These conformations have also been observed in crystal structures of isolated nsp13 as well as some other SF1-like helicases33. The conformation of the 1B domain in the 1B-open cryo-EM structure, in which the domain is rotated roughly 85° compared to the 1B-closed conformations, was only seen in nsp13T when it was paired with nsp13F in the RTC, suggesting that this conformation may not be stable in isolated nsp13. To test this hypothesis, we performed five independent 25-μs simulations on isolated (free) nsp13 (with ADP-Mg2+), initiated from the 1B-open conformation (Fig. 1).

In three out of the five simulations, the 1B domain underwent a roughly 90° rotation from the starting 1B-open conformation around the stalk toward the RNA-binding groove to a 1B-closed conformation (Fig. 5a). These roughly 90° rotated 1B domain conformations closely resemble the disposition of the 1B domain in the nsp13T-apo structure. The 1B domain r.m.s.d. between the simulation-generated structures from the last 2 μs of the three trajectories and the 1B domain in the nsp13T-apo cryo-EM structure (aligned on the RecA1 domain) was, on average, roughly 3.6 Å. We also observed that a small portion (<5%) of these 1B-closed structures were more similar to the 1B domain of the nsp13T-engaged conformation (r.m.s.d. < 3.5 Å). These nsp13T-engaged-like 1B conformations were short-lived, and once visited they quickly transitioned to the nsp13T-apo conformation, presumably because the nsp13T-engaged conformation was captured in the presence of RNA, which was not included in our simulations.

Fig. 5: In the nsp13T-swiveled structure, the entire nsp13T promoter is rotated.
figure 5

a, Ten independent simulations of ADP-Mg2+-bound nsp13T, starting from the 1B-open cryo-EM structure, in isolation (five simulations, left-hand plot) and as part of the nsp132–RTC complex (five simulations, right-hand plot). Values plotted represent the heavy-atom r.m.s.d. of the 1B domain (nsp13 residues 150–228) compared to the 1B domain in the 1B-open cryo-EM structure (aligned on the RecA1 domain). The r.m.s.d. histograms on the right of each plot represent aggregate values across all five simulations. Representative structures of the two major conformations from simulations are shown (right, colored according to the histogram labels). b, Five independent simulations of ADP-Mg2+-bound nsp13T, starting from the nsp13T-engaged state. Values plotted represent the heavy-atom r.m.s.d. of the 1B domain compared to the 1B domain in the 1B-open cryo-EM state (aligned on the RecA1 domain). The r.m.s.d. histograms on the far right represent aggregate values across all five simulations. Representative structures of the two major conformations from the r.m.s.d. histogram from simulations are shown (right). c, Front view of the nsp13T-swiveled structure, highlighting nsp13T. The position of the nsp13T promoter in the nsp13T-engaged structure is illustrated by the dashed black outline. The nsp13T protomer of the nsp13T-swiveled structure is rotated by 38° as shown (Extended Data Fig. 7).

We next asked whether or not the 1B domain in a 1B-closed state may spontaneously transition to the 1B-open state. In each of the three simulations in which we observed a transition of the 1B domain from the 1B-open to a 1B-closed conformation, the 1B domain remained in the 1B-closed conformation through the end of the 25-μs simulation; a 1B-closed-to-open transition was not observed. We performed an additional five independent 25-μs simulations of the isolated (free) nsp13T, initiated from the 1B-closed conformation (of the nsp13T-engaged structure). We did not observe any transition events to the 1B-open conformation over the aggregated 125-μs simulation time. Instead, the 1B domain maintained its 1B-closed orientation in the initial structure, with some minor wobbling back and forth between the 1B-closed conformations of the nsp13T-engaged and nsp13T-apo structures (Fig. 5b).

Aligning the nsp13T simulation structures in the 1B-open-to-closed transition pathways with the nsp13T of the 1B-open cryo-EM structure showed that, on average, roughly 40% (53%, 22% and 45% in the three simulations) of these 1B domain intermediate conformations clashed with nsp13F, suggesting that the 1B-open-to-closed transition might be blocked by nsp13F (Extended Data Fig. 7). Consistent with this notion, in five independent 25-μs simulations of nsp13T in complex with nsp13F and the RTC (that is, the nsp132–RTC complex), with nsp13T initiated from the 1B-open conformation, we did not observe any transition events to the 1B-closed conformation over the aggregate 125 μs of simulation time (Fig. 5a). Here we envisage that 1B domain transitions are facilitated by entry into the ‘swiveled’ state. The swiveled structure is characterized by one dominant conformational change; compared to the nsp13T-engaged structure, the nsp13T protomer as a whole swivels with respect to the rest of the RTC by 38°, repositioning nsp13T with respect to nsp13F (Fig. 5c).

There are some clashes between nsp13T and nsp13F when the simulation-generated structures are aligned to the swiveled cryo-EM structure, but to a much lesser extent (9%, 0% and 2% in the three simulations, Extended Data Fig. 7). This observation is consistent with the notion that the swiveled structure may be an intermediate state that facilitates the transition between the 1B-open and 1B-closed conformations.

In summary, our simulations suggest that the conformation of the 1B domain in the 1B-open structure may only be transiently stable on its own, transitioning spontaneously into the 1B-closed conformations of the nsp13T-apo and nsp13T-engaged structures. Such transitions may be blocked by the presence of nsp13F in the 1B-open nsp132–RTC. We did not observe transitions from the 1B-closed conformations to the 1B-open conformation, and we speculate that in the presence of RNA in the nsp13T RNA-binding groove (Fig. 2), nsp13T may be further stabilized in the closed 1B domain conformation.

Nsp13 conformations in nsp132-backtracked complexes

In the nsp13T-engaged state (Fig. 1), the RdRp translocates in the 3′ → 5′ direction on the t-RNA while nsp13T grasps the single-stranded t-RNA ahead of the RdRp (Fig. 2) and translocates in the 5′ → 3′ direction (Extended Data Fig. 1). We proposed that events at the RdRp active site that would delay or stall product-RNA (p-RNA) chain elongation (such as misincorporation or incorporation of nucleotide analogs) could allow the nsp13T translocation activity to push the RdRp backward on the t-RNA20. In this process, termed backtracking, the complex moves in the 5′ → 3′ direction on the t-RNA accompanied by reverse-threading of the p-RNA through the complex, generating a single-stranded p-RNA 3′ fragment. In support of this hypothesis, structural and functional studies showed that the SARS-CoV-2 RdRp can backtrack, that the resulting single-stranded p-RNA 3′ fragment extrudes from the RdRp NTP-entry tunnel and that backtracking is stimulated by nsp13 (ref. 27).

To compare the conformational states of the nsp13 protomers in the nsp132-BTCs (backtracked complexes) with the nsp132–RTCs, we used the same masked classification with signal subtraction protocol (Extended Data Fig. 3) to classify the nsp132-BTC particles into four conformational states (Extended Data Fig. 8). Structural models were built and rigid-body refined into the cryo-EM densities for each class except for nsp132-BTC-class2 (13% of the particles), which had very poor cryo-EM density for nsp13T. To compare these structural models with the nsp132–RTC structures, we aligned the models for each nsp132-BTC model with the nsp13T-engaged state by superimposing α-carbons of nsp12, yielding r.m.s.d.s <0.213 Å. We then calculated r.m.s.d.s for α-carbons of nsp13T and nsp13F. Both nsp132-BTC-class1 and nsp132-BTC-class4 aligned well with the nsp13T-engaged nsp132–RTC state (Supplementary Table 2) and both also had strong density for the downstream t-RNA engaged with nsp13T (Extended Data Fig. 8). Therefore, we classify both of these structures as nsp13T-engaged-BTCs. The nsp132-BTC-class3 structure had an open 1B domain of nsp13T and clearly aligned with the 1B-open–RTC structure (Extended Data Table 2). Thus, in contrast to the nsp132–RTC structures, which were equally divided between the nsp13T-engaged and 1B-open states (33% each), the nsp132-BTC structures were heavily skewed toward the nsp13T-engaged state (72%) versus the 1B-open state (15%, Extended Data Fig. 8).

Discussion

In this work, we observed distinct conformational states of the nsp13 protomers within the SARS-CoV-2 nsp132–RTC, providing functional insights into nsp13 and its complex with the RTC (Supplementary Video 2). Like other helicases, nsp13 is a molecular motor that translocates along single-stranded nucleic acid, unwinding structural elements in its path16. This process is driven by conformational changes within nsp13 resulting from NTP hydrolysis.

The conformational transition from the nsp13T-engaged to the nsp13T-apo structures, observed both by our cryo-EM (Fig. 3a) and MD (Fig. 3c,d) analyses, corresponds to a roughly 21° rotation of the RecA2 domain with respect to RecA1, opening the gap between the two domains (Fig. 3a and Supplementary Video 1). The nsp13T-engaged structure is engaged with the substrate RNA and is trapped in an ‘on-pathway’ conformation of the nucleotide hydrolysis cycle by the nonhydrolyzable ATP analog ADP-AlF3. While the nsp13T-apo structure, being devoid of RNA, is not ‘on-pathway’ per se, the 21° opening of the RecA2 domain from the nsp13T-engaged to nsp13T-apo conformations matches the disposition of the RecA2 domains in other SF1 helicases, such as human Upf1, a structural homolog of nsp13 (refs. 13,24). The disposition of the RecA domains of Upf1 with ADP-AlF3 and RNA substrate (Protein Data Bank (PDB) 2XZO)34 matches the nsp13T-engaged structure. On the other hand, the RecA domains in a structure of Upf1 with ADP (so likely on-pathway) are opened by a 24° rotation about the same axis as the 21° opening of the nsp13T-apo RecA domains (PDB 2GK6)35. We thus infer that the nsp13T-apo conformation reports on an on-pathway conformation of the RecA domains, such as in the ADP-Mg2+/RNA-bound state of the translocation cycle (Fig. 3c,d). Due to the opening of the nsp13T RecA domains, the center-of-gravity of RecA2 shifts roughly parallel with the RNA backbone by 3.4 Å, corresponding to the rise between stacked RNA bases. This observation is suggestive of an ‘inchworm’ model for translocation (Supplementary Video 1), as proposed for related SF1 helicase translocation on single-stranded nucleic acids14,17,36,37,38,39,40.

Previous structural analysis of the nsp132–RTC identified that the nsp13T helicase and the RdRp translocate on the t-RNA with opposing polarities20. In circumstances where RdRp elongation of the p-RNA is hindered (such as in the event of a misincorporation at the p-RNA 3′ end), nsp13T translocation activity could backtrack the RdRp20, as shown by follow-up structural and biochemical analyses27. The opposing polarities of the RdRp and nsp13 translocation activities (Extended Data Fig. 1) presented a conundrum that was not addressed by these previous studies; how is it possible for the RdRp to rapidly and efficiently synthesize RNA if it is constantly being opposed by nsp13? Moreover, the predominant complex present in the nsp13–RTC samples is the nsp132–RTC complex (Supplementary Table 1), but only nsp13T was seen to engage with the t-RNA; what is the role of nsp13F, the second copy of nsp13 in the nsp132–RTC? The work herein suggests answers to both questions.

Maximum likelihood classification approaches revealed four distinct conformations of the nsp13 protomers in the nsp132–RTC (Figs. 1 and 6 and Supplementary Videos 1 and 2). The nsp13T-engaged state resolves nsp13T clamped onto the single-stranded downstream t-RNA, providing an atomic view of nsp13-engaged with the single-stranded RNA (Fig. 2). The single-stranded t-RNA threads through a tunnel formed by a deep groove between the RecA1 and RecA2 domains and further enclosed by the 1B domain (Fig. 4a).

Fig. 6: Schematic model for RTC elongation (1B-open) versus backtracking (nsp13T-engaged) states.
figure 6

Top views (top row) and side views (bottom row) of each structural class. Nsp13T-apo (17% of the particles): the nsp13T RecA domains are open, consistent with the absence of nucleotide. Nsp13T is therefore not engaged with the downstream 5′ t-RNA and the RdRp can freely translocate on the t-RNA with concurrent elongation of the p-RNA (gray arrow pointing downstream). 1B-open (33%): the nsp13T 1B domain is rotated open and sterically trapped by the presence of nsp13F. The nsp13T is therefore unable to engage with the downstream 5′ t-RNA and is inactive. The RdRp is able to elongate freely in the downstream direction. Nsp13T-swiveled (17%): the rotation of the nsp13T protomer away from nsp13F provides space for the nsp13T 1B domain to open and/or close. We therefore propose that nsp13T-swiveled represents a transition state between the 1B-open (elongating) and nsp13T-engaged (backtracking) states. Nsp13T-engaged (33%): the nsp13T 1B and RecA domains are clamped onto the downstream 5′ t-RNA. In this state, nsp13T can translocate on the t-RNA in the 5′ → 3′ direction (shown by the orange arrow). This counteracts RdRp elongation and causes backtracking (backward motion of the RdRp on the RNA, shown by the gray arrow pointing upstream) (Extended Data Fig. 8 and Supplementary Videos 1 and 2).

By contrast, the 1B-open state shows nsp13T adopting a conformation in which the 1B domain is rotated open roughly 85° about the stalk domain, leaving an open RNA-binding groove (Fig. 4b). In this state, the single-stranded downstream t-RNA does not engage with the helicase. Thus, this represents an inactive state of the helicase that would be unable to translocate on the RNA.

Our structural analysis combined with MD simulations confirmed that the conformation of the nsp13T 1B domain in the 1B-open structure is not stable on its own but is sterically trapped by the presence of nsp13F, which blocks the conformational change required for 1B domain closure (Fig. 5a). These results indicate that the 1B-open state represents a rapidly elongating state of the nsp132–RTC, where the downstream single-stranded t-RNA feeds into the RdRp active site without engaging with nsp13T. Nsp13F may trap the 1B-open (inactive) state of nsp13T, allowing RdRp elongation to proceed without opposition from the nsp13T helicase (Fig. 6). Finally, swiveling of nsp13T in the swiveled state allows space for the 1B-open to 1B-closed transition (Fig. 5c), suggesting that the swiveled state represents a transition state between the open and closed states of the 1B domain (Fig. 6). We note that the presence of nsp13F in the nsp13T-engaged state would also block the 1B-closed to 1B-open transition, suggesting how nsp13F can enhance the helicase activity of nsp13T (ref. 21).

Thus, our results indicate a mechanism for the nsp132–RTC to turn backtracking on and off: switching between rapid RNA synthesis (1B-open state, elongating RdRp, Fig. 6) and backtracking (nsp13T-engaged, backtracking RdRp, Fig. 6). In our analysis of the conformational states of the nsp132–RTC, the particles were equally divided between the nsp13T-engaged (backtracking on) and 1B-open (backtracking off) states (Extended Data Figs. 3 and 8). An identical analysis of the backtracked nsp132-BTC (ref. 27) revealed a different distribution of particles in which the nsp13T-engaged (backtracking on) state was heavily favored (Extended Data Fig. 8). This raises the possibility that the conformational switch that turns backtracking on and off is allosterically controlled.

In MD simulations exploring the dynamics of the p-RNA 3′ nucleotide of a pretranslocated RTC, a mismatched p-RNA 3′ nucleotide frayed from the t-RNA toward and into the NTP-entry tunnel (which also serves as the backtracking tunnel, Supplementary Video 1), while a p-RNA 3′ nucleotide engaged in a Watson–Crick base pair with the t-RNA did not27. We thus suggest that misincorporation by the RdRp leads to fraying of the p-RNA 3′ nucleotide into the NTP-entry tunnel, which may allosterically signal the rapidly elongating 1B-open state to switch to the backtracking nsp13T-engaged state (via the nsp13T-swiveled state, Fig. 6). This facilitates a possible proofreading mechanism since backtracking would extrude the mismatched p-RNA 3′ nucleotide out of the NTP-entry tunnel (Fig. 6)27, allowing the nsp10/nsp14 3′ exonuclease proofreading activity to access and degrade the mismatched p-RNA 3′ nucleotide41,42,43. The nsp14-mediated proofreading activity is crucial for the virus to avoid mutation catastrophe while replicating its roughly 30-kb genome42, and is also an important determinant of SARS-CoV-2 susceptibility to many antiviral nucleotide analogs44.

Methods

No statistical methods were used to predetermine sample size. The experiments were not randomized, and the investigators were not blinded to allocation during experiments and outcome assessment.

Protein expression and purification

SARS-CoV-2 nsp12 was expressed and purified as described in ref. 20. Briefly, a pRSFDuet-1 plasmid containing His6-small ubiquitin-like modifier (SUMO) SARS-CoV-2 nsp12 (Addgene no. 159107) was transformed into Escherichia coli BL21-CodonPlus cells (Agilent). Cells were grown and protein expression was induced by the addition of isopropyl β-d-1-thiogalactopyranoside (IPTG). Cells were collected and lysed in a French press (Avestin). The lysate was cleared by centrifugation and purified on a HiTrap Heparin HP column (Cytiva). The fractions containing nsp12 were loaded onto a HisTrap HP column (Cytiva) for further purification. Eluted nsp12 was dialyzed, cleaved with His6-Ulp1 SUMO protease and passed through a HisTrap HP column to remove the SUMO protease. Flow-through was collected, concentrated by centrifugal filtration (Amicon), and loaded on a Superdex 200 Hiload 16/600 (Cytiva). Glycerol was added to the purified nsp12, aliquoted, flash-frozen with liquid N2 and stored at −80 °C.

SARS-CoV-2 nsp7/8 was expressed and purified as described in ref. 20. Briefly, the pCDFDuet-1 plasmid containing His6 SARS-CoV-2 nsp7/8 (Addgene no. 159092) was transformed into E. coli BL21 (DE3). Cells were grown and protein expression was induced by the addition of IPTG. Cells were collected and lysed in a French press (Avestin). The lysate was cleared by centrifugation and purified on a HisTrap HP column (Cytiva). Eluted nsp7/8 was dialyzed, cleaved with His6-Prescission Protease to cleave His6 tag and then passed through a HisTrap HP column to remove the protease (Cytiva). Flow-through was collected, concentrated by centrifugal filtration (Amicon) and loaded onto a Superdex 75 Hiload 16/600 (Cytiva). Glycerol was added to the purified nsp7/8, aliquoted, flash-frozen with liquid N2 and stored at −80 °C.

SARS-CoV-2 nsp13 was expressed and purified as described in ref. 20. Briefly, the pet28 plasmid containing His6 SARS-CoV-2 nsp13 (Addgene no. 159390) was transformed into E. coli Rosetta (DE3) (Novagen). Cells were grown and protein expression was induced by the addition of IPTG. Cells were collected and lysed in a French press (Avestin). The lysate was cleared by centrifugation and purified on a HisTrap HP column (Cytiva). Eluted nsp13 was dialyzed, cleaved with His6-Prescission Protease and then passed through a HisTrap HP column to remove protease (Cytiva). Flow-through was collected, concentrated by centrifugal filtration (Amicon) and loaded onto a Superdex 200 Hiload 16/600 (Cytiva). Glycerol was added to the purified nsp13, aliquoted, flash-frozen with liquid N2 and stored at −80 °C.

Preparation of SARS-CoV-2 nsp13–RTC for cryo-EM

Cryo-EM samples of SARS-CoV-2 nsp13–RTC were prepared as described in ref. 20. Briefly, purified nsp12 and nsp7/8 were concentrated, mixed in a 1:3 molar ratio and incubated for 20 min at 22 °C. Annealed RNA scaffold (Horizon Discovery, Ltd) was added to the nsp7/8/12 mixture and incubated for 15 min at 22 °C. Sample was buffer exchanged into cryo-EM buffer (20 mM HEPES pH 8.0, 150 mM K-Acetate,10 mM MgCl2, 2 mM DTT) and further incubated for 20 min at 30 °C. The sample was purified over a Superose 6 Increase 10/300 GL column (Cyriva) in cryo-EM buffer. The peak corresponding to nsp7/8/12/RNA complex was pooled and concentrated by centrifugal filtration (Amicon). Purified nsp13 was concentrated by centrifugal filtration (Amicon) and buffer exchanged into cryo-EM buffer. Buffer exchanged nsp13 was mixed with ADP (1 mM final) and AlF3 (1 mM final) and then added to nsp7/8/12/RNA at a molar ratio of 1:1. Complex was then incubated for 5 min at 30 °C.

Cryo-EM grid preparation

Before grid freezing, 3-(3-cholamidopropyl-dimethylammonio)-2-hydroxy-1-propanesulfonate (CHAPSO, Anatrace) was added to the sample (8 mM final), resulting in a final complex concentration of 8 µM. The final buffer condition for the cryo-EM sample was 20 mM HEPES pH 8.0, 150 mM K-acetate,10 mM MgCl2, 2 mM DTT, 1 mM ADP, 1 mM AlF3, 8 mM CHAPSO. C-flat holey carbon grids (CF-1.2/1.3-4Au, EMS) were glow-discharged for 20 s before the application of 3.5 μl of sample. Using a Vitrobot Mark IV (Thermo Fisher Scientific), grids were blotted and plunge-frozen into liquid ethane with 90% chamber humidity at 4 °C.

Cryo-EM data acquisition and processing

Structural biology software was accessed through the SBGrid consortium45. Grids were imaged using a 300 kV Titan Krios (Thermo Fisher Scientific) equipped with a GIF BioQuantum and K3 camera (Gatan). Images were recorded with Leginon46 with a pixel size of 1.07 Å per pixel (px) (micrograph dimensions of 5,760 × 4,092 px) over a defocus range of −0.8 to −2.5 μm with a 20-eV slit. Videos were recorded in ‘counting mode’ (native K3 camera binning 2) with roughly 30 e per px per s in dose-fractionation mode with subframes of 50 ms over a 2.5 s exposure (50 frames) to give a total dose of roughly 66 e2. Dose-fractionated videos were gain-normalized, drift-corrected, summed and dose-weighted using MotionCor2 (ref. 47). The contrast transfer function (CTF) was estimated for each summed image using the Patch CTF module in cryoSPARC v.2.15.0 (ref. 48). Particles were picked and extracted from the dose-weighted images with box size of 256 px using cryoSPARC Blob Picker and Particle Extraction. The entire dataset consisted of 17,806 motion-corrected images with 3,750,107 particles. Particles were sorted using two rounds of cryoSPARC two-dimensional classification (N = 100, where N equals the number of classes), resulting in 661,105 curated particles that were re-extracted with a boxsize of 320 px. An initial model was generated using cryoSPARC ab initio reconstruction (N = 3) on a subset of the particles. Particles were further curated using this initial model as a three-dimensional (3D) template for cryoSPARC Heterogeneous Refinement (N = 3), resulting in 451,760 particles (green map, Extended Data Figs. 2 and 3). Curated particles were further classified using cryoSPARC Heterogeneous Refinement (N = 3). Each of the resulting 3D classes were further processed with cryoSPARC Ab Initio Reconstruction (N = 3), generating three distinct models that could be used to sort particles (ref. 1 nsp131–RTC; ref. 2 nsp132–RTC; ref. 3 (nsp132–RTC)2). Using refs. 1,2,3 as 3D templates for heterogeneous Refinement (N = 6), multi-reference classification was performed on the 451,760 curated particles. Classification revealed three unique classes: nsp131–RTC (map 2; 85,206 particles, yellow), nsp132–RTC (map 3; 315,216 particles, red) and (nsp132–RTC)2 (map 4; 35,403 particles, blue). Particles within each class were further processed using RELION 3.1-beta Bayesian Polishing49. Polished particles were refined using cryoSAPRC Local and Global CTF Refinement in combination with cryoSPARC Nonuniform Refinement50, resulting in structures with the following particle counts and nominal resolutions: nsp131–RTC (85,187 particles; 3.2 Å), nsp132–RTC (315,120 particles; 2.9 Å) and (nsp132–RTC)2 (35,392 particles; 3.3 Å). To facilitate model building of nsp132–RTC, particles from nsp131–RTC and nsp132–RTC were combined in a cryoSPARC Nonuniform Refinement, subtracted (masking the RTC), and further refined with cryoSPARC Local Refinement using a mask encompassing the RTC. The resulting map, deemed RTC (local), had nominal resolution of 2.8 Å. Additionally, particles from the nsp132–RTC were subtracted in different regions (using separate masks for nsp12-NiRAN, nsp13T and nsp13F) and the particles from each subtraction were further refined with masked cryoSPARC Local Refinement. The resulting maps had the following nominal resolutions: nsp13T(local), 3.4 Å; nsp13F(local), 3.3 Å and nsp12-NiRAN(local), 2.7 Å. Locally refined maps were combined into an nsp132–RTC composite map using PHENIX ‘Combine Focused Maps’51,52, with a resulting nominal resolution of 2.8 Å. The nsp13-RecA domains in particles from the nsp131–RTC and nsp132–RTC classes were sorted using particle subtraction (masking around the RecA domains, shown as red mesh in Extended Data Figs. 2 and 3), followed by masked RELION 3D classification. Classification of RecA domains in the nsp131–RTC particles (pale yellow) did not reveal discrete conformational heterogeneity in the RecA domains. However, classification of RecA domains in the nsp132–RTC particles (light red) revealed unique conformations of the RecA domains with the following particle counts and nominal resolutions: RecA classI (52,403 particles; 3.5 Å), RecA class II (102,615 particles; 3.1 Å), RecA class III (54,830 particles; 3.5 Å), RecA class IV (105,272 particles; 3.1 Å). Local-resolution calculations were generated using blocres and blocfilt from the Bsoft package30.

Model building and refinement

For an initial model of the nsp132–RTC, the initial RTC model was derived from PDB 6XEZ (ref. 20) and the initial nsp13 model from PDB 6ZSL (ref. 33). The models were manually fit into the cryo-EM density maps using Chimera53 and rigid-body and real-space refined using PHENIX real-space-refine51,52. For real-space refinement, rigid-body refinement was followed by all-atom and B factor refinement with Ramachandran and secondary structure restraints. Models were inspected and modified in COOT54.

MD simulations

General simulation setup and parameterization

Proteins, RNAs and ions were parameterized with the DES-Amber SF1.0 force field55,56. ATP and ADP were parameterized using the DES-Amber nucleobase, sugar and phosphate parameters (force field files are available on request). The systems were solvated with water parameterized with the TIP4P-D water model57 and neutralized with a 150 mM NaCl buffer. The systems of isolated nsp13 each contained roughly 160,000 atoms in a 110 × 110 × 110 Å3 cubic box, and the systems of the nsp132–RTC complex each contained roughly 887,000 atoms in a 190 × 190 × 190 Å3 cubic box.

Systems were first equilibrated on GPU Desmond using a mixed NVT/NPT schedule58, followed by a 1-µs relaxation simulation on Anton, a special-purpose machine for MD simulations59. All production simulations were performed on Anton and initiated from the last frame of the relaxation simulation. Production simulations were performed in the NPT ensemble60 at 310 K using the Martyna–Tobias–Klein barostat61. The simulation time step was 2.5 fs, and a modified r-RESPA integrator62 was used in which long-range electrostatic interactions were evaluated every three time steps. Electrostatic forces were calculated using the u-series method63. A 9-Å cutoff was applied for the van der Waals calculations.

System preparation

The initial conformations of class II nsp13T bound to the various substrates (ATP-Mg2+/RNA, ADP-Mg2+/RNA, ATP-Mg2+ and ADP-Mg2+) were prepared based on the cryo-EM structure of the class II nsp132-BTC5. The initial conformation of the class IV, 1B-open nsp13T structure was prepared from the cryo-EM class IV nsp132-BTC5 structure. AlF3 was removed from the active site. Missing loops and termini in proteins were capped with ACE/NME capping groups. In simulations with ATP at the active site, ATP was manually placed using ADP in the cryo-EM structure as the reference. The systems were prepared for simulation using the Protein Preparation Wizard in Schrödinger Maestro (Schrödinger Release 2020-4: Maestro, Schrödinger, LLC, 2020).

Simulation analysis

The average r.m.s.d. was calculated for the RecA2 domain (residues 450–690) and 1B domain (residues 145–200) of nsp13T between the cryo-EM structures and instantaneous structures from the trajectories, aligned on the RecA1 lobe (residues 240–440). Simulation structures shown in figures were rendered using PyMol (The PyMOL Molecular Graphics System, v.2.0 Schrödinger, LLC).

Quantification and statistical analysis

The local resolution of the cryo-EM maps (Supplementary Figs. 4 and 6) was estimated using blocres30 with the following parameters: box size 15, sampling 1.1 and cutoff 0.5. Directional 3DFSCs (Supplementary Figs. 4 and 6) were calculated using 3DFSC64. The quantification and statistical analyses for model refinement and validation were generated using MolProbity65 and PHENIX52.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.