## Introduction

Most viruses utilize the host transcription apparatus to express their genes, and viral genomes contain assorted cis- and trans-elements that manipulate the host transcription machineries1,2. Most early genes of lambdoid phages are preceded by transcription terminators; therefore, the host transcription apparatus must be converted to a terminator-resistant form to promote full gene expression of viral genes1,3,4. For λ, anti-termination is promoted by the virus-encoded N protein, which binds to the cis-acting nut sites and suppresses transcription termination5,6,7. By contrast, bacteriophage HK022, discovered in Hongkong in the early 1970s, is related to λ phage but only requires cis-acting RNAs, named put (polymerase-utilization), to promote read through of transcription terminators without any dedicated trans-acting protein factors8,9. Put-mediated anti-termination is efficient and robust, and has been shown to suppress both intrinsic and ρ-dependent transcription terminations9,10,11.

The HK022 genome contains two putRNAs—putL and putR, located downstream of the early promoters PL and PR, respectively11. Both putRNAs are ~70-nt long, share sequence similarity, and are composed of two stem-loop structures12. Whereas putL is located just downstream of the PL promoter, putR is located ~270-nt downstream of the PR transcription start site11. The anti-termination activity of the putRNAs persists even on terminators located at least 10 kb away and is not dependent on the tethering between the putRNA and the elongation complex (EC) via the nascent transcript, suggesting that putRNA itself can remain stably associated with the EC as elongation proceeds13. The activity of putRNA is blocked by the host RNA polymerase (RNAP) mutations that are located exclusively in β’ zinc-binding domain (β’ZBD)10,14,15. This genetic evidence together with biochemical evidence suggests that putRNA interacts with the RNAP via the β’ZBD16.

In addition to anti-termination activity, putRNA exhibits anti-pausing activity. PutL inhibits backtracking at a pause located 21-nt downstream of the second stem (stem II) of putL, and this activity is abrogated by the insertion or deletion of several bases between the putL and the pausing site17. The dependency on the distance between putRNA and its location of action suggests that the anti-pausing and anti-termination activities of putRNA differ mechanistically. Interestingly, putRNA reduces both backtrack and hairpin-dependent pauses like RfaH18. RfaH, a paralog of NusG, recognizes an ops (operon polarity suppressor) sequence on the non-template DNA strand loaded onto the RNAP EC, changes its C-terminal helices into a β-sheet KOW domain fold to become active, and inhibits transcriptional pausing by resisting RNAP swiveling19,20.

A structural analysis of the putRNA-associated EC is required to understand the molecular mechanism of anti-pausing and anti-termination activities of the putRNA. In this study, we synthesized the putL RNA using the Eco RNAP σ70-holoenzyme initiated from the native HK022 PL promoter, and captured the modified ECs using a transcription roadblock for cryo-EM analysis. We observed putRNA-associated EC (putEC), putRNA-absent EC (put-less EC), and σ70-bound EC that contains intact putRNA (σ70-bound putEC). Comparison between putEC and put-less EC structures revealed that the putRNA binding to the β’ZBD hinders pausing by reducing the swiveling motion of the EC. Additionally, the σ70-bound putEC structure suggested that σ70 binding to EC might facilitate RNA folding as well as play a role in transcription modulation.

## Results

### Preparation and examination of putEC

Because an active form of HK022 putRNA can be produced only by the enzymatic synthesis using host RNAP, we prepared the putRNA-associated EC by initiating RNA synthesis with Eco RNAP holoenzyme and stalling the synthesis using a roadblock protein LacI (lac repressor), as previously described with some optimization for cryo-EM study (Fig. 1)16. Briefly, we first synthesized a DNA scaffold including HK022 PL promoter, the putL sequence, and the lacO sequence (LacI binding sequence) (Fig. 1a). Hereafter, we will use ‘putRNA’ to denote putL RNA for convenience. Holoenzyme containing Eco core RNAP and σ70 was added to the DNA scaffold to form an open complex, and LacI was added to bind to the lacO sequence on the DNA as a roadblock (Fig. 1b). Upon rNTP addition, RNAP synthesized the putRNA and stalled on the DNA at the roadblock. Excess rNTP was then removed by gel filtration column to prevent further RNA synthesis, and isopropyl β-D-thiogalactopyranoside (IPTG) was added to release the LacI from the DNA scaffold. The complexes were concentrated for cryo-EM grid preparation as well as native mass spectrometry (nMS) analysis.

To stall the EC at a site where the RNAP pauses in the absence of putRNA, we generated multiple DNA scaffolds having various distances between the pausing site and the lacO sequence, and performed radiolabeled transcription assay with these scaffolds. For screening, we used a put DNA scaffold that allows transcriptional pausing at the pausing site as a control (Fig. 1c, Supplementary Note 1, Supplementary Fig. 1, Supplementary Table 1)11. From the screening, we chose a scaffold that has a 7-nt spacer between the pausing site and the lacO sequence (7-nt scaffold), and then attempted to analyze the assembled putEC by nMS using the same workflow as in our previous nMS studies of bacterial ECs21,22,23. However, we were unable to observe the fully assembled putEC due most likely to sample instability or sample heterogeneity and adduction on the long, exposed nucleic acid scaffold during nMS analysis. Nevertheless, nMS analysis of the RNAs extracted from the reconstituted putEC revealed two main populations—one RNA was synthesized until the known pausing site (C94), and the other RNA extended by 1-nt (U95) (Fig. 1d)11,17. The quantity of the U95 RNA was roughly 1.5 times more than the C94 RNA; therefore, we modeled the U95 RNA placing the U95 nucleotide at the i + 1 site. Since the C94 and U95 nucleotides are located at the i and i + 1 sites, the modeled structures would exhibit the same RNA-DNA register both for the C94 and the U95 RNA-containing ECs at the post- and pre-translocated states, respectively (See below).

### Cryo-EM structures of the putEC in three conformations

Cryo-EM analysis of the putEC prepared by promoter-dependent transcription initiation, transcription elongation, and LacI roadblocking revealed three EC populations at sub-4 Å resolution: (1) the putEC (3.2 Å-resolution, 36.9% of the EC particles) that contains well-folded putRNA, (2) the put-less EC (3.6 Å, 22.4%) that does not display any well-defined putRNA density and (3) the σ70-bound putEC (3.6 Å, 40.7%) that contains both σ70 and putRNA (Table 1, Supplementary Figs. 2 and 3). We also observed a population consisting of the RNAP holoenzyme loaded onto the template DNA. This population probably resulted from abortive initiation and generated a 3.0 Å-resolution map. This complex is not discussed here because the structures of the holoenzyme open complex have been described in previous reports24,25,26.

In the cryo-EM structure of the putEC, the putRNA was located at the opening of the RNA exit channel of the EC adjacent to the β’ZBD (Fig. 2a). This location is consistent with the put-inactivating RNAP mutations and potentially would restrict the RNA hairpin formation in the adjoining RNA exit channel via electrostatic repulsion (Supplementary Fig. 4)10,27. The quality of the cryo-EM map allowed us to build the highly-structured putRNA de novo (estimated local resolution of the cryo-EM map around the putRNA was ~3.5 Å; Fig. 2b, c, Supplementary Note 2, Supplementary Figs. 57). The modeled putRNA from U2* to U74* contains twenty-one Watson-Crick (WC) base pairs and five non-canonical base pairs, A9*-G35* (Saenger class VIII), G12*-U32* (Saenger class XXVIII), G43*-A64* (Saenger class XI), G42*-U65* and C44*-A63* (Supplementary Fig. 8). To distinguish the nucleic acid residues of the putRNA from the amino acid residues in the RNAP, we have added an asterisk (*) to the residue number of the putRNA throughout this manuscript. Surprisingly, the cryo-EM structure of the putRNA was different from previously published data11,12 as follows (Fig. 2b–d): First, the 5’-end of the putRNA is not C10* but A3*. In the structure, the putRNA region from A3* to G7* makes an RNA duplex with the opposite strand from U19* to C15*. Interestingly, this corresponds to the result of the putL V1 RNase reaction, which suggested the presence of RNA duplex in the upstream of C10*12. Furthermore, this RNA duplex interacts with another RNA strand from U21* to C25* forming an unexpected minor groove RNA triplex structure28. The deletion of the third RNA strand, Δ20*−23*, decreased the anti-termination activity of the putRNA by ~50%12, indicating that the triple helix region has a significant effect on the function of the putRNA. Second, G35*, which was expected to be located at the bifurcation point of the two stem regions in an unpaired state, base pairs with A9*. This G35*-A9* base pair provides a platform for β’ZBD binding and stabilizes the overall structure of the putRNA. This base pair explains why the G35*U mutation retained 70% of the anti-termination activity while the G35*A mutation completely abolished the activity12. Meanwhile, A9*C mutation abolished the in vivo anti-termination activity, implying complicated effects of mutations on the G35*-A9* base pair12. Third, the putRNA contains a bulged loop region (from C26* to G29*) in the middle of stem I, in contrast to the prediction that stem I has a loop region at the end of the stem I ranging from C18* to C26*. This region also provides an interface for binding to the RNAP. At last, the middle region of stem II exhibits distinct base pairings compared to the predicted structure. The middle region of stem II in the cryo-EM structure contains three non-canonical base pairs with three unpaired bases instead of having one non-canonical base pair with five unpaired bases in the predicted structure. This region has relatively high local resolution indicating its structural stability, and makes interfaces with RNAP and the stem I of the putRNA.

### Interactions between the putRNA and the EC

In the putEC structure, the ‘V’-shaped putRNA binds to the prominent β’ZBD by its pothole formed in the center of the ‘V’ (Fig. 3a). The β’ZBD fits snuggly to the putRNA surface, generating a 1130.3 Å2 interface area formed by ~33% of the total putRNA residues29. At the backside of the putRNA-β’ZBD interface, the N-terminal loop of the β flap-tip helix makes significant contact with the putRNA with an interface area of 281.6 Å2. Most of the potential interactions between the putRNA and the RNAP comprise polar interactions such as salt bridges, hydrogen bonds, cation-π interactions, and long-range ionic interactions. Although the resolution of the map is not sufficient to specify these short-range interactions, we suggested possible interactions for reference (Figs. 2c, 3b, Supplementary Table 2). At the bifurcation point of the two stem structures, β’R77 locates like a wedge to separate G35* and G36* and forms a cation-π interaction with G35*. This cation-π interaction is often found between the terminal, exposed base of a nucleic acid bound to a protein and the protein loop that confines the nucleic acid. In addition, β’L78 and β’K79 are located between G35* and G36*, stabilizing the separation of stem I and stem II of the putRNA.

A mutant named put, or mutant G, has the sequence A43GAUC47 and does not exhibit anti-termination activity11. Our transcription assay revealed that put also has poor anti-pausing activity (Fig. 1c). In the structure, this region does not directly interact with the RNAP, however, its counter-strand, from the 59th to 64th residues, forms a central area of the binding interface. Therefore, the base substitutions in the put mutant likely change the structure of the binding interface and disrupt putRNA binding to the RNAP. It is also possible that these mutations interfere with the proper folding of the putRNA as well.

To test the validity of the structure of the putEC modeled in the cryo-EM density, we introduced assorted mutations in the template DNA and performed in vitro radiolabeled transcription assays to examine the effects of the mutations on the anti-pausing activity (Fig. 3c, Supplementary Fig. 9). For the quantification of the anti-pausing activities of the mutants, the anti-pausing activities of wild-type put and put were set to 1 and 0, respectively, and the anti-pausing activity of each mutant was located on a linear scale accordingly (Details are in the Methods section). To display the location of the mutated residues as well their conservation, the conservation of the putRNA residues was calculated from the sequence alignment with ten known put sequences and marked by color (Fig. 3d, Supplementary Fig. 10a)30,31. Among the twenty-three mutations we generated, eleven mutants showed ≤ 20% anti-pausing activity (named ‘inactivating’ mutations) and three mutants showed ≥ 90% anti-pausing activity (named ‘inert’ mutations). The inactivating mutations, Δ3*−7*, U28*A, U28*C, G35*A, G35*U, G35*C, G45A*, A64*G, G35*C/A9*G, G35*A/A9*G, and G35*A/A9*U, suggest that (1) the 5’-region (from A3* to G7*) is essential for the anti-pausing activity. A3GACG7 and its base-pairing region, U19CUGC15 have relatively high conservation scores of (6,6,4,9,9) and (7,7,5,10,10), respectively. This region is the first RNA duplex formed during the putRNA synthesis, and therefore, may provide a platform for further RNA folding. (2) U28*, which protrudes toward the β’ZBD and binds to a small pocket is essential for the function. Interestingly, while U28*A and U28*C abolish the anti-pausing activity, U28*G retained ~60% of the activity. From the structure, we substituted the U28* with the other bases and found that G can form three hydrogen bonds with the surrounding β’ residues while A and C form two and one potential hydrogen bonds, corroborating the result of the mutational study (Supplementary Fig. 10b). Interestingly, the original put residue, U28* forms fewer hydrogen bonds than guanine and adenine, but exhibits better activity than these, implying that U28* might have additional role(s) besides binding to the RNAP, or the mutants might have different structures from the modeled ones (Discussed below). (3) All of the G35* mutations we generated abrogated the anti-pausing activity of putRNA. We expected that the double mutants, G35*C/A9*G, and G35*A/A9*G might have some activity because they preserve the predicted base-pairing of G35*-A9* in the structure. However, mutating G35* to any base abrogated the anti-pausing activity and this was not recovered by the mutation of the base-pairing partner, implying that G35*, and possibly its base-pairing partner A9*, may have sequence-specific roles in the anti-pausing activity. We noticed that G35*U exhibited ~70% anti-termination activity in vivo12. This discrepancy could come from the different conditions encountered in vivo vs. in vitro. For example, the G35*U might form some intact or partially active putRNA in vivo, possibly aided by an unknown cellular factor(s) whereas in vitro synthesized putRNA containing G35U* could be inactive. (4) We also found that A64* is critical for the anti-pausing activity. This result is also consistent with the structural data because it contacts the stem I region of putRNA and the RNAP. All the inert mutations are of U20*, which lacks any significant interaction with other residues, supporting our structure. The remaining nine mutants exhibited moderate activities suggesting a significant, but not critical role of the residues (A8*, U21*, C25*, U32*, G43*). In summary, our mutagenesis study supports our cryo-EM structure of the putEC.

### The comparison of the putEC, the put-less EC, and other ECs

To determine if putRNA binding to the EC changes the conformation of the EC to suppress transcriptional pausing, we aligned the putEC with multiple EC structures including non-paused EC (PDB 6ALF), RNA hairpin-paused EC (PDB 6ASX), backtracked PEC (paused EC) (PDB 6RIP), and the put-less EC determined here (Fig. 4a, b, Supplementary Table 3)21,32,33. We assume that the put-less EC contains a roadblocked but unfolded RNA because (1) the majority (>~70%) of the sample was roadblocked properly (Fig. 1d), and both the putEC and the put-less EC together comprise ~60% of the EC population in the cryo-EM data, (2) the third EC class, σ70-bound putEC shows extra RNA duplex density connected to the putRNA suggesting that this class was not properly roadblocked, and (3) the put-less EC map contains some weak RNA density around the RNA exit channel and the β’ZBD, implying that the RNA is present, but it is not well-structured. We suggest this put-less EC could serve as a good negative-control model as shown in a previous study32.

We first examined the swiveled states of the ECs (Fig. 4b). Swiveling indicates the rigid-body rotation of a set of domains—the clamp, dock, shelf, jaw, SI3, and the C-terminal region of the β’ subunit—about an axis parallel to the bridge helix toward the RNA exit channel, and known to interfere with the proper folding of the trigger-loop which is required for efficient nucleotide addition to the nascent RNA. Swiveling was first introduced from the structural study of hisPEC, and later revealed in the backtracked PEC, implying that the swiveling motion potentially plays an important role in both RNA hairpin pause and backtrack pause32,33,34. The alignment of the EC structures according to the core module revealed that the putEC structure is most similar to the non-paused, active EC conformation, having the lowest RMSD values between Cɑ-carbons of domains as well as the smallest swiveling angle of 1.2° (Fig. 4b, Supplementary Table 3). The put-less EC is more swiveled than the putEC, having a swivel angle of 1.8°, although the swiveling angle of the put-less EC was less than that of the hisPEC or backtracked PEC (3.1° and 2.6°, respectively; Supplementary Table 3). Interestingly, the conformational difference between the putEC and put-less EC is more noticeable in the βSI2 (or βi9) region with 15 Å-distance between the Cɑ atoms of βE1006, which is located at the end of the βSI2 domain. While the swiveling motions of the aligned ECs are relatively continuous with the rotation angles from 1.2° to 3.1°, the arrangement of the βSI2 is more discrete – the βSI2 in the putEC overlaps with that of the non-paused EC while the βSI2 of the put-less EC is in the same location with the hisPEC. Interestingly, the βSI2 of the backtracked PEC is located between the two conformations. These conformational features suggest that the proper folding and binding of the putRNA to the EC moved the EC toward the non-swiveled, active state, aiding pause escape or omission.

The strength of the RNA-DNA hybrid influences pausing and termination35,36. Therefore, we compared the RNA-DNA hybrid of the putEC and the put-less EC (Fig. 4c, left). In the putEC, the active site region of the RNA-DNA hybrid exhibited a post-translocated state similar to the non-paused EC at the high threshold value of the map. As the threshold value decreases, the putEC map revealed a density blob for a nucleotide base that base pairs with the template DNA base at the i + 1 site. This density became connected to the nascent RNA at the lower density threshold. As stated above, we suspect that this results from the mixed population of the nascent RNAs roadblocked at either +94 or +95 position, having either post- or pre-translocated states, respectively. However, we did not observe any classes having a folded trigger-loop with the SI3 domain shifted closer to the βlobe domain as in the Eco RNAP structure of the pre-translocated state24. In addition, the putEC contained 11 template DNA bases in the RNA-DNA hybrid, in contrast to other reported EC structures (Fig. 4d). To contain one additional nucleotide in the main channel, the lid, which is known to aid the unwinding of the RNA-DNA hybrid, is pushed by about 2.6 Å (by the Cɑ atom of β’256D) compared to the known non-paused EC (Supplementary Fig. 11a)19,21,33. However, it is not certain if this 11-nt hybrid is just an alternative conformation of an EC, or a specific conformation in the putEC.

The put-less EC showed distinct RNA-DNA base-pairing at the i + 1 site (Fig. 4c, right). In the put-less EC, the template DNA base at the i + 1 site is more tilted toward the RNA base at the i site; therefore, it is not optimally placed for substrate binding. In fact, the RNA base at the i site is more closely associated with the DNA base at the i + 1 site than that of the i site. Consequently, the base-pairing hydrogen bonds are broken between the template DNA base and the nascent RNA base at the i site. The remaining region of the RNA-DNA hybrid of the put-less EC overlaps well with that of the non-paused, active EC as in the putEC. The conformational difference of the nucleotides at the active site between the putEC and the put-less EC indicates that putRNA binding to the β’ZBD influences the active site conformation, even though the catalytic magnesium ion is ~62.5 Å away from the zinc ion in the β’ZBD. This was also shown in the hisPEC structure, where the pause hairpin placed in the RNA exit channel has an influence on the active site as well as the bridge helix32,34,37. The length of the template DNA in the RNA-DNA hybrid of the put-less EC was also 11-nt, implying that this longer RNA-DNA hybrid is not caused by the putRNA.

In addition to these changes, we also observed that the RNAP domains of the putEC have similar locations to those of the non-paused EC while the domains in the put-less EC have a similar arrangement with those of backtracked PEC (Supplementary Table 4). Although we could not find any density for the backtracked RNA in the put-less EC, the pausing site was expected to have a backtrack pause. In summary, from the structures of the putEC, put-less EC, and other ECs, we found that the putRNA binding to the EC leads to the anti-pausing activity by promoting the active, non-swiveled conformation of the EC.

### σ70-bound putEC structure

The third EC population, σ70-bound putEC, contains a σ70 bound to the clamp helices in addition to the well-folded putRNA as in the putEC (Fig. 5a, Supplementary Fig. 2). In contrast to the holoenzyme structure, the σ70-bound putEC map reveals only σ1.2, σNCR and a part of σ2, indicating that the σ2 binding to the EC is relatively stable while the other σ domains are very mobile as predicted in a prior study38. It has been reported that σ70 can remain associated with RNAP after promoter escape and the association is enhanced when the non-template DNA contains a −10 element-like sequence in the promoter-proximal region that induces σ-dependent pausing39,40,41. In particular, σ-dependent pausing provides a time and space window for the anti-termination λQ protein to bind to the EC and read through the intrinsic terminator42,43. Recently, cryo-EM structures of σ70-bound ECs were reported in the context of 21Q-, λPR’-, and Qλ-associated ECs44,45,46,47,48. While these complexes are at the paused state in that the σ2 domain interacts with a −10-like sequence, our σ70-bound putEC is not in a σ-dependent paused state and contains > 100 base-long RNA having a σ70 in a different conformation from those in other σ70-bound ECs (Fig. 1d).

In the σ70-bound putEC structure, we noticed that the RNAP contains an open clamp (79.3 Å opening), which is ~20 Å larger than the non-paused EC23. This suggests that the σ70-bound putEC is in an inactive state. We suspect that this class might represent the partial run-off EC population that appeared in the nMS analysis (Fig. 1d) because (1) the main channel of the RNAP did not contain downstream duplex DNA while the RNA-DNA hybrid was present and (2) an RNA duplex density, which is connected to both putRNA and the RNA-DNA hybrid, was observed in the RNA exit channel, indicating that the RNA was transcribed beyond the roadblock site (Fig. 5a, Supplementary Fig. 11b). We used the RNAfold Server to search potential RNA secondary structures in the template DNA and found that it contains a potential RNA hairpin sequence downstream of the roadblock site (Supplementary Fig. 11b)49. We, therefore, modeled the RNAP and nucleic acid scaffold into the map and found that the potential RNA hairpin matches well with the extra density observed in the RNA exit channel (Fig. 5a, Supplementary Fig. 11b). The location of this extra RNA duplex overlaps with the pause hairpin in the hisPEC32,34.

The putRNA density in the σ70-bound putEC was at a lower resolution than that in the putEC; however, the putRNA map region was identical to that in the putEC. The RMSD of the whole atoms in the putRNA region in the σ70-bound putEC and the putEC was only 0.839 Å. To compare the σ2-RNAP interaction in the initiation and the elongation stages, we aligned the RNAP clamp-σ2 domain regions from the σ70-bound putEC and the recently published RPo (RNAP-promoter open complex) structure25. For the σ70-bound putEC, we only modeled the visible part for the σ7070 residues 112–151 and 214–447). Then, we compared the two structures only via the modeled σ70 regions and other σ domains were excluded in the comparison discussed below. Not surprisingly, the binding interface between the σ70 and the RNAP, in particular, the β’clamp domain, was different between the RPo and the σ70-bound putEC (Fig. 5b). The binding interface between the β’ subunit and the σ70 was 812 Å2 in the RPo and the interface mostly occurs on the β’clamp helices. By contrast, in the σ70-bound putEC, the interface area was 1287 Å2. This unexpected increase in the binding area results from the newly-formed interface between β’-clamp-toe domain (ranging 144–179)50 and the σ70NCR, the non-conserved σ70 region between σ1.2 and σ2.1 (ranging 274–307 and 359–374 in the structure, Fig. 5b) that does not participate in the RNAP-σ70 interface in the RPo. Since both β’-clamp-toe and σ70NCR are conserved in the γ-proteobacteria, the interaction between these two domains might be specific for the bacteria class. In addition, the shifted position of the σ2 domain in the σ70-bound putEC is more suitable for the σ70 to associate with the progressing EC because this conformation provides space for the upstream DNA to rewind and exit from the main channel of the RNAP. If the σ70 is bound to the RNAP as in the holoenzyme, σ70 would clash with the exiting upstream duplex DNA. However, at the moment, further investigation would be required to see whether these new interactions between the σ70 and the RNAP in the σ70-bound putEC are due to the transcription stage transition from initiation to elongation, or to the clamp opening which inactivates the transcription activity of the RNAP.

Additionally, we found a low-resolution blob in the main channel for the downstream DNA (Fig. 5a). The DNA scaffold used in the study spans to +122 position while the RNA modeled in this map ends at +105. nMS analysis revealed three RNA populations of 110-mer, 114-mer and 116-mer (Fig. 1d). Therefore, there should be some downstream duplex DNA around the RNAP. However, the low-resolution of the blob prevents us from locating any specific molecule in the density. We suspect that the blob could be either from the downstream duplex DNA, which is very mobile due to the open clamp conformation, or from the σ701.1 because the σ701.1 is known to bind at the position in the holoenzyme before the enzyme binds to promoter DNA. We would need further investigation to confirm this speculation.

## Discussion

In this study, we extended prior studies on the putRNA by determining its three-dimensional structure when complexed with RNAP. Our result corroborates previous analyses suggesting a two-stem structure with multiple indents and bulges. However, cryo-EM structures also revealed new and unexpected features such as an unexpected boundary of the put transcript, a short triple RNA helix in the putL stem I, and alternative base pairs. The importance of many of these features is strengthened by the observed effect of specific put mutations11,12. The structure provides clear physical evidence that the putRNA binds to the β’ZBD, a result that is strongly supported by prior genetic and biochemical experiments on putRNA. The structure also revealed a mechanistic explanation for the anti-pausing activity promoted by putRNA-RNAP interaction. When putRNA is bound to the β’ZBD, the RNAP is held in a non-swiveled, active conformation, which is associated with anti-pausing activity as previously shown in the RfaH-associated EC19. In contrast, a put-less EC exhibited a swiveled conformation suggesting that the EC is in a paused state when transcription elongation is physically blocked at a pausing site. Together, the structures revealed that putRNA promotes RNA synthesis by resisting swiveling.

We were surprised to observe a putEC population that retained σ70 even though the EC had progressed about 100 nucleotides from the start of transcription. The occurrence of the σ70-bound putEC and its structure suggests a few intriguing points. First, the σ70-bound EC successfully folded putRNA, even more efficiently than a complex lacking σ70. We found that the ratio between the putEC and the put-less EC is roughly 2:3 from the number of particles in each class, presumably reflecting the success rate of putRNA folding in vitro. Curiously, there was no put-less σ70-bound EC, suggesting that the presence of σ70 aided the proper folding and stabilization of putRNA. We found that the putL sequence contains a weak −10-like sequence (NANNAT) located at positions +23 to +28 relative to the start of the putL transcript, which lies on the third strand of RNA triple helix and a bulge region of stem I (Figs. 1a, 2b, c). A −10-like sequence is known to induce σ-dependent pausing by engaging its non-template DNA region with the σ2 domain51. We suggest that this sequence may cause σ-dependent pausing which facilitates putL folding by providing more time. Notably, among the ten putRNA sequences we aligned, all the putLs contain the identical −10-like sequences while all putRs do not (Supplementary Figs. 10a, 11c). Therefore, we speculate that σ-dependent pausing may be necessary for putL folding but not for putR which is located further downstream of its promoter. In addition, U28* is completely conserved in putL and the critical 6th residue of the −10-like sequence. Although U28*G exhibited intermediate activity in our mutagenesis study, U28*A and U28*C nearly abolished activity, supporting the existence and importance of σ-dependent pausing at this position. Furthermore, we examined the ratio between σ70-bound EC and σ70-unbound EC from both the putEC sample and the put-EC to see if the presence of put affects the σ70 retention (Supplementary Fig. 12). The put-EC was prepared in exactly the same way as the putEC preparation except the put template was used as the DNA scaffold and did not show any well-folded putRNA density in the cryo-EM maps. From the cryo-EM data analysis, the percentages of σ70-bound EC in the putEC (having intact put) and the put-EC sample were ~40.7% and ~44.2%, respectively, suggesting that the presence of put does not affect σ70 retention. Second, the σ70-bound EC was resistant to the LacI roadblock. The σ70-bound putEC revealed an extra density for a duplex RNA in the RNA exit channel, suggesting that the retained σ70 modified the EC to overcome the roadblock during elongation (Figs. 1d, 5a, Supplementary Fig. 11b).

Structural studies on prokaryotic anti-termination complexes including λN, Q21, Xoo P7, Qλ, and HK022 put suggest general strategies for anti-termination7,44,45,47,48. (1) The anti-termination factors inhibit RNA hairpin formation by either narrowing the channel or hindering the RNA hairpin folding (Supplementary Figs. 4, 13). The RNA exit channel is thought to aid RNA hairpin formation by its positively-charged residues located inside the channel32. In Q21, Qλ and Xoo P7 anti-termination complex, the anti-termination factors, Q21, Qλ, and P7 proteins bind at the mouth of the RNA exit channel and confine the channel (Supplementary Fig. 13). The narrowed RNA exit channel only allows single-stranded RNA to move through it and restricts nascent RNA folding for hairpin-dependent pausing and intrinsic termination. In λN anti-termination complex, λN binding to the EC remodels the bound NusA and NusE to destabilize the RNA hairpin folding. In addition, the rearranged Nus factors bind to β flap-tip, which stabilizes RNA hairpin pause, possibly preventing the flap-tip from assisting RNA hairpin pausing and termination52. Like λN, HK022 put also does not narrow the RNA exit channel directly. Instead, the phosphate backbone of the putRNA is located near the RNA exit channel, prohibiting the RNA hairpin formation with its negative-charged surface. Modeling an RNA duplex in the RNA exit channel of putEC shows that the phosphate backbones of the modeled RNA duplex and the putRNA are just ~5 Å apart from each other (Supplementary Fig. 4). In addition, the β flap-tip binds to the putRNA, possibly sequestering it from assisting RNA hairpin pause as in the λN-anti-termination complex. (2) In general, anti-termination proteins stabilize the elongation-proficient conformation of EC. λN transverses the RNAP hybrid cavity stabilizing the active form of the EC and binds to the upstream duplex DNA, enhancing the anti-backtracking and anti-swiveling activity of NusG. In the Q21-EC structure, Q21 binding is not compatible with swiveled conformation. Therefore, Q21 counteracts swiveling, leading to anti-pausing47. Our data suggest that putRNA also reduces swiveling. This stabilization of the active form of an EC may consolidate the RNA exit channel so that it can no longer accommodate the folding of secondary structures that promote pausing and termination53,54,55.

Komissarova et al.17, found that ΔU68* does not suppress termination, but retains anti-pausing activity in vitro. U68* is located at the lower region of stem II like a wedge, forming no base-pairing. According to our modeling, the presence of U68* kinks the stem II ~19° (Supplementary Fig. 14). This perturbation might weaken the stability of the putRNA folding by widening the space between the two stem-loop structures. In addition, the structural change would affect the interface between the putRNA and the RNAP because the interface is composed of putRNA residues from both stem I and II. Therefore, putRNA without U68* might be well-folded and reduces pausing immediately after synthesis but may unfold or dissociate from the RNAP before encountering terminators located further downstream. Alternatively, the mutant RNA may not be able to adopt an anti-terminating structure which could be different from the anti-pausing structure in vitro.

In λ phage paradigm, the anti-termination factor λN plays the role of gatekeeper for the infection process. In other words, λN accumulation is required to transcribe early genes of the genome. HK022, instead, has Nun protein, which competes with λN and blocks λ transcription. In addition, the HK022 genome harbors the put element in the place for the λ nut (N-utilization) sites, which are required for the action of λN. By substituting the N protein with Nun, HK022 acquired immunity against its competitor, λ. HK022, instead, lacks a λN-like anti-termination factor, but relies solely on the putRNA to promote full expression of its early genes. These differences benefit HK022 survival, without increasing transcription regulation complexity.

In this study, we investigated the anti-pausing mechanism of putRNA. Since transcriptional pausing is a prerequisite of transcriptional termination, our results provide important insights into the mechanism of putRNA action. It remains possible that putRNA may adopt different structures and/or interactions with RNAP to promote anti-termination as prior studies indicate that anti-pausing and anti-termination activities differ. To deepen our understanding of these events, structural studies on the putRNA-associated EC at a terminator sequence would be required.

## Methods

### Protein expression and purification

Full-length Eco σ70 was expressed from pET21-based expression vector encoding an N-terminal hexa-histidine tag followed by a PreScission protease (GE healthcare) cleavage site. The full-length Eco σ70 plasmid was transformed BL21(DE3) cells and grown at 37 °C. Protein expression was induced at an OD600 of 0.7 with 1 mM IPTG and incubated for 4 hours at 30 °C. Cells were harvested, resuspended in σ70 lysis buffer (20 mM Tris pH 8.0, 500 mM NaCl, 5% Glycerol, 5 mM Imidazole, home-made protease inhibitor cocktail) and lysed by French Press. The supernatant was loaded to Hitrap IMAC HP column (Cytiva) equilibrated with 20 mM Tris pH 8.0, 500 mM NaCl, 5% glycerol. The eluted protein by adding imidazole gradient was concentrated using Amicon Ultra centrifugal filter (Merck Millipore) and injected to HiLoad 16/600 Superdex 200 pg (Cytiva) equilibrated in TGED + 500 mM NaCl. The final elution was flash-frozen using liquid nitrogen after adding 15% glycerol.

Lac repressor (LacI) was purified as described previously57. LacI-containing pBAD plasmid with Kanamycin resistance (pBAD_Kan-LacI) was obtained from Addgene (plasmid #79826). BL21(DE3) cells that were transformed with the plasmid were grown overnight at 37 °C in 2X YT media containing 50 μg/mL Kanamycin. The seed culture was added to 2× YT media containing 50 μg/mL Kanamycin at 1:100 ratio, grown at 32 °C for 2 hours, and moved to 16 °C. Protein expression was induced with 0.2% l-arabinose for 16 hours incubation right after changing the temperature to 16 °C. Cells were harvested and lysed by French Press in lysis buffer (50 mM sodium phosphate buffer pH 8.0, 500 mM NaCl, 20 mM Imidazole, 2.5% glycerol, 1 mM DTT, 10 mM MgCl2, 0.1% Tween-20, 1 mg/mL lysozyme, home-made protease inhibitor cocktail). The lysate was added by 1000 U of DNaseI, and centrifuged to remove cell debris. The supernatant was loaded onto Hitrap IMAC HP (Cytiva) that pre-equilibrated with 50 mM sodium phosphate buffer (pH 8.0), 500 mM NaCl, 20 mM imidazole, 2.5% glycerol, and 0.2 mM DTT. Protein was eluted with 20 mM sodium phosphate buffer (pH 7.4), 300 mM NaCl, imidazole gradient from 30 to 300 mM and concentrated using 30 K MWCO Amicon Ultra Centrifugal Filter (Merck Millipore). The concentrated protein was injected onto HiLoad 16/600 Superdex 200 pg (Cytiva) gel filtration column equilibrated with 20 mM Tris-HCl (pH 8.0), 150 mM KCl, 5 mM MgCl2, and 1 mM DTT. The final eluted protein was added by 15% glycerol, flash-frozen, and stored at −80 °C until use.

### Radiolabeled in vitro transcription assay

In vitro transcription assay is performed as described previously58. Holoenzyme was reconstituted by mixing Eco RNAP and Eco σ70 with 1:2 molar ratio, and incubating for 15 min at 37 °C. Holoenzyme and DNA were mixed with 4:1 molar ratio in glutamate-based T buffer (20 mM Tris-glutamate pH 8.0, 10 mM Mg-glutamate, 150 mM K-glutamate, 5 mM DTT), and incubated at 37 °C for 10 min to make RPo. RPo and LacI were mixed with 1:10 molar ratio and incubated at 37 °C for 10 min. The final concentration of holoenzyme and template DNA in the reaction mixture was 50 nM and 12 nM, respectively. Transcription was started by adding rNTP mix to final concentrations of 200 µM ATP, 200 µM UTP, 200 µM GTP, 25 µM CTP (Cytiva) and 0.05 µM α-32P-CTP (PerkinElmer) at 37 °C, and quenched after 2 min by adding 2× loading buffer (10 M Urea, 50 mM EDTA pH 8.0, 0.05% bromophenol blue, 0.05% xylene cyanol). To show the roadblocked EC is capable of further transcription, the roadblocked EC was added by 2 mM IPTG, incubated for 2 min at 37 °C for LacI dissociation, and added additional rNTP to final concentrations of 162 µM ATP, 162 µM UTP, 162 µM GTP, 75 µM ATP and 0.15 µM α-32P CTP. The samples were loaded on 10% Urea-PAGE gel and ran in 1X TBE. The gel was exposed to an imaging plate (Fujifilm) for 2 hr, and the imaging plate was scanned to get an image (TyphoonTM FLA 7000).

For the mutational study, 50 nM holoenzyme and 12 nM template DNA were used for the transcription assay without roadblocking. In addition, the transcription reaction was quenched at 0-, 0.5-, and 2-min time point, and the data at 0.5 min were used to estimate the relative anti-pausing activity plotted in Fig. 3c although using 2-min data also showed similar result (data not shown). For the transcription reaction, 200 µM ATP, 200 µM UTP, 200 µM GTP, 25 µM CTP, and 0.05 µM α-32P CTP were used. For the estimation of the relative anti-pausing activity, we measured the intensities of the paused and the run-off transcripts of the put constructs, and calculated the fraction of the paused transcripts by dividing the intensity of the paused transcript by the sum of the intensities of the paused and run-off transcripts (Supplementary Fig. 9). The fractions of the paused transcripts were calculated for the wild-type put, inactive put, and mutant put constructs, and their relative anti-pausing activities were calculated by the equation below and plotted:

$${{{{{\rm{Relative}}}}}}\,{{{{{\rm{anti}}}}}}\mbox{-}{{{{{\rm{pausing}}}}}}\,{{{{{\rm{activity}}}}}}\,{{{{{\rm{of}}}}}}\,{{{{{\rm{mutant}}}}}}\,{{{{{\rm{x}}}}}}=1-\frac{({{{{{{\rm{P}}}}}}}_{{{{{{\rm{X}}}}}}}-{{{{{{\rm{P}}}}}}}_{{{{{{\rm{WT}}}}}}})}{({{{{{{\rm{P}}}}}}}_{put-}-{{{{{{\rm{P}}}}}}}_{{{{{{\rm{WT}}}}}}})}\\ ({{{{{{\rm{P}}}}}}}_{{{{{{\rm{x}}}}}}}={{{{{\rm{the}}}}}}\,{{{{{\rm{fraction}}}}}}\,{{{{{\rm{of}}}}}}\,{{{{{\rm{the}}}}}}\,{{{{{\rm{paused}}}}}}\,{{{{{\rm{band}}}}}}\,{{{{{\rm{of}}}}}}\,{{{{{\rm{mutant}}}}}}\,{{{{{\rm{x}}}}}})$$

For the paused fraction quantification, the intensities for the run-off and paused transcripts were calibrated according to the number of cytosines the transcripts contain. The assay was done in triplicate (n = 3 independent experiments).

### Native mass spectrometry analysis

The RNA portion of the de novo reconstituted putEC was prepared by phenol/chloroform extraction, resuspended in RNase-free water and flash-frozen in liquid nitrogen. Prior to analysis, the sample was thawed and then buffer-exchanged into nMS solution (500 mM ammonium acetate, 0.01% Tween-20, pH 7.5) using Zeba desalting microspin columns (Thermo Fisher). The buffer-exchanged sample was diluted to 5 µM with nMS solution and was loaded into a gold-coated quartz capillary tip that was prepared in-house. The sample was then electrosprayed into an Exactive Plus EMR instrument (Thermo Fisher Scientific) using a modified static nanospray source59. The MS parameters used were similar from previous work22: spray voltage, 1.2 kV; capillary temperature, 150 °C; S-lens RF level, 200; resolving power, 8750 at m/z of 200; AGC target, 1 × 106; number of microscans, 5; maximum injection time, 200 ms; in-source dissociation, 10 V; injection flatapole, 10 V; interflatapole, 7 V; bent flatapole, 6 V; high energy collision dissociation, 85 V; ultrahigh vacuum pressure, 6.6 × 10−10 mbar; total number of scans, 100. Mass calibration in positive EMR mode was performed using cesium iodide. Raw nMS spectra were visualized using Thermo Xcalibur Qual Browser (version 4.2.47). Data processing and spectra deconvolution were performed using UniDec version 4.2.060,61. The UniDec parameters used were m/z range: 2000–7000; mass range: 25,000–45,000 Da; sample mass every 0.5 Da; smooth charge state distribution, on; peak shape function, Gaussian; and Beta softmax function setting, 20. The expected masses for the de novo synthesized RNA include 94-mer (30,630 Da), 95-mer (30,936 Da), 110-mer (35,776 Da), 114-mer (37,038 Da), and 116-mer (37,671 Da). The mass deviations of the measured masses from the expected masses were within 1 Da or less.

### PutEC preparation and cryo-EM grid freezing

Holoenzyme was formed by mixing Eco RNAP and Eco σ70 with 1:2 molar ratio and incubating for 15 min at 37 °C, and purified in Superdex 200 Increase 10/300 Increase GL column (Cytiva). Template DNA was amplified in thermocycler. pRAK31 plasmid62 was used as template DNA for the PCR reaction. The forward and reverse primer sequences (from Macrogen) for the reaction are as follows; Forward primer-5’-GCATGAATTCCTATTGGTACTTTACATTAA-3’, Reverse primer-5’-CGAATTGTGAGCGCTCACAATTCTAAAAGCAAAAAAGCCTTC-3’. Holoenzyme and template DNA were mixed and incubated for 10 min at 37 °C to form RPo. After RPo reconstitution, LacI, which is also purified by size-exclusion chromatography before use, was added and incubated for 10 min for roadblocking. To the mixture, 1 mM rNTP (Cytiva) was added and incubated for 2 min at 37 °C for transcription. The sample was loaded onto zeba spin desalting column (Thermo Fisher) to remove free rNTP, and 2 mM IPTG was added to the complex. After 2 min incubation at 37 °C, the mixture was concentrated using 30 K MWCO Amicon Ultra Centrifugal Filter (Merck Millipore) up to 5–10 μM. The final buffer condition for all cryo-EM samples was 20 mM Tris-glutamate (pH 8.0), 10 mM Mg-glutamate, 150 mm K-glutamate, 5 mM DTT. 0.5% CHAPSO was added to the sample right before grid freezing. For cryo-EM grid freezing, Quantifoil R 1.2/1.3 Cu 400 grids were glow discharged at negative polarity, 0.26 mbar, 15 mA, 25 sec. Using a Vitrobot Mark IV (Thermo Fisher), grids were blotted and plunge-frozen into liquid ethane with 100% chamber humidity at 22 °C.

### Cryo-EM data acquisition and processing

Micrographs were taken using a 300 keV Krios G4 (Thermo Fisher Scientific) with a K3 BioQuantum direct electron detector (Gatan) with 20 eV energy filter slit width. Images were recorded with EPU with a pixel size of 1.06 Å/pix over a defocus range of −0.8 µm to −2.6 µm. Total dose given to the data set is 42.16 e2 and total frame number was 55. The movies were drift-corrected, summed, and dose-weighted using MotionCor2 in RELION3.163. The contrast transfer function (CTF) was estimated using Gctf64, and the summed images were sorted based on CTF max resolution (<10 Å) and CTF figure of merit (>0.01).

The sorted images were transferred to cryoSPARC v3.2.0 for further process65. First, 411.9k particles were picked using blob picker from 2000 movies, extracted with 320 pixels box size, and 2D classified to make picking templates. Then, 1447.1k particles were picked using template picker from 8174 images. The particles were 2D classified twice, and the selected 863.1k particles from 43 classes were used as templates for Topaz picker. From Topaz train, 1202.5k particles were picked and extracted from 8162 images. The particles were 2D classified into 100 classes and 90 classes were selected. The selected particles were divided into five classes in heterogeneous refinement. Among the five templates, three are from the previous data set collected from Glacios, two are from EMDB EMD-8585, a non-paused EC map. Among five classes, three classes were subjected to homogeneous refinement. Each homogeneous-refined class was further heterogeneous-refined into two classes, resulting in total of four significant EC classes—RPo, putEC, put-less EC, and σ70-bound putEC.

All particles of the four classes were imported to RELION3.1 for further refinements. The particles belonged to holoenzyme structure were 3D auto-refined, particle-polished three times, and 3D classified into three classes. Among the three classes, the major class was 3D auto-refined and post-processed yielding 3.0 Å-resolution map. The putEC particles were 3D auto-refined, particle-polished three times, and subjected to focused classification onto putRNA region into three classes. Among the three classes, two classes are combined, 3D auto-refined and post-processed yielding 3.2 Å-resolution map. The put-less particles were 3D auto-refined, particle-polished three times, and post-processed yielding 3.6 Å-resolution map. The σ70-bound putEC particles were 3D auto-refined, particle-polished three times, and 3D classified into three classes. Among the three classes, one best class was further refined and post-processed yielding 3.6 Å-resolution map.

### Model building, refinement, and validation

The local resolution estimation and filtration were done by blocres and blocfilt commands in bsoft package (version 2.0.5), respectively66. For the EC structures, EC coordinates including RNAP, DNA, and RNA are used from PDB 6C6T because this is modeled from the high-resolution EC map (3.5 Å). For σ70-bound putEC, the recently published high-resolution RPo model (PDB 6OUL) was used. In the model building, the models were first fitted onto the final cryo-EM map by using UCSF Chimera (version 1.11.12)67. Then, the RNAP domains were rigid-body refined in PHENIX (version 1.18.2)68, and the nucleic acid were mutated to have the correct sequence in Coot69. The structures were then real-space refined in PHENIX, manually modified in Coot, and iterated this process until satisfied. The putRNA was manually built into the map de novo. A .eff file that includes restraints maintaining the nucleic acid base pairing and stacking interactions was provided for each real-space refinement run. For the final refinement run, the nonbonded_weight parameter value was set to 500 (default value: 100) to improve the MolProbity and clash scores. The local filtered map was also used for the last refinement iteration because it slightly improved the modeling when inspected by eyes. The figures were made using PyMOL (version 2.4.0).

### Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.