Pausing controls branching between productive and non-productive pathways during initial transcription

Transcription in bacteria is controlled by multiple molecular mechanisms that precisely regulate gene expression. Recently, initial RNA synthesis by the bacterial RNA polymerase (RNAP) has been shown to be interrupted by pauses; however, the pausing determinants and the relationship of pausing with productive and abortive RNA synthesis remain poorly understood. Here, we employed single-molecule FRET and biochemical analysis to disentangle the pausing-related pathways of bacterial initial transcription. We present further evidence that region σ3.2 constitutes a barrier after the initial transcribing complex synthesizes a 6-nt RNA (ITC6), halting transcription. We also show that the paused ITC6 state acts as a checkpoint that directs RNAP, in an NTP-dependent manner, to one of three competing pathways: productive transcription, abortive RNA release, or a new unscrunching/scrunching pathway that blocks transcription initiation. Our results show that abortive RNA release and DNA unscrunching are not as tightly coupled as previously thought.


Introduction
Transcription initiation by DNA-dependent RNA polymerase (RNAP) constitutes the first and often decisive step in gene expression in bacteria. To balance the output of transcription with environmental and cellular needs, an extensive set of molecular mechanisms has evolved to regulate the efficiency and specificity of transcription initiation 1 . These regulatory mechanisms may either be directly encoded in the transcribed DNA sequence or mediated by protein transcription factors or small-molecule signals. The target of transcription initiation regulators may be the function of RNAP itself, or the accessibility or affinity of promoters for RNAP. Further regulation occurs in the elongation and termination phases of transcription [2][3][4][5] .
To perform promoter-specific transcription initiation, the five-subunit bacterial RNAP core associates with housekeeping σ 70 initiation factor (or one of the alternative σ factors) to form an RNAP holoenzyme 6,7 . The RNAP holoenzyme employs sequence-specific interactions between the σ 70 and the -35 and -10 promoter elements (Fig. 1A) to form an initial RNAP-DNA closed complex (RP C ), and to isomerize to the catalytically-competent RNAP-promoter DNA open complex (RP O ) 8,9 (Fig. 1B). During initial RNA synthesis, strong interactions with the DNA hold the RNAP immobile at the promoter, resulting in the build-up of "scrunching" of downstream DNA, a conformational change that increases the size of the DNA bubble [10][11][12][13] .
The eventual break-up of RNAP-promoter contacts and the escape to elongation relax the scrunched DNA 11 . The productive promoter escape pathway competes with abortive initiation, an unproductive pathway wherein the short nascent RNA is thought to prematurely dissociate, resetting the initially transcribing complex (ITC) to RP O . The presence of the abortive pathway is firmly established by in vitro biochemistry [14][15][16] , single-molecule biophysics 11,17 and in vivo studies 18 . While conformational strain resulting from the DNA scrunching may promote abortive initiation 11 , multiple other factors -such as the presence of the σ 3.2 region (which obstructs the RNA-exit channel; [19][20][21][22], strong RNAP-promoter interactions 9,16,23 and the initially transcribed sequence 24,25 -also contribute.
The step that defines the overall rate of transcription initiation varies between promoters 9,16,23 . Most σ 70 promoters in Escherichia coli are rate-limited by the stability of RP C or the rate of its isomerization to RP O ; however, in many cases, the rate-limiting step is attributed to the half-life of RP O or the rate of promoter escape. An extensively studied example of an escapelimited promoter is lacUV5 26 , which is known to produce substantial amounts of abortive products; further, transcriptional pausing has been identified in initially transcribing complexes after the synthesis of 6-nt RNA, at least partly due to the clash of the 5'-RNA end with σ 3.2 region 27,28 .

Recent advances in structural characterization of bacterial transcription initiation complexes
have created intriguing hypotheses on how specific molecular interactions and conformational changes drive holoenzyme formation, promoter recognition, isomerization to open complex 29 and initial RNA synthesis 12,20,30 . Complementing this fresh structural insight with detailed functional analysis is hampered, however, by the multi-step, asynchronous nature of transcription initiation pathways. Single-molecule techniques, which can provide a direct readout for several steps in the mechanism and resolve co-existing reaction pathways, are well-positioned to overcome the complexity of transcription initiation.
Here, we combined single-molecule and biochemical analysis of initial transcription to explore the mechanistic basis of the pause encountered by ITC6 on lac promoter 27 . We present evidence that the ITC6 pause represents a major control point where the initially transcribing complexes branch to three competing downstream reaction pathways: pause exit by productive transcription; abortive-RNA release; and slow cycling between DNA conformations with different extents of scrunching but without RNA release. The partitioning between these three paths and their kinetics depended on distinct interactions and structural elements. The rate of productive pause exit is synergistically controlled by the initial transcribed sequence and the interaction of the 5'-RNA end with σ 3.2 region, whereas weak RNAP-promoter interactions favor the entry into the scrunching/unscrunching pathway.

FRET.
To monitor the kinetics of transcription initiation at the single-molecule level, we developed a FRET sensor for real-time imaging of individual RNAP-promoter DNA complexes engaged in nascent RNA synthesis at a lac promoter derivative 27,31 . A similar approach had recently revealed the presence of a strong pause after the synthesis of 6-nt RNA by ITCs 27,31 . To allow an in-depth biophysical analysis of this rate-limiting ITC6 pause, we modified the original promoter design in two ways ( Fig. 1A and Fig. S1A) Fig. 1C Fig. S1B), suggesting that the transcription complexes synthesized 6-mer RNA and paused. After the pause, the ITCs split into two main populations: the first population comprised "productive" ITCs that resumed transcription and progressed from the PS to the FS state by synthesizing an 11-mer (Fig. 1C). The second population comprised ITCs that returned from the PS to the US state (Fig. 1D); notably, such complexes could cycle multiple times (e.g., at ~100 and ~200 s in Fig. 1D,) between PS and US states until they eventually reached the FS state (e.g., at ~500 s in Fig. 1D).
Transcribed sequence and 5'-RNA end determine the lifetime of ITC6 pause.
Two elements appear to contribute to RNAP pausing at ITC6: i) the clash of 5'-RNA end with the σ 3.2 region (Fig. 1B), which blocks the RNA-exit channel of RNAP 27 , and ii) a specific sequence motif (a non-template Y +6 G +7 in the transcribed DNA strand; 31 ) akin to that causing sequence-specific pausing in elongation [33][34][35] . We dissected the contributions of these two elements on the ITC6 pause using our smFRET assay.
To explore the steric-clash hypothesis, we modified the 5'-RNA end of the nascent transcript (and thus its interaction with σ 3.2 ) by initiating transcription either using ATP or using a synthetic dinucleotide (ApA). The use of ATP as an initiating nucleotide introduces a 5'triphosphate tail and a net charge of -4 to the 5'-RNA end; in contrast, ApA-primed reactions result in RNAs with no 5'-triphosphate tail. To evaluate the effect of the pause sequence motif on the dynamics of initial transcription, we replaced the sequence T +6 G +7 (on non-template DNA) with G +6 T +7 , creating a "ΔP promoter" (Fig. S1A) -this substitution has shown to shorten the pause by five-fold in a bulk gel assay and to reduce the total time spent in initial transcription by ~2-fold in a single-molecule assay 31 . In all experiments, the initiating ATP or ApA were held at 500 µM, a level significantly above the K M of the RP O for initiating nucleotides 22 ; we also varied the concentration of remaining NTPs (1-500 µM).
We first analyzed the effects of the pause elements on the pause duration at ITC6 (Δt ITC6 ) by focusing on the subpopulation of molecules displaying a US!PS!FS scrunching sequence (as in Fig. 1C). The dwell-time distribution for the ITC6 pause was well described by a single exponential ( Fig. 2A). The pause exit rate (k ITC6 ) towards productive synthesis (e.g. formation of ITC11), was extracted using a Maximum Likelihood Estimation (MLE) fitting routine and the errors were evaluated by bootstrapping (for details, see Materials and Methods). We observed that in the presence of ApA, the ITC6 pause exit rate was ~1.5-fold lower for the WT promoter compared to the ΔP promoter (Fig. 2B). When we replaced ApA with ATP as the initiating nucleotide and employed the remaining NTPs at above 30 µM, the ITC6 pause exit rate increased from ~0.07 to ~0.3 s -1 for the ΔP promoter and from ~0.04 to ~0.11 s -1 for the WT promoter, i.e. the pause exit rate enhancement was more than 2.5-fold (Fig. 2B).
These experiments demonstrate that the ITC6 pause duration is controlled both by the transcribed sequence and by the 5'-RNA interaction with σ 3.2 .
We also noted that the NTP concentration did not influence significantly the ITC6 pause exit rate for the WT promoter with either ApA or ATP as the starting substrate, or for the ΔP promoter with ApA as the starting substrate (Fig. 2B) We next characterized the probability to exit the ITC6 pause on the first attempt (Fig. 2C). For this purpose, we counted the probability of ITCs to proceed via the reaction path depicted in   2C), the probability to exit on the first attempt was high (0.6-0.8) at all substrate NTP concentrations (5-500 µM). On the contrary, the pause-exit probability for the ApA-initiated ΔP promoter, and the ATP-or ApA-initiated WT promoter decreased steeply from 0.8 towards zero at low NTP concentrations (Fig. 2C). By fitting the probability to exit the ITC6 pause on the first attempt with a descriptive model similar to a binding isotherm (Fig. 2C), we extracted a binding constant K NTP and a maximal pause-exit probability !"#,!"# for each condition (Fig. 2D). Overall, the WT promoter had a higher K NTP compared to ΔP promoter complexes (~28 vs 8 µM, ApA) while ATP-initiated complexes had a lower K NTP compared to ApA-initiated ones (~8 vs ~28 µM, WT promoter). The probability !"#,!"# was relatively constant, with ~80% of the molecules reaching the FS FRET level on the first attempt at saturating NTP concentration. These results suggest that ITCs can exit a weak ITC6 pause (ΔP promoter) efficiently even at low NTP concentration, while overcoming a strong ITC6 pause (WT promoter + ApA at the 5'-RNA end) requires higher NTP concentration (Fig. 2C).
Interestingly, we observed that 3-20% of the ITCs did not display a pause in the PS state (plain bars, Fig. S2B), but rather a direct transition from US to FS. This indicates two possible origins for the apparent absence of pausing: the presence of a non-pausing population of ITCs, and inadequate temporal resolution to capture the fastest US!PS!FS transitions. We thus calculated the fraction of the pausing ITCs that we cannot technically detect (by integrating the pause-exit probability distribution from 0 to our detection limit), and subtracted this fraction from the total non-pausing ITC population (plain bars, Fig. S2B). The corrected populations (dashed bars, Fig. S2B) showed that, for the WT promoter, the non-pausing events arise mainly due to limited resolution; in contrast, for the ΔP promoter, the main reason is actually the presence of non-pausing RPs (Fig. S2B). The T +6 G +7 (ntDNA) sequence therefore enforces pausing at ITC6 for ~100% of ITCs, stabilizing the pretranslocated state arising from the clash between σ 3.2 and the 5'-RNA end 27,31 .
Finally, a fully double-stranded promoter (dsWT, Fig. S1A) did not modify the ITC6 pause exit rate both for ApA and ATP starting substrates (Fig. S2C), while the probability to reach the FS state during the first attempt on this promoter was also strongly decreased in the absence of a 5'-RNA end triphosphate (~14% vs ~58%, Fig. S2D), suggesting again that the 5'-RNA end triphosphate assists in the ITC6 pause exit.

Weaker RNAP-promoter interactions promote cyclic scrunching/unscrunching
Our single-molecule reaction trajectories demonstrated ( Fig. 1) that the transcription complexes paused at ITC6 may either resume RNA extension or cycle between stable paused states. The first apparent event on the cycling pathway is the isomerization of the PS promoter conformation to the US state. A major factor determining the partition of ITC6 to the productive or unproductive pathway could thus be the stability of the scrunched DNA conformation. To explore this hypothesis, we engineered several structural changes ( Fig.   3AB), which alter important interactions between RNAP and nucleic acid components of the ITC (thus affecting scrunching), and characterized the effects on initial transcription.
To establish the importance of interactions of σ 3.2 with the template-strand DNA, we studied the F522A substitution in σ 70 , which eliminates an interaction between the -4 template DNA base and σ 3.2 29 ; this mutation has been shown to affect initial transcription, most notably by reducing the amount of transcripts shorter than 6-nt 22 , and could therefore affect ITC6 pausing. We observed that the F522A σ 70 derivative retained similar activity (Fig. 3B, Fig.   S2E) and ITC6 pause exit rate (k ITC6 ) as the WT σ 70 (Fig. 3C, Fig. S2F). Instead, the substitution significantly decreased the fraction of complexes exiting the pause on the first attempt from ~70% to ~37%, independently of the use of ApA (Fig. 3D) or ATP (Fig. S2G) as starting substrate. The weakening of σ 3.2 interaction with the template-strand DNA thus destabilizes the PS promoter conformation and biases the paused ITC6 towards the scrunching/unscrunching pathway.
We next studied the effect of the β D446A RNAP substitution on ITC6 pausing. This mutation impairs the 'G pocket' in the RNAP core recognition element that specifically binds a guanine at ntDNA position +1 in the post-translocated state 29 , strengthens the holoenzyme-promoter interaction 29 and helps to overcome a consensus elongation pause by stabilizing the RNAP at the post-translocated register 33 . At the same time, these interactions stabilize a hairpindepended pause 36 . Notably, the post-translocated ITC6 on our WT promoter has a guanine in a position optimal for interacting with the G pocket ( Fig. 1A). Our results demonstrate that the G pocket (in addition to being essential for forming an active ITC; Fig. 3B, Fig. S2E) facilitates pause exit, as from a consensus pause during elongation. Specifically, we observed ~2-fold reduction in the pause exit rate (~0.1 vs. ~0.055 s -1 , with ATP starting substrate Fig. S2F). We also observed up to 4-fold reduction in the fraction of complexes escaping the pause on the first attempt (~20% vs ~80% for ApA starting substrate, WT promoter and 500 µM NTPs, see Fig. 3D and Fig. 2C, respectively). The decreased pauseexit rate for the βD446A RNAP suggests that the paused ITC6 is biased towards the pretranslocated state, similar to the consensus paused elongation complex 33 .
To probe the effects of weakened interactions between σ region 2 and the -10 promoter element, we also replaced the consensus -7 thymine in the non-template DNA by an adenine Fig. S1A); σ specifically unstacks and inserts the thymine into a deep pocket during RP o formation 29,37,38 . Our experiments using -7T/A promoter show only small changes in the ITC6 pause exit rate and the fraction of complexes exiting the ITC6 pause on the first attempt ( Fig. 3BC), showing that this interaction is not affecting significantly this phase of initial transcription.

Complexes undergoing cyclic unscrunching/scrunching are inactive for many minutes
We then quantitatively analyzed the ITCs that first pause at ITC6, and then perform cyclic unscrunching/scrunching. As seen in Fig. 1D, these complexes may cycle multiple times between the PS and US states until they reach the FS FRET level. Since cycling often lasted tens or even hundreds of seconds, many of the analyzed trajectories were interrupted by dye bleaching before the RP reached the FS state (Fig. 4A). For the cycling population, we generated probability density distributions for the dwell times in PS (Δt PS ) and US (Δt US ) states ( Fig. 4BC). Both PS and US distributions showed a similar trend, with dwell times varying from ~0.4 s to ~200 s (Fig. 4BC). Using a MLE fitting routine (Materials and Methods), we found that the distributions were fitted well by a two-exponential probability distribution (solid lines, Fig. 4BC; dashed lines depict a single-exponential function). Our fit can thus define the exit rates k 1 and k 2 for both PS and US states, and the probability P(k 1 ) to exit the US or PS state with the pause exit rate k 1 (Fig. S2E-G; the probability to exit a state with the rate k 2 is given by 1-P(k 1 )).
We applied this analysis to our results from WT promoter reactions initiated with ApA or ATP, and the ΔP promoter initiated with ApA ( Fig. 2B). We did not include the ATP-initiated ΔP promoter results, since most complexes exited the ITC6 pause directly to the FS state ( Fig.   2C). We first noted that the exit rates k 1 and k 2 , as well as the P(k 1 ) probabilities of PS and US states, remained fairly constant in all used NTP concentrations ( Fig. S2H-J). We observed a single exception with the ITC on the ApA-initiated WT promoter, which showed a decreased probability P(k 1, PS ) at higher NTP concentrations (right panel, Fig. S2H). To further improve our accuracy, we averaged the kinetic parameters over all used NTP concentrations.
The US and PS states had practically identical kinetics, with the average values being k 1~0 .15 s -1 , k 2~0 .02 s -1 and P(k 1 )~0.6. Notably, these values were also independent of the NTP subset used (allowing maximal transcript lengths 7 or 11), the nature of the RNA 5'-end, the presence of pause motif in the transcribed sequence, the presence of the σ 70 F522A and β D446A mutations, the fully double-stranded structure of the promoter, or the presence of the -7T/A substitution in the -10 element (Fig. 4D, S2KLM).
The remarkable insensitivity of the kinetics of the unscrunching/scrunching pathway to the tested parameters, and in particular to the NTP concentration, suggests the complexes that enter unscrunching/scrunching pathway are catalytically inactive, until reentering the productive pathway to produce an ITC11 transcript (Fig. 1D). Indeed, if this pathway was catalytically competent, the exit rates k 1 and k 2 would have been sensitive to the NTP concentration, and would have therefore followed a Michaelis-Menten description. As evidenced from the slow unscrunching/scrunching rate constants, any complex embarking on the unscrunching/scrunching pathway very significantly delays the clearing of the promoter for the next cycle of transcription initiation, which potentially decreases the level of expression of the downstream gene.

DNA unscrunching does not necessarily lead to abortive RNA release and re-initiation
The discovery of extensive cycles of unscrunching and scrunching during initial transcription raises intriguing questions about its relation with abortive initiation. Does each unscrunching event lead to the release of nascent RNA (Fig. 5A, left panel)? Is the subsequent reisomerization to scrunched state driven by the synthesis of new RNA? Could the RNA be maintained in the complex upon unscrunching (Fig. 5A, right panel)? To address these questions, we performed experiments (Fig. 5B) in which we allowed RNA synthesis up to ITC11 (Fig. 5EG) or ITC7 (Fig. S3A) for ~10 s, washed the surface extensively to remove NTPs, and re-imaged the surface-bound complexes. To our surprise, we observed many complexes displaying unscrunching/scrunching activity in the absence of NTPs; The percentage of complexes cyclically unscrunching/scrunching in the absence of NTPs was ~28% and ~18% for the WT promoter initiated with ApA or ATP, respectively (Fig. 5D). These numbers should be compared to ~27% (ApA) and ~42% (ATP) of active FRET pairs in the presence of NTPs, respectively (Fig. 5D). This means that a large fraction of the cycling molecules in the presence of NTPs (potentially up to the entire population, in the case as ApA-initiated reactions) can be accounted for by cycling complexes that do not synthesize RNA. The scrunching/unscrunching cycling lasted for hundreds of seconds, being only limited by dye bleaching (Fig. 5EG). Consistent with the maximal RNA length, complexes pulsed with ITC7 NTP sampled only US and PS states (Fig. S3A) whereas complexes pulsed with ITC11 NTP could additionally occupy the FS state (Fig. 5G). Our results clearly establish that extended cycling in different scrunching states does thus not require active RNA synthesis.
We quantitatively analyzed the kinetics of cyclic unscrunching/scrunching for complexes pulsed with ITC11 NTP, and identified two subpopulations: the first cycled between US and PS FRET levels only (Fig. 5E), and the second cycled between US, PS and FS FRET levels ( Fig. 5G). The US/PS subpopulation included ~50% (ApA starting substrate) or ~40% (ATP starting substrate) of all cycling molecules, respectively ( Table S1) Using ATP or ApA for initiation did not significantly affect the kinetic parameters of the US/PS/FS FRET states (Fig. 5FH).
Close inspection of the trajectories belonging to the US/PS/FS subgroup revealed that the two most frequently encountered state transitions were FS!US and its reversal US!FS ( Fig. 5I and Fig. S3F); this was also the case in the continuous presence of NTP (Fig. S3G).
The US!PS and PS!US transitions were about 4-fold less frequent, whereas PS!FS or FS!PS transitions were only rarely observed. This data clearly indicate that RPs engaged in the unscrunching/scrunching pathway do not share the same linear US!PS!FS reaction coordinate of ITCs engaged in productive transcription (Fig. 5J). We also note the absence of any temporal correlation between two successive state dwell times (dt n and dt n+1 ), independent of the scrunching state they originate from (right hand side, Fig. S3BCD), which shows that the transition from one state to the next is memory-less, i.e., the scrunching magnitude of the preceding state has no effect on the timescale of the transition of the following state.

Paused ITC may undergo abortive initiation or stably trap RNA
Our FRET assay monitors the conformation of the promoter DNA and thus does not provide a direct readout for the presence of RNA in the ITCs. Since pulsed RNA synthesis was required to generate ITCs that cycle for several minutes between scrunched states, we assumed that these ITCs retain the nascent RNA in the transcription bubble. The assumption generates two testable hypotheses: first, RNA is slowly released from NTP-deprived ITCs; second, any RNAs retained in ITCs are extendable upon NTP reintroduction.
To determine the profile and time-dependence of RNA release from ITCs, we immobilized biotinylated RP O complexes to streptavidin-coated magnetic beads. The complexes were pulsed for 10 s with the ITC7 NTP subset (containing α− 32 P-UTP), pulled down, washed and immersed into NTP-free reaction buffer; beads and supernatant were then analyzed at specified times to obtain the time-dependent profile of retained and released RNAs (Fig.   S4AB). Our results showed that the RNA-release kinetics was strikingly biphasic: many ITCs released their RNA within the first 2 min, the release being almost quantitative for the shortest RNAs (~95% of 3-4-mers) and less efficient for 5-, 6-, and 7-nt RNAs (45,80, and 80%, respectively; Fig. S4B). After the rapid initial phase, the amount of released 6-or 7-nt RNA increased only marginally. After 15 min, still ~20% of 6-7-nt RNA remained bound in the ITCs. This amount is 2-fold lower than what we measured in similar NTP-pulsed singlemolecule experiments, where most of the active ITCs were sampling the unscrunching/scrunching states for several minutes (Fig. 5D).
To probe whether the stalled ITCs retaining 6-nt RNA for an extended period of time can resume active transcription, we chased the immobilized and washed ITCs with the next incoming nucleotide (GTP). We observed that the 6-nt RNA became converted quantitatively to 7-nt RNA ( Fig. 5K; longer products appear due to mis-incorporation), indicating that the

Discussion
In this study, we employed a refined single-molecule FRET assay to quantitatively dissect the reaction pathway and kinetics of the initially transcribing complexes on the lac promoter. Our unique FRET sensor helped us to observe directly and with high contrast the entrance and exit from initiation pausing and allowed us to disentangle the complex network of catalytic and non-catalytic events during initial transcription, and to examine the role of the σ 3.2 region, the nature of pausing, and pausing-related conformational changes such as scrunching/unscrunching in the presence and absence of RNA release.

σ 3.2 represents a translocation barrier during initial transcription
Two aspects of our current results and our previous work 27 (Fig. 1B, Fig. 6

Initial transcription pause involves elemental pause-like states
The finding that the Δσ 3.2 mutation reduced but did not completely abolish pausing at ITC6 indicated that σ 3.2 is not the only determinant of initial transcription pausing 27 . Recent work elucidated a consensus sequence that dramatically increased the probability of transcription elongation complex to enter the consensus pause 33,34 . The most strongly conserved part of the consensus pause motif is a pyrimidine-guanine (Y)/G at position -1/+1 relative to the 3'end of the transcript, which may cause the template strand to isomerize in the pause complex such that the template base becomes inaccessible to the incoming NTP 2,41 . A consensus pause motif is indeed encountered at ITC6 31 and, importantly, substitution of the motif increased both the pause exit rate and the probability to exit the pause towards ITC7; these results are consistent with biochemical studies of many promoters, including the lac promoter 31 . The exit rate from the ITC6 pause (~0.3 s -1 ) is similar to the exit rate from consensus elongation pause (~0.5 s -1 ) 34 .
Overall, it appears that the first events leading to a pause during initiation and elongation phases of transcription are similar: an energetic (transcribed sequence in elongation) or physical (σ 3.2 in initial transcription) barrier to translocation delays RNAP in the pre-translocated register 31 from where the protein can, with sequencedependent efficiency, branch-off to a catalytically inactive elemental pause state (Fig. 6).

Backtracking leads to long-lived paused states
While the entry of ITC6 into the elemental pause was nearly obligatory (80-90% of trajectories showed the pause, Fig. S2B), a significant fraction (~20% at saturating NTP concentration, Fig. 2D) of the RNAP complexes did not exit this pause on the first attempt, but instead embarked on another reaction pathway involving cyclic unscrunching/scrunching events. Provided that the initially transcribing RNAP remains tightly anchored at the promoter, an unscrunching event results in partial relaxation and reannealing of the downstream DNA that was pulled into RNAP in the scrunched state 10  or non-template DNA (D446A substitution in β) favored the partitioning of ITC6 into the unscrunching/scrunching pathway (Fig 3D, Fig. S2G). This finding may imply that the promoter and initially transcribed sequences, interacting with the holoenzyme most tightly, encode efficient promoter-escape kinetics because they disfavor ITC partitioning into the nonproductive unscrunching/scrunching pathway. Consistently, Record and co-workers recently reported the correlation of stronger holoenzyme-discriminator (promoter sequence between the -10 element and transcription start site) interaction with the production of longer abortive RNAs, while having a higher promoter escape efficiency 43 .
Similar backtracked/unscrunched initially transcribing complexes were recently identified in a magnetic tweezers assay 28 . Both that study and our current work described the US state kinetics with a double-exponential distribution, with the longest-lived one being the backtracked complexes. Lerner et al. 28 reported backtracked complex lifetimes two orders of magnitude longer than we find in our study. However, the difference can be easily explained suggest that the transition between the states described by two exponentials, i.e. described by exit rates k 1 and k 2 , (Fig. 4 and Fig. S2H-M) originates from a conformational change within the holoenzyme that precludes catalysis but does not preclude the dynamics of the hybrid, i.e. scrunching/unscrunching cycle.

RNA release and subsequent re-initiation is not obligatory upon DNA unscrunching
Previous single-molecule studies assumed a direct link between unscrunching and abortive transcription 11,17,27,28  Recent work 43 have also noted that RP O complexes on λP R and T7A1 promoters were divided into two populations upon NTP addition: a first population (30-45% of all complexes) that rapidly (within 10 sec) synthesized long RNA, i.e. longer than 10 nucleotides, quickly, i.e. within 10 sec, and represented 30-45% of the total RP population; and a second population that was stalled in early ITC, i.e. shorter than ITC10, and that released RNA slowly, similarly to moribund complexes 52 . We propose that these two populations, i.e. the population producing quickly long RNAs and the moribund complexes, are consistent with the two populations we described here, i.e. the RP complexes that exited the ITC6 pause on the first attempt (Fig. 1C), and the population that entered the cyclic unscrunching/scrunching state from the ITC6 pause (Fig. 1D) 43 . In contrast, our complexes are able to extend the retained RNA (Fig. 5K), and could enter the cyclic unscrunching/scrunching state multiple times before elongating the transcript, as a function of the NTP concentration (Fig. 1D, Fig. 2C). This conclusion is further supported by the biphasic activity also observed with purified moribund complexes 52,53 , demonstrating that moribund complexes have entered a catalytically incompetent state, which is eventually exited towards productive synthesis. Supporting our conclusion on the backtracked nature of the US state, Shimamoto and co-workers have also observed that the moribund complexes were reactivated by GreA 53 . They also showed that part of the moribund complexes are converted into dead-end complexes on λP R promoter, i.e. that could not show any catalytic activity anymore, while trapping a short transcript 52 . Interestingly, their gels showed that a large amount of 9-mer transcript was produced, despite the presence of all NTPs, which correlated with the presence of a (YG) sequence on the non-template promoter DNA at the +9 position. We showed here that such a (YG) sequence increased the probability to enter the cyclic unscrunching/scrunching pathway (Fig. 2C). We suggest that the dead-end complexes observed on the λP R promoter are the consequence of multiple successive entries into the cyclic unscrunching/scrunching pathway, and therefore appeared inactivated. Similarly, it is likely that the complexes observed by Henderson et al. 43 have not yet recovered from the cyclic unscrunching/scrunching state. Nevertheless, our data clearly demonstrate that the unscrunching of the promoter DNA is not linked with obligatory abortive transcription and the release of RNA. Instead, off-pathway ITCs can sample different scrunched conformations and eventually resume productive RNA synthesis. Such conformational changes have also been proposed for promoter-proximal paused complexes 54 .

A model for initial transcription
We summarize our findings in a new kinetic model of the transition to productive transcription ( Fig. 6). In the RP O complex,  S4) and ultimately return to the productive pathway and escape the ITC6 pause. The model we present here contains three significant molecular mechanisms, i.e. the initial barrier imposed by σ 3.2 to the transcript elongation (1 in Fig. 6), the subsequent loss of catalytic conformation (2 in Fig. 6) and the RNA-dependent reversible backtracking (3 in Fig. 6), which potentiate, initiate and amplify the pause encountered by the initially transcribing bacterial RNAP, respectively.

Biological significance
The dependence of entry and recovery from the pause states as a function of initially transcribed sequence 31 implies wide variation in the kinetics of initial transcription across the bacterial promoter sequence space. On the lac and similar promoters, the molecular mechanism of pausing sensitizes the efficiency of promoter escape to NTP concentration, potentially trapping the RNAP to the promoter in a "ready-to-fire" or "poised" mode until improved growth conditions lead to the replenishing of cellular NTP pool [55][56][57]  The expression and purification of the core bacterial RNA polymerase have previously been described in Ref. 60 . The expression and purification of the wild-type σ 70 have been previously described in Ref. 21 .

DNA constructs preparation.
The DNA constructs preparation has previously been described in detail in the Supplementary Protocols.

Microscope and single-molecule experiments description.
The single-molecule TIRF microscope for FRET experiments has been previously described in Ref. 61  The data were acquired after immobilization of the RPo complex to the surface. After ~200 frames (~20 s) the imaging buffer is spiked with a 12.5x NTP solution and the reaction is observed for the remaining ~5800 frames (total time: 10 minutes).
For the post-RNA synthesis rinsing experiments, the RP O was incubated with NTP in the reaction buffer for 10 sec before the reaction buffer was exchanged twice and finally replaced with imaging buffer, followed by the start of the acquisition. The buffer exchange procedure takes ~40 sec to be completed before the start of the acquisition.
Single-molecule data analysis.
FRET pair localization and detection. The movies recorded on the camera were offline analyzed using the home-built Matlab routine Twotone-ALEX 66 to extract the intensities of colocalized donor and acceptor, i.e. FRET-pair. The following parameters from Twotone-ALEX were used to select only the FRET pairs formed by a single ATTO647N acceptor dye and a single Cy3b donor dye: channel filter as DexDem&&AexAem&&DexAem (colocalisation of the donor dye signal upon donor laser excitation, the acceptor dye signal upon acceptor laser excitation, and the acceptor dye signal upon donor laser excitation), a width limit between the donor and the acceptor between 1 and 2 pixels, a nearest neighbor limit of 6 pixels and a maximal ellipticity of 0.6 (ellipticity is defined as the ratio of the minor and the major axis of the ellipse). The traces extracted from the Twotone-ALEX analysis were then sorted to remove all the traces that displayed extensive blinking or multisteps photobleaching, i.e. that contain more than one donor or acceptor dye in the same diffraction limited intensity spot.  68 , where only steps longer than 2 frames and separated from the subsequent step by more than twice the Allan deviation estimated at 5 frames were conserved 69 to be assembled into dwell time.
Characterization of the Δt ITC6 , Δt US , Δt PS and Δt FS dwell time distributions. A detailed analysis of the dwell time distributions is provided in Ref. [70][71][72] . Shortly, the distribution of are described by a probability distribution function with exponentials: where ! and ! are the characteristic rate of the !! exponential and its probability, respectively. The minimum number of exponential to fit the distributions was determined for each distribution by using the Bayes Schwarz Information Criterion (BIC) 73 . We calculate the maximum likelihood estimate of the parameters (MLE) 74 by maximizing: over the parameter set. Here the ! are the experimentally measured dwell times and N is the number of collected dwell times ! . The error bars for each fitting parameters are one standard deviation extracted from 1000 bootstrap procedures 75 . The ebFRET software 68 was also used to extract the peak positions of each FRET level, subsequently fitted with a Gaussian function, with the peak center and the standard deviation as free parameters (Fig.   S1D, Fig. S3A-D). WT/ApA (yellow) and ITC11 (Fig. S1A) for all others, and different NTP concentrations (Table S1). In the ATP-initiated reactions, we did not use NTP concentration below 5 µM to prevent potential misincorporations of ATP (used at 500 µM for initiation purposes) 76 . On the right hand side is indicated the mean ± standard deviation of k ITC6 for each promoter/starting substrate condition. (C) Probability to reach the fully scrunched (FS) FRET level in a single attempt (Fig. 1C). The solid lines are fits to a binding isotherm of the form = !"#,!"# × ( + !"# ). The error bars are 95% confidence intervals. (D) !"# and !"#,!"# extracted from (C). Error bars are one standard deviation extracted from the fit.   Fig. S2E-G) of the double exponential MLE parameters (k 1 and k 2 exit rates and P probability of being in the exponential with k 1 exit rate) for US and PS FRET states averaged over all NTP concentrations, for the conditions described in Fig. 2A and Table S1. Color scheme as in Fig.   2.  Green triangle marks the template base for the next incoming nucleotide in the active site of RNAP. Red and black strands represent the nascent RNA and template DNA, respectively.
The numeration (1, 2 and 3) indicates the three significant molecular mechanisms described by the model: the initial barrier imposed by σ 3.2 to the transcript elongation, the subsequent loss of catalytic conformation and the RNA-dependent reversible backtracking, respectively.