Introduction

Cellular processes including transcription are inherently dynamic. Currently, the dynamics of transcription and other molecular processes in the cell are poorly understood1 because of a lack of methods that measure fundamental kinetic parameters in vivo. Precise estimation of the chromatin-binding on-rates and off-rates of general transcription factors (GTFs) and other classes of transcription factors (TFs) would allow more quantitative understanding and modeling of pre-initiation complex (PIC) formation2,3, RNA polymerase recruitment and elongation, and transcription4,5. Live-cell imaging at specific multi-copy genes is capable of yielding the residence time of TF-chromatin interactions at high temporal resolution (i.e., second timescale)6 but in general does not allow these measurements at single-copy genes. Cross-linking kinetic (CLK) analysis is a high spatial and temporal resolution method that enables estimation of the in vivo TF-chromatin on-rates and off-rates at single-copy loci7,8. Two other experimental approaches used to assess TF-chromatin dynamics are anchor-away (AA)9,10 and chromatin endogenous cleavage followed by sequencing (ChEC-seq)11; however, only qualitative or semi-quantitative TF-chromatin dynamic information is determined from these approaches9,10,11. Indeed, alternative physical-modeling approaches to calculating these kinetic parameters are needed to independently verify the estimates obtained from CLK and live-cell imaging techniques12,13.

Competition ChIP is another high-spatial resolution method in which the endogenous copy of a TF contains one protein tag and an alternative copy, a competitor, is transcriptionally induced with an alternative protein tag14,15,16. We developed and applied a physical modeling approach using chemical kinetic theory that directly estimates the physical half-life or residence time of TATA-binding protein (TBP)—the general transcription factor which initiates PIC formation17—on chromatin across the yeast genome from TBP competition ChIP data16. Given that the competitor TF requires 20–30 minutes for induction15,16, competition ChIP was generally believed to be low temporal resolution (20 minutes or greater)7,9. Moreover, previous analyses of competition ChIP data have estimated relative turnover rates14,15,16 and not residence times. Lickwar et al.15 argue that they estimated the residence time of Rap1 across the yeast genome with the shortest residence time being ~30 minutes; however, we show that their estimates, while correlated with the physical Rap1 residence time, are likely much longer than the actual physical residence time. In support of this, live cell imaging13, CLK7,8 and AA9 analyses reveal that TBP-chromatin interactions range from seconds to a few minutes depending on the promoter. However, the previous estimates of residence times were made at select loci using qPCR7,9 or represented effective averages across hundreds to thousands of promoters13. Consequently, this study is the first to arrive at genome-wide estimates of physical TF-chromatin residence times for any TF (in this case TBP). Using our physical modeling approach, we are capable of estimating TBP-chromatin residence times as short as 1.3 minutes and as long as 53 minutes, demonstrating that competition ChIP is actually a relatively high temporal resolution method. An advantage of estimating the physical residence time as opposed to relative turnover is that comparison of physical residence times to other physical timescales including nascent RNA transcription rates inform qualitative and quantitative models of the efficiency or stochasticity of PIC formation and transcription1. Furthermore, physical residence times will lead to physical mathematical models of PIC assembly and transcription2,3 as more kinetic parameters are measured.

Comparing TBP-chromatin residence times with nascent RNA transcription rates18, we found that a median value of ~5 TBP binding events were associated with productive RNA synthesis across Pol II genes. Our results paint a highly dynamic, stochastic picture of pre-initiation complex formation with multiple rounds of partial assembly and disassembly before a single round of productive RNA polymerase elongation. We also compared TBP-chromatin residence times to Rap1 and nucleosome relative turnover14,15,16. Notably, these are the only other regulatory factors whose dynamics have been characterized at specific sites on a genomic scale. We found that TBP-chromatin residence time was correlated with Rap114,15,16 but not nucleosome14,15,16 turnover dynamics. Moreover, while TBP and Rap1 chromatin dynamics were poorly correlated with nascent RNA transcription rates18, +1 nucleosome turnover dynamics, which likely affect Pol II elongation19,20,21, showed modest but robust positive correlation with nascent RNA transcription rates. Assessment of the role that the occupancy of over 200 transcription factors22 played in modulating TBP-chromatin residence times and nascent RNA transcription rates across gene promoters revealed only a subunit of TFIIE affecting TBP residence times while a number of initiation and elongation-related TFs had a relatively strong impact on nascent RNA transcription rates. Our findings point to the dynamics and occupancy of factors that regulate the late stages of transcription initiation including Pol II elongation associating more strongly with nascent RNA transcription rates than that of factors regulating early stages including PIC formation such as TBP and Rap1.

Results

Overview of competition ChIP experiment and data analysis

Competition ChIP (schematically represented in Fig. 1a–d) enables direct measurement of TF-chromatin turnover dynamics at binding sites across a genome (e.g., yeast genome). This is accomplished by attaching a protein tag to an endogenous TF (orange dots in Fig. 1a–d) and by expressing a competitor of that TF with a different tag (maroon dots in Fig. 1a–d). The relative occupancy of the alternatively tagged TFs are measured at binding sites across a genome using chromatin immunoprecipitation (ChIP) followed by hybridization to genomic tiling arrays (ChIP-chip) or high throughput sequencing (ChIP-seq). Quantification of the normalized ratio of induced competitor TF ChIP signal over the endogenous TF ChIP signal over time after induction of the competitor TF yields estimates of TF-chromatin turnover at any given binding site15,16. The induction of the competitor concentration (labeled CB) relative to the endogenous TBP concentration (labelled CA) takes ~60–70 minutes to reach steady state levels as shown by the dashed line in Fig. 1e–g.

Figure 1: Illustration of competition ChIP experiment.
figure 1

(a–d) In vivo induction of HA-tagged competitor TBP (maroon), in vivo stable population of Avi-tagged endogenous TBP (orange), and a depiction of fast, medium, and slow binding dynamics over induction times of 0 min, 20 min, and 60 minutes. (a) Induced TBP concentration going from zero at 0 min to twice the endogenous TBP at 60 min of induction time, approximately following the induction in van Werven et al.16. The induction curve is also labeled as (dashed brown curve) in (e–g). (b–d) The “Fast”, “Medium”, and “Slow” rows depict the binding of induced and endogenous TBP at loci with TBP residence times of less than a minute, a few minutes, and tens of minutes, respectively, for given induction times of 0 min, 20 min, and 60 min. (e–g) Simulated in vivo ratio of occupancy of induced to endogenous TBP with a residence time of 1 min, 10 min, and 70 min. (e) For loci with fast dynamics, the occupancy ratio follows the induction curve closely, also depicted in (b) where the ratio of sites occupied by competitor to those occupied by endogenous TBP closely follows the ratio of concentrations of competitor to endogenous TBP shown in (a). (f) The occupancy ratio lags behind the induction curve for TBP residence time of 10 min. At 20 min post-induction the ratio of occupancies is almost zero, also shown by the absence of maroon dots in the middle panel of (c). Since the induction curve approaches the saturation value of 2 around 50 minutes, the ratio of occupancies starts approaching the induction curve around 60 minutes, also shown in the last panel of (c) where the induced TBP occupancy is twice that of the endogenous TBP. (g) The rise and saturation of the ratio of occupancies is significantly delayed compared to the induction curve for TBP residence time of 70 min. Around 60 minutes, the ratio of induced occupancy to endogenous occupancy is ~ 0.5, also shown in the last panel of the (d) with induced TBP bound to one locus and endogenous TBP bound to two loci.

We applied kinetic theory to model the in vivo competitive dynamics of the induced competitor and the endogenous TBP in a competition ChIP experiment16 to estimate the TBP-chromatin binding on-rate (ka) and off-rate (kd) (Supplementary Text Sec. 2) at sites across the yeast genome. We found that the ratio of simulated induced over competitor occupancy versus time strongly depended on residence time and not the on-rate (Supplementary Text Sec. 3). Additionally, we observed that the simulated ratio of occupancies using the kinetic model (solid lines in Fig. 1e–g) rose and saturated (at steady state levels) at slower rates with increasing residence time, t1/2. As noted in the Introduction above, we were able to determine residence times as short as 1.3 minutes. Given that the induction time of the competitor takes ~60–70 minutes to reach steady state levels, how were we able to derive such short TBP residence times? TBP-chromatin interactions with short residence times (t1/2 = 1 min) yielded a simulated ratio of occupancies versus time that was mildly but noticeably displaced or shifted (i.e., minute timescale) to the right of the induction curve (Fig. 1e) while TBP-chromatin interactions with longer residence times were displaced roughly by the value of the residence time (Fig. 1f,g). Intuitively, this time-delayed response of the ratio of occupancies relative to the induction curve can be viewed as an additional delay compared to induction driven by the residence time of the TF. In fact, the simulation showed that the residence time is effectively the time it takes for the turnover to affect the ratio of occupancies in response to induction of the competitor at all times post induction, including times much shorter than that required for full induction. Importantly, this delay is noticeable as soon as the induction curve rises above 0 (i.e., noise level), which is ~10 minutes (Fig. 1e–g), and, as discussed below, enables residence times as short as ~1 to 2 minutes to be estimated.

Along with background subtraction and normalization (Fig. 2a,b) of TBP competition ChIP data15,16 (Supplementary Text Sec. 1), an important data processing step includes scaling of the normalized, background subtracted ChIP ratios at steady state (i.e., t → ∞) as outlined in Fig. 2c (also Supplementary Text Sec. 2). In order to fit a kinetic theory of competitive binding represented as a ratio of the competitor over the endogenous TF occupancies versus time, the processed data must satisfy the constraints on the ratio of occupancies at the start of induction (t = 0) and steady state or equilibrium (t → ∞). More specifically, the mathematical solution of the kinetic theory equations (Supplementary Text Sec. 2) shows that the ratio of the competitor over endogenous TF occupancies equals the ratio of competitor over endogenous TBP concentration at steady state (t → ∞). This is depicted in Fig. 1e–g where the ratio of simulated occupancies (solid blue lines) and the ratio of TBP competitor concentration over endogenous concentration (dashed brown line) at steady state both equal 2. Importantly, background subtraction and normalization of competition ChIP genomic tiling array or high throughput sequencing data across time points does not yield properly scaled data at steady state (as shown in Fig. 2b). There are likely multiple reasons for this discrepancy between theoretical and background-subtracted normalized ratios including differences in the affinity of the two antibodies used to tag the competitor and endogenous TF (Supplementary Text Sec. 2). Nevertheless, if a kinetic model is used to fit competition ChIP data, the data must be properly scaled to satisfy the constraints of the theory at the start of induction and at steady state—a crucial step that has not been implemented previously14,15.

Figure 2: Schematic workflow of the quantitative analysis pipeline.
figure 2

(a–d) A schematic representation of the data processing pipeline that takes the geometrically averaged ratio of HA and Avi proteins from van Werven et al.16 and outputs scaled, normalized ratios that can be fit with the in vivo kinetics model. (a) The first step was to normalize the data for each induction time to the non-specific background to take into account potentially different experimental conditions for the time points. (b) After normalization, a sigmoid with a constant was fitted to the data for each locus: the constant (B) gave the locus specific background value, and the amplitude gave the saturation value for the ratio data. In the figure, the locus specific background is 0.078, and the saturation value is 1.01. The expected saturation value at each locus given by the in vivo kinetic model is the ratio of the concentrations of the competitor to the endogenous TBP at long induction times (~2.23 as shown in Fig. 3a). (c) We subtracted the background (B) from the locus data and scaled the data with a multiplicative factor such that the saturation matched the expected saturation value of 2.23, without which the data and the theory would be at odds. (d) The data was fitted with the in vivo kinetic model to extract residence times. A heuristic, approximate explanation of the “lag” between the induction curve and the observed occupancy ratio is that the response time (denoted by ) as measured by fitting a sigmoid to locus data without using the kinetic model is approximately the sum of the protein induction time and the extracted in vivo residence time found using the kinetic model. This signifies that the residence time can be qualitatively approximated as the difference between the response time and the protein induction time.

Background subtraction, normalization and scaling of competition ChIP-chip data

In order to fit TBP competition ChIP two-color Agilent tiling microarray data16 to our kinetic model, we first normalized each dataset to non-specific background (Fig. 2b, Supplementary Fig. S1, and Supplementary Text Sec. 1). We then subtracted locus-specific background and scaled the data for TBP peaks within gene promoters to theoretically expected values at the start of induction (t = 0) and steady state or equilibrium (t → ∞) (Fig. 2c and Supplementary Text Sec. 2). The kinetic theory explicitly accounts for the time dependence of the induction of the competitor. Consequently, we fit the ratio of the induced (denoted by B) over endogenous (denoted by A) TBP concentration determined from Western blots as a function of induction time16 to a function that displayed critical features of the ratio: saturation as well as positive curvature (i.e., increasing slope) at low time points and negative curvature (i.e., decreasing slope) near steady state or saturation. A Hill-like sigmoid function with Hill coefficient n = 4 (Fig. 3a and Supplementary Eqn. 3) displays all of these properties and yielded the best fit of the ratio of concentration data over time. The fit yielded a characteristic time-scale for TBP competitor induction min and the steady state ratio of induced over endogenous TBP concentration . Not surprisingly, the normalized competition ChIP data at nearly every TBP binding site was also well approximated by an n = 4 Hill-like equation with a time-scale parameter t0 (Supplementary Eqn. 3), which quantifies the overall turnover response including induction and TF-turnover dynamics at every TBP peak. As we showed in our simulation of ratios of competitor over endogenous TF occupancies using kinetic theory of competitive binding (Fig. 1e–g), the resulting competition ChIP ratio (after proper normalization, background subtraction and scaling) is a response curve that is delayed compared to the induction curve (with a characteristic time-scale ) roughly by the residence time (t1/2) (i.e., crudely ) (Fig. 2d). We used this Hill-like equation to background subtract and scale the data to the theoretical in vivo (denoted by superscript i) ratio of fractional occupancy of the competitor to the endogenous TBP, which must satisfy the boundary conditions at the start of induction and steady state ( as ) as described above (Fig. 2 and Supplementary Text Sec. 2.4).

Figure 3: Estimation of TBP residence time from kinetic model fit to normalized, scaled competition ChIP data.
figure 3

(a) Ratio of concentration of competitor TBP (CB) to the concentration of endogenous TBP (CA) taken from van Werven et al.16 along with a sigmoid fit to the data (dashed line). The fit gave a saturation value of 2.23 and protein induction time of 22 min (the time at which the signal reaches half the saturation value). (b) Plot of normalized, scaled competition ChIP ratio data (competitor/endogenous) versus induction time. The dashed line shows the protein induction data from (a). As shown in Fig. 2, is an estimate of the overall turnover response time. Hence, the data stratified and averaged in bands of 2 minutes for ranging from 24.5 minutes to greater than 40 minutes showed a progressively slower rise as increased. (c) Normalized density of TBP residence times, , obtained from data in each band (same color scheme as panel (b)) showing that larger leads to longer residence times as explained in Fig. 2. Here, and throughout, normalized density was calculated using the kernel density estimation algorithm implemented in R via the density function, which normalizes the area under the curve to near unity. (d) log2-log2 plot of TBP versus response time showing a monotonic relationship between and for t0 > 24.5 min. For t0 < 24.5 min, the noise in the data and the induction curve made estimates imprecise. As a consequence, estimates of residence times shorter than ~1.3 minutes are in general unreliable. (e–g) Representative fits of our kinetic theory based model to the normalized, scaled competition ChIP ratio data and estimates of TBP , along with the fit to the protein induction data (dashed, same as (a)). The colors of the data and the fits correspond to the appropriate bands shown in (b). (e–g) Once again highlight that the residence time extracted using the kinetic model increases as the response time increases.

Estimation of residence time by fitting the model of competitive binding to normalized, scaled competition ChIP data

We then simultaneously numerically solved and fitted the in vivo kinetic equations of competitive binding between species A and B (Methods Eqns 1 and 2, Supplementary Eqns 9 and 10) to normalized, scaled competition ChIP data (Fig. 2 and Supplementary Text Sec. 4, 5). We (and others14,15,16) ignored the impact of cross-linking theoretically as competition ChIP data was gathered at one cross-linking time (20 min of formaldehyde cross-linking in van Werven et al.14,15,16). We showed that the resulting off-rate, kd, could be modestly biased (Supplementary Fig. S3a–d) using a generalization of the CLK framework with crosslinking to competition ChIP (Supplementary Eqns 4–8). This framework could be used to correct the bias if data is gathered at various crosslinking times7. As noted by Lickwar et al.15, we also found that the in vivo ratio of induced over endogenous TF as a function of induction time is insensitive to the on-rate, ka, and is very sensitive to the off-rate or residence time, (Supplementary Text Sec. 3 and Supplementary Fig. S3e–h). Consequently, we only arrived at relatively precise values of the residence time (t1/2).

TBP-chromatin residence times ranging from 1.3 to 53 minutes estimated from normalized, scaled competition ChIP data

Stratifying TBP-containing promoters in 2-minute bands of t0, we showed that the average normalized and scaled ratio of competitor over endogenous signals as a function of induction time progressively showed slower rise as t0 increased (i.e., moved to the right) (Fig. 3b) with corresponding residence times increasing from 1.3 to 53 minutes (Fig. 3c, Supplementary Text Sec. 6), showing that residence times could be estimated from the ratio. Indeed, given that fitting the Hill-like equation and chemical kinetic equations should yield highly correlated results, we found a smooth relationship between t1/2 and t0 (as mentioned earlier, crudely ) up to a point where numerically fitting the chemical kinetic equations became unstable; this point is marked by t0 < 24.5 min (Fig. 3d and Supplementary Fig. S4a–d). This numerical instability was due to the fact that for promoters with t0 < 24.5 min, the separation between the normalized, scaled data and the induction curve were well within the noise of the competition ChIP data. For t0 > 24.5 min, the normalized, scaled data yielded excellent fits to the chemical kinetic equations, the data moved progressively to the right with increasing residence time and, remarkably, allowed residence times as short as 1.3 minutes (Fig. 3e–g, Supplementary Fig. S4e–h, and Supplementary Table S1) and longer (Supplementary Fig. S5) to be estimated. So how were we able to determine residence times as short as 1.3 minutes? The shortest residence time that could be reliably estimated was determined by the noise in the induction and competition ChIP data and not the induction time of the competitor. As soon as reliable, robust separation (i.e., beyond their relative error or noise) driven by increasing residence times between the induction and competition ChIP ratio curves existed (i.e., corresponding to t1/2~1.3 min), relatively reliable residence times could be estimated. The distribution of genome-wide TBP t1/2 values (Supplementary Fig. S7a) reveals highly dynamic TBP with the majority of residence times below 5 minutes across the sites where reliable estimates could be made. Notably, a comparison of competition ChIP derived off-rates with those determined at select loci using the CLK method7 (Supplementary Table S2 in Supplementary Text Sec. 9) shows that the off-rates are in qualitative agreement (i.e., relatively rapid TBP dynamics).

Multiple TBP-chromatin binding events are associated with synthesis of one nascent RNA molecule at Pol II genes

Earlier estimates of relative TBP turnover, r, for 602 Pol II and 264 Pol III genes were obtained using linear regression to a subset of the data (i.e., 10, 20, 25 and 30 min time points)16. Because a physical model of competitive binding rooted in reaction-rate theory naturally follows the profiles of the normalized and scaled data as a function of induction time (as opposed to a linear fit), we were able to apply stringent noise criteria on the residuals of each fit (Supplementary Text Sec. 5 and 8) and reliably estimate TBP residence times for 794 Pol II and 205 Pol III genes (Supplementary Table S1). Given the quasi-linear relationship between t1/2 and t0, we calculated the percent error in t0 (100 times the standard error in t0 divided by t0) genome-wide (Supplementary Fig. S7b), which reflects the associated percent error in t1/2. The median percent error in t0 was 6.9%, in accord with the stringent noise criteria applied to each fit. While r and our estimates of kd are correlated (Supplementary Fig. S6a), r is also strongly correlated with the t = 0 ratio of induced over competitor ChIP signals (Supplementary Fig. S6b), which suggests insufficient background subtraction influencing the estimates of r. Nevertheless, in agreement with estimates of r made by van Werven et al.16 as well as the competition ChIP and AA results of Grimaldi et al.9, we found that TBP residence times were notably shorter for Pol II compared to Pol III genes (Fig. 4a) and to a lesser extent for TATA compared to TATA-less genes23 (Fig. 4b). While the presence of a strong TATA box affected TBP residence times, TBP residence times were not correlated with the AT content of the TBP binding sites (Supplementary Fig. S11g). In contrast to van Werven et al.16 but consistent with Grimaldi et al.9, we found no significant differences between TBP residence times comparing SAGA containing and SAGA free genes (Supplementary Fig. S8d) or TFIID-containing and TFIID-free genes (Supplementary Fig. S8g). Given that Pol III genes tend to be higher expressed24 and have longer TBP residence times than Pol II genes, we were surprised to find marginally shorter TBP residence times at highly expressed ribosomal protein (RP) genes compared to other genes (Fig. 4c, Supplementary Fig. S8j). This finding was consistent with modestly higher nascent RNA transcription rates (TRs)18 for shorter TBP residence times at Pol II genes (Fig. 4d). Shorter residence times were also associated with slightly but significantly higher levels of extrinsic transcriptional noise25 (Fig. 4e) consistent with recent findings26. Notably, this result remained significant even after applying a stringent 15% or lower percent error cutoff on t0 (Supplementary Fig. S7d). With estimates of TR and TBP t1/2, we defined transcriptional efficiency, which is the product of the transcription rate and TBP residence time (TRt1/2) whose inverse represents the number of TBP residence times or binding events associated with productive elongation of Pol II and transcription. Strikingly, we found low transcriptional efficiencies for Pol II genes (Fig. 4f). The median TRt1/2 across Pol II promoters was 0.2 molecules, or ~5 TBP binding events for productive RNA synthesis to proceed (Fig. 4f). This is consistent with an upper limit for this value for most Pol II genes (i.e., TRt1/2 ≤ 1 molecules) determined by the likely TBP-chromatin residence time from AA experiments and characteristic values of transcription rate across the yeast genome9. These findings are consistent with rapid, highly stochastic TPB/PIC dynamics at Pol II genes with multiple rounds of assembly and disassembly before productive Pol II elongation. Surprisingly, higher TBP turnover was associated with modestly higher levels of Pol II gene transcription. While we don’t have nascent RNA data for Pol III genes, these genes tend to be much higher expressed than Pol II genes; yet TBP residence times tended to be ~10 minutes (Fig. 4a) suggesting much more stable PIC formation27 and function for Pol III genes.

Figure 4: Multiple, minute-scale TBP-chromatin binding events are associated with transcription at Pol II genes.
figure 4

(a) Normalized density of TBP residence time (on log2 scale) for Pol II and Pol III promoters which yielded a median Pol II TBP residence time (t1/2) of 3 min and median for Pol III genes of 9 min. The difference between the two distributions is significant with a Kolmogorov-Smirnoff (KS) p-value = 2.2e-16. (b) Normalized TBP (on log2 scale) density for TATA-containing versus TATA-less promoters. TATA-containing promoters have over all shorter residence times than TATA-less promoters (KS p-value = 0.0075). (c) Ribosomal protein (RP) genes have marginally shorter TBP residence times compared to non-RP genes (median RP t1/2 = 1.4 min and median non-RP t1/2 = 1.6 min; KS p-value = 0.25). (d) Promoters in the highest quartile of transcription rate (TR) tend to have shorter TBP than promoters in the lowest quartile (KS p-value = 0.005). (e) Promoters with higher extrinsic transcriptional noise 25 have lower TBP residence time (KS p-value = 0.048). (f) Normalized density of transcription efficiency (defined as the transcription rate multiplied by residence time, ) showing that the median transcriptional efficiency is 0.21 molecules. In other words, for a representative Pol II promoter, ~5 TBP turnovers are required before a single molecule of RNA is successfully transcribed (inverse of transcriptional efficiency).

TBP-chromatin residence time is correlated with relative Rap1 residence time but not with +1 nucleosome residence time or nascent RNA transcription rate

To gain further insights into the upstream regulation and/or downstream impact of TBP-chromatin binding dynamics especially on regulation of gene expression, we compared TBP residence times (t1/2) to the only other regulatory factors whose dynamics have been characterized on a genomic scale (in yeast): previously derived Rap115 and nucleosome14 relative turnover rates (λ) and their inverse turnover rates (λ−1) or relative residence times. Notably, we showed that the relative turnover (λ), derived using a Poisson statistical turnover model14,15, equals the off-rate (kd) plus a time-dependent function (Supplementary Eqn. 28, Supplementary Fig. S9) and can be moderately biased. More importantly, the relative turnover rates are excessively biased because normalized ChIP ratios were not scaled to ratios of fractional occupancies before model fitting15 as described above (Fig. 2, Supplementary Text Sec. 15 and Supplementary Fig. S10). In other words, fitting a model of the ratio of occupancies to un-scaled data (Fig. 2b) as opposed to properly scaled data (Fig. 2c) yields significantly biased (i.e., 30-fold or greater) estimates of Rap1 residence time (Supplementary Text Sec. 15 and Supplementary Fig. S10). Nevertheless, we found TBP residence time (t1/2) was correlated with Rap1 relative residence time (λ−1) at non-RP Pol II genes but not at RP Pol II genes (Fig. 5a). TBP residence time showed weak negative correlation with Pol II transcription rate (corr = −0.11; Supplementary Fig. S11a). Rap1 relative residence time (λ−1) showed slight positive correlation (Fig. 5b) with transcription rate at non-RP genes, while transcriptional efficiency was modestly correlated with Rap1 relative residence time at non-RP Pol II genes (Fig. 5c). Interestingly, the majority of the sites for which Rap1 relative residence times have been determined (ranging from 30–150 min) exhibit highly dynamic TBP (t1/2 < 1.3 min or t0 < 24.5 min; Supplementary Fig. S11b). This further supports our findings that Rap1 relative residence times15 are 20 to 30 fold higher (or more) than, but likely correlated with, actual Rap1 residence times15 (Supplementary Text Sec. 15). While +1 nucleosome dynamics were poorly correlated with TBP residence time (Fig. 5d, Supplementary Fig. S11c,d), they were positively correlated with transcription rate (Fig. 5e, Supplementary Fig. S11e) and efficiency (Fig. 5f, Supplementary Fig. S11f). These results suggest that while the dynamics and not merely the presence (Supplementary Fig. S8m) of transcription factors like Rap1 regulate TBP/PIC dynamics, TBP and Rap1 recruitment and dynamics are not the rate-limiting step in transcription at Pol II genes. Conversely, the dynamics of factors that play a role in regulating elongation including +1 nucleosome turnover19,20,21 may play more critical roles in determining the transcription rate and efficiency.

Figure 5: TBP dynamics are correlated with Rap1 but not +1 nucleosome dynamics.
figure 5

(a–c) log2-log2 scatterplot of Rap1 relative residence time (λ−1) versus (a) TBP residence time (t1/2), (b) transcription rate (TR), and (c) transcription efficiency (TRt1/2) for Ribosomal protein (RP) genes in blue and non-RP genes in red. Rap1 λ−1 correlated well with TBP and at non-RP genes, but not at RP genes. λ−1 was mildly correlated with TR at non-RP genes. (d–f) Normalized density of (d) TBP residence time (t1/2), (e) transcription rate (TR), and (f) transcription efficiency (TRt1/2) at genes containing hot and cold +1 nucleosomes. Hot nucleosomes were in the top quartile of nucleosome turnover and cold were in the bottom quartile (see Supplementary Text Sec. 16). There is no difference in TBP between hot and cold nucleosomes (KS p-value = 0.50) (d), but hot nucleosomes tend to have higher TR (KS p-value = 0.007) (e) and higher (KS p-value = 1.3e-7) (f).

Occupancy of multiple elongation and initiation complexes at promoters tends to increase transcription efficiency and rate but does not affect TBP-chromatin residence time

To further assess the hypothesis that transcription factors associated with elongation as opposed to PIC and Pol II recruitment or initiation are the rate-limiting step in transcription, we tested the effect that the presence or absence of 202 transcription factors mapped to the yeast genome22 had on TBP residence time, transcription rate and transcription efficiency. We subdivided loci for which we had estimates of TBP residence time into quartiles of the number of bound transcription, initiation, and elongation factors based on the classification by Venters et al.22. As expected, the presence of greater numbers of transcription, initiation and elongation factors at promoters had no significant impact on TBP residence times (Fig. 6a–c) but yielded higher transcription rates (Fig. 6d–f) and efficiencies (Fig. 6g–i). Strikingly, the presence of more elongation factors had a much greater impact on both transcription rate (Fig. 6f) and efficiency (Fig. 6i) compared to that of initiation factors (Fig. 6e,h), consistent with our hypothesis.

Figure 6: High numbers of elongation factors at Pol II promoters are associated with higher transcription rates and efficiencies.
figure 6

(ac) Normalized density of TBP residence time (t1/2) on log2 scale for genes with the upper quartile numbers of bound transcription factors (TFs) and genes with the lower quartile numbers of bound TFs (out of 202 mapped TFs in Venters et al.22) showing that is not modulated by (a) the number of total TFs, (b) initiation TFs or elongation TFs (c). The elongation and initiation TFs were annotated as in Venters et al.22. (df) Normalized density of transcription rate (TR) on the log2 scale for genes with the upper quartile numbers of bound TFs and genes with the lower quartile numbers of bound TFs showing that TR is modulated by (d) the number of total TFs (KS p-value = 8.6e-5), (e) initiation TFs (KS p-value = 9.8e-4), and (f) elongation TFs (KS p-value = 3.14e-11). (gi) Normalized density of transcription efficiency (TRt1/2) on log2 scale for genes with the upper quartile numbers of bound TFs and genes with the lower quartile numbers of bound TFs showing that TRt1/2 is significantly modulated by (g) the number of overall TFs (KS p-value = 2.4e-4), (h) initiation TFs (KS p-value = 0.05), and (i) elongation TFs (KS p-value = 8.5e-8) (i).

For each of the 202 factors, we also conducted permutation tests to estimate the significance of differences of TBP residence times, transcription rates and efficiencies at sites with the factor present compared to sites with that factor absent. We only found one factor, Tfa2 (a TFIIE subunit), whose presence yielded statistically shorter TBP residence times compared to its absence (Supplementary Fig. S12a). Given that TFIIE (together with TFIIH) recruitment leads to a complete PIC, which then requires ATP for formation of the transcription bubble and subsequent Pol II elongation28, higher occupancy of TFIIE could lead to more rapid rates of Pol II elongation and PIC disassembly. This could explain shorter TBP residence times for promoters with higher levels of TFIIE. In partial agreement with this, presence of Tfa2 at promoters modestly increased transcription rate (Supplementary Fig. S12b) but had no significant effect on efficiency (Supplementary Fig. S12c). In contrast, we found that 46% and 50% of all the initiation and elongation factors mapped, respectively, significantly modulated transcription rate and efficiency (Supplementary Fig. S12d,e and Supplementary Text Sec. 17). Not surprisingly, many of these factors were members of initiation and elongation complexes whose enrichment at promoters lead to both increased transcription rate and efficiency (Supplementary Fig. S12f).

Discussion

We developed and applied a physical model of competitive binding using chemical kinetic theory to TBP competition ChIP-chip data and derived TBP-chromatin residence times genome-wide in yeast. While competition ChIP was believed to be a low time resolution approach given the 60–70 minutes that it takes to induce the competitor to a concentration approaching steady state levels, we found that we could reliably extract residence times as short as 1.3 minutes. Consistent with live cell imaging13, CLK7, and AA9 results, many promoters displayed highly dynamic TBP with residence times less than 1.3 minutes, which could not be accurately estimated (Supplementary Text Sec. 8).

In order to derive the physical residence times at relatively high time resolution (i.e., few minutes) and obtain biologically meaningful results, we learned a number of critical lessons. First, normalized ChIP-chip or ChIP-seq data must be scaled to the relevant in vivo occupancy variable in order to fit the associated kinetic theory using these occupancy variables. Second, this scaling requires quantifying soluble competitor and endogenous TF levels (as in Fig. 3a) as well as competition ChIP signal at “late” time points that enable steady state competitor TF induction levels to be accurately estimated. Third, increasing the precision of both the competitor induction curve and competition ChIP signal by way of either careful measurements or many replicate measurements and averaging increases the time-resolution of competition ChIP. Fourth, as noted in the Results section above and detailed below, we found that comparing the dynamics of one TF, Rap1, as opposed to static snapshots of occupancy (presence versus absence) of TFs like Rap1 (and 200 other TFs) yielded significant associations with the dynamics of TBP-chromatin binding. In addition, any significant albeit modest associations of TBP dynamics with static occupancy data (e.g., Tfa2), could indicate that the dynamic coupling between TBP and Tfa2, for example, could be strong, pointing to the necessity of measuring TF-chromatin dynamics for many more factors to gain mechanistic insights into the regulation of transcription.

Comparison of reliable TBP-chromatin residence times, which ranged from 1.3 minutes to 53 minutes, across different promoter classes revealed highly dynamic TBP at Pol II genes and less so at Pol III genes similar to previous studies using competition ChIP9,16 and AA9. In contrast to the findings of van Werven et al.16, we did not find that the occupancy of SAGA or TFIID at promoters significantly modulated TBP residence time, consistent with an independent study applying both competition ChIP and AA at select loci9. We did find a significant but modest decrease in TBP residence time at TATA containing compared to TATA-less promoters in agreement with van Werven et al.16. We also found that the TBP relative turnover parameter (r) derived by van Werven et al.16 was biased by the HA/Avi ratio at the start of induction with higher HA/Avi ratios yielding lower relative turnover values (Supplementary Fig. S6b). This could explain the discrepancy between our results and that of van Werven et al.16.

We also assessed the effect that the occupancies of 202 mapped TFs22 had on TBP residence time, transcription rate and transcription efficiency. We only found that the presence of one factor, Tfa2 (a subunit of TFIIE), significantly modulated TBP residence time: the presence of Tfa2 at promoters by ChIP-chip analysis22 was associated with shorter TBP residence times (Supplementary Fig. S12a). Notably, the presence of the other TFIIE subunit, Tfa1, did not have an effect on TBP residence time. Based on the analyses of Venters et al.22, Tfa1 was present at most promoters (4350 sites)—nearly twice as many as Tfa2 (2605 sites). Thus, Tfa2 site enrichment may be a surrogate for overall TFIIE enrichment at promoters. Conversely, we found that the presence of a number of factors classified as “access”, “orchestration”, “initiation” and “elongation” by Venters et al.22 significantly affected—mostly increasing—transcription rate and efficiency (Supplementary Fig. S12), with the presence of multiple factors annotated as “elongation” associated with notably higher transcription rates and efficiencies than those annotated as “initiation” (Fig. 6e,f,h,i). We note that an important caveat to these conclusions is that while these annotations are useful and may indicate a predominant role for a number of these factors, many, for example FACT, play multiple roles including both “initiation” and “elongation”19.

While the presence or absence of Rap1 did not have a significant effect on TBP residence time, Rap1 relative residence time15 (i.e., inverse turnover rate) was correlated with TBP residence time. This suggests the possibility of a number of unknown dynamic relationships between regulatory factors that require characterization of the dynamics as opposed to static snapshots of relative occupancy determined by ChIP-seq or ChIP-chip. We also found that Rap1 residence times were likely much shorter than previously reported15 and likely similar to TBP residence times, consistent with findings that Rap1 activates transcription by interacting directly with the TBP-containing TFIID complex9,29. Neither Rap1 relative residence time nor TBP residence time was correlated with nascent RNA transcription rate or +1 nucleosome inverse turnover. However, +1 nucleosome turnover rate was positively correlated with transcription rate and efficiency. Moreover, in agreement with the conclusion of Grimaldi et al.9 that at least one round of PIC assembly is required for Pol II recruitment and elongation at most Pol II genes, we found a median value of ~5 TBP residence times associated with one productive elongation of Pol II across Pol II genes (i.e., median transcription efficiency, TRt1/2, of 0.2 molecules) suggesting multiple PIC assembly and disassembly events before synthesis of one RNA molecule at Pol II genes. Taken together, these findings suggest increased dynamic coupling of TFs and GTFs at similar stages of PIC assembly, Pol II recruitment and elongation, and transcription; the dynamics of factors that are more involved in the early stages of transcription initiation including Pol II elongation (e.g., +1 nucleosome19,20,21) are likely better dynamically correlated with transcription rate. Our study highlights the importance of developing methods that estimate TF-chromatin dynamic parameters including residence time and the resulting insights that can be gained into the inherently dynamic and stochastic process of transcription. These approaches and measurements should ultimately allow the stochastic processes of pre-initiation complex formation, Pol II recruitment and elongation, and transcription to be characterized quantitatively.

Methods

Background subtraction, normalization and scaling of competition ChIP data

The raw data generated by van Werven et al.16 (ArrayExpress E-M-TAB-58) reported the optical signal intensity for induced (SHA) and endogenous (SAvi) TBP concentrations hybridized on an Agilent whole-genome microarray. SHA and SAvi were replicated by swapping Cy3 and Cy5 dyes to take into account dye-specific variations in the intensity of the optical signal. We geometrically averaged the two dye-swapped ratios (call it Rm), as described in Supplementary Text Sec. 1. Non-specific background probes were identified by fitting a normal curve to the right edge of the t = 0 minute log2(Rm) data as shown in Supplementary Fig. S1. We selected signal probes in the tail of the normal fit to the non-specific background with a false discovery rate (FDR) of 0.05 or less in the t = 0 minute data. Rm values were normalized (denoted by ) across time points, t, by dividing Rm by the background mean obtained from the normal fit to the background probes (Supplementary Fig. S1). To quantify the induction of HA over time, we fitted a Hill-like sigmoid curve with n = 4 to the ratio of the concentration of HA to Avi , where A and B denote Avi and HA, respectively, and the superscript i denotes “in vivo”. The fit gave an induction time of 22 minutes and the saturation value of HA/Avi concentration ratio of 2.23 (Supplementary Eqn. 3, and Fig. 3a). We theoretically related the empirical values of for the signal probes in our data to the ratio of the in vivo fractional occupancy of HA and Avi as , where B is the locus-specific differential background between HA and Avi at t = 0 minutes and α denotes a scale factor which effectively quantifies the ratio of the antibody affinities for HA and Avi (Supplementary Text Sec. 2). To determine α and B at every TBP peak, a Hill-like sigmoid curve (with n = 4) with the added term B was fitted to for each peak (Supplementary Eqn. 24). B was subtracted from and α was determined as the asymptotic in vivo concentration ratio of HA/Avi (i.e., 2.23) over the asymptotic value. Hence, after scaling and background subtraction, satisfied the two boundary conditions: for , and as , as required by the kinetic model of in vivo competitive binding.

Estimation of residence time by fitting a chemical kinetic theory model of competitive binding to normalized, scaled competition ChIP data

The model for in vivo competitive binding dynamics between endogenous Avi (subscript A) and competitor HA (subscript B) TBP is described by mass-action differential equations linear in the TBP-chromatin association rate and dissociation rate :

In the above equations, we have assumed that the association and dissociation rates for endogenous and competitor TBP are the same, and we have absorbed the experimentally undetermined endogenous concentration into , such that and have units of inverse minutes (Supplementary Text Sec. 2). Equations (1) and (2) could not be solved analytically due to the time dependence of , but an approximate solution could be derived assuming ideal induction, i.e., that the induction of HA was instantaneous: for and constant for . Inserting the actual time dependent in the ideal induction solution gave an approximate solution to Equations (1) and (2) (Supplementary Eqns. 19 and 20).

We fitted the analytical solution of ideal induction to the normalized, scaled ratio data by developing a procedure for estimating the starting values for nonlinear regression (Supplementary Text Sec. 5.1). The algorithm was implemented in Mathematica and the NonlinearModelFit function was used for fitting. The ratio is almost insensitive to (Supplementary Text Sec. 3), and hence, we could reliably only extract . The ideal solution introduced a bias in our estimate of , which was expected since the ideal solution was an approximate solution to Equations (1) and (2). We fixed this bias using a pre-generated look-up table (Supplementary Text Sec. 5.2, Supplementary Fig. S2). Finally, we used our bias-corrected estimates from the look-up table as the starting point for a numerical one-dimensional Newton’s method fit of Equations (1) and (2) to find the minimum of the fit residual and extract (Supplementary Text Sec. 5.3). To calculate the derivative of the fit residual required at each iteration of Newton’s method, we numerically solved the in vivo differential equations using NDSolve in Mathematica. Exceptions to the fitting procedure where we had to change the starting estimate of or the step size for Newton’s method are noted in Supplementary Text Sec. 6.

Statistical analyses of residence time, transcription rate and transcription efficiency data

Throughout the main text and the supplement, quoted correlations are Spearman correlation coefficients unless otherwise stated. Kolmogorov-Smirnoff (KS) test was conducted in R using the ks.test function to determine the p-values reported in Figs 4, 5, and 6 and Supplementary Fig. S12a–c. For Supplementary Fig. S12d, permutation test (which is useful in particular when the test statistic does not follow a normal distribution) was used to calculate the false discovery rate (FDR) for t1/2, TR, and TRt1/2. In other words, loci across the genome were partitioned into two sets for each transcription factor: those that showed a significant enrichment of the transcription factor above the background as determined by Venters et al.22 and those that did not. These two sets were used to conduct permutation test for t1/2, TR, or TRt1/2 test statistics using permTS in the perm library in R, which gave the mean difference of the test statistic between the two sets along with the p-value for the mean difference. The p-value was adjusted using the Benjamini-Hochberg correction30 using the p.adjust function in R to derive FDR estimates. In Supplementary Fig. S12d the FDR for TRt1/2 was plotted against the FDR for TR, and transcription factors were listed in descending order of TRt1/2 mean differences. The blue dots (representing TFs that affect TR more significantly than TRt1/2) were chosen with a TR FDR < 0.06 and TRt1/2 FDR > 0.1. Red dots (representing TFs that were significant in permutation tests for both TR and TRt1/2) were chosen with TR FDR < 0.1 and TRt1/2 FDR < 0.1. Finally, black dots (representing TFs that potentially affect TRt1/2 more than TR) were chosen with TR FDR > 0.1 and TRt1/2 FDR < 0.1, or TR FDR > 0.45 and TRt1/2 FDR < 0.3.

Additional Information

How to cite this article: Zaidi, H. A. et al. RNA synthesis is associated with multiple TBP-chromatin binding events. Sci. Rep. 7, 39631; doi: 10.1038/srep39631 (2017).

Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.