Transcription regulates the spatio-temporal dynamics of genes through micro-compartmentalization

Although our understanding of the involvement of heterochromatin architectural factors in shaping nuclear organization is improving, there is still ongoing debate regarding the role of active genes in this process. In this study, we utilize publicly-available Micro-C data from mouse embryonic stem cells to investigate the relationship between gene transcription and 3D gene folding. Our analysis uncovers a nonmonotonic - globally positive - correlation between intragenic contact density and Pol II occupancy, independent of cohesin-based loop extrusion. Through the development of a biophysical model integrating the role of transcription dynamics within a polymer model of chromosome organization, we demonstrate that Pol II-mediated attractive interactions with limited valency between transcribed regions yield quantitative predictions consistent with chromosome-conformation-capture and live-imaging experiments. Our work provides compelling evidence that transcriptional activity shapes the 4D genome through Pol II-mediated micro-compartmentalization.

High-resolution contact maps in mammalian and fly cells reveal transcription-dependent fine structures, such as loops between active gene promoters, promoters and enhancers, or transcriptional start (TSS) and termination (TTS) sites of the same gene [14,[26][27][28][29]. However, the mechanistic origins of these fine structures, despite their potential significance in gene regulation, remain controversial.
Indeed, on the one hand, some experiments in Drosophila and mice indicate higher 3D contacts within expressed gene bodies compared to repressed ones [29,30]. Remodeling of chromatin structure around genes during mouse thymocyte maturation often coincides with transcriptional changes [31]. RNA Polymerase IIs (Pol II) form also distinct foci and higher-order clusters known as transcription factories [32][33][34][35], and active genes tend to colocalize within transcriptionally-active subcompartments [27,36]. On the other hand, there are cases in mammalian cells where significant unfolding of genes occurs after strong transcriptional activation [37][38][39], and acute depletion of Pol I, II, and III has minimal effects on large-scale genome folding [40]. In budding yeast, gene activity inversely correlates with local chromatin compaction [25]. Live-cell imaging experiments highlight the relationship between gene transcription and chromatin dynamics [41][42][43][44][45], revealing enhanced gene mobility upon Pol II elongation inhibition [41,43,44] or gene activation [42] and correlated motions between active regions [44,45].
Overall, the evidence presents a complex understanding of the genome spatio-temporal dynamics in response to gene transcription, necessitating a comprehensive framework to reconcile these observations. In this study, we analyze publicly-available Micro-C data for mouse embryonic stem cells (mESCs) and develop observables to characterize transcription-dependent 3D gene folding. Our analysis reveals a nonmonotonic relationship between intragenic contact density and gene transcription, potentially reconciling contradictory data. By dissecting the contributions of loop extrusion and transcription-associated factors, we propose Pol II occupancy as a key determinant of gene folding. Using a traffic model for gene activity and a 3D polymer model [55,56], we demonstrate that transcriptionally-active subcompartments and intragenic contact enrichment may arise from Pol II-mediated phase separation. Furthermore, we suggest that Pol II-mediated condensation, coupled with transcriptional bursting, may slow down gene mobility, aligning with experimental observations. Results RNA Pol II occupancy and gene length correlate with intra-gene compaction In this study, we aimed to investigate the potential role of transcriptional activity in the local organization of genes within the genome. To accomplish this, we focused on mouse embryonic stem cells (mESC) as they provide abundant quantitative data. Specifically, we utilized publicly available high-resolution Micro-C and Pol II ChIPseq data [14]. For our analysis, Micro-C contact maps were distance-normalized to examine contact enrichment compared to a sequence-averaged null behavior, resulting in the observed over expected (obs/exp) contact map. We introduced two scores for each gene (Fig. 1A, Materials and Methods): (i) Intra-gene contact enrichment (IC), which represents the mean obs/exp values calculated for all pairs of loci within the gene, capturing the level of self-association and overall gene compaction. (ii) Intra-gene RNA Pol II enrichment (IR), which corresponds to the mean normalized Pol II ChIPseq profile within the gene and reflects gene transcriptional activity, correlating with RNAseq data (Fig.S1).

Fig. 2. Gene classification based on size and Pol II occupancy, and pileup meta-gene analysis. (A)
Pileup meta-gene analysis (PMGA, see Methods) of the obs/exp map around genes clustered based on their length (horizontal axis) and Pol II enrichment (vertical axis). The number of genes of each cluster is indicated above on each map. Maps for clusters with less than 25 representative genes were not drawn, due to lack of statistics. (B) Average IC scores for each cluster in (A). (C) PMGA of different chromatin tracks: in each subplot, all the average profiles of the different Pol II clusters for genes of the same length range are shown (from left to right: from small to large genes); different colors correspond to the different Pol II clusters, from low (blue) to high (red) IR score. Figure 1B shows a significant positive correlation (Spearman's ρ = 0.56) between IC and IR scores, indicating that increased transcriptional activity is associated with enhanced intra-gene compaction, consistent with prior research [29][30][31]. Additionally, a similar correlation between IC and intra-gene H3K36me3 (a histone mark related to Pol II elongation) content was observed (Fig.S1). Clustering genes based on similar IR scores revealed a nonlinear and non-monotonic relationship (Fig. 1C): IC generally increases with IR, except at very high Pol II levels where a slight but significant relative decrease in contact frequency occurs. Importantly, this behavior cannot be solely attributed to the inherent properties of Micro-C experiments to detect more or less contacts depending on the molecular crowding on DNA [25] since similar behavior was observed using mESC Hi-C data [19] (Fig.S2). Interestingly, this correlation between IR and IC holds true regardless of gene compartment (A or B) (Fig.S3) or the number of exons [19] (Fig.S4). However, genes with higher exon counts tend to exhibit greater compaction compared to those with fewer exons. Moreover, IC scores for A-compartment genes are generally higher than those for B-compartment genes, with this difference becoming more pronounced for genes with high IR scores, where the drop in IC is more significant for B-genes.
In mammals, highly active genes are typically smaller, and larger genes, when active, are usually lowly expressed (Fig.S5). Hence, we investigated whether gene size could also be a determining factor and classified genes based on both their IR and genomic length (Materials and Methods). Figure 2B displays the average IC score for each category, revealing a positive correlation between IC and gene length: longer genes exhibit stronger intra-gene contact frequency at a given Pol II occupancy density. To gain deeper insights into the contact patterns and profiles within and surrounding genes, we conducted a pile-up meta-gene analysis (PMGA), aggregating the rescaled obs/exp maps and ChIP-seq profiles of genes with similar size and transcriptional activity ( Fig. 2A,C, Materials and Methods). PMGA uncovered a strong correlation between Pol II profiles and certain structural features of contact maps: intra-gene contact maps were nearly uniform, consistent with the constant Pol II levels observed within genes; stripes of preferential interactions were observed between Pol II-rich promoters/TSSs and gene bodies ("stripe"); and loops were formed between Pol II-rich TSSs and TTSs ("TSS-TTS loops"). Notably, the correlation between H3K36me3 profiles (which exhibit a depletion around TSSs) and Hi-C patterns such as TSS-TTS loops and stripes was less apparent ( Fig. 2A,C). Regarding the dependency on gene size, we found that promoters of short genes are often located at the domain borders, while larger genes tend to form their own insulated domains separate from surrounding regions.
To further investigate the role of Pol II occupancy, we analyzed two publicly available datasets involving the treatment of mESC cells with transcriptional inhibitor drugs: triptolide (TRP), which inhibits Pol II initiation, and flavopiridol (FLV), which inhibits Pol II elongation [29]. Firstly, we confirmed the significant reduction in the intensity of Pol II-mediated loops after both treatments (Fig.S6). Consistent with the observed loss of intra-gene Pol II occupancy in all genes, particularly highly transcribed ones (Fig.S7,S8), intra-gene interactions were weaker in the TRP and FLV cases compared to the normal condition (Fig. 3A,S9,S10), resulting in a 12% reduction in IC for large active genes post-treatment. There exists a notable correlation between the fold-changes (treated vs untreated) in IC and IR scores (Fig. 3C): the greater the reduction in Pol II occupancy for a given gene, the more likely its intra-gene compaction is affected. Interestingly, when re-clustering genes based on their new IR scores measured in TRP-and FLV-treated cells, we still observed an average increase in IC as a function of IR similar to the untreated case (Fig. 3D), suggesting that the remaining intra-gene interactions observed after transcription inhibition may be attributed to residual Pol II occupancies.
In summary, our findings demonstrate a correlation between intra-gene compaction, interaction patterns, local transcriptional activity, gene length, and Pol II occupancy.
Cohesin-mediated loop-extrusion activity plays a minor role on intra-gene compaction Recent studies, both experimental and theoretical, have proposed that the loop extrusion mechanism, which plays a crucial role in the formation of TADs, might have an impact on the transcription machinery [23,38,46,47,54]. Interestingly, we observed a significant correlation between the occupancy of CTCF and cohesin, the main players in loop extrusion, and the intra-gene contact enrichment and Pol II occupancy (Fig. 2C,S1). This observation led us to investigate whether cohesin-mediated loop extrusion could drive the correlation between transcriptional activity and intra-gene compaction discussed earlier.
To address this question, we analyzed our original dataset from wild-type mESCs and excluded genes with high SMC1a (a cohesin subunit) occupancy (Materials and methods). The remaining genes, clustered based on IC and gene length, showed significantly lower levels of CTCF and cohesin, while Pol II profiles remained largely unchanged (Fig.S11C). Despite this subset of genes, the IC and IR scores still exhibited a strong correlation at a level similar to wild-type ( Fig. 3D,S11A,B). Additionally, PMGA revealed that the typical interaction patterns observed within genes were still visible for cohesin-poor genes, although certain features such as stripes, which are known to be footprints of loop extrusion activity near extruding barriers [38,46], were absent outside the genes (black arrows in Fig. 3B left).
Furthermore, we utilized three publicly available mESC datasets where CTCF (∆CTCF), the cohesin subunit RAD21 (∆RAD21), or the cohesin unloader WAPL (∆WAPL) were acutely depleted [57]. These treatments led to significant alterations in CTCF and cohesin occupancies throughout the genome [57], as well as changes in TAD folding and CTCF-CTCF loops [7,13], such as a strong reduction in loop intensity in ∆CTCF and ∆RAD21 and reinforcement in ∆WAPL (Fig.S6 bottom). However, most gene expressions remained unaltered [57], and the majority of loops between Pol II peaks were unaffected (Fig.S6 top). Surprisingly, despite the acute changes in intra-gene CTCF and cohesin profiles (Fig.S12C, S13C,S14C), we observed only minimal effects on intra-gene interactions (Fig.3B,D, Fig. S10,S12-14A,B). The most noticeable -yet weak -changes in IC scores occurred in highly active genes (high IR), with an average 5% reduction in ∆RAD21 (Fig.3B). However, the changes in IC between WT and ∆RAD21 conditions did not exhibit a clear correlation with changes in RAD21 occupancy (Fig.S15). Similar to cohesin-poor genes in WT, the structural features associated with loop extrusion outside genes were lost or significantly reduced in ∆CTCF and ∆RAD21 (and enhanced in ∆WAPL) (Fig.3E).
Collectively, these results indicate that cohesin-mediated loop extrusion does not significantly affect the specific organization of transcribed genes, suggesting the presence of an independent mechanism. A biophysical model to investigate the role of transcription on gene folding Our data analysis strongly suggests that Pol II occupancy drives the 3D organization of genes, independently of cohesin activity. Moreover, recent in vitro and in vivo experiments suggest that Pol IIs could form liquid-like droplets either directly through a phase-separation process mediated by weak interactions between their carboxy-terminal domains [58][59][60] [36,61] or indirectly via the formation of Mediator condensates triggered by nascent RNAs [62]. In the following, we developed a biophysical model to better characterize the phenomenology of Pol II-mediated gene folding by investigating how effective self-attractions between Pol II-occupied loci may shape the spatio-temporal dynamics of genes. First, we built a stochastic model to describe Pol II occupancy and dynamics at a gene using a standard Totally Asymmetric Simple Exclusion Process (TASEP) [63][64][65]. In this model (Fig. 4A, Materials and methods), Pol IIs can be loaded onto chromatin at the TSS with rate , transcription elongation initiates with rate , Pol IIs then progress along the gene at rate until 0 they unbind from chromatin at TTS with a rate . During this process, Pol IIs cannot overlap or bypass each other. With this simple model, we can predict the average Pol II density profile along the gene at steady-state (Fig. S16). For example, by varying model parameters (γ 0 /γ = 1 , ), we reproduced uniform average profiles of Pol II occupancy along the gene, β/γ = 1 − α/γ ranging from low (~0.02) to high (~0.80) densities (Fig. 4B,C).
Next, to assess the spatial organization of a gene, we integrated the TASEP in a 3D polymer model of chromatin fiber [55,56] (Fig. 4A, Materials and methods). Briefly, we represented a 20 Mbp-long section of chromatin as a self-avoiding chain (1 monomer = 2kbp = 50nm). We focused on a region of size L in the middle of the chain, which represents the gene of interest. Each monomer within the gene is characterized by a random binary variable indicating the local Pol II occupancy, whose dynamics is described by the TASEP. To investigate the impact of Pol IIs density and dynamics on gene folding, we assumed that monomers occupied by Pol II at a given time may self-interact at short-range with energy strength E. All the other monomers are considered non-interacting, neutral particles. The coupled stochastic spatio-temporal dynamics of the Pol II occupancies and 3D positions of the monomers are then simulated using kinetic Monte-Carlo (Materials and methods).

Self-attraction between Pol II-bound genomic regions drives the intra-gene spatial organization
The coil-to-globule transition of a gene is controlled by Pol II occupancy and gene length We quantified generic structural properties of the model and investigated the relationship between gene compaction (IC scores) and Pol II density (IR scores) with respect to model parameters. In particular, we varied IR scores (via ) while keeping other TASEP parameters α/γ constant, achieving uniform Pol II occupancies along the gene (as in Fig.4B,C), and we monitored the corresponding IC scores at steady-state. For fixed gene length L and elongation rate , IC is an increasing function of both the Pol II γ density (Fig. 4D upper panels) and the strength E of self-attraction (Fig.4E mid panel): the gene's polymeric subchain undergoes a theta-like collapse [66] towards a globular state when the Pol II occupancy reaches a critical value (Movie S1), such transition occurring at lower threshold densities for stronger interactions (|E|). Similar to standard self-interacting homopolymers [67,68], gene compaction strengthens with increasing gene length (Fig.4F, left  panel), while maintaining a fixed average Pol II level. This reflects the cooperative nature of the theta-collapse [69,70]. At a constant average Pol II density, IC is a decreasing function of the Pol II elongation rate (Fig. 4E upper panel). Indeed, the capacity of Pol II-bound monomers to stably interact depends on the out-of-equilibrium dynamics of the elongating Pol IIs: shorter residence time of Pol II on a monomer (compared to typical polymer diffusion time) results in more transient Pol II-mediated interactions between monomers. Notably, biologically-relevant elongation rates (~2kb/min, [71]) correspond to the 'slow' elongation regime, maximizing gene compaction. Overall, our model qualitatively recapitulates the global Pol II and gene length trends observed experimentally (Fig.2B). However, the predicted strengths of intra-gene contact enrichment are much stronger than expected (Fig. 4F left panel). For instance, a 128kbp-long gene shows ã 6-fold increase in IC score with a~8-fold rise in Pol II density across the theta-collapse (for γ = 2kb/min and E=-3 kT), whereas experimentally the same change in average Pol II occupancy yields only~35% increase in IC. We verified that reducing |E| does not resolve the problem as the theta-transition remains sharp and cooperative (Fig. S17 left panel).

Intra-gene compaction in mESC is consistent with a limited valency of Pol II-Pol II interactions
In our initial model, unrestricted interactions were allowed among the Pol II-occupied monomers in close proximity in the 3D space. However, such molecular interactions are mediated by only a restricted set of accessible residues and thus one monomer may have only a limited valency (number of simultaneous interactions). Reducing the valency led to a global, sharp drop in intra-gene contact enrichment (Fig. 4D, 4E lower panels, Movie S2). For instance, at high Pol II occupancy, a~11-fold reduction in IC score was observed for valency 2 compared to unlimited valency. At lower valencies (2 or 3), the levels of compaction aligned with experimental values (Fig. 4F right panel, Fig. S17 right panel) while still preserving the overall dependence on Pol II density and gene length see with unlimited valency. However, an intriguing exception emerged: the IC score now displays a non-monotonic dependency with Pol II levels ( Fig. 4E lower panel), as actually observed experimentally at high IR scores (Fig.1C). Within our framework, this behavior arises from a screening effect on long-range interactions. At high Pol II density, the neighboring Pol II-occupied monomers along the chain are likely to engage in interactions, limiting the ability of a monomer to interact with distantly located monomers and consequently reducing large-scale gene compaction.

Nonuniform Pol II profiles lead to intra-gene architectural details
We previously focused on average gene folding properties by considering flat, homogeneous Pol II densities. However, experimental Pol II profiles show distinct peaks at TSS and TTS. By adjusting the TASEP parameters, we generated qualitatively similar peaked profiles of increasing density (Fig. 4G, Materials and methods). Using interacting parameters (E=-3kT, valency=2) compatible with the experimental IC vs IR relationship, we predicted the formation of a stable loop between TSS and TTS in Hi-C map as well as promoter-gene stripes within gene body, for high Pol II occupancy (Fig. 4H lower panels). Interestingly, off-diagonal pileup analysis of mESC Micro-C datasets around TSS-TTS anchors exhibits similar patterns independent of the cohesin loop-extrusion mechanism (Fig. 4H upper panels, Fig.S18), implying that such architectural details are driven by Pol II occupancy and effective Pol II-Pol II interactions.

Modeling predicts a coupling between transcription and chromatin dynamics
Stochastic dynamics of gene folding in response to transcription bursting Most mammalian genes undergo discontinuous transcription in bursts [72][73][74]. To address the impact of such bursting kinetics on the gene spatio-temporal dynamics, we modified the TASEP model minimally: the promoter can stochastically switch between an "on" state, enabling Pol II binding and transcription, and an "off" state refractory to Pol II binding, with rates and (Fig. 5A). These rates define the effective Pol II binding rate ( ), the α = α /( + ) burst frequency ( , mean number of bursts per time unit) and the train size = /( + ) ( , mean number of Pol II binding and elongating during one burst). For simplicity, we = α/ assumed , allowing variation in burst properties from rare, long trains (k=0.01/min) = ≡ to frequent, short ones (k=0.04/min) (Fig. 5B,C), while maintaining an almost constant average Pol II density profile (Fig. 5D).
By averaging over all configurations, we observed a weak -but significant -decrease in intra-gene compaction in the presence of bursting (Fig. 5E). However, when considering the promoter's on/off states separately, the impact of bursting became apparent with overall higher compaction and more pronounced TSS-TTS loops and promoter-gene stripes in the on-state (Fig. 5G). This effect was more pronounced for low burst frequency as the difference in Pol II occupancy between the on/off-states became more prominent (Fig. 5F). Similarly, more elongating trains leads to increased compaction (Fig.S19). These findings suggest a time-correlation between transcriptional bursting and gene compaction where dynamical changes in the gene's radius of gyration (RG) are preceded by modifications in Pol II along the gene (Fig. 5H). Indeed, we observed an overall negative correlation between instantaneous Pol II density and RG, which was more pronounced for low burst frequencies (Fig. 5 I,J). Interestingly, when multiple trains are present simultaneously along the gene, the dynamic looping between could rise to the formation of 'factories' where they colocalize (Fig.  5K, Movie S3).

Transcription slows down gene mobility
Live-imaging experiments have indicated that chromatin motion is enhanced after Pol II inhibition or reduced after gene activation [41,43,44], suggesting a connection between transcription and a reduced gene mobility. To assess whether our biophysical model aligns with these observations, we computed for each monomer the mean-squared displacement (MSD), that measures the typical space explored by a locus over a time-lag Δt. We observed that , where and are diffusion constant and exponent, respectively (Fig. 6A). ∆ δ δ δ ∼ 0. 5 is independent of Pol II occupancy (Fig. 6B) and its value is consistent with standard polymer dynamics [75,76]. Conversely, depends on Pol II density and gene length (Fig. 6C) with a perfect opposite trend as the intra-gene compaction (Fig. 4F right): the more compact the gene the less mobile [76]. For example, a 40-70% increase in compaction corresponds to a 10-15% decrease for , consistent with experiments (Fig. 6C,D). A transcription-associated subcompartment emerges from Pol II-mediated phase separation Our analysis of intra-gene folding and dynamics suggests that similar mechanisms may explain the role of Pol II occupancy in distal inter-gene interactions. On the Micro-C map of mESC, we observed selective contact enrichments between distal highly active genes (Fig. 7, Fig. S20). For instance, the average contact frequency between the 811 kb-distant large active genes Ahctf1 and Parp1 is 3.2-fold higher than expected at similar genomic distance (Fig.7A). Both genes belong to the same A compartment, indicating that strongly transcribed genes may form an independent subcompartment within A. To test this hypothesis, we clustered all the 32-64 kb-long genes into three categories based on their IR score (Low, Mid and High) and performed PMGA (Materials and methods) of the inter-gene contacts for pairs of genes distant by more than 128 kb but less than 2 Mb (Fig 7B, Fig.S21). When both genes are transcribed (Mid and High clusters in Fig 7B), a strong promoter-promoter interaction is detected. Moreover, highly active genes (High-High) showed significant contact enrichment between their gene bodies compared to the surrounding background, which was transcription-dependent and independent of loop extrusion (Fig.S21). Contact enrichment between inactive genes (Low-Low) is similar to background and can be attributed to their location in the more compact B compartment [30].
To rationalize these observations with our biophysical model, we conducted simulations for two 100 kbp-long genes distant by 200 kb, exhibiting similar steady-state Pol II profiles (Fig. 7C). We observed that Pol II-mediated interactions not only affect intra-gene contacts but also drive the formation of inter-gene contacts between TSS and TTS and between gene bodies, whose strengths increase with Pol II density. Interestingly, interacting genes tend to colocalize and segregate from the rest of the simulated polymeric chain [77,78] (Fig.7D, Movie S4).

Discussion
In this study, we analyzed publicly-available Micro-C data of mESC [14,29] to investigate the relationship between transcriptional activity and chromosome organization. At the single-gene level (2kbp-1Mbp), intra-gene contact enrichment, structural patterns (gene-loops, promoter-stripes, Fig.3E) and the degree of insulation from the surrounding genomic regions correlate positively with Pol II occupancy along the gene and gene length (Fig.1,2). This is to contrast with the very local structure of the chromatin fiber (< 600bp) that is increasingly open as transcription rate increases [25]. For highly expressed genes, we observed reduced gene body compaction (Fig.1C), which aligns, although at a lesser extent, with the extended gene conformations observed for very long, highly expressed tissue-specific genes in mice [37,39]. In good agreement with recent high-precision Capture Micro-C data [27], we demonstrated that such structure-function relation between gene compaction and gene transcription does not directly associate with loop extrusion [77] (Fig.3), although the latter is known to drive loops and TADs formation [11] and to interfere with transcriptional elongation [38,46,47]. At the inter-gene level, we observed long-range contacts between active genes, not only between gene promoters as already characterized [79,80], but also between gene bodies (Fig.7), here also closely tied to Pol II profiles and independent of loop extrusion (Fig.S21).
Altogether, our findings suggest that active genes are central units of the 3D genome [25] and form a subcompartment [27,77,78], driven by gene activity, Pol II binding and elongation. This observation likely holds true for other cell types, as we recently showed that intra-gene compaction during mouse thymocyte maturation is, in average, associated with change in transcription levels [31]. The mechanisms described here are also likely to be broadly conserved in animals. Indeed, we analyzed the correlation between IC and IR (spearman's ρ=0.48) in whole-embryo Drosophila data at embryonic nuclear cycle 14 (Fig.S22) [81]. Drosophila is interesting as its chromosome organization is believed to be mainly driven by the spatial segregation of the epigenome instead of cohesin loop-extrusion processes [4]. We found a similar nonmonotonic dependence of IC to IR as well as Pol II-related intra-gene interaction patterns. One exception is the effect of gene length that is less clear. Interestingly, in the bacterium Escherichia coli, higher transcription is also associated with more intra-gene contacts [82]; in yeast and dinoflagellate, TAD-like structures are associated with (blocks of) active genes [25,83].
To better characterize the underlying mechanisms behind the correlations between Pol II activity and the transcriptionally-active subcompartment, we introduced a simple biophysical framework that accounts for the 1D dynamics of Pol II along genes coupled to the 3D polymer organization of chromosomes (Fig.4). By assuming self-attractive, short-range interactions between genomic loci bound to Pol II [35,48], the model is able to recapitulate qualitatively the overall augmentation of intra-gene compaction associated to an enrichment of Pol II density inside gene body and to longer genes, consistent with a standard cooperative coil-globule transition observed for finite-size chains [70,[84][85][86]. Our model suggests that limiting the number of possible interactions per Pol II-bound region to low values (e.g., 2 or 3) allows to align quantitatively our predictions with experiments, leading to percolated but less compact 3D domains [87,88]. Interestingly, this constraint also explains the weak decompaction observed for highly transcribed genes as interactions between distant positions along the genes (mediating the large-scale gene's compaction) are screened by (more frequent) interactions between nearest-neighbor Pol II-bound sites. This screening mechanism may also contribute to the formation of the extended transcription loops observed in long highly-transcribed genes [37], along with the potential stiffening of the chromatin fiber caused by the high density of nascent ribonucleoprotein complexes along the genes, as originally evoked. Furthermore, our model predicts a strong coupling between gene structure and dynamics: transcription bursts may regulate the stochasticity of intra-and inter-gene contacts at the single-cell scale (Fig.5) [89]; such dynamical contacts may conversely reduce locally gene mobility (Fig.6) and lead to long-range coherent motion between active regions [56,90], in good agreement with live-microscopy observations [41,43,45].
What are the molecular determinants of the putative attractive interaction between Pol II-bound loci hypothesized in our model? It is likely that several sources may directly or effectively participate in its regulation. The C-terminal domain (CTD) of Pol II can form liquid condensates in vitro under physiological conditions, which become unstable upon CTD phosphorylation [58]. This mechanism may thus promote direct attractions in vivo between non-elongating Pol II, bound at promoters for example [35]. CTDs may also interact with co-factors that can themselves phase-separate both at the transcriptional initiation [34,91,92] and elongation [59,93] stages, like FUS, BRD4, Mediator, P-TEFb or splicing factors. For example, the observed correlation between intra-gene compaction and the number of exons [19] at similar Pol II occupancy (Fig.S4) suggests a role for splicing-related condensates [93]. In addition, transcription-generated supercoiling [82,94] or specific histone marks deposited along the gene bodies (that may regulate putative nucleosome-nucleosome interactions [95]) may contribute to transcription-dependent effective interactions. The limited valency of interactions in our model aligns with a restricted number of simultaneously accessible residues involved in the aforementioned sources of Pol II-Pol II attraction. It is also possible that the screening effect observed at high transcription rates could be explained by the strength of interaction depending on local Pol II concentration and/or the length of nascent transcripts (Supplementary Notes), as RNA size and concentration can impact the stability of transcription-related condensates [62].
In conclusion, our results demonstrate the significant impact of Pol II binding and elongation on the spatiotemporal organization of the active genome through an out-of-equilibrium phase-separation process coupling the time-dependent dynamics of transcription to the formation of gene micro-domains and of transcriptionally-active subcompartment [60,77,96]. This extends the concept of transcription factories [97], typically associated with inter-gene contacts, to the internal organization of long genes having multiple trains of transcribing Pol IIs. Consistent with our findings, recent works also proposed that interactions between Pol IIs may also facilitate promoter-enhancer communications [22,23]. However, our approach provides only an "average" picture of the role of transcription on 3D chromosome organization and does not account for the various epigenetic, genomic and spatial factors that may interplay with Pol II-mediated phase separation [47] around specific genes, potentially explaining the variability of behaviors observed after transcription (de)activation [31]. Future investigations should aim to further elucidate the biological function(s) of such transcription-dependent micro-compartmentalization. Indeed, colocalization of active genomic regions may enhance the recycling of Pol II or transcription co-factors [98,99] by increasing their local concentrations. Investigating precisely such a "structure-function" coupling between the binding and assembly of transcription-associated components and condensates and the spatial folding of the genome remains an intriguing challenge and would require further developments both at the experimental and modeling levels.

Materials and Methods
Experimental data analysis Datasets The processed Micro-C data for mESCs (wild-type and mutants) and Drosophila in multi-resolution format mcool were downloaded from National Center for Biotechnology Information's Gene Expression Omnibus (GEO) through accession no: GSE130275, GSE178982 and ArrayExpress accession E-MTAB-9306.

Contact maps
We used cooltools (https://github.com/open2c/cooltools) [100] module to compute the obs/exp maps from the balanced contact maps, at various resolutions ranging from 100bp to 50kb.
To perform intra-gene PMGA, for each gene with size (>20 x resolution), we considered a domain of size around it, including the gene body and the two upstream and downstream 3 flanking regions, each of size . To insure consistency and facilitate pileup analysis, we rescaled each corresponding obs/exp matrix to a (60,60)-pseudo-sized matrix by (3 ) (3 ) averaging the original matrix elements. An example of this rescaling process can be seen in Figure S23. We then aligned all the rescaled matrices in the transcription forward direction to maintain uniformity. Finally, we aggregated all the data of genes belonging to a given cluster (clustered by gene length, IR score,...). For inter-gene PMGA, for each pair of genes, we considered the off-diagonal region of the obs/exp map of size and centered at ( ), with and the size and genomic position of the middle of gene 1 (same for gene 2). Then, similarly, we rescaled this region to a (30,30)-pseudo-matrix, aligned the genes in parallel forward direction and aggregated the pseudo-matrices belonging to the same cluster.

ChIP-seq tracks
Using pyBigWig (https://github.com/deeptools/pyBigWig), for each gene, we discretized the 3 domain (see above) into 60 bins and computed the coverage for each bin. Then, we aligned the domains in the transcription forward direction and aggregated over all genes in the same cluster.
ChIP-seq peak calling and calculation of peak contacts For each ChIP-seq track, we transformed BigWig to bedGraph, used MACS software [101] version 2 to call the peaks in the "no model" mode and merged the results from different replicates. Then, we sorted them by fold-change score (compared to input) and selected the most significant peaks (top 1/3). Finally, for every pair of peaks with a genomic distance between 160 and 320 kb, we used the off-diagonal pileup module of cooltools to compute the average peak contacts.

Insulation score and compartments analysis
For computing the insulation score, we analyzed contact maps at 800-bp, 1600-bp and 3200-bp resolutions with the dedicated module of cooltools with sliding windows 3, 5, 10 and 25 times larger than the given resolution, e.g. 2.4, 4, 8 and 20-kb windows for 800-bp resolution. For the compartment analysis, we used the eigs_cis module of cooltools to compute the first eigenvector of the Pearson's correlation matrix of contact map taking as inputs the 6.4-kbp resolution Micro-C maps and the GC coverage computing from mm10 reference genome.

Biophysical model
We previously introduced a self-avoiding semi-flexible polymer model for chromosomes [55,56].
In this study, we employed a coarse-graining approach to represent a 20-Mbp-long chromatin fiber using 10,000 monomers. Each monomer corresponds to approximately 2-kbp of the genome and has a size of 50 nm (Fig. 4A). Within this chain, we inserted a 100-kbp-long gene (composed of 50 monomers), where TSS and TTS are located at the first and last monomers of the gene, respectively.

TASEP model
Each monomer i within a gene (of total size n) is characterized by a binary state ϵ{0, 1} depending if a Pol II complex is bound to it ( ) or not ( ). We simulated the stochastic = 1 = 0 dynamics of Pol II binding, unbinding and elongation using a simple kinetic Monte-Carlo framework: each Monte Carlo step (MCS) consisted of (i) one attempt to bind a Pol II with rate at the TSS if unoccupied ( ), (ii) one attempt to unbind Pol II with rate at the TTS if 1 = 0 → 1 occupied ( ), and (iii) n-1 elongation attempts, each consisting in randomly picking = 1 → 0 one monomer i in [1:n-1] and, if occupied, to move with rate the Pol II to its adjacent upstream monomer iff it is not already occupied ( ).

Polymer model
The polymer chain undergoes local movements on a FCC lattice with periodic boundary conditions under Metropolis criterion, as described in our previous works [55]. The total Hamitonian of a given configuration can be expressed as following: The first term accounts for the stiffness of the chain with the bending rigidity and the local κ θ bending angle at monomer . The second term represents the Pol II-Pol II interaction, where denotes the attractive interaction strength, and equals 1 if monomers and occupy nearest neighboring sites on the lattice.
For simulations with a limited valency number, we defined an interaction list for each monomer with . This list stores the genomic positions of the other monomers it interacts with and is = 1 constrained not to exceed the given valency number. It is updated after any polymer or TASEP moves. Note that, due to the relatively high stall force of Pol II (~25-30 pN [102]), we assumed that Pol II-Pol II interactions do not impede Pol II elongation.

Numerical simulations
In our study, we set and a lattice volumic density of 50% to account for a chromatin κ ∼ 1. 2 fiber with a Kuhn length of 100 nm [68,103] and a typical base-pair density found in mammalian and fly genomes (~0.01 bp/nm³) [104]. Simulations were initiated by unknotted configurations [55] and performed with a kinetic Monte Carlo algorithm. In addition to the TASEP moves (see above), each MCS contains N local polymer trial moves. For each parameter set, 20 independent trajectories were conducted, discarding the first 10⁶ MCS from each trajectory to allow the system to reach a steady state. Subsequently, snapshots of the system were saved every 10³ MCS during the simulation during 10⁷ MCS and analyzed subsequently (see below).

Data analysis
The radius of gyration (RG) provides a measure of the typical spatial extent of a gene, reflecting its overall span in 3D space. In a given configuration, the position of monomer can be defined as . The RG is then calculated as follows: