Main

Mechanisms that shape 3D genome organization are thought to play important roles in controlling gene expression, particularly during development. For example, interactions between gene promoters, or gene promoters and other distal gene regulatory elements (like enhancers), have been implicated both in the maintenance of gene expression patterns and in enabling alterations in gene expression states during cell fate transitions1,2,3,4.

A number of mechanisms have been proposed to create and regulate interactions between gene regulatory elements. For example, cohesin can extrude chromatin to establish topologically associated domains (TADs), which are generally constricted by CTCF-bound insulator sites. Cohesin-mediated loop extrusion is thought to increase the frequency of interactions between gene promoters and distal regulatory elements within TADs5. However, despite the profound effects that the disruption of cohesin or CTCF has on interactions within TADs, this typically translates into modest or tissue-specific effects on gene expression6,7,8,9,10. Although loop extrusion functions across the genome, other mechanisms are thought to play more direct and specific roles at gene regulatory elements by creating physical interactions that control gene expression. For example, the Mediator complex, which is a fundamental regulator of gene transcription, has been proposed to support gene expression by functioning as a molecular bridge through binding transcription factors at active enhancers and RNA polymerase II at gene promoters11,12,13. However, recent work has questioned the extent to which the function of Mediator in gene expression relies on promoting physical interactions between regulatory elements14,15,16,17. At silent gene regulatory elements, binding of the Polycomb repressive complexes (PRCs) enables physical interactions between these inactive sites18,19,20,21,22,23,24,25,26,27,28, which is thought to maintain gene repression29,30,31 but may also poise genes for activation during cell linage commitment32,33,34. In these contexts, whether chromosomal interactions themselves or other functions of the Polycomb system control gene expression is unknown. Thus, although it is evident that a variety of mechanisms have evolved to shape how gene regulatory elements physically interact with one another, the extent to which these interactions are required to control gene expression remains a central outstanding question35,36,37,38,39,40.

The Mediator complex alone may not play a major role in enabling interactions between gene regulatory elements14,15,16,17, but we and others have shown that a distinct form of the complex - which contains the cyclin-dependent kinase module CKM (composed of CDK8 or CDK19, CCNC, MED12 or MED12L, and MED13 or MED13L) and does not interact with RNA polymerase II41,42,43 - is associated with gene regulatory element interactions in mouse ESCs44,45,46. Unlike the Mediator complex, CKM–Mediator (also known as CDK-Mediator) has been implicated in both repression and support of gene expression, suggesting that it might work through mechanisms that are distinct from the well-characterized function of Mediator in binding to and regulating RNA polymerase II activity47,48. In line with this possibility, CKM–Mediator appears to play specialized roles in controlling inducible gene expression after exposure to extracellular stimuli or cellular differentiation cues47,49,50,51,52,53,54. We and others have previously demonstrated that CKM–Mediator is recruited to the promoters of repressed developmental genes in ESCs55,56,57,58 and this primes these genes for induction during differentiation55. In this context, CKM–Mediator binding appears to be important for creating interactions with other gene regulatory elements, suggesting that formation of 3D interactions may underpin its capacity to prime developmental genes for induction during cell lineage commitment44.

Based on these findings, we set out to determine how CKM–Mediator controls chromosomal interactions and gene expression. To achieve this, we exploit inducible genetic perturbation systems and genomic approaches to examine CKM–Mediator function in ESCs and during cellular differentiation. We discover that CKM–Mediator contributes little to overall 3D genome organization in ESCs but is essential for creating interactions between Polycomb-bound regions of the genome. We show that CKM–Mediator does not define these interactions through an intrinsic bridging mechanism. Instead, it controls canonical PRC1 (cPRC1) binding at these sites, which in turn establishes contacts between Polycomb domains. Surprisingly, through separation-of-function experiments we reveal that Polycomb-dependent chromosomal interactions regulated by CKM–Mediator are not required for the priming or poising of genes for induction during differentiation. Instead, we discover that the priming function of CKM–Mediator relies on its ability to enable core Mediator binding to gene promoters during the process of gene induction.

Results

CKM–Mediator enables Polycomb domain interactions

To examine how CKM–Mediator influences genome organization in ESCs, we carried out in situ Hi-C in a cell line in which we can inducibly disrupt CKM–Mediator complex formation by removing its MED13/MED13L structural subunits (CKM–Mediator cKO; Fig. 1a,b and Extended Data Fig. 1a)55. No major alterations to overall genome organization were observed following CKM–Mediator disruption, with TADs and loop interactions remaining largely unchanged (Fig. 1c,d). It was previously proposed that CKM–Mediator promotes super enhancer-promoter interactions in ESCs45,46. However, we observed only subtle reductions in these interactions upon disruption of CKM–Mediator (Extended Data Fig. 1c). These data suggest that CKM–Mediator does not contribute centrally to 3D genome organization in ESCs.

Fig. 1: CKM–Mediator has a limited role in 3D genome organization but is essential for Polycomb domain interactions.
figure 1

a, A schematic of Med13/13lfl/fl ESCs. 4-Hydroxytamoxifen (TAM) induces conditional disruption of the CKM–Mediator complex (CKM–MED). b, A representative western blot analysis (n = 6) of nuclear extracts from Med13/13lfl/fl (wild type, WT) and Med13/13l−/− (CKM–MED KO) ESCs showing depletion of MED13 and MED13L proteins. HDAC1 is shown as a loading control. c, Hi-C contact matrices of WT and CKM–MED KO ESCs at 10 kb resolution. Genomic coordinates are indicated. d, Aggregate analysis of TADs and loops in WT and CKM–MED KO ESCs at 10 kb resolution. e, Hi-C contact matrices of WT and CKM–MED KO ESCs at 5 kb resolution. Interactions between Polycomb domains are indicated with a red circle. The blue track shows binding of PRC1 (RING1B ChIP–seq). Genomic coordinates are indicated. f, Aggregate analysis of Hi-C signal (10 kb resolution) at pairs of Polycomb domains in Med13/13lfl/fl (WT) and Med13/13l−/− (CKM–MED KO) ESCs, with 200 kb flanking regions. The difference between WT and KO is shown. g, A snapshot showing Capture-C read count signal in WT and CKM–MED KO ESCs. Interactions between the Nkx2-1 promoter bait (triangle) and surrounding Polycomb-bound sites are shown with arrowheads. PRC1 binding (RING1B ChIP–seq) is shown as a reference. h, Boxplot analysis of mean normalized read counts from WT and CKM–MED KO ESCs showing interactions between Polycomb gene promoters and other Polycomb domains (left), or non-Polycomb gene promoters and active sites (H3K27ac, right). Interactions were not distance-matched due to differences in the interaction ranges for the two promoter types. Boxes show IQRs, center line represents median, whiskers extend by 1.5 × IQR or the most extreme point (whichever is closer to the median), while notches extend by 1.58 x IQR/sqrt(n), giving a roughly 95% confidence interval for comparing medians.

Source data

ESCs are characterized by a unique set of extremely strong long-range interactions between regions of the genome that have high-level occupancy of PRCs, which we refer to as Polycomb domains19,20,21,23,28,59. These interactions are thought to contribute to developmental gene regulation either by maintaining repression in differentiated cell types or potentially by poising genes for induction during cell lineage commitment. Interestingly, a similar role in regulating developmental gene expression has been proposed for CKM–Mediator44,55,56,57. Given these seemingly similar functionalities, we asked whether CKM–Mediator might influence interactions between Polycomb domains. Remarkably, Hi-C analysis after CKM–Mediator disruption revealed dramatic reductions in interactions between Polycomb domains, and this effect was evident over a range of interaction distances (Fig. 1e,f and Extended Data Fig. 1d). A widespread reduction in Polycomb domain interactions was also observed using Capture-C analysis focused on promoters associated with Polycomb domains (Fig. 1g,h and Extended Data Fig. 1e,f). Therefore, CKM–Mediator is essential for interactions between Polycomb domains.

CKM–Mediator supports cPRC1 binding to enable interactions

To understand how CKM–Mediator enables interactions between Polycomb domains, we asked whether CKM–Mediator is bound at these sites. The majority of Polycomb domains (91.12%) were enriched for the CKM–Mediator subunit CDK8, in general agreement with previous findings56,57, suggesting that the effects of CKM–Mediator on Polycomb domain interactions may be direct (Fig. 2a and Extended Data Fig. 2a). It has previously been proposed that interactions between Polycomb domains are dependent on cPRC1, which in ESCs is defined by its structural subunit PCGF2 (refs. 20,24,30,60,61,62,63,64,65). Given the profound effects on Polycomb domain interactions after the depletion of CKM–Mediator, we reasoned that CKM–Mediator may influence the function of cPRC1. To test this possibility, we examined cPRC1 occupancy after CKM–Mediator disruption by carrying out calibrated ChIP–seq (cChIP–seq) using antibodies recognising the cPRC1 subunits RING1B, PCGF2 and CBX7. Importantly, this revealed a major reduction in cPRC1 binding at Polycomb target sites in the absence of CKM–Mediator (Fig. 2b-d and Extended Data Fig. 2b,c), despite only subtle reductions in protein levels (Extended Data Fig. 2d). cPRC1 associates with Polycomb domains via its CBX7 subunit that binds H3K27me3 deposited by PRC2 (refs. 66,67,68). Interestingly, cChIP–seq for H3K27me3 revealed only modest reductions in this modification after CKM–Mediator disruption (Fig. 2b-d). Therefore, CKM–Mediator regulates cPRC1 binding without major effects on H3K27me3.

Fig. 2: CKM–Mediator regulates canonical PRC1 binding.
figure 2

a, Heatmaps showing RING1B (PRC1) and CDK8 ChIP–seq signals at Polycomb domains (n = 2097), sorted by decreasing RING1B signal. b, A genomic snapshot of a Polycomb-bound locus, showing CDK8, RING1B, PCGF2, CBX7 and H3K27me3 ChIP–seq signal in WT (+) and CKM–MED KO (-) ESCs. c, Heatmaps showing RING1B, PCGF2, CBX7 and H3K27me3 ChIP–seq signal at Polycomb domains (n = 2,097) in WT (+) and CKM–MED KO (-) ESCs, sorted by decreasing RING1B signal. d, Metaplot analysis of RING1B, PCGF2, CBX7 and H3K27me3 enrichment at Polycomb domains (n = 2,097) in WT and CKM–MED KO ESCs.

Given that cPRC1 has been proposed to enable interactions between Polycomb domains20,24,30, and its binding is abrogated following disruption of CKM–Mediator (Fig. 2), the observed effect on Polycomb domain interactions in the absence of CKM–Mediator may be due to loss of cPRC1 occupancy. In agreement with this possibility, the effects on cPRC1 binding were related to the reductions in interactions after depletion of CKM–Mediator (Extended Data Fig. 2e) and corresponded to the level of CKM–Mediator binding (Extended Data Fig. 2f). However, CKM–Mediator has also been proposed to function as a molecular bridge to enable chromosomal interactions44,45,46. Given that both cPRC1 and CKM–Mediator binding are lost upon CKM–Mediator disruption, interactions could be defined by either cPRC1 or CKM–Mediator. To discover the molecular determinant that enables these interactions, we took advantage of a synthetic system to create a separation-of-function scenario in which either cPRC1 or CKM–Mediator could be ectopically tethered to an artificial site in the genome69 (Fig. 3a and Extended Data Fig. 3a,b). Importantly, tethering CDK8 recruited CKM–Mediator and tethering PCGF2 recruited the cPRC1 complex69 but not CKM–Mediator (Extended Data Fig. 3c). We then asked whether binding of cPRC1 or CKM–Mediator at this ectopic site was able to support interactions with nearby regions co-occupied by cPRC1 and CKM–Mediator. These data revealed that cPRC1 was sufficient to create de novo interactions with surrounding sites, in line with similar findings from PRC2 tethering70, which would lead to recruitment of cPRC1 (ref. 69). By contrast, we found no evidence for interactions with surrounding sites when CKM–Mediator was tethered (Fig. 3b and Extended Data Fig. 3d). Importantly, endogenous control sites retained interactions in both cell lines, although they were slightly weaker in the CKM–Mediator tethered line (Extended Data Fig. 3e).

Fig. 3: cPRC1 creates interactions between Polycomb domains.
figure 3

a, A schematic of the integrated TetO site and experimental setup. b, A snapshot showing Capture-C read count signal from TetR-PCGF2, TetR-CDK8 and TetR-GFP lines at the TetO array. CDK8 and PCGF2 (cPRC1) ChIP–seq signal is given as a reference. The TetO bait is shown as a triangle and interactions created with surrounding cPRC1-bound sites are represented with arrowheads. c, A schematic of the cPRC1 (Pcgf4−/−Pcgf2fl/fl) conditional KO line. d, A snapshot showing Capture-C read count signal from WT and cPRC1 KO ESCs. Interactions between the Nkx2-1 promoter bait (triangle) and surrounding Polycomb domain sites are shown with arrowheads. cPRC1 binding (PCGF2 ChIP–seq) is shown as a reference. e, Boxplot analysis of normalized read counts from WT and cPRC1 KO ESCs showing interactions between Polycomb gene promoters and other Polycomb domains (left), or non-Polycomb gene promoters and active sites (H3K27ac, right). Boxes show IQR, center lines represent the median, whiskers extend 1.5 × IQR or the most extreme point (whichever is closer to the median), whereas notches extend by 1.58 x IQR/sqrt(n), giving a roughly 95% confidence interval for comparing medians.

To further explore whether cPRC1 is the central determinant underpinning Polycomb domain interactions, we next used a cell line in which we can inducibly disrupt the cPRC1 complex by removing the core structural components PCGF2 and PCGF4 (cPRC1 cKO)59 and carried out Capture-C (Fig. 3c and Extended Data Fig. 3f). Importantly, removal of cPRC1 caused a near complete loss of interactions between Polycomb domains, while most sites retained CKM–Mediator binding (Fig. 3d,e and Extended Data Fig. 3g-i). Therefore, cPRC1 establishes long-range interactions between Polycomb domains, with CKM–Mediator playing a regulatory role in facilitating cPRC1 binding.

cPRC1 interactions are not required for CKM–Mediator priming

CKM–Mediator occupies silent developmental gene promoters in ESCs and is required for subsequent gene activation during differentiation55,56. In some cases CKM–Mediator occupancy corresponds to pre-formed long-range interactions with other gene regulatory elements, suggesting that by bringing gene regulatory elements in close proximity with each other in ESCs, CKM–Mediator may prime them for future activation44. We now show that CKM–Mediator-dependent interactions are reliant on cPRC1 (Fig. 3). Importantly, the Polycomb system has similarly been implicated in poising or priming genes for activation during differentiation by creating interactions between gene promoters and other regulatory elements, including poised enhancers32,33. Based on this functional convergence between CKM–Mediator and cPRC1 activities, we hypothesized that CKM–Mediator may enable interactions between gene promoters and regulatory elements via a cPRC1-dependent mechanism to prime genes for activation during differentiation.

To examine this possibility, we used all-trans retinoic acid to drive ESC differentiation and carried out calibrated nuclear RNA-seq (cnRNA-seq) to identify genes that rely on CKM–Mediator for their induction during differentiation (Fig. 4a and Extended Data Fig. 4a,b). Based on this analysis, 631 CKM–Mediator-dependent genes (fold change of > 1.5, adjusted P value < 0.05) were identified (Fig. 4b and Extended Data Fig. 4c). Importantly, these genes also showed cPRC1 enrichment at their promoters in ESCs and PRC1 occupancy tended to be reduced following RA treatment (Extended Data Fig. 4d,e). To determine whether cPRC1 and its capacity to mediate chromosomal interactions enables gene induction by CKM-Mediator, we depleted cPRC1 and induced differentiation (Fig. 4c and Extended Data Fig. 4f,g). On average, CKM–Mediator-dependent genes induced normally in the absence of cPRC1 (Fig. 4d,e and Extended Data Fig. 4h), with only 18 of these genes showing a significant decrease in activation (Extended Data Fig. 4h,i). Therefore, while CKM–Mediator contributes to gene induction, it does not seem to do so through a cPRC1-dependent mechanism.

Fig. 4: CKM–Mediator primes genes for activation during differentiation independently of cPRC1-mediated interactions.
figure 4

a, A schematic of the differentiation of WT and CKM–MED KO ESCs used for cnRNA-seq. b, Boxplot analysis of the expression of CKM–MED-dependent genes (n = 631) in WT ESCs and following RA (retinoic acid) induction (WT and CKM–MED KO). Boxes show IQR, center lines represent the median, whiskers extend by 1.5 × IQR or the most extreme point (whichever is closer to the median), whereas notches extend by 1.58 x IQR/sqrt(n), giving a roughly 95% confidence interval for comparing medians. c, A schematic of the differentiation of WT and cPRC1 KO ESCs for cnRNA-seq. d, As in b but for cPRC1 cKO cells. e, A screenshot showing the expression of genes within the HoxB cluster following RA induction of CKM–MED cKO or cPRC1 KO cells. Forward strand is shown on top and reverse strand is shown at the bottom of each track. ChIP–seq tracks for CDK8 and cPRC1 (PCGF2) enrichment are shown. f, Boxplot analysis of the expression of RA-induced (RA-ind) genes from the Polycomb (PcG) network (top, n=482) and CKM–Med-dependent genes from the PcG network (bottom, n=184) following RA induction of CKM–MED cKO or cPRC1 KO cells. Boxes are defined as in a.

This finding prompted us to investigate more generally whether cPRC1 has a role in gene induction during differentiation, particularly of genes that engage in interactions. Therefore, our analysis was extended to include RA-induced genes that are part of a previously described Polycomb interaction network in ESCs (n = 482) (Extended Data Fig. 4j,k)19. Interactions between RA-induced genes were lost in the absence of cPRC1, including interactions with poised enhancers (Extended Data Fig. 4l,m). However, as with CKM–Mediator-dependent genes, this had minimal effect on gene induction (Fig. 4e,f and Extended Data Fig. 4n). In contrast, we identified 184 genes within the Polycomb interaction network that rely on CKM–Mediator for induction (Fig. 4f). Therefore, CKM–Mediator has an essential role in gene activation during differentiation, independent of cPRC1-mediated chromosomal interactions. Furthermore, cPRC1 does not poise genes for activation during differentiation, despite its role in enabling interactions between gene promoters and other regulatory elements in ESCs.

CKM–Mediator primes genes by enabling core Mediator binding

CKM–Mediator is essential for enabling cPRC1 to create interactions between Polycomb domain-associated gene regulatory elements, but these interactions are dispensable for gene induction during differentiation. In the absence of a pre-formed interaction mechanism for priming, we hypothesized that CKM–Mediator may prime genes for activation during differentiation by more directly influencing the function of the core Mediator71,72,73. To investigate this possibility, we engineered an epitope tag into the endogenous Med14 gene, which is a structural subunit of the core Mediator (Extended Data Fig. 5a-c). Addition of the epitope tag did not interfere with CKM–Mediator complex formation (Extended Data Fig. 5d) and, therefore, enabled us to carry out ChIP–seq analysis and examine core Mediator occupancy in ESCs and during differentiation. Despite high levels of CDK8 binding at the promoters of CKM–Mediator-dependent genes (Fig. 5) and, more broadly, over Polycomb domains in ESCs (Extended Data Fig. 5e), the occupancy of MED14 at these sites was much lower than at active sites (Extended Data Fig. 5e). This suggests that, although the CKM–Mediator can bind to inactive developmental gene promoters, binding of the core Mediator may be more dynamic at these sites. Furthermore, it raised the interesting possibility that the mechanism of core Mediator binding and its stability at activated sites could change during the process of gene activation, so that it enters into a state that relies less on the CKM for engagement, as has been suggested previously58.

Fig. 5: CKM–Mediator enables gene induction via recruitment of the Mediator complex.
figure 5

a, A genomic snapshot of two CKM–Mediator-dependent genes, showing CDK8 and T7-MED14 ChIP–seq and cnRNA-seq in WT (+) and CKM–MED KO (-) ESCs (top) and following RA induction (bottom). b, Heatmaps showing CDK8 and T7-MED14 ChIP–seq signal at promoters (TSS±2.5 kb) of CKM–MED-dependent genes in ESCs and following RA induction (n = 631). T7-MED14 signals are shown for WT and CKM–Mediator KO RA-induced cells. Genes are sorted by decreasing T7-MED14 signal in RA-treated cells. Metaplots showing read density are shown on the top of each heatmap.

Based on these observations, we were keen to examine core Mediator association with these sites during differentiation. During the transition to an active state, promoters of CKM–Mediator-dependent genes showed reduced levels of CDK8 binding, and they accumulated more MED14 (Fig. 5 and Extended Data Fig. 5f,g). We then asked whether the CKM module was required for this increased association of the core Mediator during differentiation12,13,58. Indeed following RA induction, promoters of CKM–Mediator-dependent genes do not acquire more MED14 in the absence of the CKM module (Fig. 5a,b and Extended Data Fig. 5g), consistent with these genes failing to induce appropriately (Fig. 4). Therefore, we propose that the CKM module primes genes for induction, not by pre-forming 3D gene regulatory interactions through the Polycomb system, but instead by enabling efficient engagement of the core Mediator at target gene promoters to support transcription activation during differentiation.

Discussion

To define the extent to which interactions between gene regulatory elements are required for controlling gene expression has been challenging. This is due to the fact that many of the proteins and complexes that are proposed to enable these interactions are also known to have direct roles in controlling transcription at gene promoters. Here, we show that CKM–Mediator contributes very little to 3D genome organization in ESCs but is specifically required for interactions between Polycomb-bound gene regulatory elements (Fig. 1). These interactions do not rely directly on a CKM–Mediator-based bridging mechanism (Fig. 3), but instead CKM–Mediator controls binding of the cPRC1 complex (Fig. 2) to enable interactions between Polycomb domains (Fig. 3). By removing cPRC1, we specifically disrupt these interactions and reveal that CKM–Mediator is still able to prime genes for activation during differentiation (Fig. 4) through supporting recruitment of the core Mediator to gene promoters (Fig. 5). Therefore, CKM–Mediator primes genes for activation during differentiation by supporting recruitment of the core Mediator.

Physical interactions between gene regulatory elements are thought to enable gene expression32,44,74,75. In line with this concept, it has been proposed that, through the function of Polycomb and/or CKM–Mediator complexes, pre-formed interactions that tether silent developmental genes and other regulatory elements in stem cells may render genes poised or primed for activation during differentiation32,33,34,44,76,77. Here, we demonstrate that pre-formed interactions between gene regulatory elements co-occupied by CKM–Mediator and cPRC1 rely on cPRC1, and that the binding of cPRC1 is regulated by CKM–Mediator. Although the precise mechanisms through which CKM–Mediator facilitates cPRC1 binding to create interactions remain an open question for further study this realization allowed us to create a separation-of-function scenario whereby we could disrupt pre-formed interactions by removing cPRC1 yet leave CKM–Mediator intact. Importantly, in the context of these experiments, we find no evidence to suggest that pre-formed regulatory interactions play a prominent role in priming genes for activation during differentiation. Consistent with these findings, studies have shown that cPRC1 does not contribute to gene regulation during embryoid body formation in vitro18, and cPRC1-null mice develop normally until 8.5 dpc, by which point a host of key developmental gene expression transitions have already been completed78,79.

Instead, we find that the CKM–Mediator appears to have a more direct role in priming genes for induction during differentiation by ensuring appropriate association of the core Mediator complex during activation. This priming is likely to involve FBXL19, which physically interacts with CKM–Mediator and recruits CKM–Mediator to silent developmental gene promoters by binding to CpG-island DNA55. We speculate that pre-binding of CKM–Mediator might provide transcriptional activators with a localized pool of core Mediator that can be co-opted to support the timely induction of silent developmental genes during cellular differentiation. However, other related models could be envisaged that explain the mechanics of priming, including transcriptional activators evicting the CKM from pre-bound CKM–Mediator to enable transition of silent developmental genes into an activated state. Given the dynamic nature of these systems in vivo80,81, it is extremely difficult to distinguish between these related yet distinct biochemical models. In future work, kinetic experiments using rapid degron approaches may help to resolve these points and also provide insight into how CKM–Mediator influences cPRC1 binding. However, consistent with the requirement for pre-binding of CKM–Mediator in priming genes for induction, removal of FBXL19 causes a reduction in CKM binding at silent developmental gene promoters and, similarly to CKM–Mediator removal, renders them less competent for induction during differentiation55. Furthermore, mice deficient for CKM subunits display pre-implantation lethality, consistent with an essential role in early developmental gene expression transitions82,83,84. As such, CKM–Mediator appears to function to prime genes for induction through supporting core Mediator acquisition at gene promoters during gene induction, not through mechanisms that create pre-formed regulatory element interactions.

These new findings raise the important question of why CKM–Mediator regulates cPRC1 binding to create interactions between silent gene regulatory elements if this is not related to its role in priming genes for induction during differentiation. A hint as to why this might be important comes from genetic screens in Drosophila, in which the CKM–Mediator complex components MED12 and MED13 were identified as Polycomb group genes that enable the long-term maintenance of Hox gene repression85. In agreement with a potential repressive role for CKM–Mediator at Polycomb target genes, it was recently shown that the CDK8 component of the CKM–Mediator complex has important roles in maintaining X-chromosome inactivation in mice86 and that CDK8 absence leads to loss of Polycomb-mediated gene silencing86,87. Interestingly, in both of these scenarios, CKM–Mediator and Polycomb appear to maintain repression in more differentiated cells, whereas, in contrast, cPRC1 disruption has little effect on the maintenance of Polycomb target gene repression in ESCs18,59,88. As such, we envisage that the role that CKM–Mediator plays in regulating cPRC1 occupancy to create long-range interactions between silent regulatory elements may be particularly important in maintaining long-term gene repression in more differentiated cell types, yet contribute less to gene repression in rapidly dividing stem cells. This is consistent with the observation that cPRC1-deficient mice display inappropriate maintenance of Polycomb target gene repression and lethality in later embryonic stages78,79.

Based on its seemingly distinct roles in gene regulation, we propose that CKM–Mediator may play a ‘yin-and-yang’ role in controlling expression. We hypothesize that during early developmental stages CKM–Mediator associates with silent developmental gene promoters to support gene induction during differentiation by helping to enable core Mediator binding during the transition to an activated state. However, in the absence of an activation signal at later developmental stages, the distinct role of CKM–Mediator in enabling cPRC1 binding to create interactions with other silent Polycomb-occupied regulatory sites could predominate to help maintain long-term gene repression. As such, distinct CKM–Mediator functions could play important and complementary roles in supporting developmental gene regulation. In future work, it will be important to test these new models for CKM–Mediator function in appropriate mouse developmental model systems.

In summary, we show that CKM–Mediator is essential for regulating interactions between Polycomb domains. However, these interactions contribute little to gene activation during differentiation. Instead, we show that CKM–Mediator primes genes for induction during differentiation by supporting core Mediator binding to promoters during gene activation.

Methods

Cell culture

Mouse ESCs were cultured on gelatin-coated dishes (Sigma-Aldrich) in DMEM (Thermo Fisher Scientific) supplemented with 15% fetal bovine serum (BioSera), 2mM L-glutamine, 0.5 mM beta-mercaptoethanol, 1× non-essential amino acids, 1× penicillin-streptomycin solution (Thermo Fisher Scientific) and 10 ng ml-1 leukemia-inhibitory factor (produced in-house). Med13/13lfl/fl ERT2-Cre55 and Pcgf4−/−/Pcgf2fl/fl ERT2-Cre ESCs59 were treated with 800 nM 4-hydroxytamoxifen (Sigma-Aldrich) for 96 h and 72 h, respectively. For RA differentiation of ESCs, 4 × 106 ESCs were allowed to attach to gelatinized 15 cm dishes for 6–8 h and treated with 1 µM all-trans retinoic acid (Sigma-Aldrich) in EC-10 medium (DMEM supplemented with 10% fetal bovine serum, L-glutamine, beta-mercaptoethanol, non-essential amino acids and penicillin-streptomycin) for 48 h. TOT2N E14 ESCs used for TetR targeting experiments were previously described69. To generate TetR-CDK8 TOT2N ES lines, TOT2N E14 ESCs were transfected using Lipofectamine 2000 (Thermo Fisher Scientific) following the manufacturer’s instructions. Stably transfected cells were selected for 10 days using 1 μg ml-1 puromycin, and individual clones were isolated and expanded in the presence of 1 μg ml-1 puromycin to maintain transgene expression. HEK293T cells, used for calibration of crosslinked cChIP–seq experiments, were cultured in EC-10 media. All mammalian cell lines were cultured at 37 °C and 5% CO2. SG4 Drosophila cells, used for calibration of ncRNA-seq and native ChIP–seq experiments, were grown at 25 °C in Schneider’s medium (Thermo Fisher Scientific) supplemented with 10% heat-inactivated fetal bovine serum (BioSera) and penicillin-streptomycin. All cell lines generated and grown in the Klose Lab were routinely tested for mycoplasma infection.

Generation of the MED14-T7 Med13/13l fl/fl ESC line

To allow for efficient chromatin immunoprecipitation of MED14, we introduced an amino-terminal 3xT7-2xStrepII-FKBP12 tag to the endogenous Med14 gene. The tag was synthesized by GeneArt (Thermo Fisher Scientific). The targeting construct was generated by Gibson assembly (Gibson Assembly Master Mix kit, New England Biolabs) of the PCR-amplified tag sequence and roughly 520 bp homology arms surrounding the ATG start codon of the Med14 gene, amplified from mouse genomic DNA.

The pSptCas9(BB)-2A-Puro(PX459)-V2.0 vector was obtained from Addgene (no. 2988) and the sgRNA was designed using the CRISPOR online tool (http://crispor.tefor.net/crispor.py). The targeting construct was designed such that the endogenous ATG sequence is absent, and the Cas9 recognition site is disrupted by the insertion of the tag. ESCs were transfected in a single well of a 6-well plate with 0.5 µg Cas9 guide plasmid and 2 µg targeting construct plasmid using Lipofectamine 3000 (Thermo Fisher Scientific) according to the manufacturer’s guidelines. The day after transfection, cells were passaged at a range of densities and subjected to puromycin selection (1 μg ml-1) for 48 h. Approximately 7–10 days following transfection, individual clones were isolated, expanded and PCR-screened for the homozygous presence of the tag.

Preparation of nuclear extracts and Western blot analysis

Collected cells were resuspended in 10 × pellet volume (PV) of Buffer A (10 mM Hepes pH 7.9, 1.5 mM MgCl2, 10 mM KCl, 0.5 mM DTT, 0.5 mM PMSF, cOmplete protease inhibitor cocktail (Roche)) and incubated for 10 min at 4 °C with slight agitation. After centrifugation, the cell pellet was resuspended in 3× PV Buffer A containing 0.1% NP-40 and incubated for 10 min at 4 °C with slight agitation. Nuclei were recovered by centrifugation and the soluble nuclear fraction was extracted for 1 h at 4 °C with slight agitation using 1× PV Buffer C (10 mM Hepes pH 7.9, 400 mM NaCl, 1.5 mM MgCl2, 26% glycerol, 0.2 mM EDTA, cOmplete protease inhibitor cocktail). Protein concentration was measured using the Bradford assay (BioRad).

Nuclear extract samples were mixed with 1× SDS loading buffer (2% SDS, 0.1 M Tris pH 6.8, 0.1 M DTT, 10% glycerol, 0.1% bromophenol blue) and placed at 95 °C for 5 min. Between 25–35 μg of nuclear extract was separated on home-made SDS-PAGE gels or NuPAGE 3–8% Tris-acetate gels (Life Technologies, for large Mediator subunits). Gels were blotted onto nitrocellulose membranes using the Trans-Blot Turbo transfer system (BioRad). Antibodies used for Western blot analysis were rabbit polyclonal anti-MED13L (A302-420A, Bethyl laboratories), rabbit polyclonal anti-MED13 (GTX129674, Genetex), rabbit monoclonal anti-CDK8 (ab229192, Abcam), rabbit polyclonal anti-CCNC (A301-989A, Bethyl laboratories), rabbit polyclonal anti-MED1 (A300-793A, Bethyl laboratories), rabbit polyclonal anti-MED15 (A302-422A, Bethyl laboratories), rabbit polyclonal anti-MED23 (A300-425A, Bethyl laboratories), rabbit polyclonal anti-MED17 (GTX115241, Genetex), rabbit polyclonal anti-MED14 (A301-044A-T, Bethyl laboratories), rabbit monoclonal anti-RING1B (5694, Cell Signaling), rabbit monoclonal anti-SUZ12 (3737, Cell Signaling), rabbit polyclonal anti-PCGF2 (sc-10744, Santa Cruz), rabbit monoclonal anti-T7-Tag (D9E1X, 13246, Cell Signaling), mouse monoclonal anti-TBP (ab818, Abcam), rabbit monoclonal anti-HDAC1 (ab109411, Abcam), and mouse monoclonal anti-Flag (F1804, Sigma). Images were analyzed using Image Studio v5.2 (LI-COR).

Co-immunoprecipitation of the CKM–Mediator complex

For purification of the CKM–Mediator complex from wild type or tamoxifen-treated Med13/13lfl/fl ESCs, 600 µg of nuclear extract was diluted in BC150 buffer (50 mM Hepes pH 7.9, 150 mM KCl, 0.5 mM EDTA, 0.5 mM DTT, cOmplete protease inhibitor cocktail (Roche)). Samples were incubated with 5 µg CDK8 antibody (A302-500A, Bethyl laboratories) and 25 units benzonase nuclease (Millipore) overnight at 4 °C. For purification of T7-MED14, 5 μl T7-Tag antibody (D9E1X, 13246, Cell Signaling) and 25 units benzonase nuclease were used. Protein A agarose beads (RepliGen) were blocked for 1 h at 4 °C in Buffer BC150 containing 1% fish skin gelatin (Sigma) and 0.2 mg ml-1 BSA (New England Biolabs). The blocked beads were added to the samples and incubated for 4 h at 4 °C. Four washes for 10 min each were performed using BC150 containing 0.02% NP-40. The beads were resuspended in 2× SDS loading buffer and boiled for 5 min to elute the immunoprecipitated complexes.

Chromatin immunoprecipitation

Chromatin immunoprecipitation was performed as described previously55. In brief, 50 × 106 ES cells were fixed for 45 min with 2 mM DSG (Thermo Fisher Scientific) in PBS followed by 12.5 min with 1% formaldehyde (methanol-free, Thermo Fisher Scientific). Reactions were quenched by the addition of glycine to a final concentration of 125 µM and the fixed cells were washed in ice-cold PBS and snap frozen in liquid nitrogen. 50 × 106 HEK293T cells were fixed as above, snap frozen in 2 × 106 aliquots and stored at −80 °C until further use.

For calibrated ChIP–seq, 2 × 106 HEK293T cells were resuspended in 1 ml ice-cold lysis buffer (50 mM HEPES pH 7.9, 140 mM NaCl, 1 mM EDTA, 10% glycerol, 0.5% NP-40, 0.25% Triton X-100) and added to 50 × 106 fixed ESCs resuspended in 9 ml lysis buffer. The cell suspension was incubated for 10 min at 4 °C. The released nuclei were washed (10 mM Tris-HCl pH 8, 200 mM NaCl, 1 mM EDTA, 0.5 mM EGTA) for 10 min at 4 °C. The chromatin pellet was resuspended in 1 ml of ice-cold sonication buffer (10 mM Tris-HCl pH 8, 100 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 0.1% Na deoxycholate, 0.5% N-lauroylsarcosine) and sonicated for 25 cycles (30 s on and 30 s off) using a BioRuptor Pico sonicator (Diagenode), shearing genomic DNA to produce fragments between 300 bp and 1 kb. Following sonication, Triton X-100 was added to a final concentration of 1%. Two hundred and fifty µg chromatin was diluted ten-fold in ChIP dilution buffer (1% Triton X-100, 1 mM EDTA, 20 mM Tris-HCl pH 8.0, 150 mM NaCl) and used in each immunoprecipitation. Three reactions per treatment condition were set up to allow for maximal DNA recovery suitable for library preparation. Chromatin was pre-cleared with protein A Dynabeads (Thermo Fisher Scientific), blocked with 0.2 mg ml-1 BSA and 50 µg ml-1 yeast tRNA and incubated with the respective antibodies overnight at 4 °C. Antibody-bound chromatin was purified using blocked protein A Dynabeads for 3 h at 4 °C. ChIP washes were performed as described previously89. ChIP DNA was eluted in ChIP elution buffer (1% SDS, 100 mM NaHCO3) and reverse crosslinked overnight at 65 °C with 200 mM NaCl and RNase A (Sigma). The reverse crosslinked samples were treated with 20 μg ml-1 proteinase K and purified using a ChIP DNA Clean and Concentrator kit (Zymo Research). The three reactions per treatment condition were pooled at this stage. For each sample, corresponding input DNA was also reverse crosslinked and purified. The efficiency of the ChIP experiments was confirmed by quantitative PCR. Prior to library preparation, 5–10 ng ChIP material was diluted to 50 µl in TLE buffer (10 mM Tris-HCl pH 8.0, 0.1 mM EDTA) and sonicated with a Bioruptor Pico sonicator for 17 min (30 s on and 30 s off).

The antibodies used for ChIP–seq experiments were rabbit polyclonal anti-CDK8 (A302-500A, Bethyl laboratories, 2.5 μl), rabbit monoclonal anti-RING1B (5694, Cell Signaling, 3 µl), rabbit polyclonal anti-PCGF2 (sc-10744, Santa Cruz, 3 µl), rabbit polyclonal anti-CBX7 (ab21873, abcam, 4 µl), rabbit monoclonal anti-T7-Tag (D9E1X) (13246, Cell Signaling, 3 µl). The antibodies used for ChIP-quantitative PCR for TetO targeting experiments were rabbit polyclonal anti-FS2 (produced in-house89, 33 µl), polyclonal anti-MED12 (A300-774A, Bethyl laboratories, 3 µl), polyclonal anti-MED1 (A300-793A, Bethyl laboratories, 3 µl), polyclonal anti-CCNC (A301-989A, Bethyl laboratories, 3 µl) and rabbit polyclonal anti-FS2 (produced in-house, 5 µl).

Native chromatin immunoprecipitation

Native calibrated ChIP–seq for H3K27me3 was performed as described previously59,89. In brief, 50 × 106 ESCs were mixed with 20 × 106 SG4 Drosophila cells and washed with 1× PBS prior to chromatin isolation. Nuclei were released in ice-cold lysis buffer (10 mM Tris-HCl pH 8.0, 10 mM NaCl, 3 mM MgCl2, 0.1% NP-40), washed and resuspended in 1 ml ice-cold digestion buffer (10 mM Tris-HCl pH 8.0, 10 mM NaCl, 3 mM MgCl2, 0.1% NP-40, 0.25 M sucrose, 3 mM CaCl2, 1× cOmplete protease inhibitor cocktail (Roche)). Chromatin was digested with 200 units MNase (Thermo Fisher Scientific) for 5 min at 37 °C, and the reaction was stopped by the addition of 4 mM EDTA pH 8.0. The samples were centrifuged at 1,500×g for 5 min at 4 °C, the supernatant (S1) was retained. The remaining pellet was incubated with 300 μl of nucleosome release buffer (10 mM Tris-HCl pH 7.5, 10 mM NaCl, 0.2 mM EDTA, 1× protease inhibitor cocktail (Roche)) at 4 °C for 1 h, passed five times through a 27 guage needle using a 1 mL syringe, and spun at 1,500×g for 5 min at 4 °C. The second supernatant (S2) was collected and combined with the corresponding S1 sample from above. Digestion to mostly mononucleosomes was confirmed on a 1.5% agarose gel. The prepared native chromatin was aliquoted, snap frozen in liquid nitrogen, and stored at −80 °C until further use. ChIPs were performed as described previously59, using 5 µl of H3K27me3 antibody prepared in-house.

Calibrated nuclear RNA-seq

Nuclear RNA sample preparation was performed using 20 × 106 ES or RA-treated cells and 8 × 106 SG4 Drosophila cells, as described previously59. RNA was isolated from purified nuclei using a RNeasy RNA extraction kit (Qiagen), and genomic DNA contamination was depleted using a TURBO DNA-free Kit (Thermo Fisher Scientific). The quality of RNA was assessed using a 2100 Bioanalyzer RNA 6000 Pico kit (Agilent). All cnRNA-seq experiments were performed in biological quadruplicates.

Library preparation and high-throughput sequencing

All cChIP–seq experiments were performed in biological triplicates. All ncRNA-seq experiments were performed in biological quadruplicates. Libraries for cChIP–seq and native cChIP–seq were prepared from 5–10 ng of ChIP and corresponding input DNA samples using a NEBNext Ultra II DNA Library Prep Kit for Illumina (New England Biolabs), following the manufacturer’s guidelines. For ncRNA-seq, RNA samples (800 ng) were depleted of ribosomal RNA using the NEBNext rRNA Depletion kit (New England Biolabs). RNA-seq libraries were prepared using the NEBNext Ultra Directional RNA Library Prep kit (New England Biolabs). Samples were indexed using NEBNext Multiplex Oligos (New England Biolabs). The average size and concentration of all libraries were analyzed using the 2100 Bioanalyzer High Sensitivity DNA Kit (Agilent) followed by qPCR using SensiMix SYBR (Bioline, UK) and KAPA Illumina DNA standards (Roche). Libraries were sequenced as 40 bp paired-end reads on Illumina NextSeq 500 platform.

Massively parallel sequencing, data processing and normalization

For cChIP–seq, paired-end reads were aligned to concatenated mouse and spike-in genomes (mm10 + hg19 for crosslinked cChIP–seq and mm10 + dm6 for native cChIP–seq) using Bowtie 2 (ref. 90) with the ‘–no-mixed’ and ‘–no-discordant’ options specified. Reads that were mapped more than once were discarded, followed by removal of PCR duplicates using Sambamba91.

For cnRNA-seq, paired-end reads were first aligned using Bowtie 2 (with ‘–very-fast,’ ‘–no-mixed’ and ‘–no-discordant’ options) against the concatenated mm10 and dm6 rRNA genomic sequences (GenBank: BK000964.3 and M21017.1) to filter out reads mapping to ribosomal RNA gene fragments. All unmapped reads were then aligned against the genome sequence of concatenated mm10 and dm6 genomes using STAR92. To improve mapping of intronic sequences of nascent transcripts abundant in nuclear RNA-seq, reads failing to map using STAR were aligned against the mm10 + dm6 concatenated genome using Bowtie 2 (with ‘-sensitive-local’, ‘–no-mixed’ and ‘–no-discordant’ options). PCR duplicates were removed using SAMTools93.

For visualization and annotation of genomic regions, internal normalization of cChIP–seq and ncRNA-seq experiments was performed as described previously59. In brief, mouse reads were randomly downsampled based on the spike-in ratio (hg19 or dm6) in each sample. To account for possible spike-in cell variation, the ratio of spike-in to mouse read counts in the corresponding ChIP inputs were used as correction factors for cChIP–seq replicates. MED14-T7 ChIP–seq was performed without spike-in normalisation. Individual replicates were compared using multiBamSummary and plotCorrelation functions from deepTools (version 3.1.1)94, confirming a high degree of correlation (Pearson’s correlation coefficient > 0.9). Replicates were pooled for downstream analysis. Genome-coverage tracks for visualization on the UCSC genome browser95 were generated using the pileup function from MACS2 (ref. 96) for ChIP–seq and genomeCoverageBed from BEDtools (v2.17.0) (ref. 97) for cnRNA-seq.

Read count quantification and analysis

Heatmap and metaplot analysis for ChIP–seq was performed using computeMatrix and plotProfile and plotHeatmap functions from deepTools (v.3.1.1)94, looking at read density at Polycomb domains, CDK8 peaks or transcription start sties (TSSs) of CKM–MED-dependent genes. Intervals of interest were annotated with read counts from merged replicates, using a custom-made Perl script utilising SAMtools (v1.7) (ref. 93). Polycomb domains were defined in ref. 59. CDK8 peaks were defined in ref. 55. H3K27me3ac peaks were defined in ref. 44.

For differential gene expression analysis, read counts were obtained from the non-normalized mm10 BAM files for a non-redundant mouse gene set, using a custom-made Perl script utilizing SAMtools (v1.7) (ref. 93). The non-redundant mouse gene set (n = 20,633) was obtained by filtering mm10 refGenes for very short genes with poor sequence mappability and highly similar transcripts. To identify significant changes in gene expression, a custom-made R script utilizing DESeq2 (ref. 98) was used. For spike-in normalization, read counts for the spike-in genome at a unique set of dm6 refGenes were supplied to calculate DESeq2 size factors which were then used for DESeq2 normalization of raw mm10 read counts, similarly to ref. 99. For a change to be considered significant, a threshold fold change of > 1.5 and adjusted P < 0.05 was applied.

The distribution of log2 fold changes and normalized read counts at different genomics intervals was visualized using custom R scripts. For boxplot analyses, boxes showing interquartile range (IQR) and whiskers extending by no more than 1.5 × IQR were used.

Hi-C library preparation and analysis

In situ Hi-C in Med13/13lfl/fl ESCs was performed in biological duplicates as described in ref. 100. Hi-C libraries were sequenced on the Illumina NextSeq 500 platform as 51 bp or 40 bp paired-end reads. Hi-C sequencing data were mapped to GRCm38.p6 and processed with Hi-C-Pro 2.9 (ref. 101). Further data analysis was performed with GENOVA (http://www.github.com/deWitLab/GENOVA)102.

TAD and loop coordinates of mouse ESC samples were taken from ref. 25. Aggregate peak analysis (APA) and aggregate TAD analysis (ATA) were performed on 10 kb ice-normalized matrices with default parameters. Paired-end spatial chromatin analysis (PE-SCAn) between the 100 kb regions surrounding Ring1B peaks in Polycomb domains was also performed on these matrices. Super-enhancer coordinates for GRCm38.p6 were downloaded from dbSUPER103. PE-SCAn between the 1 Mb regions surrounding super enhancers was performed using 20 kb ice-normalized matrices, setting the top and bottom 5% values as outliers.

Capture-C extraction protocol

Chromatin was extracted and fixed as described previously104. In brief, 10 × 106 mouse ESCs were trypsinized, collected in 50 ml falcon tubes in 9.3 ml medium, and crosslinked with 1.25 ml 16% formaldehyde (1.89% final concentration; methanol-free, Thermo Fisher Scientific) while rotating for 10 min at 25 °C. Cells were quenched with 1.5 ml 1 M cold glycine, washed with cold PBS and lysed for 20 min at 4 °C in lysis buffer (10 mM Tris pH 8, 10 mM NaCl, 0.2% NP-40, supplemented with cOmplete proteinase inhibitors (Roche)) prior to snap freezing in 1 ml lysis buffer on dry ice. Fixed chromatin was stored at −80 °C.

Capture-C library construction protocol

Capture-C libraries were prepared as described previously105. In brief, lysates were thawed on ice, pelleted and resuspended in 650 µl 1× DpnII buffer (New England Biolabs). Three 1.5 ml tubes with 200 µl lysate each were treated in parallel with 0.28% final concentration of SDS (Thermo Fisher Scientific) for 1 h at 37 °C in a thermomixer shaking at 500 r.p.m. (30 s on/off). Reactions were quenched with 1.67% final concentration of Triton X-100 for 1 h at 37 °C in a thermomixer shaking at 500 r.p.m. (30 s on/off) and digested for 24 h with 3 × 10 µl DpnII (produced in-house) at 37 °C in a thermomixer shaking at 500 r.p.m. (30 s on/off). An aliquot from each reaction (100 µl) was taken for use as the digestion control, reverse crosslinked and visualized on an agarose gel. The remaining chromatin was then independently ligated with 8 µl T4 Ligase (240 units Thermo Fisher Scientific) in a volume of 1440 µl for 20 h at 16 °C. The nuclei containing ligated chromatin were pelleted to remove any non-nuclear chromatin and reverse crosslinked, and the ligated DNA was phenol-chloroform purified. The sample was resuspended in 300 µl water and sonicated for 13 cycles (30 s on/off) using a Bioruptor Pico (Diagenode) to achieve a fragment size of approximately 200 bp. Fragments were size-selected using AmpureX beads (Beckman Coulter) and a 0.85×/0.4× selection ratio. Two reactions of 1–5 µg DNA each were adapter-ligated and indexed using the NEBNext Ultra II DNA Library Prep Kit for Illumina (New England Biolabs) and NEBNext Multiplex Oligos for Illumina Primer sets 1 and 2 (New England Biolabs). The libraries were amplified with seven PCR cycles using the Herculase II Fusion Polymerase kit (Agilent). Libraries were hybridized in the following way: for each promoter containing a DpnII restriction fragment, we designed two 70 bp capture probes using the CapSequm online tool (http://apps.molbiol.ox.ac.uk/CaptureC/cgi-bin/CapSequm.cgi) with the following filtering parameters: duplicates, < 2; density, < 30; SRepeatLength, < 30; duplication, FALSE. For promoters for which no probes could be designed for the restriction fragment directly overlapping the TSS, probes were designed for the next-nearest DpnII fragment, if it was within 500 bp of the TSS. The probes were pooled at 2.9 nM each, and the samples were multiplexed en masse prior to hybridization (2 µg each, according to Qubit dsDNA BR Assay, Invitrogen). Hybridization was carried out using the Nimblegen SeqCap system (Roche, Nimblegen SeqCap EZ HE-oligo kit A no. 6777287001, Nimblegen SeqCap EZ HE-oligo kit B no 06777317001, Nimblegen SeqCap EZ Accessory kit v2 no. 07145594001, Nimblegen SeqCap EZ Hybridization and wash kit no. 05634261001), according to the Roche protocol, for 72 h followed by a 24 h hybridization (double capture). Captured libraries were quantified by qPCR using SensiMix SYBR (Bioline) and KAPA Illumina DNA standards (Roche) and sequenced on an Illumina NextSeq 500 as 40 bp paired-end reads. Libraries for Capture-C in Med13/13lfl/fl and Pcgf4−/−Pgcf2fl/fl were generated using biological triplicates (Capture set1) or biological duplicates (Capture set2, as control for captures in the TetR-fusion lines). Libraries for Capture-C in the TetR-fusion lines were generated in biological triplicates.

Capture-C analysis

Paired-end reads were aligned to mm10 (or mm10 + BAC insert for TetR-fusion cell lines) and filtered for Hi-C artifacts using HiCUP106 and Bowtie 2 (ref. 90), with the fragment filter set to 100–800 bp. Read counts of reads aligning to captured gene promoters and interaction scores (=significant interactions) were then called by CHiCAGO107.

For visualisation of Capture-C data, weighted, pooled read counts from CHiCAGO data files were normalized to total read counts aligning to captured gene promoters in the sample and then to the number of promoters in the respective capture experiment and multiplied by a constant number to simplify genome browser visualization using the following formula: normCounts=1/cov*nprom*100000. Bigwig files were generated from these normalized read counts.

For comparative boxplot analysis, we first determined all interactions between promoters and a given set of intervals (that is, Polycomb domains) using a CHiCAGO score of ≥5 as a cutoff. Next, for each promoter-interval interaction, we quantified the sum of normalized read counts or CHiCAGO scores across all DpnII fragments overlapping this interval. This number was then divided by the total number of interval-overlapping DpnII fragments to obtain mean normalized read counts and scores. For boxplot analyses, boxes show IQR and whiskers show the most extreme data point, which is no more than by 1.5 × IQR.

Statistics and reproducibility

Details of the individual statistical analyses and tests, as well as the number of biological replicates, can be found in the respective figure legends and in the detailed methods description. No statistical method was used to predetermine sample size. No data were excluded from the analyses. The experiments were not randomized. The investigators were not blinded to allocation during experiments and outcome assessment.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.