A Mediator-cohesin axis controls heterochromatin domain formation

The genome consists of regions of transcriptionally active euchromatin and more silent heterochromatin. We reveal that the formation of heterochromatin domains requires cohesin turnover on DNA. Stabilization of cohesin on DNA through depletion of its release factor WAPL leads to a near-complete loss of heterochromatin domains. We observe the opposite phenotype in cells deficient for subunits of the Mediator-CDK module, with an almost binary partition of the genome into dense H3K9me3 domains, and regions devoid of H3K9me3 spanning the rest of the genome. We suggest that the Mediator-CDK module might contribute to gene expression by limiting the formation of dense heterochromatin domains. WAPL deficiency prevents the formation of heterochromatin domains, and allows for gene expression even in the absence of the Mediator-CDK subunit MED12. We propose that cohesin and Mediator affect heterochromatin in different ways to enable the correct distribution of epigenetic marks, and thus to ensure proper gene expression. The link between 3D genome architecture and gene expression is still far from resolved. Here the authors show that loss of the CDK catalytic subunit of the Mediator complex results in heterochromatic silencing, which can be rescued by stabilization of cohesin on chromatin.

W ithin the nucleus active and inactive genomic regions segregate into euchromatin and heterochromatin. The latter is molecularly defined by the post-translational methylation of the ninth lysine of histone H3 (H3K9me) 1,2 . This histone modification creates a binding site for heterochromatin protein 1 (HP1). Recent results suggest that heterochromatin forms a phase-separated compartment inside the nucleus 3 . Heterochromatin has the propensity to self-amplify through the interaction of HP1 with the histone methyltransferase Suv39h, which methylates H3K9 and creates binding site for HP1 4 . How this amplification is kept in check remains unknown.
The distribution of heterochromatic histone marks shows a remarkable correlation with many genomic features. Examples include late replication 5 , lamina association 6 , and inactive compartments. Chromosomes are organized into multi-megabase domains called A and B compartments into which active and inactive chromatin is physically separated 7,8 . Cohesin-dependent chromatin loops counteract compartmentalization 9,10 , and loss of cohesin is associated with an increase in compartmentalization 9,11 .
The 3D genome has been heralded as an important factor in the regulation of gene expression. However, the total ablation of cohesin-dependent loops shows only mild overall effects on gene expression in cancer cell lines 12,13 . On the other hand, the loss of individual CTCF sites, through mutation or DNA methylation, can have severe pathological consequences in limb development 14 and cancer progression 15 . Changes in the 3D genome may therefore have unexpected and cell-type-specific phenotypic effects.
In this work, we explore the role of chromatin loop formation and compartmentalization in the control of the epigenome and the expression of genes. We reveal a surprising link between the three-dimensional organization of the genome and the linear distribution of chromatin modifications. Stabilization of cohesin on chromatin, besides reducing compartmentalization, also counteracts heterochromatin domain formation. Loss of the Mediator complex CDK-module subunits MED12 and CCNC on the other hand increases both compartmentalization and heterochromatin domain formation. We conclude that a Mediatorcohesin axis controls both the spatial and the linear epigenome.

Results and discussion
We have previously shown that the stabilization of cohesin on chromatin has a major effect on the 3D genome. Loss of the cohesin-release factor WAPL leads to a genome-wide increase in the length and number of CTCF-anchored loops 9,16 . This in turn leads to a severe decrease in compartmentalization (Fig. 1A). We wondered whether the change in the 3D genome was associated with a change in the epigenome. To this end, we mapped core histone modifications H3K4me1 and H3K9me3 in wild-type and WAPL knock-out (ΔWAPL) HAP1 cells. In wild-type cells, H3K9me3 shows broad domain-like distributions that overlap almost exclusively with B compartments (Supplementary Fig. 1a).
In A compartments, H3K9me3 shows a more focal distribution, covering transposable elements ( Supplementary Fig. 1b). In ΔWAPL cells, we see a near-complete loss of H3K9me3 domains ( Fig. 1B-D) and a doubling of H3K9me3 peaks (Fig. 1E). We also see a marked increase of H3K4me1 peaks in genomic regions that are H3K9me3 domains in wild-type cells (3414 vs. 8310, Fig. 1F). WAPL-mediated cohesin-release apparently maintains a balance between eu-and heterochromatin domains.
As ΔWAPL cells almost completely lack heterochromatin domains, we reasoned that this could be an interesting setting to identify factors that restrict the amplification of heterochromatin domains. While wild-type cells presumably are dependent on such suppressors of heterochromatin, these factors may no longer be needed in ΔWAPL cells. If so, loss of suppressors of heterochromatin should be detrimental to wild-type cells, but not to ΔWAPL cells. To this end we revisited a haploid genetic screen we  integrations with respect to a protein coding gene result in truncating deletion. Anti-sense integrations are largely inconsequential and serve as a control for accessibility of the gene. Right panel shows percentage of sense integrations (over a total of sense and anti-sense) shown for WT and ΔWAPL cells for the Mediator subunits Cyclin-C (CCNC) and MED12. Boxplot shows the result of n = 3 haploid genetic gene trap assays for WT and ΔWAPL cells. Boxes indicate the interquartile range (IQR) of the data (25-75%) and box center line indicates the median. Whiskers extend to the minimum or maximum value that lies no further than 1.5 times the IQR from the bottom or top of the box, respectively. B Western blot analysis confirming knock-out of MED12 and Cyclin-C. HSP90 and actin represent loading controls for MED12 and Cyclin-C panels directly above. Representative images are shown from experiments that have been performed twice. C ChromHMM analysis segmenting the genome in 15 chromatin states shown for WT and MED12 cells. Table shows  performed in wild-type and ΔWAPL cells 9 . The screen is based on a viral gene trap assay that knocks out a gene when it lands in the sense orientation, whereas anti-sense integration leaves the gene unaffected ( Fig. 2A). A decrease in the ratio of sense over antisense integrations is an indication that the gene is required for cell viability 17 . We found that disruption of genes encoding two subunits of the CDK-module of the Mediator complex, MED12 and Cyclin C (CCNC), affected the proliferation of wild-type cells, but this growth defect was mitigated in ΔWAPL cells ( Fig. 2A).
To study the role of MED12 and Cyclin C, we generated knockout cells using CRISPR-Cas9 genome editing. Western blot analysis (Fig. 2B), Sanger sequencing, and Targeted Locus Amplification (TLA) 18 confirmed the perturbation of the genes (Supplementary Fig. 2A, B).
As MED12 and Cyclin C both are part of the same module of the Mediator complex 19 , loss of either of these factors presumably affects the same cellular pathway. To characterize the effect of mutating this module, we charted the epigenomic landscape by performing ChIPseq of 5 histone modifications 20 : H3K4me3, H3K4me1, H3K36me3, H3K27me3 and H3K9me3 in wild-type and ΔMED12 (Supplementary Fig. 2c). We performed ChromHMM analysis 21 to segment the genome into 15 different chromatin states. The most obvious difference is that loss of MED12 leads to a more than three-fold increase in the heterochromatin state (Fig. 2C). When we inspected the genomewide coverage of H3K9me3 domains in ΔMED12 cells, we saw only a mild increase (Fig. 2D). Surprisingly, we even saw a 5.5fold decrease in the amount of H3K9me3 peaks (27403 vs. 4976, Fig. 2E, Supplementary Fig. 1C). Upon further inspection of the ChIPseq track we found that for both for ΔMED12 and ΔCCNC there was a marked increase in the H3K9me3 signal at wild-type H3K9me3 domains (Fig. 2F), leading to more pronounced heterochromatin domains. The H3K4me1 signal, on the other hand, was diminished in heterochromatin domains in ΔMED12 and ΔCCNC cells (Fig. 2F, Supplementary Fig. 2d). Whole-genome analysis of the H3K9me3 domain signals corroborated these results (Fig. 2G). To further assess the role of MED12 and Cyclin C in heterochromatin distribution, we performed immunofluorescence (IF) analysis of H3K9me3 and HP1α. Whereas in wild-type cells a low level of H3K9me3 can be found throughout the entire nucleus, in ΔMED12 and ΔCNCC cells H3K9me3 levels are reduced to baseline levels in much of the nucleus, and seems to be accumulated at specific sites and the nuclear periphery (Fig. 2H). Quantification of the IF data shows a decrease in the median H3K9me3 signal strength confirming the redistribution of H3K9me3 in the nucleus ( Supplementary Fig. 2E). The IF quantification for HP1α showed a similar result ( Supplementary  Fig. 2f). To assess whether canonical heterochromatin proteins were deregulated we performed RNAseq analysis. We did not observe differential expression for genes encoding classical heterochromatin proteins such as HP1-isoforms ( Supplementary   Fig. 3A), nor could we pinpoint any other gene expression differences that provide an indirect explanation for changes in heterochromatinization.
MED12 has been suggested to be involved in the bridging of promoters to enhancers 22,23 , although this role of Mediator has been questioned in two recent reports 24,25 . To understand the role of MED12 in 3D genome organization in our system, we generated high-resolution in situ Hi-C maps for ΔMED12 cells. When we zoom in to a single chromosome arm, we see that there is a clear increase in compartmentalization in ΔMED12 compared to wild-type cells (Fig. 3A, Supplementary Fig. 4a). By quantifying the degree of compartmentalization (i.e. compartment strength 26 ), we find that compartmentalization increases for almost every chromosome arm (Fig. 3B, Supplementary Fig. 4b).
Upon studying the Hi-C matrices, we noted that the compartmentalization seemed to increase only for the B compartment. Comparison of wild-type and ΔMED12 cells overall indeed showed stronger compartment scores for the B compartment, whereas regions in the A-compartment remained largely unchanged (Fig. 3C). We devised a hidden Markov model to identify regions of increased compartmentalization. We refer to these regions as hypercompartmentalized domains (HCDs), which together cover 633 Mb (22.1% of the genome). As expected, HCDs showed considerable overlap with H3K9me3 domains (Fig. 3D). Collectively, these results would suggest that the Mediator-CDK-module somehow restricts heterochromatinization and prevents over-compartmentalization. Compartments can be subdivided into submegabase-sized topologically associating domains (TADs). These are formed by cohesin-mediated loop extrusion 10,12,27 . We analyzed what happens to TAD borders (i.e. genomic regions separating two TADs) that are located in HCDs. We find that TAD borders that become stronger (i.e. showing stronger segregation) are located closer to the boundaries of HCDs (Fig. 3E), consistent with the observed increase in compartmentalization. Conversely, TAD boundaries that decrease in strength are found closer to the center of HCDs (Fig. 3E). This suggests that TAD-border separation, which is controlled by cohesin and CTCF, can be disrupted in ΔMED12 cells. Figure 3F shows an example of an HCD where there is a clear loss of a TAD border inside an HCD. This region also reveals a loss of CTCF-anchored chromatin loops within the region. Note that when we calculate the aggregate signal of all chromatin loops identified in wild-type cells, we do not see a clear difference ( Supplementary Fig. 4c). Together these results indicate that MED12 deficient cells display less clear TAD boundaries and a loss of CTCF-anchored loops, but that both these phenotypes are particularly pronounced within heterochromatin domains.
To better understand the role of the CDK-module in 3D genome organization, we mapped the binding sites of the architectural proteins CTCF and SCC1 (also known as RAD21, a subunit of the cohesin complex) using ChIPseq in wild-type, ΔMED12 and ΔCCNC cells. The binding sites that were identified in wild-type were comparable to our previously published ChIPseq tracks for these factors (Supplementary Fig. 4e). Inspection of the binding tracks revealed that CTCF binding was diminished at numerous sites in HCDs (Fig. 4A). The identified binding sites in HCDs of ΔMED12 and ΔCCNC cells decreased in number compared to wild-type in HCDs, as well as in signal strength (Fig. 4B, C), while CTCF protein levels were unaffected ( Supplementary Fig. 4d). To link these changes to genome organization, we selected the CTCF/SCC1 binding sites that show the greatest relative decline in binding signal for CTCF and SCC1. By performing a pile-up of Hi-C signal on pairwise combinations of these sites, we find that wild-type cells display chromatin loops at these sites, and that these loops are ablated in ΔMED12 and ΔCCNC cells (Fig. 4D). The CTCF/SCC1 sites with the strongest decline are heavily skewed towards H3K9me3 domains (Fig. 4E). Notably, the vast majority of chromatin loops identified using HICCUPs 28 in wild-type do not colocalize with MED12-peaks ( Supplementary Fig. 4F).
Our genetic screening data showed that mutations in CDKmodule subunits were better tolerated in ΔWAPL cells. We therefore mutated MED12 in a ΔWAPL background to create double knock-out lines. When we perform immunofluorescence analysis of H3K9me3, we find that the pronounced heterochromatin domains of the ΔMED12 cells are absent in the ΔWAPL/ΔMED12 cells (Fig. 5A, Supplementary Fig. 5b). Consistent with the immunofluorescence results, we find by ChIPseq that H3K9me3 domains are strongly diminished (Fig. 5B, Supplementary Fig. 5c-f), and by Hi-C we find that the hypercompartmentalization seen in ΔMED12 is also completely rescued. These results indicate that the stabilization of chromatin loops by WAPL-ablation can counteract the increased compartmentalization and heterochromatinization that is conferred by loss of MED12.
An important remaining question is how these changes in the 3D genome and epigenome affect gene expression. To this end, we performed RNA-seq analysis on the wild-type, ΔWAPL, ΔMED12, and ΔWAPL/ΔMED12 cell lines. When we analyze the genes located in H3K9me3 domains, we find a large cluster of genes whose expression is downregulated in ΔMED12 cells, consistent with the increased heterochromatinization (Fig. 5C-E). However, when MED12 is knocked out in the ΔWAPL background, these genes are not downregulated. These results show that these genes are not dependent on MED12 per se for their expression, and would suggest that the 3D genome organization rather dictates the expression of these genes. MED12 therefore may contribute to the maintenance of a 3D genome organization that is permissible to gene expression.
Our functional analyses show that the 3D genome is intimately linked to the epigenome and gene expression. How then does loss of the Mediator-CDK-module subunits lead to such major changes in the epigenome, and in particular to the amplification of heterochromatin domains? Our analyses show that loss of this module results in reduced CTCF and cohesin binding, and correspondingly in a loss of chromatin loops. Conversely, we find that stabilization of such loops by the loss of WAPL results in the near-ablation of heterochromatin domains. We therefore suggest that chromatin loops counteract the formation of heterochromatin domains.
An explanation for the over-heterochromatinization upon the loss of loops may lie in the microphase separation model that has been proposed for the organization of the 3D genome into compartments 29 . Heterochromatin has the characteristics of a phase-separated compartment 3 and its constituent protein HP1 can phase-separated in vitro 30 . We propose that chromatin loops counteract the phase separation of heterochromatin components. On one hand, WAPL, by driving cohesin's release from DNA, allows the formation of heterochromatin domains. On the other hand, a fully intact Mediator-CDK-module counteracts the formation of heterochromatin domains (Fig. 6). It should be noted that in our analyses we have used clonal knock-out lines of MED12 and CCNC and therefore we cannot exclude that the observed effects occur through an, as of yet unidentified, indirect mechanism. Given the crucial role of the Mediator complex in gene expression, we cannot rule out that a regulator of heterochromatin is misexpressed. We have scrutinized our RNAseq data for indications that epigenetic modifiers are changed in gene expression, and found no obvious candidate. However, we cannot exclude the possibility of changes in the expression of an unknown regulator of heterochromatin domains.
While we cannot rule out an indirect regulatory mechanism, the Mediator-CDK-module may in fact be directly involved in regulating heterochromatin domains. Below we discuss two, not necessarily mutually exclusive mechanisms, for how the Mediator-CDK-module might restrain heterochromatin domain formation. One possibility is through the direct action of the Mediator-CDK-subunit CDK8, which has been shown to be a histone H3 kinase 31 . Phosphorylation of S10 of H3 blocks the binding of HP1 to H3K9me3 32,33 . Loss of MED12 may lead to a decrease in the activity of the module and hence a decrease in H3S10P, which in turn could allow for increased heterochromatin domain formation. Alternatively, MED12 may play a role in the formation and/or stabilization of loops, as has been suggested This would also explain the loss of heterochromatin domains in the absence of WAPL: a stabilization of loops would then counteract the phase-separation of heterochromatin domains, and hereby prevent their self-amplification. Indeed, highresolution DNA FISH experiments have shown that loss of WAPL results in increased interactions between neighboring domains 35 , consistent with our Hi-C data. This domain intermingling, particularly between heterochromatic (H3K9me3 high) and euchromatic (H3K9me3 low) domains may explain the redistribution of this epigenetic mark. An increase in heterochromatin density, as observed in the Mediator-CDK-module mutants, could in turn limit the binding of CTCF and cohesin, which then could allow for yet further heterochromatinization. The absence of a counterforce could therefore result in a shift in the epigenome, in which all heterochromatin that would normally reside at dispersed focal H3K9me3 peaks now ends up in monolithic heterochromatin domains. We propose that heterochromatin domains could thus act as a sink for heterochromatin components. By preventing the formation of such dense heterochromatin structures, MED12 might contribute to the maintenance of a 3D genome organization that is permissible to gene expression. As such we have uncovered a previously unappreciated link between nuclear organization and the epigenome, and we suggest that this inter-connected network could control gene regulation in an unexpected way.
Hi-C. We performed in-nucleus or in situ Hi-C 36-38 as described in ref. 9 . Hi-C libraries were sequenced on a HiSeq X. Hi-C sequencing data was processed with Hi-C-Pro 2.9 39 . We performed loop calling with HICCUPs 0.9 28 . Further data analysis was performed with GENOVA 40 (github.com/deWitLab/GENOVA), see below for details.
To ensure biological reproducibility, we analyzed different genetic clones for every knock-out line, and multiple replicates per clone. For ChIPseq samples were crosslinked for 10 min in 1% formaldehyde. For cohesin and CTCF ChIPs we used 15 milion cells and for the histone modifications 3 million cells. Crosslinking was stopped by adding Glycin (0.1 M end concentration). Cells were washed twice with PBS by centrifugation for 5 min at 1350 G. The cells were lysed in 10 ml lysis buffer 1 (50 mM HEPES pH7.5, 140 mM NaCl, 1 mM EDTA, 10% Glycerol, 0.5% NP40, 0.25% Triton X-100) at 4°C. Cells were centrifuged for 5 min at 1350 G and the cell pellet was incubated for 10 min in lysis buffer 2 (200 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 10 mM Tris-HCl). After centrifugation the pellet was resuspended in lysis buffer 3 (100 mM NaCl, 1 mM EDTA, 0.5 mM EGTA, 10 mM Tris-HCl ph8, 0.1% DOC, 0.5% N-Lauroyl sarcosine). To shear the DNA we used for the histone modifications a covaris ultrasonicator in covaris microtubes AFA Hi-C analysis. The compartment score was calculated as described in refs. 11,41 . Briefly, per chromosome arm we calculate an observed over expected (O/E) matrix with a resolution of 100 kb. We subtract 1 from the O/E matrix and extract the eigen vectors. The first eigen vector is multiplied by the square root of the eigen value of this vector resulting in the compartment score. As the compartment score is a dimensionless score, we use the H3K4me1 signal to correctly orient the compartment score. To calculate the compartment strength we selected the bins with top 20% and bottom 20% of the compartment score as A and B compartments, respectively. The compartment strength is then calculated as the log2 of product of the interactions between two A compartments (AA) and two B compartments (BB) divided by the square of the interactions between an A and a B compartment (AB): log 2 (AAxBB)/AB 2 .
To identify hyper-compartmentalized domains, a hidden Markov model (HMM) was used. The model was specified as having two hidden states, stable vs. differential compartment score, and two gaussian responses: the ΔMED12 compartment scores and the absolute difference in compartment scores between wild-type and ΔMED12 on the 40 kb resolution matrices. Parameters were optimized by the default EM algorithm of depmixS4 42 . This model was used to classify states on all chromosome arms separately with 40 kb resolution compartment scores as input. Afterwards, two filtering steps were applied to call HCDs: (i) we selected only regions with negative compartment scores in ΔMED12 Hi-C data and (ii) only regions with five consecutive differential states were called as HCDs.
Insulation scores were calculated as described in ref. 43 . We compared the insulation score of TAD borders called in wild-type 9 for wild-type and ΔMED12 lines and ordered the borders on the difference in the insulation score. We stratified the TAD borders located in an HCD into three equal-sized categories based on the difference in insulation score: weakened borders (higher insulation score), stronger borders (lower insulation score) and equal border (little to no difference in insulation score). We determined the position of the TAD borders in all three categories with respect to the edges of HCDs.
ChIPseq data analysis. Mapping of ChIPseq data was performed with bowtie 2.3.4.1 44 to hg19. We performed peak calling with MACS2 2.1.1 45 for SCC1, CTCF, H3K4me3, H3K4me1, and H3K36me3 with standard settings. For H3K9me3 we performed peak calling in advanced mode using the following settings -l 500 -g 65 -cutoff-analysis 2.25. ChromHMM was performed as described in ref. 21 . H3K9me3 domains were also called with MACS2 in advanced mode using the following settings -l 500 -g 65 -G 2000. Where indicated we filtered out H3K9me3 peaks that overlapped with H3K9me3 domains. ChIPseq alignment plots were created with deeptools 3.0.0.
RNAseq data analysis. RNAseq data were mapped with TopHat 2.1.1 46 and count-tables were generated with HTseq 47 with the stranded=reverse setting using the Gencode v27lift37 gene-build. TPMs were calculated with DESeq2 1.18.1 48 by dividing the counts by the normalization factors. Differential genes were called using DESeq2 using the Wald test. Profiles were generated with RPKM-normalized tracks in triplicate. Our RNAseq data offered the opportunity to check whether the changes in 3D genome organization and chromatin landscape were due to a change in the RNA levels of obvious chromatin regulatory or architectural proteins. No obvious genes or overarching functional categories were found enriched among upor downregulated genes ( Supplementary Fig. 3b, c). Furthermore, the expression of CTCF, RAD21, HP1-isoforms, and DNA-methyltransferases and -demethylases was unaltered (Supplementary Fig. 3a). We selected genes that overlapped with wild-type H3K9me3 domains. We performed k-means clustering on the genes with k = 2.
Immunofluorescence. Cells were grown on poly-l-lysine (Sigma-Aldrich) coated coverslips and samples were taken at 30% confluency. Non-chromatin bound proteins were removed using a pre-extraction procedure of 0.1% Triton X-100 in PBS for 1 min, followed by fixation using 3.7% formaldehyde in PBS for 7 min. Samples were blocked using 4% BSA in PBS for 1 h at room temperature and incubated with primary H3K9Me3 (ab8898, Abcam) or HP1alpha (Clone 15 19s2, Upstate/MilliporeSigma) antibody at a 1:1000 dilution overnight at 4°C. Samples were washed (0.1% Tween in PBS) and incubated using secondary antibody (either goat anti-Mouse Alexafluor 488 or 568 (ThermoFisher A-11029 and A-11004), and goat anti-Rabbit Alexafluor 488 or 568 (ThermoFisher A-11008 and A-11011), used at a 1:600 dilution) and DAPI (Sigma-Aldrich). Coverslips were mounted onto glass slides using Prolong Antifade Gold (Invitrogen). Representative images were acquired using a Leica Confocal microscope 63x/1.32 oil lens using LAS-AF Software (Leica), DAPI levels were adjusted for visualization purposes. Analysis of the intensity of H3K9me3 or HP1 in the DAPI region was performed using an inhouse written macro (ImageJ).
TLA. TLA was performed as previously described 18 , using the default restriction enzyme combination (i.e NlaIII as a first restriction enzyme and NspI as the second). We designed TLA primers that on the resistance marker sequence (i.e. Blasticidin or Puromycin) to amplify from the inserted cassette. Blasticidin primers were: forward: GCTAGTTCAAACCTTGGGAAAA, reverse: ATGAGCA-CAAAGCAGTCAGG. Puromycin primers were: forward: GCAACCTCCCCTTCTACGAG, reverse: AGGCCTTCCATCTGTTGCT. Sequencing libraries were generated using in-house produced Tn5 enzyme 49 and sequenced on HiSeq 2500 single end 65 nt.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
The data that support this study are available from the corresponding authors upon reasonable request. The RNAseq, ChIPseq and Hi-C data generated in this study have been deposited in the NCBI GEO database under accession code GSE125672. The public datasets used in this study are available in the GEO database under accession code GSE95015. Source data are provided with this paper.