The ubiquitin ligase Cullin-1 associates with chromatin and regulates transcription of specific c-MYC target genes

Transcription is regulated through a dynamic interplay of DNA-associated proteins, and the composition of gene-regulatory complexes is subject to continuous adjustments. Protein alterations include post-translational modifications and elimination of individual polypeptides. Spatially and temporally controlled protein removal is, therefore, essential for gene regulation and accounts for the short half-life of many transcription factors. The ubiquitin–proteasome system is responsible for site- and target-specific ubiquitination and protein degradation. Specificity of ubiquitination is conferred by ubiquitin ligases. Cullin-RING complexes, the largest family of ligases, require multi-unit assembly around one of seven cullin proteins. To investigate the direct role of cullins in ubiquitination of DNA-bound proteins and in gene regulation, we analyzed their subcellular locations and DNA-affinities. We found CUL4A and CUL7 to be largely excluded from the nucleus, whereas CUL4B was primarily nuclear. CUL1,2,3, and 5 showed mixed cytosolic and nuclear expression. When analyzing chromatin affinity of individual cullins, we discovered that CUL1 preferentially associated with active promoter sequences and co-localized with 23% of all DNA-associated protein degradation sites. CUL1 co-distributed with c-MYC and specifically repressed nuclear-encoded mitochondrial and splicing-associated genes. These studies underscore the relevance of spatial control in chromatin-associated protein ubiquitination and define a novel role for CUL1 in gene repression.


Results
Expression of cullins. To better assess the relative contribution of the ubiquitously expressed cullin backbone proteins to cellular function, we analyzed the individual cullin transcripts in 62 different human tissues from The Protein Atlas 22 . CUL1 and CUL4B are the two most highly transcribed cullin genes in primary human tissues (Fig. 1A). When comparing 64 human cell lines, CUL1 and CUL3 are the two most highly expressed cullins (Fig. 1B). Both results indicate that CUL1 is a major cullin in primary and tissue culture cells.
Despite the presence of seven paralogs, cullins still have unique functions. For instance, CUL1 has been well studied as part of the SCF complex (SKP1, CDC53/Cullin, and F-box proteins) that regulates cell cycle progression and signaling 23,24 . Notwithstanding the fact that CUL1 and its most divergent paralog CUL7 feature a combined 53.4% amino acid sequence similarity and identity (Fig. 1C), both are essential for viability and cannot be rescued by the presence of any other cullin [25][26][27] . One explanation is that these gene paralogs inhabit exclusive subcellular locations.
To determine the sites of individual cullin activities, we analyzed their intracellular expression patterns. It is noteworthy that there have been conflicting reports regarding for instance the cytoplasmic versus nuclear location of the closely related homologs CUL4A and CUL4B, respectively 17,[28][29][30][31] . These discrepancies are likely caused by the use of antibodies that cannot differentiate between all cullins, especially when they were raised against protein regions with a high degree of homology. To avoid cross-reactivity of primary antibodies, we individually tagged the N-termini of the seven cullins with a 3xFLAG domain. In previous structural and functional studies, N-terminal tags did not interfere with cullin biology [32][33][34] . The 3xFLAG domain allows for high fidelity isolation www.nature.com/scientificreports/ and visualization of tagged proteins using biochemical assays, microscopy, or chromatin-immunoprecipitation. We optimized DNA transfection with plasmids encoding the tagged constructs for similar expression levels of all seven cullins in human HeLa cells ( Fig. 2A and Suppl. Figure 1) and identified their subcellular locations using immunofluorescence detection. CUL1, 2, 3, and 5 were expressed in the cytoplasm as well as the nucleus. CUL4A and CUL7 were predominantly excluded from the nucleus, while CUL4B was entirely nuclear (Fig. 2B).

DNA association of cullins.
To identify ubiquitin ligases that catalyze the ubiquitination and possibly degradation of DNA-associated transcription factors, we first assessed how the seven cullins might directly affect gene regulation through chromatin association. We used Chromatin Immunoprecipitation (ChIP) to determine the DNA affinity of all seven cullins in HeLa cells. Unsurprisingly, we found that cytoplasmic CUL4A and CUL7 did not interact with DNA. When testing cullins with exclusive or partial nuclear expression, only CUL1 and CUL4B substantially associated with DNA (Fig. 3A). In particular, CUL1 displayed reproducible, genome-wide peaks by ChIP-sequencing (ChIP-seq) (Fig. 3B). These findings are confirmed by earlier work that showed nuclear expression of CUL1 in human cell lines 32 and chromatin association in S. cerevisiae 35 . CUL1 peaks were significantly enriched at promoter regions upstream of transcription start sites (Fig. 3C,D). Cullins do not possess DNA binding domains and CUL1 is likely indirectly tethered to chromatin. Such indirect binding increases the functional space CUL1 controls. Further, CUL1 can bridge substrate proteins and ubiquitinating enzymes over distances of 100 Å 14,36 . On the compacted DNA solenoid, this distance translates to a linear DNA length of approximately 3,000 bp 37 . Thus, to identify potential DNA regions under control of CUL1, we extended the CUL1 peaks by 3 kb in either direction. These CUL1 regions strongly overlapped with the active chromatin mark H3K27ac and excluded the repressive mark H3K27me3 (Fig. 3E).
We previously defined DNA sites associated with high protein turnover by performing ChIP against ubiquitin. Our studies demonstrated that the addition of a proteasome inhibitor further increases the signal by enhancing degradation-prone ubiquitination in contrast to non-degradative ubiquitination 8 . Overlaying such a nuclear degradation map in HeLa cells with DNA regions under CUL1 control revealed a 59.17% overlap, suggesting that the majority of CUL1-associated sites are also sites of detectable protein degradation (Fig. 3F). Conversely, the CUL1-associated sites represent 23.16% of all genomic protein degradation sites in HeLa cells.
To identify the potential degradation targets or protein networks controlled by CUL1 activity, we examined CUL1-associated areas for known DNA binding motifs. We found the E-Box, a hallmark motif of the c-MYC/ MAX heterodimer, to be highly enriched within CUL1-associated sites (Fig. 4A) 38 . Further, 67.3% of CUL1 target genes showed c-MYC occupancy at their promoters, suggesting that genes controlled by c-MYC may also be regulated by CUL1 39 (Fig. 4B).
Proximal gene regulation by CUL1. To functionally assess how CUL1 affects the expression of potential target genes, we performed unbiased RNA-sequencing in control HeLa cells or cells in which CUL1 was stably knocked down. Probable CUL1 target genes can be divided into seven main gene ontologies based on CUL1 affinity close to the transcription start site (< 1 kb): cell cycle genes (146 genes, p < 5.7E−15 for gene ontology enrichment), genes involved in RNA splicing (74 genes, p < 2.5E−14), nuclear-encoded mitochondrial genes (210 genes, p < 2.2E−13), ribonucleoprotein complexes (75 genes, p < 3.4E−10), transcriptional regulators (379 genes, p < 1.3E−6), genes of the ubiquitin-proteasome pathway (123 genes, p < 1.6E−6), and genes involved in cilium biology (37 genes, p < 5.5E−4) 40,41 . Of these seven gene ontologies with CUL1 affinity, two subsets were www.nature.com/scientificreports/ significantly altered in their expression upon knockdown of CUL1: nuclear-encoded mitochondrial genes and genes encoding splicing factors were upregulated in CUL1-deficient HeLa cells (Fig. 4C). CUL1 has a prominent role in cell cycle regulation. However, little is known about its function in transcriptional or metabolic control 42 . To validate whether CUL1 depletion alters expression of nuclear-encoded mitochondrial and splicing-associated genes, we used shRNA to generate two independent CUL1-deficient HeLa cell lines ( Fig. 5A and Suppl. Figure 2). We then specifically analyzed genes for which we had established close affinity of CUL1 and c-MYC to the promoter regions by ChIP-seq. CUL1 knockdown cells displayed a significant  Confirming the in silico motif enrichment, we also found significant overlap of CUL1 target genes with previously reported c-MYC target genes in HeLa cells (p < 2.06E−19, Chi-squared test with Yates' correction). (C) Nuclear-encoded mitochondrial genes (p < 6.67E−7, Wilcoxon rank-sum test) and splicing-associated genes (p < 4.20E−6) are significantly upregulated in CUL1deficient HeLa cells, as shown in this box plot. RNP refers to ribonucleoprotein complex genes; UPS refers to ubiquitin-proteasome system genes. Asterisks denote statistical significance. Gene ontologies were defined with DAVID 41 . www.nature.com/scientificreports/ increase in mRNA transcripts of these splicing-associated target genes and nuclear-encoded mitochondrial genes compared to cells expressing the control shRNA vector (Figs. 5B, 6A). Given that c-MYC-addicted cancer cells depend upon the spliceosome and that c-MYC drives mitochondrial biogenesis 43 , these data suggest an antagonistic relationship between c-MYC and CUL1. We performed RT-qPCR on select splicing-associated target genes and nuclear-encoded mitochondrial genes in dependence of CUL1 expression. Our studies confirmed the increased transcription of most target genes we tested in CUL1-deficient cells (Figs. 5C, 6B). Overexpression of CUL1 had the opposite effect and reduced expression of these target genes, suggesting the ubiquitin ligase has a repressor-like function on transcription from these c-MYC-associated gene promoters (Figs. 5D, 6C). Genome browser tracks show the close proximity of CUL1 affinity, c-MYC binding, and protein degradation at active (H3K27ac-positive) target promoters (Figs. 5E, 6D).
To further investigate how CUL1-regulated transcription of metabolic genes affects cellular function, we analyzed the mitochondrial oxygen consumption in cells with normal or reduced CUL1 expression. Basal respiration was increased by an average of 60% in cells in which CUL1 was knocked down (Fig. 7A). In addition to increased respiration, we found evidence for elevated mitochondrial stress in the absence of CUL1. The morphology of mitochondrial networks showed significantly enhanced levels of fusion, which is consistent with damaged mitochondria that are attempting to repair and restore metabolic function 44,45 (Fig. 7B,C). Overall, our results indicate that CUL1 is associated with the promoters of approximately 210 nuclear-encoded mitochondrial genes and a significant number of these genes are repressed by CUL1. De-repression increases mitochondrial activity, but also leads to morphological changes in mitochondria that are consistent with stress.

Discussion
We here identify a novel role of the ubiquitin ligase CUL1 as a transcriptional repressor. A substantial number of genes controlled by c-MYC also show promoter association with CUL1. The promoters of these genes feature distinct ubiquitin peaks upon proteasome inhibition, indicating high levels of protein turnover. Our data suggest that CUL1 directly represses a subset of these genes involved in mitochondrial biology and splicing.
CUL1 and c-MYC both show synergistic function in cancers and can act as oncogenes 46,47 . While this seemingly contradicts the antagonistic function between CUL1 and c-MYC we describe here, a key role of CUL1 is, notably, to promote cell cycle progression. CUL1 contributes to this progression through bulk degradation of www.nature.com/scientificreports/ cell cycle regulators, a process fundamentally different from the DNA site-selective ubiquitination of proteins observed in our study. Further, c-MYC increases CUL1 expression and the repressive function of the ubiquitin ligase may act as a partial negative feedback to limit some c-MYC target genes 46 . The involvement of metabolic genes is of particular interest, given that the synchronization of mitochondrial biogenesis with cell cycle regulation is an emerging field. More research will be necessary to consolidate the synergistic and antagonistic roles of c-MYC and CUL1 and to parse out how CUL1's transcriptional function correlates with the cell cycle status. The CUL1 knockdown cells in our study grew slightly slower than control cells, but showed no significant differences in cell cycle distribution (Suppl. Figure 3). Ubiquitination occurs in various forms and does not necessarily lead to the degradation of a protein. Our study does not directly address whether CUL1 engages in degradative or non-degradative ubiquitination at promoter sites. However, we found evidence for protein degradation at the majority of CUL1 target DNA (Fig. 3F). Protein turnover was examined by quantifying the levels of DNA-associated ubiquitination upon proteasome inhibition 8 . Such treatment leads to a massive redistribution of ubiquitin, shuttling the limiting amounts of this protein from non-degradative to degradative use. Further evidence that CUL1 is specifically engaged in protein degradation can be found in numerous publications 10,46,48 . Interestingly, CUL1 has been described as a ubiquitin ligase that targets c-MYC for degradation through the substrate receptor FBXW7 49 . We have not found evidence of bulk changes in c-MYC protein levels after CUL1 knockdown or after the introduction of dominant-negative CUL1 (not shown). However, it is possible that c-MYC degradation by CUL1 occurs in a site-selective manner at specific promoters, in which case there may only be a negligible change to total c-MYC levels. We have previously observed such spatially selective degradation for other transcriptional regulators 8 . In summary, the identities of target proteins of chromatin-associated ubiquitination by CUL1 and their fates remain unsolved and are subjects of ongoing studies by our laboratory.
Previous reports on the subcellular locations of cullins were inconsistent, especially concerning the clinically relevant proteins CUL4A and CUL4B. Both cullins bind to cereblon 50 , a substrate-binding protein that triggers ubiquitination of the transcription factors IKZF1 and IKZF3 upon treatment with thalidomide or its derivatives. Degradation of IKZF1 and IKZF3 is therapeutically exploited in the treatment of hematological malignancies. This clinically relevant degradation through cereblon occurs in the nucleus 51 . Our results argue that this activity is mediated by CUL4B, not CUL4A. In support of our findings, an earlier report identified a nuclear localization sequence in CUL4B 29 . Except for CUL4A and CUL7, which are mostly excluded from the nucleus, all other cullins show some or specific expression in the nucleus. It is, therefore, possible that CUL2, 3, 4B, and 5 participate in the bulk ubiquitination of nuclear proteins and that CUL1 further engages in the site-selective ubiquitination of proteins at specific genomic regions.
Cullins represent the largest family of ubiquitin ligases. Here, we show a surprising variability in intracellular distribution of the seven cullins. Our data suggests that both CUL1 and CUL4B have the capacity to ubiquitinate DNA-bound proteins. In particular, CUL1 demonstrated the strongest association with chromatin and regulated the expression of genes that are under control of the transcription factor c-MYC. These results underscore that the specificity of ubiquitin ligases encompasses multiple dimensions: both the specificity for target proteins and the spatial specificity of where protein ubiquitination occurs are critical to the activity of the seven cullins. These features are of particular importance when it comes to targeting DNA-bound proteins, for which location dictates function.

Chromatin immunoprecipitation (ChIP).
ChIP experiments with the 3xFLAG tag were performed as previously published 8 . In short, HeLa cells were grown in T175 flasks and harvested at 90% confluency. Each flask contained approximately 5 million cells and at least 10 million cells were harvested for each experimental condition. 3F-Ubiquitin ChIP was performed with stably transduced HeLa cells 8 . 3F-Cullin ChIP was performed with HeLa cells that were transfected with Lipofectamine 2000 (Thermo Fisher, #11668019) 48 h prior to harvest with 25 µg of 3F-Cullin and 5 µg of a GFP spike-in control vector to validate consistent transfection efficiency across different cullin constructs at > 80%. For 3F-Ubiquitin ChIP, proteasome inhibition was performed for 3 h prior to ChIP with 25 µM lactacystin or 0.1% v/v DMSO control (Cayman Chemical, #70980). Cells were washed and fixed in 1% para-formaldehyde in PBS (Thermo Scientific, #28906) at room temperature for 10 min, followed by quenching with glycine. Cells were manually detached by scraping and washed prior to lysis. 5 million cells were lysed per 5 mL dilution buffer (150 mM NaCl, 20 mM Tris pH 7.4, 2 mM EDTA) with the addition of Triton X-100 (1%, VWR, #IB07100), protease inhibitor cocktail (1%, Gendepot, #P3100-010), and RNase cocktail (1%, Thermo Fisher, #AM2288) for 10 min at 4 °C with constant mixing. Nuclei were isolated through centrifugation (350×g, 5 min, 4 °C) and immediately sonicated in dilution buffer containing 0.04% SDS, RNase, and protease inhibitor cocktail using a Bioruptor Pico water bath sonicator (Diagenode) at 4 °C. Shearing was optimized to yield DNA fragments of 200-500 bp. After removal of insoluble material through centrifugation, the nuclear lysate was aliquoted for input material or diluted to 0.01% SDS and immunoprecipitated over night with monoclonal anti-FLAG M2 antibody produced in mouse (see above) and protein G beads (Thermo Fisher, #10003D) that were blocked with DNA-free BSA (Thermo Fisher, #15561020). The following day, beads were washed twice Scientific RepoRtS | (2020) 10:13942 | https://doi.org/10.1038/s41598-020-70610-0 www.nature.com/scientificreports/ with Tris-based buffer (see above) and eluted with 3xFLAG peptide for 15 min at room temperature (Sigma Aldrich, #F4799). Input and ChIP material was then de-crosslinked over night at 65 °C in the presence of 5% proteinase K (Thermo Fisher, #AM2546). Finally, DNA was recovered with Qiagen's MinElute PCR kit (#28006). Size selection was performed prior to library preparation using AMPure XP beads (Beckman Coulter, #A63880).
Next generation sequencing. The Genomic and RNA Profiling Core at Baylor College of Medicine performed next generation sequencing as previously described [54][55][56][57] . Libraries for ChIP-seq were synthesized and prepared for multiplexing according to New England BioLabs' protocol for Illumina sequencing (Ultra Next DNA library prep kit I and II, #E7370S and #E7645S). As indexing primers, we used NEBNext Multiplex oligos (#E7335S and #E7500S). Libraries for RNA-seq were synthesized and prepared for sequencing with the KAPA stranded RNA-seq kit with RiboErase (HMR) (Roche, #KK8483) with ERCC ExFold RNA spike-in mixes (Thermo Fisher, #4456739). Indexing primers for RNA-seq were custom-synthesized by IDT. ChIP-Seq: The Genomic and RNA Profiling Core first conducted sample quality checks using the NanoDrop spectrophotometer and Agilent Bioanalyzer 2100 (High Sensitivity DNA Chip, #5067-4626). To quantitate the adapter ligated library and confirm successful P5 and P7 adapter incorporations, we used the Applied Biosystems ViiA7 real-time PCR system and a KAPA Illumina/universal library quantification kit (#KK4824). We then sequenced the libraries on the Nextseq500 system using the high output v2.5 flowcell.
Library quantification by qPCR and Bioanalyzer: A qPCR assay was performed on the libraries to determine the concentration of adapter ligated fragments using the Applied Biosystems ViiA7 quantitative PCR instrument and a KAPA library quant kit (#KK4824). All samples were pooled equimolarly and re-quantitated by qPCR, and also re-assessed on the Bioanalyzer.
Cluster Generation by Bridge Amplification: Using the concentration from the ViiA7 qPCR machine above, 1.8 pM of equimolarly pooled library is loaded onto a NextSeq 500 high output v2.5 flowcell (Illumina #20024906) and amplified by bridge amplification using the Illumina NextSeq 500 sequencing instrument. PhiX Control v3 adapter-ligated library (Illumina, #FC-1103001) is spiked-in at 1% by weight to ensure balanced diversity and to monitor clustering and sequencing performance. A single-end 75 cycle run was used to sequence the flowcell on a NextSeq 500 sequencing system to achieve a minimum of 25 million reads per sample. Fastq file generation and data delivery was achieved using Illumina's Basespace sequence hub.
RNA-seq: The Genomic and RNA Profiling Core first conducted sample quality checks using the NanoDrop spectrophotometer and Agilent Bioanalyzer 2100 (high sensitivity DNA Chip, #5067-4626). To quantitate the adapter ligated library and confirm successful P5 and P7 adapter incorporations, we used the Applied Biosystems ViiA7 real-time PCR system and a KAPA Illumina/universal library quantification kit (#KK4824). We then sequenced the libraries on the Nextseq500 system using the high output v2.5 flowcell.
Library quantification by qPCR and Bioanalyzer: A qPCR assay was performed on the libraries to determine the concentration of adapter ligated fragments using the Applied Biosystems ViiA7 quantitative PCR instrument and a KAPA library quant kit (#KK4824). All samples were pooled equimolarly and re-quantitated by qPCR, and also re-assessed on the Bioanalyzer.
Cluster Generation by Bridge Amplification: Using the concentration from the ViiA7 qPCR machine above, 1.8 pM of equimolarly pooled library is loaded onto a NextSeq 500 high output v2.5 flowcell (Illumina, #20024907) and amplified by bridge amplification using the Illumina NextSeq 500 sequencing instrument. PhiX Control v3 adapter-ligated library (Illumina, #FC-1103001) is spiked-in at 1% by weight to ensure balanced diversity and to monitor clustering and sequencing performance. A paired-end 75 cycle run was used to sequence the flowcell on a NextSeq 500 sequencing system to achieve a minimum of 50 million reads per sample. Fastq file generation and data delivery was achieved using Illumina's Basespace sequence hub. Data processing. ChIP-seq fastq files were processed with Cutadapt ver. 1.12 58 and mapped to the HG19 genome with Bowtie ver. 1.0 59 . Peak calling was performed with MACS2 ver. 2.1.0.20140616 60 with a false discovery rate < 0.05. Peaks were compared to input DNA as well as ChIP DNA from cells transfected with the 3xFLAG/pCMV7.1 control vector (Sigma Aldrich, #E7533). Mapping to functional genomic sites and target genes was performed with CEAS ver. 1.0.2 61 . Gene ontologies were defined with DAVID (https ://david .ncifc rf.gov) ver. 6.8 40 . Target site and peak overlaps were analyzed with Bedtools ver. 2.23.0 62 and fold enrichment was calculated based on randomized peaks of equal number and size and intra-chromosomal permutation. Wig files were created from MACS2 output with Samtools ver. 0.1.19-96b5f2294a 63 . Wig and bigwig files were visualized using the Integrative Genomics Viewer (IGV) version 2.3 from the Broad Institute 64-66 . The following ENCODE data was utilized: c-MYC (ENCFF045UZK, ENCFF224GZD), H3K27ac (ENCFF388WMD), and H3K27me3 (ENCFF252BLX) 67,68 . The c-MYC reference file used in this study is based on common peaks between both entries. Similarly, Venn diagram comparisons for CUL1 binding are based on common peaks between two biological ChIP replicates. As outlined in the manuscript, domains under CUL1 control were estimated by extending peak regions 3,000 bp in both directions. The analysis of gene expression changes by RNA-seq in Figs. 5B and 6A was performed using the bona fide peaks (not extended regions) of both CUL1 replicates and c-MYC. ChIP bed files were subjected to motif analysis using the SeqPos module in Cistrome 69 . Parameters were defined as sequencing positions p < 0.05, peak size 600 bp, using fold enrichment.
RNA-seq fastq files were processed with Cutadapt ver. 1.12 and mapped to the HG19 genome with TopHat2/ Bowtie2 ver. with Cell-Tak cell and tissue adhesive (0.024 mg/mL Corning, #354240) according to manufacturer specifications and 30,000 cells/well were plated. Agilent Seahorse XF base medium (#103193-100) was supplemented with 25 mM glucose, 2 mM sodium pyruvate, and 2 mM l-glutamine. Basal respiration was normalized by cell count.
Mitotracker and MiNA analysis. HeLa cells were incubated with 500 nM Mitotracker Red CMXRos (Invitrogen, #M7512) for 30 min, then placed on coverslips, permeabilized, and mounted as described. Images were taken as z-stacks at 100 × with the Zeiss CellDiscoverer7 and processed with Zeiss ZEN 3.1 (blue edition) Deconvolution (Defaults-Excellent)". Using ImageJ, images were converted to RGB, auto-thresholding was applied (yen algorithm), and pictures were subjected to MiNA analysis 44 .

Data availability
Raw and processed ChIP-seq files are available at the Gene Expression Omnibus under GSE147426.