Nucleoporin 153 links nuclear pore complex to chromatin architecture by mediating CTCF and cohesin binding

Nucleoporin proteins (Nups) have been proposed to mediate spatial and temporal chromatin organization during gene regulation. Nevertheless, the molecular mechanisms in mammalian cells are not well understood. Here, we report that Nucleoporin 153 (NUP153) interacts with the chromatin architectural proteins, CTCF and cohesin, and mediates their binding across cis-regulatory elements and TAD boundaries in mouse embryonic stem (ES) cells. NUP153 depletion results in altered CTCF and cohesin binding and differential gene expression — specifically at the bivalent developmental genes. To investigate the molecular mechanism, we utilize epidermal growth factor (EGF)-inducible immediate early genes (IEGs). We find that NUP153 controls CTCF and cohesin binding at the cis-regulatory elements and POL II pausing during the basal state. Furthermore, efficient IEG transcription relies on NUP153. We propose that NUP153 links the nuclear pore complex (NPC) to chromatin architecture allowing genes that are poised to respond rapidly to developmental cues to be properly modulated.

stablishment of cell lineage specification, maintenance of cellular states, and cellular responses to developmental cues rely on gene regulation and spatial genome organization during development 1,2 . Emerging data point to highly coordinated activity between epigenetic mechanisms that involve nuclear architecture, chromatin structure, and chromatin organization 3,4 . However, our understanding on how nuclear architectural proteins are causally linked to chromatin organization and impact gene regulation have been limited underscoring the importance of defining the molecular determinants.
Nuclear architecture is in part organized by the nuclear lamina composed of lamin proteins and the nuclear pore complex (NPC). Nucleoporin proteins (Nups) are the building blocks of the NPC, which forms a~60-120 mega dalton (mDa) macromolecular channel at the nuclear envelope mediating nucleocytoplasmic trafficking of proteins and RNA molecules during key cellular processes such as cell signal transduction and cell growth 5 . Beyond its role in nuclear transport, the NPC has been one of the nuclear structural sites of interest for its potential role in gene regulation by directly associating with genes 6 . Studies in budding yeast and metazoans have shown that the NPC provides a scaffold for chromatin modifying complexes and transcription factors, and mediates chromatin organization. In metazoans, such compartmentalization supports nucleoporin-chromatin interactions that influence transcription [7][8][9] . In yeast, inducible genes including GAL, INO1, and HXK1 localize to the NPC upon transcription activationa process that has been proposed to be critical for the establishment of transcription memory [10][11][12] . For several of these loci, NPC association facilitates chromatin looping between distal regulatory elements and promoters 13,14 . Similar mechanism applies to the developmentally regulated ecdysone responsive genes in Drosophila melanogaster. Upon activation, ecdysone responsive genes exhibit NUP98-mediated enhancer-promoter chromatin looping at the NPC 15 . Notably, NUP98 has been shown to interact with several chromatin architectural proteins, including the CCCTC-binding factor, CTCF. These findings collectively suggest that Nups can facilitate chromatin structure in a direct manner by regulating transcription and in an indirect manner whereby Nup-mediated gene regulation relies on architectural proteins. Nevertheless, the functional relevance of Nup-architectural protein interactions in transcription regulation and chromatin structure is not well understood.
Chromatin architectural proteins, CTCF and the cohesin, facilitate interactions between cis-regulatory elements 16,17 . These interactions influence the formation and maintenance of longrange chromatin loops that underlie higher-order chromatin organization 18,19 . Long-range loops of preferential chromatin interactions, referred to as "topologically associating domains" (TADs), are stable, conserved across the species, and exhibit dynamicity during development 17,20 . Importantly, TADs segregate into transcriptionally distinct sub-compartments 21,22 and exhibit spatial positioning 23 . Current models argue that lamina-chromatin interactions may provide sequestration of specific loci inside the peripheral heterochromatin and promote the formation of a silent nuclear compartment 24,25 . Despite the close interaction between the nuclear lamina and the NPC, we still know very little on how NPC-chromatin interactions influence transcription and chromatin organization at the nuclear periphery.
In mammals, Nups show variable expression across different cell types and their chromatin binding has been attributed to celltype-specific gene expression programs 6 . NUP153 is among the chromatin-binding Nups which have been proposed to impact transcription programs that associate with pluripotency and selfrenewal of mammalian stem cells [26][27][28] . NUP153 binding sites have been detected at the promoters, across gene bodies, and enhancers [26][27][28] . Nevertheless, the molecular basis for how NUP153 association at the enhancers or promoters impact chromatin structure and transcription remain to be open questions.
Here, we directly test the relationship between NUP153-chromatin interactions and gene regulation in pluripotent mouse ES cells. Towards elucidating NUP153-mediated mechanisms of transcription, we further utilize immediate early genes (IEGs) at which transcription can be efficiently and transiently induced using growth hormones such as the epidermal growth factor (EGF) in HeLa cells 29 . We report that NUP153 interacts with cohesin and CTCF, and mediates their binding at enhancers, transcription start sites (TSS), and TAD boundaries in mouse ES cells. NUP153 depletion results in differential gene expression that is most prevalent at bivalent genes 4 . At the IEGs, NUP153 binding at the cis-regulatory elements is critical for CTCF and cohesin binding and subsequent POL II pausing. This function of NUP153 is essential for efficient transcription initiation of IEGs. Notably, IEGs exhibit a NUP153-dependent positioning to the nuclear periphery during the basal state and reposition even closer to the periphery upon transcriptional activation. Our findings reveal that IEG-NUP153 contacts are essential for IEG transcription via the establishment of a chromatin structure that is permissive for POL II pausing at the basal state. We propose that NUP153 is a key regulator of chromatin structure by mediating binding of CTCF and cohesin at cis-regulatory elements and TAD boundaries in mammalian cells. Through this function, NUP153 links NPCs to chromatin architecture allowing developmental genes and IEGs that are poised to respond rapidly to developmental cues to be properly modulated.

Results
Identification of CTCF and cohesin as NUP153 interacting proteins. To understand the functional relevance of NUP153 in transcriptional regulation and chromatin structure, we utilized an unbiased proteomics screen using mouse NUP153 as bait in an affinity purification assay. We expressed FLAG-tagged mouse NUP153 (FLAG-mNUP153) in HEK293T cells and carried out immunoprecipitation (IP) followed by mass spectrometry (MS) ( Fig. 1a and Supplementary Fig. 1a, b). We identified several known NUP153 interacting proteins including TPR 30 , NXF1 (ref. 31 ), SENP1 (ref. 32 ), and RAN 33 . In addition, IP-MS revealed that NUP153 interacts with chromatin interacting proteins including the cohesin complex components, SMC1A, SMC3, and RAD21.
NUP153 has been mapped to enhancers and promoters in mammalian cells and has been implicated in transcription regulation [26][27][28] . Nevertheless, the mechanisms are not well understood. We, thus focused on the NUP153-cohesin interactions as cohesin mediates higher-order chromatin organization, and regulates gene expression by facilitating and stabilizing enhancer-promoter interactions together with CTCF 34 . We performed FLAG-NUP153 IP followed by western blotting and determined that NUP153 interacts with CTCF and cohesin subunits (Fig. 1b). To define the nuclear fraction at which NUP153 spatially interacts with CTCF and cohesin, we performed biochemical chromatin fractionation assay using HeLa cells as previously described 35 (Fig. 1c). Micrococcal nuclease (MNase) treatment of the nuclear fraction (P1) resulted in the elution of chromatin binding proteins into the soluble nuclear fraction (S3) (Fig. 1c, d). We detected NUP62, in the insoluble nuclear fraction (P2, +/− MNase) suggesting that the P2 contains the intact nuclear membrane including the nuclear envelope and the NPC. In accordance with earlier cell biological reports 36 , we detected NUP153 both in the insoluble (P2), and the soluble (S3) nuclear fractions (Fig. 1d). This data suggested that the NUP153-chromatin interactions might be established at the nuclear periphery or in the nucleoplasm. Similar to NUP153, we detected a proportion of CTCF and cohesin in the insoluble nuclear fraction (P2) even in the presence of MNase (Fig. 1d). Insoluble fraction has been shown to contain nuclear matrixassociated proteins, including CTCF 37 . These findings argue that NUP153 may interact with CTCF and cohesin at the nuclear periphery, nuclear matrix, or within the nucleoplasm.
NUP153 enrichment at the cis-regulatory elements and TAD boundaries. NUP153 mediates transcription regulation of developmental genes in mouse ES cells 26 . Such function has been attributed to the transcriptional silencing role of NUP153 together with the Polycomb Repressive Complex 1 (PRC1). Nevertheless, only~10% of NUP153 binding sites overlap with PRC1 interaction sites explaining only a small proportion of NUP153mediated gene regulation in pluripotent mouse ES cells. To define NUP153-mediated chromatin structure and gene regulation, we mapped NUP153 binding sites using female mouse ES cell lines (EL16.7) 38 by DamID-Seq 39 ( Supplementary Fig. 1c-e). A Dam only expressing cell line was used to normalize NUP153-DamID-Seq data. We identified 73,018 high confidence NUP153 binding sites (greater than 2-fold enrichment over Dam-only control and FDR < 0.05) (Supplementary Data 1). In agreement with an earlier report 26 , we detected 32.2% of the NUP153 peaks at intergenic sites, 14.2% of peaks at promoters, and 53.5% of peaks across gene bodies (Fig. 2a).
We next examined NUP153 distribution across various genetic elements (Fig. 2b). We found that NUP153 is enriched at the TSS ( Supplementary Fig. 2a) and 31.5% of TSS are NUP153-positive (7721/24,513) ( Fig. 2b and Supplementary Data 2a). To define the transcriptional state of NUP153-positive TSS, we performed RNA-Seq and utilized previously published Histone 3 Lysine 4 trimethylation (H3K4me3) and H3K27me3 ChIP-Seq data 40 ( Supplementary Fig. 2b). We found that NUP153 occupied both transcriptionally active and inactive TSS with a bias towards the active genes ( Supplementary Fig. 2b). To evaluate NUP153 binding across enhancers, we mapped enhancers (n = 16,242) using previously published ChIP-Seq against enhancer-specific histone marks, H3K4me1 (ref. 41 ), Histone 3 Lysine 27 acetylation (H3K27Ac) 42 , and Chromatin Binding Protein (CBP)/P300 (ref. 43 ) (Supplementary Fig. 2c). We detected NUP153 enrichment at the enhancers ( Supplementary Fig. 2c) and identified 17.5% NUP153-positive enhancers (2849/16,242) (Fig. 2b). Compared to NUP153-negative enhancers, NUP153positive enhancers exhibited higher H3K4me1, H3K27Ac, and CBP/P300 occupancy ( Supplementary Fig. 2d). TSS-and  . These results pointed to a potential crosstalk between NUP153, CTCF, and cohesin during the regulation of gene expression and/or chromatin architecture. We next investigated the regulatory role of NUP153 in CTCF and cohesin binding by performing ChIP-Seq in control and NUP153-deficient mouse ES cells. To generate NUP153-deficient ES cells, we transduced cells with two different mouse NUP153specific shRNA lentivirus ( Supplementary Fig. 2f). NUP153 knockdown (KD) cells showed typical pluripotent ES cell characteristics with normal morphology and the presence of alkaline phosphatase activity, suggesting that NUP153 depletion did not interfere with the pluripotent state of ES cells ( Supplementary Fig. 2g). By utilizing an oligo (dT)50-mer probe and performing RNA fluorescent in situ hybridization (FISH) 46 , we further validated that the Poly(A) + RNA export function of the NPCs was intact in NUP153 KD ES cells ( Supplementary Fig. 2g).
Given that cohesin binding has been suggested to rely on CTCF 47 , we focused on CTCF binding sites and showed that NUP153 is enriched at the CTCF-positive-TSS (n = 2164; p = 0, hypergeometric test), -enhancers (n = 2272; p = 0, hypergeometric test) and -TAD boundaries (n = 2238; p = 8.66e−103, hypergeometric test) ( Supplementary Fig. 3b). Notably, NUP153 depletion resulted in reduction in CTCF and cohesin binding across the CTCF-positive genetic elements (Fig. 2d). To determine how NUP153 binding influences CTCF distribution, we calculated the mean CTCF binding in control and NUP153 KD cells and grouped the CTCF binding sites into two. Group I contained CTCF sites that showed greater mean CTCF binding in control cells over NUP153 KD cells. Group II contained CTCF sites that showed equal or lesser mean CTCF binding in control cells over NUP153 KD cells (Fig. 2e, Supplementary Fig. 3c-e, and Supplementary Data 4). Group I TSS sites constituted~10% (1123/11,726) of the total CTCF binding sites and half of these sites (~5%, 558/11,726) were NUP153 positive. Notably, metagene profiles across TSS, enhancer and TAD boundaries at Group I sites showed higher NUP153 binding compared to Group II sites ( Fig. 2e and Supplementary Data 5). This data suggested that the degree of NUP153 binding correlates with differential change in CTCF binding at each genetic element. We concluded that NUP153 mediates CTCF and cohesin binding at TSS, enhancer, and TAD-boundaries. These findings raised the possibility that NUP153 may be critical for enhancer-promoter functions or chromatin organization functions of CTCF and cohesin during gene expression.
We next investigated how NUP153-dependent changes in CTCF binding impact transcription. We found that~34.4% (245/711) of the differentially regulated genes associated with CTCF-positive TSS (Supplementary Data 8). Majority of these genes (~61%) showed transcriptional upregulation in NUP153 KD mouse ES cells ( Fig. 3b and Supplementary Data 8). GO analysis has revealed that these genes associate with important cellular processes such as the cell migration (e.g., Ptk2b, Tcaf2, Wnt11), cell adhesion (e.g., Alcam, App, Itga3, Itga8, PLCb1), and cell differentiation (e.g., Foxa3, Flnb, Zfp423, Tnk2) (Supplementary Data 7). Because CTCF-positive Group I sites Fig. 2 NUP153 mediates CTCF and cohesin binding at cis-regulatory elements and TAD boundaries in mouse ES cells. a Distribution of NUP153 peaks in mouse ES cells. Peaks are categorized as promoters (−2 kb from TSS to +100 bp from TSS), gene body (+100 bp from TSS to +1 kb from transcription termination site (TTS)), intergenic sites (<−2 kb from TSS and >+1 kb from TTS). See Supplementary Data 1 for a list of NUP153 binding sites. b Metagene profiles of mean NUP153 binding at NUP153-positive and NUP153-negative TSS and enhancer (±5 kb), and TAD boundaries (±250 kb) (top). Number and percentage of NUP153 binding sites are presented as a table for the indicated genetic elements (bottom). (See Supplementary Data 2 for NUP153 binding sites at different genetic elements.) c Genome-wide CTCF and SMC3 binding sites were compared in control and NUP153 deficient (KD-1, KD-2) mouse ES cells. d Metagene profiles showing mean CTCF and SMC3 binding across CTCF-positive TSS (n = 2164), enhancers (n = 2272), and TAD boundaries (n = 2238) in control and NUP153 deficient mouse ES cells. e Mean CTCF binding in control and NUP153 KD mouse ES cells were compared and CTCF sites were grouped into two. Group I contained CTCF sites that showed greater mean CTCF binding in control cells over NUP153 KD cells. Group II contained CTCF sites that showed equal or lesser mean CTCF binding in control cells over NUP153 KD cells. Number of CTCF-positive sites across TSS and enhancer (±2.5 kb) and TAD boundaries (±250 kb), and the number of NUP153 target genes that associate with each group are shown as a table (top) (see also Supplementary Data 4 and Supplementary Fig. 3d). Metagene profiles showing mean NUP153 binding across CTCF-positive Group I and Group II TSS, enhancer (±2.5 kb) and TAD boundaries (±250 kb) (bottom). NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16394-3 ARTICLE NATURE COMMUNICATIONS | (2020) 11:2606 | https://doi.org/10.1038/s41467-020-16394-3 | www.nature.com/naturecommunications were enriched for NUP153 compared to Group II sites, and showed drastic change in CTCF binding in NUP153 KD mouse ES cells, we evaluated the number of differentially regulated genes between two groups. Group I sites associated with 19.4% (138/711) and Group II sites associated with 15% (107/711) of the differentially regulated genes in NUP153 KD-1 mouse ES cells (Fig. 3b)   This data supported a key role for NUP153 in transcription of bivalent genes and suggests that expression of a small proportion of bivalent genes is mediated through NUP153-mediated CTCF binding at TSS. This analysis has also revealed that NUP153 associates with several Hox genes, which are characterized as bivalent genes in mouse ES cells 1,48 . Genomic organization of the Hox loci relies on TADs with enriched CTCF binding 44 and influences developmental expression of Hox genes 49 . As presented in the representative tracks shown for the HoxA and HoxC clusters, we found that NUP153 depletion resulted in altered CTCF and/or cohesin binding at specific Hox genes (Fig. 3d, arrows). Importantly, three of these CTCF-binding sites (Fig. 3d, asterisks) have been reported to be critical in facilitating the formation of TADs and providing an insulator function during Hox gene transcription in mouse 48 . Based on these data, we propose that NUP153 may contribute to the higher-order chromatin organization by regulating CTCF and cohesin binding at specific developmental genes, such as the Hox loci, and mediates their gene expression.
NUP153-mediated POL II recruitment is critical for timely IEG transcription. To provide a mechanistic understanding on NUP153-mediated gene expression and the interplay between NUP153, and CTCF and cohesin, we utilized EGF-inducible IEGs 50 . Several characteristics of these loci suggested that they would provide a powerful in vivo model for our studies. First, we identified that IEGs, Egr1, c-Fos, and Jun, are NUP153 targets (Supplementary Data 1). Second, TSS and distal regulatory elements of IEGs showed CTCF and cohesin occupancy in mouse ES cells ( Supplementary Fig. 4). Third, during the preparation of this manuscript, it was shown that CTCF-mediated higher-order chromatin structure impacts transcription of IEGs 51,52 . Lastly, due to their inducible nature, the IEG loci can be utilized for mechanistic studies to elucidate NUP153-mediated chromatin structure during transcriptional silencing and activation. To test the regulatory role for NUP153 in IEG transcription, we could not use mouse ES cells because IEG transcription kinetics show variability in these cells and thus could not be stably measured 53 . We thus utilized HeLa cells in which IEG transcription can be reduced to a silent state by serum starvation and transcription initiation can be reproducibly induced by EGF treatment 29 . We generated NUP153 KD HeLa cells by shRNA lentivirus (Fig. 4a) and validated that NUP153 knockdown did not alter the nucleocytoplasmic trafficking at the NPCs by quantitating the nuclear import and export of the dexamethasone (Dex) responsive GFP-tagged glucocorticoid receptor (GR) 54 ( Supplementary  Fig. 5a, b). Furthermore, as in mouse ES cells, NUP153 KD HeLa cells did not present any defects in Poly(A) + RNA export (Supplementary Fig. 5c).
To evaluate NUP153-dependent changes in IEG transcription, we utilized EGR1, JUN, c-FOS genes, and assessed transcription induction in response to EGF treatment in control and NUP153 deficient HeLa cells in a time course dependent manner. We found that NUP153 depletion led to a significant reduction in IEG mRNA and pre-mRNA levels upon 15 min of EGF treatment compared to control cells ( Fig. 4b and Supplementary Fig. 6a). This effect was NUP153-specific, as expression of FLAG-NUP153 in NUP153-deficient HeLa cells led to the recovery of transcription initiation (Fig. 4c). At 30 min EGF treatment, EGR1 and c-FOS pre-mRNA levels were significantly upregulated in NUP153 deficient cells ( Fig. 4b and Supplementary Fig. 6a). This data suggested that the suppression of IEG transcription during the initiation step may lead to a delay in transcription or trigger a passive negative feedback on IEG transcription 29 . These data collectively indicated that NUP153 acts as an activator of IEG transcription.
Based on our findings, we reasoned that NUP153 may control POL II occupancy during IEG transcription. To investigate, we performed POL II ChIP and quantitatively measured POL II occupancy at the TSS and across gene bodies (GB) of JUN and EGR1 using gene-specific primers ( Fig. 4d and Supplementary Data 10). We found that POL II binding at the TSS of IEGs was significantly reduced in NUP153 KD HeLa cells during the paused state (minus EGF). Furthermore, the expected POL II enrichment across the TSS and the gene bodies was significantly altered in NUP153 KD HeLa cells upon induction of transcription (15 min EGF). By contrast, POL II binding across the IEGs was comparable between NUP153 KD and control HeLa cells at 30 min of EGF induction. These results were in line with data showing that NUP153 is critical for timely IEG transcription initiation (Fig. 4b). We concluded that NUP153 regulates IEG transcription initiation by controlling POL II occupancy at the TSS during the paused state.
NUP153 controls CTCF and cohesin binding at the IEG cisregulatory elements. To define how NUP153 influences IEGspecific changes in CTCF and cohesin binding, we examined   Data 8). c NUP153 DamID-Seq, CTCF, cohesin, H3K4me3, and H3K27me3 ChIP-Seq, and RNA-Seq tracks are shown for two NUP153-positive Group I genes, Rtn4rl1 (left panel) and Calb2 (right panel) in control (WT) and NUP153 KD ES cells. Rtn4rl1 shows transcriptional upregulation and Calb2 shown transcriptional downregulation. d NUP153 DamID-Seq, CTCF, cohesin, H3K4me3, and H3K27me3 ChIP-Seq tracks are shown for a 145-150 kb region for the HoxA and HoxC loci in control (WT) and NUP153 KD mouse ES cells as indicated. Arrows point to regions where CTCF or SMC3 binding are altered in NUP153 KD mouse ES cells. CTCF sites labeled with asterisk (*) denote CTCF sites that have been reported to regulate transcription at the Hox loci by mediating the formation of TADs 48,70 . The 2D heat map shows the interaction frequency in mouse ES cells 44 . Hi-C data was aligned to the mm9 genome showing HoxA cluster residing in a TAD boundary and HoxC cluster in a TAD as published 44 . H3K4me3 and H3K27me3 (ref. 40 ) and CBP/P300 (ref. 43 ) ChIP-Seq data were previously published. CPM, counts per million. NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-16394-3 ARTICLE NATURE COMMUNICATIONS | (2020) 11:2606 | https://doi.org/10.1038/s41467-020-16394-3 | www.nature.com/naturecommunications CTCF, cohesin, and POL II binding along with chromatin structure at the EGR1 and JUN loci using ENCODE ChIP-Seq data in HeLa cells 55 (Supplementary Fig. 6b). This analysis allowed us to design IEG-specific primers (Supplementary Data 10) across distal enhancers, TSS, GB, and transcription termination sites (TTS), and quantitatively determine NUP153, cohesin, and CTCF binding in a time-course dependent manner.
NUP153 ChIP revealed that NUP153 associates with the EGR1 and JUN distal enhancers, TSS and TTS at the paused state (minus EGF) (Fig. 5). Interestingly, we found that NUP153 associates across these loci in a transcription-dependent manner (  HeLa cells at the paused state and upon transcriptional activation with EGF, both proteins dynamically dissociated from these sites. In NUP153-deficient HeLa cells, CTCF and cohesin binding at the enhancers were significantly reduced at the paused state, but did not change upon transcriptional activation (15 min EGF) (Fig. 5). These results suggested that enhancer-specific binding of both proteins relies on NUP153. Given that EGR1 transcription relies on CTCF-mediated higher-order chromatin 52 , it is likely that NUP153 influences IEG chromatin organization by mediating CTCF and cohesin binding during the paused state.
Co-regulatory function of NUP153 and CTCF during the IEG paused state. Based on these results, we hypothesized that NUP153-mediated CTCF and cohesin binding at the IEG enhancers might be necessary for the proximal-promoter binding of POL II during the IEG paused state. We specifically focused on the functional relationship between NUP153 and CTCF because cohesin distribution depends on CTCF binding and POL II elongation 56 . We generated CTCF knockdown HeLa cells by using shRNA (Fig. 6a). Similar to what we detected in NUP153 KD HeLa cells (Fig. 4b), CTCF depletion resulted in significant decrease in the TSS-and promoter-specific POL II binding during the IEG paused state and reduction in the IEG transcription initiation (Fig. 6b, c). Importantly, targeting both NUP153 and CTCF by shRNA (NUP153/CTCF KD) in HeLa cells did not cause an additive effect in downregulation of IEG transcription (Fig. 6d). This data suggests that NUP153 and CTCF mediate IEG transcription through the same regulatory mechanism.
NUP153-dependent spatial positioning of IEGs during transcription regulation. Here, we investigated the spatial positioning of the c-FOS locus and its dependency on NUP153 in a time course dependent manner. To this end, we performed c-FOS DNA FISH in combination with LAMIN B1 immunofluorescence and examined the sub-nuclear position of c-FOS DNA with respect to the nuclear periphery in control and NUP153 KD HeLa cells (Fig. 7a). Analysis of cumulative frequency graphs has revealed that c-FOS locus is closely positioned (ND ≤ 0.12) to the nuclear periphery in~30% of the control cells at the paused state (minus EGF) and that the loci moved even closer to the periphery (ND ≤ 0.10) upon transcription induction (Fig. 7b). By contrast, the locus remained distal to the periphery independent of the transcriptional state in NUP153 KD HeLa cells ( Fig. 7b and Supplementary Fig. 7). These results argue that NUP153dependent positioning of IEG to the NPC is critical during transcription regulation and suggest that NUP153 mediates spatial positioning of CTCF and cohesin to the NPC during IEG transcription.

Discussion
In this study, we aimed to provide a mechanistic understanding on how NUP153 mediates chromatin structure and influences transcription. We identified NUP153 association with the chromatin architectural proteins, CTCF and cohesin, and revealed that NUP153 is a critical regulator of chromatin structure and transcription by affecting CTCF and cohesin binding across cisregulatory elements and TAD boundaries. Even though we cannot exclude the fact that nucleoplasmic or nuclear matrix association is not possible, our findings suggest that the co-regulatory function of NUP153 and architectural proteins likely occurs around the NPC (Figs. 1d, 7 and Supplementary Fig. 7). Our findings are in line with earlier reports in yeast and Drosophila showing that the inducible genes associate with the NPC and are mediated through chromatin looping between distal regulatory elements and promoters [13][14][15] . NUP153 has been associated with cell-type-specific transcription 27 and implicated in chromatin accessibility 28 . Similarly, CTCF exhibits variable binding patterns influencing cell-typespecific transcription 57 . It is thus plausible to speculate that NUP153 might cooperate with CTCF in higher-order chromatin organization in the regulation of cell-type-specific transcription. We propose that bivalent genes might be under such control. This is because bivalent genes are mediated through the simultaneous catalytic activity of MLL and Polycomb Repressive Complex 2 (PRC2). Recent work suggests that regulation of chromatin organization is equally important 4 . Specifically, MLL2 deficiency in mouse ES cells results in increased Polycomb binding and loss of chromatin accessibility at promoters, coupled with alterations in long-range chromatin interactions 4 . Investigating the coregulatory function of NUP153 and CTCF in bivalency and celltype-specific gene expression using ES cells are thus interests for future studies. CTCF and cohesin have been shown to mediate insulation of TADs 16,17 . Nevertheless, depletion of either protein does not result in disappearance of all TADs pointing to a hierarchical control of higher-order chromatin organization 16,17,58 . In addition to TADs, chromatin compartments can be also established based on specific chromatin interactions with the lamina or the NPC 23 . NUP153 is among the NPC components that participate in such regulation. For example, the yeast NUP153 homologue Nup2 acts as an insulator at the nuclear basket 59 and the mammalian NUP153 impacts establishment of heterochromatin domains in interphase cells 60 . Furthermore, NUP153 have been recently implicated in the compartmentalization of transcription factors at the NPC in response to the activation of signal transduction pathways during cellular senescence, cell migration, and cell proliferation 60,61 . Our results suggest that NUP153 may have a role in multistep organization and/or insulation of site-specific higher-order chromatin around the NPCs providing spatial and/ or temporal organization of transcription in response to cellular cues (e.g., EGF signaling). The Hox loci and IEGs may be subjected to such regulation. We have determined that transcription of human IEGs and a subset (~5%) of the mouse ES cell genes rely on NUP153-mediated CTCF and/or cohesin binding at TSS.
Our results are in accordance with earlier findings showing that~10% of all TSS bound CTCF associated with promoter activity 16 . Thus, future studies focusing on the role of NUP153 in chromatin structure and chromatin organization are critical.
Several genome-wide studies in metazoa have shown that the distribution of paused POL II shows a positive correlation with CTCF and cohesin binding 62 . CTCF is thought to induce POL II pausing by creating "roadblocks" on the DNA template obstructing transcription elongation 63 . Here, we provide new evidence that NUP153 cooperates with CTCF in the regulation of POL II occupancy at the IEGs during paused state and that NUP153 and CTCF mediate IEG transcription through the same regulatory mechanism. We propose that NUP153 interacts with CTCF and mediates its binding at the cis-regulatory elements which subsequently leads to cohesin recruitment and chromatin looping between gene regulatory elements and/or TADs at the NPC. This state is essential for the establishment of a poised chromatin environment at which efficient transcription initiation can be rapidly induced through a POL II pause-release mechanism in response to stimuli (Fig. 7c). Two recent reports showed that CTCF-mediated chromatin organization impacts IEG transcription 51 model. NUP153-dependent localization to the NPC might thus provide an advantageous spatial position to genes that are poised to respond rapidly to developmental cues during ES cell pluripotency and/or differentiation. Furthermore, by examining NUP153 binding dynamics during transcription, we showed that NUP153 spreads across the IEG promoter and the gene bodies during transcriptional activation (Fig. 5). This data suggests that there might be a tight functional correlation between NUP153 and POL II activity during transcription. Chromatin sites that are engaged with stalled or active POL II might therefore allow for the differential NUP153 binding and can provide its selectivity towards transcriptionally silent or active chromatin domains. We found that CTCF and cohesin binding sites were on average~5 kb distance from the nearest NUP153 binding sites ( Supplementary Fig. 2e). NUP153 may influence CTCF and cohesin binding directly or indirectly. One possible mechanism is through the scaffold feature of the NPCs 6 . Second possible mechanism might be through the establishment of an optimal chromatin environment at the putative CTCF binding sites by NUP153. CTCF-binding sites display characteristic chromatin structure showing DNase I hypersensitivity and enrichment of H3K4me3, H3K4me2, H3K4me1, and H2A.Z 64,65 . Thus, defining NUP153-interacting proteins and dissecting their co-regulatory function with NUP153 in chromatin structure can provide valuable insights on the underlying mechanisms of NUP153mediated CTCF binding.
Our findings are also relevant towards the understanding of cancers that underlie defects in chromatin-associated function of Nups. Several Nups, including NUP153 and NUP98, contain unstructured phenylalanine-glycine (FG)-repeats 5 . Structural chromosomal rearrangements or translocations of the FG-Nup genes result in the formation of FG-Nup fusion proteins (e.g., NUP98-HOXD13, NUP98-HOXA9, NUP98-MLL), which have been implicated in several hematologic malignancies 66 . A recent report in Drosophila suggests that NUP98 forms a complex with several architectural proteins including CTCF 15 . Thus, we propose that enhancer-specific regulation of chromatin structure and organization by mammalian NUP153 may apply to other FG-Nups and contribute to the gene regulatory mechanisms that underlie FG-Nup fusion protein-associated cancers.

Methods
Cell culture, plasmids, virus preparation, and viral transduction. EL16.7 female mouse ES cell line (gift from J.T. Lee (Harvard)) and cell culture conditions have been described previously 38 . Mouse ES cells were cultured on γ-irradiated mouse embryonic fibroblasts (MEFs) that were isolated from Tg(DR4)1Jae/J mice (The Jackson Laboratory). To transduce ES cells, control (scramble) or mouse NUP153 specific shRNA lentivirus particles (~10 7 -10 8 TU/ml) were added into 0.5 ml of complete ES cell medium containing dissociated ES cells (5 × 10 5 ), LIF (500 units/ml; ESGRO, Sigma-Aldrich) and Polybrene (4 μg/ml; Sigma-Aldrich) and incubated overnight at 37°C. Next day, ES cells were dissociated and plated onto a 60-mm tissue culture dish (BD) containing γ-irradiated DR4 MEFs (1 × 10 6 ), cultured for 24 h in regular ES cell media followed by 2 days of selection using 2 μg/ml puromycin (Puro) (Sigma-Aldrich) and collected for subsequent analyses. HEK293T and HeLa cells were obtained from the American Tissue Collection Center (ATCC, Manassas, VA, USA) through the Duke University Cancer Center Facilities and were maintained in high glucose Dulbecco's modified Eagle's medium (DMEM) GlutaMAX supplemented with 10% fetal bovine serum (FBS; Sigma-Aldrich), 1% penicillin/streptomycin, 1 mM sodium pyruvate, 1% non-essential amino acids, and 3% HEPES. To generate FLAG-NUP153 overexpressing cells, HEK293T cells were transfected with FLAG-mNUP153 or FLAG-hNUP153 cDNA vectors using Xfect reagent (Clonetech) according to the manufacturer's instructions. FLAG-hNUP153 (human) or FLAG-mNUP153 (mouse) expression vectors were constructed by amplifying full-length human NUP153 or mouse NUP153 cDNA using human NUP153 cDNA (Origene, SC116943) or mouse NUP153 cDNA (ATCC, IMAGE clone ID: 6516328) clones, respectively. Amplified cDNA sequences were modified and cloned into BamHI and XhoI sites of pCMV-3FLAG-6 vector (Agilent, 240200). To produce shRNA lentivirus particles, HEK293T cells were transfected with pMD2.G (Addgene #12259) and psPAX2 (Addgene #12260) vectors along with each shRNA lentiviral vector. Viral supernatants were concentrated X100 using Lenti-X Concentrator (Clontech) according to the manufacturer's instructions, aliquoted and stored at −80°C. All reagents were from Thermo Fisher Scientific, unless noted otherwise. All cells were cultured at 37°C with 5% CO 2 . Mouse husbandry and experiments were conducted in accordance with an approved protocol (A238-17-10) for the ethical use of animals in research by the Duke University Institutional Animal Care and Use Committee (IACUC).
LC-MS/MS proteomics analysis. Samples in 1× Laemmli Sample buffer (Bio-Rad, 1610737) were run on a NuPAGE 4-12% Bis-Tris Protein gel (Invitrogen, NP0336PK2) in NuPAGE MES SDS Running Buffer (Invitrogen, NP0002) for~5 min. The entire molecular weight range was excised and subjected to standardized in-gel trypsin digestion (http://www.genome.duke.edu/cores/proteomics/samplepreparation/documents/IngelDigestionProtocolrevised.pdf). Extracted peptides were lyophilized to dryness and resuspended in 12 µL of sample buffer (0.2% formic acid, 2% acetonitrile). Each sample was subjected to chromatographic separation on a nanoACQUITY UPLC (Waters) equipped with an ACQUITY UPLC BEH130 C 18  at 400 nL/min. The analytical column was connected to a SilicaTip emitter (New Objective) with a 10 µm tip orifice and coupled to a Q Exactive Plus mass spectrometer (Thermo Fisher Scientific) through an electrospray interface operating in a data-dependent mode of acquisition. The instrument was set to acquire a precursor MS scan from m/z 375-1600 at R = 70,000 (target AGC 1e6, max IT 60 ms) with MS/MS spectra acquired for the 10 most abundant precursor ions at R = 17,500 (target ABC 5e4, max IT 60 ms). For all experiments, HCD energy settings were 27 V and a 20 s dynamic exclusion was employed for previously fragmented precursor ions. Raw LC-MS/MS data files were processed in Proteome Discoverer (Thermo Fisher Scientific) and then submitted to independent Mascot search (Matrix Science) against a SwissProt database (Human taxonomy) containing both forward and reverse entries of each protein (20,322 forward entries). Search tolerances were 5 ppm for precursor ions and 0.02 Da for product ions using trypsin specificity with up to two missed cleavages. Carbamidomethylation (+57.0214 Da on C) was set as a fixed modification, whereas oxidation (+15.9949 Da on M) and deamidation (+0.98 Da on NQ) were considered dynamic mass modifications. All searched spectra were imported into Scaffold (v4.4, Proteome Software) and scoring thresholds were set to achieve a peptide false discovery rate of 1% using the PeptideProphet algorithm.
Total RNA extraction, reverse transcription, and real-time PCR. Total RNA was extracted from cells using TRIzol reagent (Invitrogen) according to the manufacturer's instructions. For reverse transcription, cDNA was prepared using M-MLV Reverse Transcriptase (Thermo Fisher Scientific) with random hexamers (Sigma-Aldrich). Real-time PCR (qPCR) was performed using iTaq Universal SYBR Green Supermix (Bio-Rad) with specific primer sets indicated in Supplementary Data 10. Relative gene expression was calculated by the relative standard curve method. GAPDH expression was used to normalize data.
IEG transcription induction in HeLa cells. HeLa cells (1 × 10 6 ) were transduced with the control (scramble) or human NUP153-specific shRNA lentivirus particles overnight at 37°C followed by selection for 48 h in medium containing Puromycin (Puro, 2 μg/ml) (Sigma-Aldrich). To collect cells at the basal (minus EGF) IEG state, cells were pre-cultured in DMEM supplemented with 0.1% FBS (Sigma-Aldrich) for 24 h, followed by treatment with EGF (50 ng/ml) (Sigma-Aldrich, E9644) for 15, 30, 60, 90, and 120 min. For the rescue experiments, HeLa cells were transfected with control (scramble) or NUP153-specific shRNA vectors along with FLAG-hNUP153 expression vector using Xfect transfection reagent (Clontech) according to the manufacturer's instructions. At the 16 h time point, culture medium was replaced with Puro (2 μg/ml) containing medium and cells were incubated in this medium for 24 h, followed by incubation in Puro-free medium for another 24 h. To induce IEG transcription, cells were subjected to EGF treatment as described above.
Immunostaining and DNA FISH. For sequential LAMIN B1 immunostaining and c-FOS DNA FISH, HeLa cells (5.5 × 10 3 ) were grown on 12-well glass slides (Invitrogen) overnight at 37°C, and IEG transcription was induced as described above. Immunostaining was performed as previously described 67 . Briefly, fixed cells were subjected to immunostaining using anti-LAMIN B1 (Abcam, ab16048) antibody (1:450) at 4°C overnight, washed three times in wash buffer (1× phosphate-buffered saline (PBS)/0.2% Tween-20 buffer) at room temperature for 5 min each, and incubated with goat polyclonal anti-IgG(H+L)-Alexa488 secondary antibody (1:500) for 1 h at room temperature. To remove excess secondary antibody, cells were washed three times in wash buffer for 5 min each. Slides were mounted using Vectashield mounting medium containing DAPI (Vector Labs). Slides were imaged using a Leica DM5500B microscope, and a Leica DFC365 FX CCD camera, image positions were recorded and slides were washed in 1× PBS/ 0.2% Tween 20 to remove the mounting medium and cells were re-fixed in 4% paraformaldehyde (Electron Microscopy Sciences) prior to DNA FISH experiment. To detect DNA signal at the c-FOS locus by FISH, BAC clone (RP11-293M10) (CHORI) was fluorescently labeled using Cy3-dUTP (ENZO) and nick translation kit (Sigma-Aldrich) according to the manufacturer's instructions. Human Cot-1 DNA (Thermo Fisher Scientific) (10 μg per 2 μg of nick translated BAC vector) was included into the reaction containing the nick translated vector to block the background DNA signal. The probe was precipitated by NaOAc-EtOH precipitation and the pellet was re-suspended in 50 μl of hybridization buffer (50% formamide, 2× saline sodium citrate (SSC), 2 mg/ml bovine serum albumin (Sigma-Aldrich), 10% dextran sulfate-500K (Millipore)) generating~40 ng/μl labeled DNA probe. DNA FISH was performed as previously described 67 . Hybridization was performed using~200 ng DNA probe per slide at 37°C overnight in a humidified chamber. DNA FISH images at the recorded positions were obtained with a Leica DM5500B microscope, a Leica DFC365 FX CCD camera, and analyzed using ImageJ software (v.2.0.0). Distribution of c-FOS locus distance to nuclear periphery was measured in control and NUP153 KD HeLa cells at the indicated time points. Cumulative frequencies at a normalized distance (ND) of 0.0-0.12 are shown (Fig. 7). Frequency of c-FOS distribution at ND 0.0-0.45 is shown in Supplementary Fig. 7. ND = (c-FOS locus to periphery distance)/(cell diameter (d)), where d = (2× nuclear area/π) 0.5 . *p < 0.05; ***p < 0.001; the Kolmogorov-Smirnov (KS)-test was applied to calculate significance. To determine the cellular distribution of FLAG-NUP153 in HEK293T cells, FLAG-NUP153 transfected HEK293T cells (5.5 × 10 3 ) were cultured on glass coverslips, and immunostaining was performed using anti-FLAG M2 (Sigma-Aldrich, F1804) antibody (1:250) as described above.
Poly(A) + RNA FISH and alkaline phosphatase staining. Poly(A) + RNA FISH was performed by using 5′ Cy3-labeled oligo-dT 50mer (Sigma-Aldrich) as previously described 46 . Briefly, hybridization was performed using 0.5 μg 5′ Cy3labelled oligo-dT 50mer per slide at 37°C in a humidified chamber overnight. Following hybridization, cells were washed twice for 15 min at 42°C with 2× SSC, and once for 15 min at 42°C in 0.5× SSC. Slides were mounted using mounting medium containing DAPI (Vector Labs) and cells were imaged by fluorescence microscopy. Alkaline phosphatase staining was performed using Red Alkaline Phosphatase Substrate kit (Vector Labs, SK-5100) according to the manufacturer's instructions. Bright-field images were taken using Leica EC3 color camera attached to Leica DM5500B microscope.
Nuclear transport assay. Hela cells were co-transfected with Rev-Glucocorticoid Receptor-GFP (RGG) expression vector (Gift from K. Ullman (University of Utah)) and control (scrambled) or hNUP153-specific shRNA vectors using Xfect reagent. Import and export assays were performed as previously described 54 . Briefly, for import assay, transfected HeLa cells were grown overnight on 12-well glass slides at 37°C and treated with 250 nM dexamethasone (Dex) (Sigma-Aldrich, D4902) to induce RGG nuclear import for the indicated times. For the export assay, 120 min Dex-treated cells were washed with 1× PBS pH 7.2 and cultured in fresh culture medium for the indicated times. At the end of each time point, cells were fixed using 4% paraformaldehyde and mounted using DAPI containing mounting medium (Vector Labs). Images were obtained with a Leica DM5500B microscope, a Leica DFC365 FX CCD camera, and examined to calculate the percentage of cells with nuclear GFP-RGG signal.
RNA-Seq. Total RNA quality and concentration was assessed on a 2100 Bioanalyzer (Agilent Technologies) and Qubit 2.0 (Thermo Fisher Scientific), respectively. Total RNA (RIN value ≥ 8) from control and two NUP153 KD mouse ES cells were depleted of ribosomal RNA using the Illumina Ribo-zero Gold kit and converted into RNA-seq libraries using the Illumina Total RNA-seq kit. Libraries were indexed using a dual indexing approach allowing for multiple libraries to be pooled and sequenced on the same sequencing flow cell of an Illumina HiSeq 4000 sequencing platform. Before pooling and sequencing, fragment length distribution and library quality was first assessed on a Fragment Analyzer (Agilent). All libraries were pooled in equimolar ratio and sequenced. Libraries were sequenced at 50 bp single-end on the Illumina HiSeq 4000 instrument. About 110 × 10 6 reads per sample were generated. Once generated, sequence data was demultiplexed and Fastq files generated using Illumina's Bcl2Fastq v2 conversion software.
ChIP-Seq. ChIP DNA samples were quantified using the fluorometric quantitation Qubit 2.0 system (Thermo Fisher Scientific). ChIP-Seq libraries were prepared using the Roche Kapa BioSystem HyperPrep Library Kit to generate Illuminacompatible libraries. During adapter ligation, dual unique indexes were added to each sample. Resulting libraries were cleaned using SPRI beads and quantified using Qubit 2.0. Fragment length distribution of the final libraries was assessed on a Fragment Analyzer (Agilent). Libraries were then pooled into equimolar concentration and sequenced on an Illumina HiSeq 4000 instrument. Sequencing was done at 50 bp single-end and generated about 110 × 10 6 reads per sample. Sequence data was demultiplexed and Fastq files generated using Illumina's Bcl2Fastq v2 conversion software.
RNA-Seq data analysis. RNA-Seq reads were trimmed by Trim Galore (v.0.4.1, with -q 15) and then mapped with TopHat (v 2.1.1, with parameters --b2-versensitive --no-coverage-search and supplying the UCSC mm10 known gene annotation). The ERCC spike-in sequences were mapped separately. Gene-level read counts were obtained using the featureCounts (v1.6.1) by the reads with MAPQ greater than 30. Bioconductor package RUVseq (v 1.16.0) was used to normalize the read counts and edgeR (v 3.24.0) was employed for differential expression analysis. Fold change greater than 1.5 and false discovery rate (FDR) less than 0.05 was used to filter the significant differentially expressed genes.
ChIP-Seq data analysis. ChIP-Seq reads were trimmed by Trim Galore (0.4.1, with -q 15) and then mapped with bowtie2 (2.2.5, with parameters --very-sensitive) to mouse genome (UCSC mm10). The mapped reads were filtered by MAPQ greater than 30 by samtools (v 1.5) and duplicated reads were removed by picard (v 1.91). The peaks were called by MACS2 (v 2.1.0, with --pvalue 1e-5). The read coverages were quantified by the signal in reads per million per base pair https:// github.com/BradnerLab/pipeline/blob/master/bamToGFF.py with parameters -m 500 -r -d. Metagene plots were used to display the average ChIP-seq signal across related regions of interest for enhancers and TSS separately. The average profile (metagene) was calculated by the mean of ChIP-seq signal profiles across the related regions of interest. For each metagene plot, the profile is displayed in rpm/ bp in a ±2.5 kb or 5 kb region centered on the regions of interest. The number of enhancers or TSS were noted in the title of plots.
DamID-Seq data analysis. DamID-Seq reads were mapped with bowtie2 (2.2.5, with parameters --very-sensitive) to mouse genome (UCSC mm10). The mapped reads were filtered by MAPQ greater than 30 by samtools (v 1.5) and filtered by GATC at the 5′ ends. The peaks were called by MACS2 (v 2.1.0, with -q 0.05). To determine distribution of NUP153-DamID peaks across the genetic elements in mouse ES cells we used the following criterion. Promoters (−2 kb from TSS to +100 bp from TSS); GB (+100 bp from TSS to +1 kb from TTS); Intergenic sites (<−2 kb from TSS and >+1 kb from TTS). TSS, transcription start site; GB, gene body; TTS, transcription termination site.
Definition of regulatory regions for the analyses of ChIP-and DamID-Seq data. Several analyses in the manuscript rely on ChIP-or DamID-Seq analyses across different regulatory regions namely enhancers, promoters, and TAD boundaries. These regulatory regions were defined as follows. (A) Promoters were defined by gene start sites downloaded from UCSC Genome Browser goldenPath/ mm10/database/knownGene. Active promoters were defined by the Fragments Per Kilobase of transcript per Million mapped reads (FPKM), which is calculated by cufflinks (v 2.1.1), greater than 1 in control RNA-seq. Inactive promoters were defined by FPKM no greater than 1. Chromatin structure at the transcriptionally active vs inactive TSS was validated using previously published H3K4me3 and H3K27me3 ChIP-Seq, respectively (GEO: GSE36905) 40 . (B) Enhancers were defined by utilizing the previously published ChIP-Seq data sets and determining the overlapping region of peaks with at least two enhancer-specific markers including CBP/P300 (GEO: GSE29184), H3K4me1 (GEO: GSE25409), or H3K27Ac (GEO: GSE42152). (C) TAD boundaries were defined by utilizing the previously published Hi-C data and TAD boundary coordinates reported 44 . (D) The overlap between NUP153 DamID peaks and CTCF or SMC3 ChIP peaks were defined using control (scramble shRNA) samples and were called by utilizing the Bioconductor package ChIPpeakAnno (v. 3.19.5) with a maximal gap of 5 kb. The overlapping sites are referred to as co-occupied sites.