U1 snRNP regulates chromatin retention of noncoding RNAs


Long noncoding RNAs (lncRNAs) and promoter- or enhancer-associated unstable transcripts locate preferentially to chromatin, where some regulate chromatin structure, transcription and RNA processing1,2,3,4,5,6,7,8,9,10,11,12,13. Although several RNA sequences responsible for nuclear localization have been identified—such as repeats in the lncRNA Xist and Alu-like elements in long RNAs14,15,16—how lncRNAs as a class are enriched at chromatin remains unknown. Here we describe a random, mutagenesis-coupled, high-throughput method that we name ‘RNA elements for subcellular localization by sequencing’ (mutREL-seq). Using this method, we discovered an RNA motif that recognizes the U1 small nuclear ribonucleoprotein (snRNP) and is essential for the localization of reporter RNAs to chromatin. Across the genome, chromatin-bound lncRNAs are enriched with 5′ splice sites and depleted of 3′ splice sites, and exhibit high levels of U1 snRNA binding compared with cytoplasm-localized messenger RNAs. Acute depletion of U1 snRNA or of the U1 snRNP protein component SNRNP70 markedly reduces the chromatin association of hundreds of lncRNAs and unstable transcripts, without altering the overall transcription rate in cells. In addition, rapid degradation of SNRNP70 reduces the localization of both nascent and polyadenylated lncRNA transcripts to chromatin, and disrupts the nuclear and genome-wide localization of the lncRNA Malat1. Moreover, U1 snRNP interacts with transcriptionally engaged RNA polymerase II. These results show that U1 snRNP acts widely to tether and mobilize lncRNAs to chromatin in a transcription-dependent manner. Our findings have uncovered a previously unknown role of U1 snRNP beyond the processing of precursor mRNA, and provide molecular insight into how lncRNAs are recruited to regulatory sites to carry out chromatin-associated functions.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: MutREL-seq identifies a U1-recognition site that contributes to RNA–chromatin tethering.
Fig. 2: U1 snRNP regulates lncRNA-chromatin retention.
Fig. 3: U1 tethers and mobilizes lncRNAs to chromatin by interacting with transcriptionally engaged Pol II.

Data availability

All sequencing data are available in the Sequence Read Archive database under accession numbers SRP214639 and SRP125289. For gel source data, see Supplementary Fig. 1. Source data for Fig. 2a–c are provided with the paper (Supplementary Tables 47). All other data are available from the corresponding author upon reasonable request. Sequencing data have been deposited in the Gene Expression Omnibus under accession numbers GSE107131 and GSE134287.


  1. 1.

    Hon, C. C. et al. An atlas of human long non-coding RNAs with accurate 5′ ends. Nature 543, 199–204 (2017).

    ADS  PubMed  PubMed Central  CAS  Google Scholar 

  2. 2.

    Kim, T. K. et al. Widespread transcription at neuronal activity-regulated enhancers. Nature 465, 182–187 (2010).

    ADS  PubMed  PubMed Central  CAS  Google Scholar 

  3. 3.

    Core, L. J., Waterfall, J. J. & Lis, J. T. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science 322, 1845–1848 (2008).

    ADS  PubMed  PubMed Central  CAS  Google Scholar 

  4. 4.

    Preker, P. et al. RNA exosome depletion reveals transcription upstream of active human promoters. Science 322, 1851–1854 (2008).

    ADS  PubMed  CAS  Google Scholar 

  5. 5.

    Seila, A. C. et al. Divergent transcription from active promoters. Science 322, 1849–1851 (2008).

    ADS  PubMed  PubMed Central  CAS  Google Scholar 

  6. 6.

    Schlackow, M. et al. Distinctive patterns of transcription and RNA processing for human lincRNAs. Mol. Cell 65, 25–38 (2017).

    PubMed  PubMed Central  CAS  Google Scholar 

  7. 7.

    Li, W., Notani, D. & Rosenfeld, M. G. Enhancers as non-coding RNA transcription units: recent insights and future perspectives. Nat. Rev. Genet. 17, 207–223 (2016).

    PubMed  CAS  Google Scholar 

  8. 8.

    Derrien, T. et al. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res. 22, 1775–1789 (2012).

    PubMed  PubMed Central  CAS  Google Scholar 

  9. 9.

    Luo, S. et al. Divergent lncRNAs regulate gene expression and lineage differentiation in pluripotent cells. Cell Stem Cell 18, 637–652 (2016).

    PubMed  CAS  Google Scholar 

  10. 10.

    Yin, Y. et al. Opposing roles for the lncRNA haunt and its genomic locus in regulating HOXA gene activation during embryonic stem cell differentiation. Cell Stem Cell 16, 504–516 (2015).

    PubMed  CAS  Google Scholar 

  11. 11.

    Yan, P., Luo, S., Lu, J. Y. & Shen, X. Cis- and trans-acting lncRNAs in pluripotency and reprogramming. Curr. Opin. Genet. Dev. 46, 170–178 (2017).

    PubMed  CAS  Google Scholar 

  12. 12.

    Engreitz, J. M. et al. Local regulation of gene expression by lncRNA promoters, transcription and splicing. Nature 539, 452–455 (2016).

    ADS  PubMed  PubMed Central  CAS  Google Scholar 

  13. 13.

    Kopp, F. & Mendell, J. T. Functional classification and experimental dissection of long noncoding RNAs. Cell 172, 393–407 (2018).

    PubMed  PubMed Central  CAS  Google Scholar 

  14. 14.

    Ridings-Figueroa, R. et al. The nuclear matrix protein CIZ1 facilitates localization of Xist RNA to the inactive X-chromosome territory. Genes Dev. 31, 876–888 (2017).

    PubMed  PubMed Central  CAS  Google Scholar 

  15. 15.

    Sunwoo, H., Colognori, D., Froberg, J. E., Jeon, Y. & Lee, J. T. Repeat E anchors Xist RNA to the inactive X chromosomal compartment through CDKN1A-interacting protein (CIZ1). Proc. Natl Acad. Sci. USA 114, 10654–10659 (2017).

    PubMed  CAS  Google Scholar 

  16. 16.

    Lubelsky, Y. & Ulitsky, I. Sequences enriched in Alu repeats drive nuclear localization of long RNAs in human cells. Nature 555, 107–111 (2018).

    ADS  PubMed  PubMed Central  CAS  Google Scholar 

  17. 17.

    Engreitz, J. M. et al. RNA-RNA interactions enable specific targeting of noncoding RNAs to nascent Pre-mRNAs and chromatin sites. Cell 159, 188–199 (2014).

    PubMed  PubMed Central  CAS  Google Scholar 

  18. 18.

    Will, C. L. & Lührmann, R. Spliceosome structure and function. Cold Spring Harb. Perspect. Biol. 3, a003707 (2011).

    PubMed  PubMed Central  CAS  Google Scholar 

  19. 19.

    Huang, Y. & Carmichael, G. G. A suboptimal 5′ splice site is a cis-acting determinant of nuclear export of polyomavirus late mRNAs. Mol. Cell. Biol. 16, 6046–6054 (1996).

    PubMed  PubMed Central  CAS  Google Scholar 

  20. 20.

    Lee, E. S., Akef, A., Mahadevan, K. & Palazzo, A. F. The consensus 5′ splice site motif inhibits mRNA nuclear export. PLoS ONE 10, e0122743 (2015).

    PubMed  PubMed Central  Google Scholar 

  21. 21.

    Brown, J. A., Valenstein, M. L., Yario, T. A., Tycowski, K. T. & Steitz, J. A. Formation of triple-helical structures by the 3′-end sequences of MALAT1 and MENβ noncoding RNAs. Proc. Natl Acad. Sci. USA 109, 19202–19207 (2012).

    ADS  PubMed  CAS  Google Scholar 

  22. 22.

    Wilusz, J. E. et al. A triple helix stabilizes the 3′ ends of long noncoding RNAs that lack poly(A) tails. Genes Dev. 26, 2392–2407 (2012).

    PubMed  PubMed Central  CAS  Google Scholar 

  23. 23.

    Boelens, W. C. et al. The human U1 snRNP-specific U1A protein inhibits polyadenylation of its own pre-mRNA. Cell 72, 881–892 (1993).

    PubMed  CAS  Google Scholar 

  24. 24.

    Herzog, V. A. et al. Thiol-linked alkylation of RNA to assess expression dynamics. Nat. Methods 14, 1198–1204 (2017).

    PubMed  PubMed Central  CAS  Google Scholar 

  25. 25.

    Kaida, D. et al. U1 snRNP protects pre-mRNAs from premature cleavage and polyadenylation. Nature 468, 664–668 (2010).

    ADS  PubMed  PubMed Central  CAS  Google Scholar 

  26. 26.

    Folco, E. G., Coil, K. E. & Reed, R. The anti-tumor drug E7107 reveals an essential role for SF3b in remodeling U2 snRNP to expose the branch point-binding region. Genes Dev. 25, 440–444 (2011).

    PubMed  PubMed Central  CAS  Google Scholar 

  27. 27.

    O’Brien, K., Matlin, A. J., Lowell, A. M. & Moore, M. J. The biflavonoid isoginkgetin is a general inhibitor of Pre-mRNA splicing. J. Biol. Chem. 283, 33147–33154 (2008).

    PubMed  PubMed Central  Google Scholar 

  28. 28.

    Tripathi, V. et al. The nuclear-retained noncoding RNA MALAT1 regulates alternative splicing by modulating SR splicing factor phosphorylation. Mol. Cell 39, 925–938 (2010).

    PubMed  PubMed Central  CAS  Google Scholar 

  29. 29.

    Chillón, I. & Pyle, A. M. Inverted repeat Alu elements in the human lincRNA-p21 adopt a conserved secondary structure that regulates RNA function. Nucleic Acids Res. 44, 9462–9471 (2016).

    PubMed  PubMed Central  Google Scholar 

  30. 30.

    Almada, A. E., Wu, X., Kriz, A. J., Burge, C. B. & Sharp, P. A. Promoter directionality is controlled by U1 snRNP and polyadenylation signals. Nature 499, 360–363 (2013).

    ADS  PubMed  PubMed Central  CAS  Google Scholar 

  31. 31.

    Shen, X. et al. EZH1 mediates methylation on histone H3 lysine 27 and complements EZH2 in maintaining stem cell identity and executing pluripotency. Mol. Cell 32, 491–502 (2008).

    PubMed  PubMed Central  CAS  Google Scholar 

  32. 32.

    Chao, S. H. & Price, D. H. Flavopiridol inactivates P-TEFb and blocks most RNA polymerase II transcription in vivo. J. Biol. Chem. 276, 31793–31799 (2001).

    PubMed  CAS  Google Scholar 

  33. 33.

    Titov, D. V. et al. XPB, a subunit of TFIIH, is a target of the natural product triptolide. Nat. Chem. Biol. 7, 182–188 (2011).

    PubMed  PubMed Central  CAS  Google Scholar 

  34. 34.

    Nishimura, K., Fukagawa, T., Takisawa, H., Kakimoto, T. & Kanemaki, M. An auxin-based degron system for the rapid depletion of proteins in nonplant cells. Nat. Methods 6, 917–922 (2009).

    PubMed  CAS  Google Scholar 

  35. 35.

    Bhatt, D. M. et al. Transcript dynamics of proinflammatory genes revealed by sequence analysis of subcellular RNA fractions. Cell 150, 279–290 (2012).

    PubMed  PubMed Central  CAS  Google Scholar 

  36. 36.

    O’Hagan, H. M. et al. Oxidative damage targets complexes containing DNA methyltransferases, SIRT1, and polycomb members to promoter CpG Islands. Cancer Cell 20, 606–619 (2011).

    PubMed  PubMed Central  Google Scholar 

  37. 37.

    Wutz, A. & Jaenisch, R. A shift from reversible to irreversible X inactivation is triggered during ES cell differentiation. Mol. Cell 5, 695–705 (2000).

    PubMed  CAS  Google Scholar 

  38. 38.

    McCullum, E. O., Williams, B. A., Zhang, J. & Chaput, J. C. Random mutagenesis by error-prone PCR. Methods Mol. Biol. 634, 103–109 (2010).

    PubMed  CAS  Google Scholar 

  39. 39.

    Tsai, M. C. et al. Long noncoding RNA as modular scaffold of histone modification complexes. Science 329, 689–693 (2010).

    ADS  PubMed  PubMed Central  CAS  Google Scholar 

  40. 40.

    Li, Z. et al. Exon-intron circular RNAs regulate transcription in the nucleus. Nat. Struct. Mol. Biol. 22, 256–264 (2015); corrigendum 24, 194 (2017).

    PubMed  Google Scholar 

  41. 41.

    Berg, M. G. et al. U1 snRNP determines mRNA length and regulates isoform expression. Cell 150, 53–64 (2012).

    PubMed  PubMed Central  CAS  Google Scholar 

  42. 42.

    Clark, M. B. et al. Genome-wide analysis of long noncoding RNA stability. Genome Res. 22, 885–898 (2012).

    PubMed  PubMed Central  CAS  Google Scholar 

  43. 43.

    Shen, X. et al. Jumonji modulates polycomb activity and self-renewal versus differentiation of stem cells. Cell 139, 1303–1314 (2009).

    PubMed  PubMed Central  Google Scholar 

  44. 44.

    Schofield, J. A., Duffy, E. E., Kiefer, L., Sullivan, M. C. & Simon, M. D. TimeLapse-seq: adding a temporal dimension to RNA sequencing through nucleoside recoding. Nat. Methods 15, 221–225 (2018).

    PubMed  PubMed Central  CAS  Google Scholar 

  45. 45.

    Neumann, T. et al. Quantification of experimentally induced nucleotide conversions in high-throughput sequencing datasets. BMC Bioinformatics 20, 258 (2019).

    PubMed  PubMed Central  Google Scholar 

  46. 46.

    Schwalb, B. et al. TT-seq maps the human transient transcriptome. Science 352, 1225–1228 (2016).

    ADS  PubMed  CAS  Google Scholar 

  47. 47.

    Dölken, L. et al. High-resolution gene expression profiling for simultaneous kinetic parameter analysis of RNA synthesis and decay. RNA 14, 1959–1972 (2008).

    PubMed  PubMed Central  Google Scholar 

  48. 48.

    Tsanov, N. et al. smiFISH and FISH-quant—a flexible single RNA detection approach with super-resolution capability. Nucleic Acids Res. 44, e165 (2016).

    PubMed  PubMed Central  Google Scholar 

  49. 49.

    Chu, C., Qu, K., Zhong, F. L., Artandi, S. E. & Chang, H. Y. Genomic maps of long noncoding RNA occupancy reveal principles of RNA-chromatin interactions. Mol. Cell 44, 667–678 (2011).

    PubMed  PubMed Central  CAS  Google Scholar 

  50. 50.

    Quinlan, A. R. & Hall, I. M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).

    PubMed  PubMed Central  CAS  Google Scholar 

  51. 51.

    Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013).

    PubMed  PubMed Central  Google Scholar 

  52. 52.

    Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).

    PubMed  PubMed Central  CAS  Google Scholar 

  53. 53.

    Tilgner, H. et al. Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs. Genome Res. 22, 1616–1625 (2012).

    PubMed  PubMed Central  CAS  Google Scholar 

  54. 54.

    Trapnell, C. et al. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nat. Protoc. 7, 562–578 (2012); erratum 9, 2513 (2014).

    PubMed  PubMed Central  CAS  Google Scholar 

  55. 55.

    de Dieuleveult, M. et al. Genome-wide nucleosome specificity and function of chromatin remodellers in ES cells. Nature 530, 113–116 (2016).

    ADS  PubMed  PubMed Central  Google Scholar 

  56. 56.

    Yue, F. et al. A comparative encyclopedia of DNA elements in the mouse genome. Nature 515, 355–364 (2014).

    PubMed  PubMed Central  CAS  Google Scholar 

  57. 57.

    Zhang, T. et al. A variant NuRD complex containing PWWP2A/B excludes MBD2/3 to regulate transcription at active genes. Nat. Commun. 9, 3798 (2018).

    ADS  PubMed  PubMed Central  Google Scholar 

  58. 58.

    Yeo, G. & Burge, C. B. Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J. Comput. Biol. 11, 377–394 (2004).

    PubMed  CAS  Google Scholar 

  59. 59.

    Gruber, A. R., Lorenz, R., Bernhart, S. H., Neuböck, R. & Hofacker, I. L. The Vienna RNA websuite. Nucleic Acids Res. 36, W70–W74 (2008).

    PubMed  PubMed Central  CAS  Google Scholar 

  60. 60.

    Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011).

    PubMed  PubMed Central  Google Scholar 

  61. 61.

    Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).

    PubMed  PubMed Central  CAS  Google Scholar 

  62. 62.

    Shen, L., Shao, N., Liu, X. & Nestler, E. ngs.plot: quick mining and visualization of next-generation sequencing data by integrating genomic databases. BMC Genomics 15, 284 (2014).

    PubMed  PubMed Central  Google Scholar 

  63. 63.

    Wang, R., Zheng, D., Yehia, G. & Tian, B. A compendium of conserved cleavage and polyadenylation events in mammalian genes. Genome Res. 28, 1427–1441 (2018).

    PubMed  PubMed Central  CAS  Google Scholar 

  64. 64.

    Core, L. J. et al. Analysis of nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers. Nat. Genet. 46, 1311–1320 (2014).

    PubMed  PubMed Central  CAS  Google Scholar 

  65. 65.

    Ntini, E. et al. Polyadenylation site-induced decay of upstream transcripts enforces promoter directionality. Nat. Struct. Mol. Biol. 20, 923–928 (2013).

    PubMed  CAS  Google Scholar 

Download references


We thank G. Dreyfuss, P. Sharp, J. Steitz, N. Proudfoot, Y. Shi, J. Zhu, Z. Wang and members of the Shen Laboratory for suggestions. Grant support is from the National Natural Science Foundation of China (31630095 and 31925015 to X.S.; 31900439 to Y.Y; and 91740204 to Q.C.Z.), the National Basic Research Program of China (2018YFA0107604 and 2017YFA0504204 to X.S.; and 2018YFA0107603 and 2019YFA0110002 to Q.C.Z.), the US National Institutes of Health (R01GM084089 to B.T.), and the Center for Life Sciences and Beijing Advanced Innovation Center for Structural Biology at Tsinghua University (to X.S. and Q.C.Z). Y.Y. is a recipient of the Outstanding Postdoctoral Program of the Tsinghua Center for Life Sciences.

Author information




X.S. and Y.Y. conceived the project and wrote the manuscript. Y.Y. performed most experiments and bioinformatics analyses. X.Z. and W.S. helped with subcellular fractionation; Y.H. and L.C. helped to construct SNRNP70AID mESCs; J.Y.L. and Y.X. helped with correlation analysis of U1–RNA localization; P.L. carried out prediction of U1-recognition sites under the guidance of Q.C.Z.; G.S. provided initial batches of AMOs; B.T. helped with 3′ss analysis.

Corresponding authors

Correspondence to Yafei Yin or Xiaohua Shen.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature thanks Zhaohui Qin, Claes Wahlestedt and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data figures and tables

Extended Data Fig. 1 REL-seq and mutREL-seq for identification of cis elements that contribute to the subcellular localization of RNA.

a, Comparisons of chromatin enrichment between lncRNAs and protein-coding mRNA genes in human and mouse. We fractionated chromatin-bound RNA in mESCs and compared the RNA-seq profiles of chromatin RNA to total RNA from whole cells. We also analysed a previously published RNA-seq data set (chromatin and cell) from human K562 cells53. LncRNAs with FPKM values of greater than 1 and protein-coding genes with FPKM values of greater than 5 were used for the analysis. Data for samples connected by brackets were compared with two-sided t-tests. Consistent with previous reports6,8, lncRNAs as a class are significantly enriched in the chromatin fraction in both mouse and human cells. Box plots show 5th, 25th, 50th, 75th and 95th percentiles, with median values labelled beside the box plots and sample sizes (n ≥ 1,114) labelled on the axis. b, Detailed pipelines for REL-seq and mutREL-seq. For REL-seq, DNA fragments from candidate genes are randomly fragmented, ligated with adaptors, amplified, and inserted into three types of reporters in the PiggyBac transposon vector (including 5AI, 3AI and GFP reporters). The reporters are co-transfected with PiggyBac transposases (pBASE) and stably integrated into the genome. RNA from the cytoplasm (cyto) and chromatin (chr) fractions is reverse-transcribed (RT) with primers indicated by purple arrows, and amplified with primers P1 and P2 for subsequent high-throughput sequencing. Sequences that are enriched in different subcellular fractions are identified by comparing the read intensities or insert abundance in the chromatin fraction with that of the cytoplasmic fraction. For mutREL-seq, a candidate fragment (NXF1-enChr) is randomly mutagenized through error-prone PCR, and the products are further inserted into the 5AI reporter vector and subjected to downstream procedures similar to those described for REL-seq. Asterisks represent mutation sites. ITR, inverted terminal repeat sequences of the PiggyBac transposon system. See also Supplementary Note 1. c, Box plots showing the length of inserts of RTM (RNA transcript mixture, without Xist) and Xist REL-seq libraries in 5AI, 3AI and GFP reporters (i) and 5AI-short reporters (ii). Box plots show 5th, 25th, 50th, 75th and 95th percentiles. In i, n = 20,000 randomly selected inserts for each group. In ii, n = 19,748 for 5AI-short-1; n = 15,091 for 5AI-short-2. d, Western blot analysis of marker proteins in subcellular fractions of mESCs or HEK 293T cells. Tubulin and histone H3 are used as marker proteins for the cytoplasmic and chromatin fractions, respectively. n = 3 independent experiments. e, f, RT-qPCR analysis of the relative abundance of marker genes and candidate genes for REL-seq in subcellular fractions of mESCs (e) or human HEK 293T cells (f). ACTB and Xist are used as markers of mouse cytoplasmic and chromatin fractions, respectively; hACTB and hACTB-intron (intronic region of hACTB) are used as markers of human cytoplasmic and chromatin fractions, respectively. g, Numbers of mutation or deletion events expected and identified by mutREL-seq. A total of 469 mutation events, including 374 mutations (coverage 99.7%) and 85 deletions (coverage 68%), were identified over the 125-base-pair (bp) length of the NXF1-enChr DNA, indicating saturated mutagenesis. h, Analysis of the mutation rate. PCR mutations are spread across the NXF1-enChr sequence, ruling out a PCR bias towards the core 7-nucleotide mutations at positions 37-45. Notably, the two binding sites for PCR primers at the 5′ (1–18-bp) and 3′ (144–162-bp) ends of NXF1-enChr were less likely to be mutated, compared to the middle region with an average of 0.2%~3% mutation rate at each nucleotide position. This excludes the possibility that the sequences were misread. i, Predicted secondary structure of NXF1-enChr RNA. The U1-recognition site at positions 37–45 is highlighted with a thick blue line. A weak U1-recognition site at positions 53–59 is highlighted with a thin blue line. The coloured bar represents the probability of base-pairing or being unpaired (red, high probability of pairing (or lack of pairing); blue, low probability). j, Comparison of predicted secondary structures of the wild-type (i) and mutant (ii, iii) NXF1-enChr (partial sequences are shown). Purple triangles (ii, iii) highlight mutation sites. See also Supplementary Note 2. k, The percentage of enChrs or all target sequences (‘total’ = (sum of length)/(median length of enChr)) used for REL-seq analysis that overlap with predicted U1-recognition sites. P-values (one-sided Fisher’s exact test) and sample sizes are shown at the top.

Extended Data Fig. 2 Genome-browser views of representative genes showing REL-seq results.

The enChrs (shown as thick black bars, also highlighted by dashed boxes) indicate regions with significant chromatin enrichment that were identified by AI/GFP or AI reporter screens (P < 0.05, fold change greater than 1.5). Representative chromatin-enriched inserts from respective reporters (5AI-short or GFP) are shown at the bottom of each panel. a, b, No enChrs were identified by REL-seq in the cytoplasm-localized protein-coding ACTB (a) and NANOG (b) transcripts. In these transcripts, U1 signals are mainly confined to the intronic regions and appear to be depleted in exons. Scales are shown in square brackets at the right of each track. cg, REL-seq identified multiple enChrs in mouse Malat1 (c), NR_028425 (d), Neat1 (e), Xist (f), and NXF1-IR (g) transcripts. In c, two mouse regions homologous to regions E and M in human Malat1 are shown with thick blue lines. In f, only minus-strand tracks are shown. The plus-strand tracks for Tsix, which is transcribed in the antisense direction to Xist RNA, are shown in Extended Data Fig. 6c. The locations of different repeats in Xist are shown as thick blue lines. In g, multiple strong U1-recognition sites are clustered in the 162-nucleotide NXF1-enChr (highlighted by dashed boxes) and nearby sequences. U1 snRNP also binds strongly (roughly 70-fold maximal enrichment versus the input) to this region, with peak signals that are centred at the 7-nucleotide U1 motif revealed by mutREL-seq (Fig. 1a), providing evidence for direct interactions of NXF1-enChr RNA and U1 snRNP in vivo. The predicted strong or medium U1-recognition sites are shown underneath the NXF-1 gene annotation. RNA-seq signals of U1 RAP–RNA with formaldehyde (FA) or 4′-aminomethyltrioxalen (AMT) crosslinking and respective input controls are also shown. AMT generates interstrand crosslinks between uridine bases to detect the direct RNA–RNA interactions of highly expressed transcripts17. FA stabilizes both direct and indirect interactions of proteins and nucleotides. Also shown is the average fold change (log2) of the read intensity of the chromatin fraction compared with that of the cytoplasmic fraction in inserts from AI/GFP reporters or AI reporters. Red lines represent chromatin enrichment, while green lines represent chromatin depletion, of signals.

Extended Data Fig. 3 Reporter assays reveal a key role of the U1 motif and U1 snRNP in RNA–chromatin association.

a, Diagrams showing the GFP reporters that we constructed and analysed here. EGFP, enhanced GFP; EV, empty vector; PAS, bovine growth hormone (BGH) polyadenylation signal; 3′ Malat1, 3′ termination sequence of Malat1; U1 (28-nt), a short 28-nucleotide NXF1-enChr sequence that encompasses two U1-recognition motifs (one strong and one weak site from position 32 to 59 in NXF1-enChr). The NXF1-enChr GFP reporters (PAS and 3′ Malat1) contain the 162-nucleotide NXF1-enChr sequences with either a wild-type U1-recognition site or a mutated site (mutU1, vertical black line). To rule out an indirect effect of RNA degradation on its chromatin localization, we replaced the PAS with the 3′ termination sequence of Malat1. The 3′ end of Malat1 possesses a triple-helix structure, which resembles the viral expression and nuclear retention element (ENE) and stabilizes the Malat1 transcripts21,22. To assess the specificity of U1-mediated chromatin association, we also constructed GFP reporters carrying RepA alone and RepA with NXF1-enChr. We used two primer pairs (q1F/q1R and t1F/t1R; red arrows) to analyse the expression and potential splicing of the insert. b, The insertion of NXF1-enChr (U1) did not elicit splicing. The agarose gel picture shows PCR products amplified by primers t1F and t1R in mESCs expressing the NXF1-enChr (PAS) or NXF1-enChr-mutU1 (PAS) constructs. cDNAs were used as templates and the corresponding plasmids were used as control templates. n = 2 independent experiments. c, Subcellular fractionation and RT–qPCR analysis of GFP RNA in various reporters. Endogenous expression of ACTB and Malat1 serves as internal controls. d, GFP fluorescence imaging of mESCs expressing the EV, NXF1-enChr (PAS) or NXF1-enChr-mutU1 (PAS) constructs shown in a. GFP fluorescence is much weaker in cells expressing the NXF1-enChr-PAS construct. n = 4 independent experiments. e, RT–qPCR analysis of chromatin/cytosol ratios (top) and relative expression (bottom) of GFP RNA in mESCs. The relative chromatin/cytosol ratio was normalized to a spike-in RNA that was prepared by in vitro transcription. Data shown as mean ± s.e.m.; n = 2 biological replicates. f, RT–qPCR analysis of the chromatin/non-chromatin ratio (i) and relative expression (ii) of GFP RNA in human HEK 293T and Hela cells. Similar results to those shown in e were observed in human cells, suggesting a conserved U1-based mechanism in humans and mice. g, Additive effect of multiple enChrs in promoting RNA–chromatin association. The RepA-NXF1-enChr reporter exhibited significantly higher chromatin enrichment than RepA or NXF1-enChr alone. In f, g, data are shown as mean ± s.e.m.; P-values obtained by two-sided t-test with three biological replicates. h, Representative proteins identified by NXF1-enChr RNA pull-down assay. The mass-spectrometry scores of proteins identified by NXF1-enChr and NXF1-enChr-as RNA pull-down are shown, together with their fold enrichment in the NXF1-enChr sample relative to the antisense control (enChr-as). Sm proteins are general components of snRNPs that bind snRNAs. i, Western blot confirming the specific interaction between SNRNP70 and NXF1-enChr. Controls were the antisense sequence and the sequence with a deletion of the strong U1-recognition site (ΔU1). n = 3 independent experiments. j, RNAi in mESCs harbouring the NXF1-enChr (PAS) GFP reporter. Depletion, using short hairpin RNA (shRNA) knockdown (KD) of the three core components of U1 snRNP (SNRNP70, SNRPA and SNRPC), but not of SPEN or splicing regulators (RBM6, U2AF1/2 and SF3A2), led to 2.5–4.5-fold increases in GFP signals analysed by FACS. k, Knockdown efficiency of SNRNP70 using two shRNAs in mESCs. Top, RT–qPCR, mean ± s.e.m.; bottom, western blot. n = 2 independent experiments. l, m, RT–qPCR analysis of the chromatin/cytosol ratio (top) and relative expression (bottom panel) of GFP RNA in mESCs after knocking down SNRNP70 (l) or SPEN (m) in mESCs expressing the NXF1-enChr-U1 reporter or RepA-GFP reporter. Means ± s.e.m. are shown. n = 2 biological replicates. Scr, scrambled control shRNA. n, Mutation analysis of the 5′ and 3′ splice sites using a GFP reporter carrying the intron-3 sequence of ACTB. i, The mutation scheme. ii, RT–qPCR analysis of relative chromatin/non-chromatin ratios (left) and expression (right) of GFP RNA in mESCs expressing different constructs. iii, PCR bands of spliced and unspliced transcripts. Splicing was abolished in the 3′ss and 5′+3′ss mutants. Splicing was detected in the 5′ss mutant reporter (red asterisk) owing to the presence of an alternative 5′ss downstream of the mutated site. iv, Results of sequencing PCR fragments of the 5′ss mutant reporter. Data are shown as mean ± s.e.m., and include three biological replicates. P-values obtained from two-sided t-test. See also Supplementary Notes 3 and 4. For gel source data, see Supplementary Fig. 1.

Extended Data Fig. 4 Differential distributions of U1-recognition sites and 3′ splice sites in mRNA and lncRNA genes.

a, Average density of predicted strong U1-recognition sites in genic regions (i) and repeat elements (ii), and average density of predicted 3′ss in genic regions (iii), in humans (top panels) and mice (bottom panels). Random sequences, random intergenic sequences, and reverse-complementary sequences serve as controls for the background. LINE and SINE, long and short interspersed nuclear elements; LTR, long terminal repeat. b, Comparison of the density of predicted strong U1-recognition sites in the reverse-complementary strand of exons (top) and of 3′ss in the gene-body region (bottom) of lncRNAs (U1 site, n = 4,731; 3′ss, n = 21,512) and mRNA genes (U1 site, n = 68,881; 3′ss, n = 139,458) in humans. Only genes with transcript lengths (for U1 site) or genomic lengths (for 3′ss) larger than 1 kilobase were analysed. c, Densities of predicted medium-strength U1-recognition sites in exons of lncRNAs and mRNA genes in humans (top; lncRNA, n = 4,731; mRNA, n = 68,881) and mice (bottom; lncRNA, n = 3,385; mRNA, n = 47,298). Transcripts with total exon lengths of more than 1,000 nucleotides were analysed. d, Densities of predicted strong U1-recognition sites in introns of lncRNA and mRNA genes in humans (top; lncRNA, n = 15,885; mRNA, n = 130,431) and mice (bottom; lncRNA, n = 6,467; mRNA, n = 72,688). e, Densities of predicted strong U1-recognition sites in the sense strand (top) and reverse-complementary strand (bottom) of mRNA with different levels of chromatin-binding activity (n = 7,773 each) and lncRNA (n = 1,038) transcripts (from low to high: green, low; orange, moderate; red, high; dark red, lncRNA). f, Comparison of expression (top) and fold enrichment of U1 RAP–RNA signals in introns (bottom) in mRNA genes (n = 2,262 each group) and lncRNAs (n = 97) with different levels of chromatin binding. LncRNAs that show detectable expression (FPKM values greater than 1) and no overlap with protein-coding genes were used. In bf, P-values are based on two-sided t-tests. Box plots show 5th, 25th, 50th, 75th and 95th percentiles, with median values labelled by the box plots.

Extended Data Fig. 5 U1 snRNP regulates the chromatin retention of lncRNAs.

a, Scheme illustrating inhibition of U1 and/or U2 snRNA by AMO nucleofection followed by strand-specific RNA-seq analysis of total RNAs isolated from whole cells and from three subcellular fractions: cytoplasm (cyto), nucleoplasm (nuc) and chromatin (chr). CT, scramble control AMO. b, RT–qPCR of the chromatin/non-chromatin ratios (i) and relative expression (ii) of representative lncRNAs in mESCs after 2-h treatments with the control (CT), U1, U2 or U1/2 AMOs (upper panels), or after 72 h of knockdown with scramble (Scr), SNRNP70 or SPEN shRNAs (bottom panels). (Non-chromatin refers to cytoplasm plus nucleoplasm.) Means ± s.e.m. are shown. n = 2 biological replicates. Consistent with inhibition of U1 snRNA, knockdown of SNRNP70—a core component of U1 snRNP—also led to decreased chromatin signals of individual lncRNAs. c, Venn diagram showing lncRNAs with decreased chromatin associations upon U1, U2 or U1/2 inhibition. Compared with U1 AMO (337), inhibition of U2 snRNA affected a smaller number (91) of lncRNAs, most of which (69) belong to U1-regulated lncRNAs. Inhibition of both U1/2 snRNA did not elicit a stronger effect than inhibition of U1 alone. Note that inhibition of both U1 and U2 used a lower concentration of U1 and U2 AMOs (50 μM each) than the 75 μM used to inhibit U1 alone. d, Metaplots of whole-cell RNA-seq reads in intron–exon junctions upon inhibition of U1 and/or U2 snRNA (i) or upon SNRNP70AID degradation (ii). RPM, read count per million mapped reads. e, Volcano plot showing the fold change (log2) of the chromatin/non-chromatin ratio of lncRNAs (n = 1,282) upon inhibition of U1 snRNA (using AMOs, 2 h). P-values obtained by two-sided t-test with three biological replicates. f, Heat maps showing the ratio of chromatin/non-chromatin, the ratio of each fraction versus the normalized total RNA contents, and the expression level of 337 lncRNAs with decreased chromatin association upon U1 AMO inhibition (i); or 346 chromatin-downregulated lncRNAs in total RNA-seq upon SNRNP70AID degradation (ii). Upon U1 AMO inhibition or SNRNP70AID degradation, these lncRNAs show decreased chromatin association, while their relative abundance in cytoplasmic and nucleoplasm fractions even increased. g, Diagram showing the construction of the SNRNP70AID mESC line, which expresses an AID- and FLAG-tagged SNRNP70 (SNRNP70AID) in a transgene with the two endogenous SNRNP70 alleles inactivated. TIR1, DNA expressing E3 ligase for the AID system. h, i, Time-course expression analysis by western blot (h) and by RT–qPCR (i) in SNRNP70AID mESCs upon addition of auxin for 0–12 h. n = 2 independent experiments. For panel i, means ± s.e.m. are shown. RNA expression was normalized by a lacZ spike-in that was added into the same numbers of cells. At 4 h of auxin treatment, the expression of p53 protein and the RNA transcripts analysed exhibited modest changes compared to the changes after 8–12 h. j, Auxin-induced rapid degradation of SNRNP70AID did not affect the phosphorylation of Pol II at Ser 2 (S2P). ACTB serves as a loading control. n = 3 independent experiments. k, RT–qPCR analysis showing enrichments of candidate lncRNAs captured by U1 ChIRP before (0 h) and after (4 h) SNRNP70AID depletion. U1 binding to its target candidate lncRNAs was severely impaired in auxin-treated mESCs (4 h), indicating that the function of U1 snRNA requires an intact snRNP complex. Data shown as mean ± s.e.m., from three biological replicates. P-values obtained by two-sided t-test. l, The overlap of a total of 531 U1-snRNP-regulated, chromatin-downregulated lncRNAs from U1 AMO inhibition (red) or SNRNP70AID degradation (blue). P-values obtained by exact hypergeometric probability. m, Heat map showing changes of the chromatin/non-chromatin ratio upon U1 AMO inhibition (i) or degradation of SNRNP70AID (ii) for the set of 531 U1-snRNP-regulated, chromatin-downregulated lncRNAs shown in l. The patterns of chromatin/non-chromatin ratio changes are highly similar for both treatments. n, Analysis of U1 RAP–RNA signals (i), expression levels (FPKM) (ii) and transcript lengths (iii–v) in the various sets of lncRNAs shown in l. Box plots show 5th, 25th, 50th, 75th and 95th percentiles, with median values labelled by the box plots and sample sizes (n ≥ 337) labelled on the axis. We further divided mESC-expressed lncRNAs into three groups on the basis of their transcript length (from shortest to longest). iv, v, The chromatin/non-chromatin ratio of total RNA-seq (iv) and the numbers of U1-snRNP-regulated and unregulated lncRNAs (v) in each group (n = 427). Longer lncRNAs appear to exhibit stronger chromatin retention and to be preferentially affected upon SNRNP70AID degradation. P-values obtained by two-sided Mann–Whitney test for i–iv, one-sided Fisher’s exact test for v. For gel source data, see Supplementary Fig. 1.

Extended Data Fig. 6 Sequencing tracks of representative lncRNA and mRNA genes.

a, Malat1. b, Kcnq1ot1. c, Tsix. d, Pvt1 and its upstream protein-coding gene, MYC. e, Meg3. f, Rian. g, Lncenc1. In the Tsix locus (c), the top three sets of tracks show the predicted U1-recognition sites (strong or medium, indicated by blue vertical lines), U1 RAP–RNA-seq with formaldehyde (FA) crosslinking (including enrichment ratio, reads signals of U1 RAP and the input control), and the REL-seq result. The lower two sets of tracks show total RNA-seq after AMO treatments, and total and polyA RNA-seq and TT-seq of SNRNP70AID mESCs at 0 h or 4 h of auxin treatment. Total RNA-seq analysis revealed decreased chromatin levels of Tsix transcripts after inhibition of U1 snRNA or degradation of SNRNP70AID. Intriguingly, polyA-seq showed a more dramatic increase of polyadenylated Tsix transcripts in the whole-cell sample compared with that in the chromatin fraction upon SNRNP70AID degradation. Coincidently, an U1-associated enChr (black vertical line) was identified by REL-seq at roughly 2.2 kb upstream of the annotated transcription end site (TES) of Tsix. In addition, strong binding of U1 snRNA was detected extensively across the whole exon at the 3′ end of Tsix. Thus, U1 snRNP may inhibit the PAS of Tsix to promote its degradation and chromatin retention. Inhibition of U1 snRNP by degradation of SNRNP70AID thus appears to enhance polyadenylation, stability and nuclear export of Tsix, leading to the observed increases of polyA RNA in both whole-cell and chromatin fractions. Nevertheless, the chromatin/whole-cell ratio of polyA Tsix still decreases after SNRNP70AID degradation. Two well spliced lncRNAs Meg3 (e) and Rian (f) show few intronic signals in both total and polyA RNA-seq. In dg, U1 RAP–RNA-seq (FA), total RNA-seq after AMO treatments, and total and polyA RNA-seq and TT-seq tracks of SNRNP70AID mESCs at 0 h or 4 h auxin treatment are shown. Some very large lncRNAs—such as Kcnq1ot1 (b, 83.4 kb in the genome sequence), Tsix (c, 53.4 kb) and Pvt1 (d, 213 kb)—show decreased RNA signals in the downstream gene body upon U1 AMO treatment; however, this effect was less obvious in SNRNP70AID mESCs.

Extended Data Fig. 7 Analysis of the direct causality of U1 snRNP in regulating lncRNA–chromatin retention.

a, Summary of experiments that we carried out to systematically investigate the effects of acute inhibition of U1 snRNP on lncRNA–chromatin associations, transcription dynamics, and RNA processing and decay. b, Volcano plot of polyA RNA-seq showing the fold change (log2) in the chromatin/whole-cell ratio of lncRNAs upon SNRNP70AID degradation. Red dots and deep grey dots indicate lncRNAs that show a significant decrease or increase, respectively, in their chromatin/cell ratio by comparing SNRNP70AID (4-h auxin treatment) versus control (0 h) (P < 0.05; two-sided t-test with three biological replicates). c, Correlation plot of polyA and total RNA-seq analysis of SNRNP70AID mESCs. The set of chromatin-downregulated lncRNAs shows significantly correlated changes in chromatin localization upon degradation of SNRNP70AID (n = 346). d, SLAM-seq analysis of chromatin and non-chromatin (cytoplasm and nucleoplasm) fractions in SNRNP70AID mESCs. SNRNP70AID mESCs were treated with or without auxin for 2 h and then labelled with 4sU for 3 h. After chemical conversion of the incorporated 4sU nucleotides to cytidine, RNA from various subcellular fractions was isolated for 3′-end polyA-seq library construction. e, Box plots showing the conversion rate detected by SLAM-seq of chromatin fractions in the last exon of genes with detectable new transcript (n = 24,097) before (0 h) and after (4 h) SNRNP70AID degradation. Box plots show 5th, 25th, 50th, 75th and 95th percentiles. f, Pearson correlation analysis of the change in chromatin/non-chromatin ratio for new versus all transcripts of lncRNAs with detectable new transcripts (n = 492) identified by SLAM-seq. g, Volcano plots showing expression changes of mESC-expressed lncRNAs (n = 1,282) after treatment with U1 AMO (i) or degradation of SNRNP70AID (ii). Chromatin-downregulated or non-downregulated lncRNAs were further classified into ‘downregulated (down-)’, ‘upregulated (up-)’ or ‘unchanged’ according to their expression changes in whole-cell samples. LncRNAs with reduced chromatin association upon inhibition of U1 snRNA or SNRNP70AID do not show greater downregulated expression by comparison with all lncRNAs. Only a small proportion of them (84 of 337 U1-regulated and 76 of 346 SNRNP70-regulated) show decreased transcript levels. P-values obtained by two-sided t-test; n = 3 biological replicates. h, Metagene analysis of whole-cell RNA-seq reads for the set of U1-snRNP-regulated, chromatin-downregulated lncRNAs in mESCs. Only lncRNAs that do not overlap with any protein-coding gene on the same strand were analysed (n = 239). Similar read-distribution patterns were observed in control cells and in cells treated with U1, U2 or U1/U2 AMOs (i) or in cells subjected to auxin-induced degradation of SNRNP70AID for 0 h or 4 h (ii). Thus, rapid inhibition of U1 snRNP did not cause global transcription termination, although we did observe decreased downstream RNA signals in a few very long lncRNAs, such as Kcnq1ot1 (83 kb), in agreement with the proposed role of U1 telescripting in protecting the transcription integrity of very large transcripts25. In addition, we conjecture that the slight decreases in total transcript levels are likely to be post-transcriptionally mediated by RNA degradation instead of an effect of U1 inhibition on nascent transcription (see panels ik below). i, Metagene analysis of TT-seq signals in all mESC-expressed genes (n = 10,675) (i) and chromatin-downregulated (n = 239) and non-downregulated lncRNAs (n = 151) (ii) upon SNRNP70AID degradation. Only lncRNAs that do not overlap with any protein-coding gene on the same strand were analysed. j, k, Metagene analysis of ChIP-seq signals of Pol II S5P (‘paused’ Pol II; i) and S2P (‘elongating’ Pol II; ii) across the gene body and upstream/downstream 5-kb region of all mESC-expressed genes (n = 10,675) and unexpressed genes (n = 7,933) (j) or chromatin-downregulated (n = 155) and non-downregulated (n = 66) lncRNAs upon SNRNP70AID degradation (k). Only lncRNAs that do not overlap with any protein-coding gene on either strand were analysed. For hk, shadings represent 95% confidence intervals for the average enrichment. l, RT–qPCR analysis of the knockdown efficiency and lncRNA expression level change (i), chromatin/non-chromatin ratio (ii) and relative expression (iii) in SNRNP70AID mESCs depleted of EXOSC3 by RNA inhibition. Knockdown was analysed at 72 h after shRNA viral infection. The observation of increased expression for most lncRNAs analysed is consistent with a role of EXOSC3 in mediating RNA degradation (i). Knockdown of EXOSC3 blocked RNA degradation for most lncRNAs analysed (iii), but failed to rescue their decreased chromatin associations induced by auxin (ii). Thus, the effect of U1 snRNP in promoting lncRNA-chromatin binding is not caused by increased RNA degradation. Means ± s.e.m. are shown; P-values obtained by two-sided t-test for three biological replicates.

Extended Data Fig. 8 U1 snRNP regulates lncRNA–chromatin association through its interaction with transcriptionally engaged Pol II.

a, Western blots of SNRNP70, SNRPC, Pol II and marker proteins with no treatment (i), or treated with DMSO, triptolide (TPL), or flavopiridol (Flav) for 1 h (ii), or in nuclear fractions sequentially extracted with increasing concentrations of salt (NaCl) (iii). For i, iii, n = 3 independent experiments, for ii, n = 2 independent experiments. b, Scheme for co-immunoprecipitation of the chromatin fraction. Benzonase was used to digest all DNA and RNA and to elute proteins from chromatin. c, Proteins captured by SNRNP70 coIP-SILAC. Native chromatin extracts that were released by benzonase were subjected to anti-FLAG co-immunoprecipitation of SNRNP70AID protein (FLAG-tagged) coupled with stable-isotope labelling by amino acids in cell culture, followed by mass spectrometry. SNRNP70 purification captured the U1 and U2 snRNPs, as well as several components of the U4/6-U5 snRNPs and other splicing factors. Notably, the SNRNP70 interactome also identified proteins involved in transcription regulation, such as POLR2A, SPT5 and SPT6. d, Chromatin-based co-immunoprecipitation (in the presence of benzonase) and western blot analysis confirming the interactions between U1 snRNP and the proteins identified in c. Panels i, ii show co-immunoprecipitation of FLAG-tagged SNRNP70AID protein. SNRNP70AID mESCs treated with auxin for 4 h serve as the negative control. Panel iii shows that co-immunoprecipitation of endogenous SNRPC captured the total Pol II. Panel iv shows that co-immunoprecipitation of Pol II S2P captured SNRNP70 under a physiological salt condition (150 mM) but not in high-salt conditions, suggesting dynamic associations between U1 snRNP and engaged Pol II. n = 3 independent experiments. e, RT–qPCR analysis of the expression of representative lncRNAs and U1 snRNA upon treatment with DMSO (control), 1 μM with E7107 (inhibitor of U2 snRNP), 1 μM of flavopiridol (transcription inhibitor) or 1 μM of triptolide (transcription inhibitor) for 1 h, or with 100 μM of isoginkgetin (inhibitor of U4/6-U5 snRNP) for 2 h. Means ± s.e.m. are shown; P-values obtained by two-sided t-test with three biological replicates. f, Metagene analysis of U1 RAP–DNA-seq signals for all Ensembl genes in mESCs treated with DMSO or with flavopiridol (Flav) for 1 h. Flavopiridol treatment led to reduced U1 RAP–DNA signals downstream of the TSS across the gene body. Red shading represents 95% confidence intervals for the average enrichment. g, Genome-browser views of U1 RAP–DNA-seq and Pol II ChIP-seq at the Malat1 and Tsix loci. 8WG16, hypophosphorylated Pol II; NTD, N-terminal domain of Pol II, representing the total Pol II. h, Relative chromatin/non-chromatin ratio (i), expression-level change of representative lncRNAs (ii) and knockdown efficiency of targeted genes (iii) after mESCs were treated with PRPF8 or SNRNP200 shRNAs for 72 h. Means ± s.e.m. are shown; P-values obtained by two-sided t-test with three biological replicates. For gel source data, see Supplementary Fig. 1.

Extended Data Fig. 9 Inhibition of U1 and U2 snRNPs downregulates the chromatin association of uaRNAs and eRNAs.

a, Metaplots of U1 RAP–DNA-seq, showing enrichment of U1 snRNA in the chromatin proximity of regulatory DNA sequences. The top panel shows a ± 5-kb window flanking TSSs of Ensembl genes that do not overlap with any other gene within 2 kb (n = 18,972), and the bottom panel shows a ± 5-kb window flanking enhancers that do not overlap with a gene within 2 kb (n = 3,767). b, c, Sequencing tracks showing chromatin and cytoplasmic RNA-seq signals of uaRNAs/eRNAs in the PHC1 promoter (b) and Haunt enhancer (c). d, e, Metaplots of RNA-seq reads of uaRNAs from whole cells and subcellular fractions in AMO-treated samples (d) and SNRNP70AID-degraded samples (e) in a ± 2-kb window flanking the TSSs of Ensembl genes that do not overlap with any other gene within 2 kb (n = 18,972). The uaRNAs show upregulated overall expression and more dramatic increases in the cytoplasmic and nucleoplasmic fractions after U1 or U1/2 inhibition, while there are comparable (U1 AMO) or slightly decreased (U1/2 AMO) uaRNA signals in the chromatin fraction at the TSS-to-1-kb upstream region. f, g, Metaplots of RNA-seq reads of eRNAs from whole cells and subcellular fractions in AMO-treated samples (f) and SNRNP70AID-degraded samples (g) in a ± 2-kb window flanking enhancers that do not overlap with any gene within 2 kb (n = 3,767).

Extended Data Fig. 10 U1 snRNP tethers and mobilizes lncRNAs to chromatin.

a, b, Immunofluorescence analysis of SNRNP70 (a) and SC35 (b) in auxin-treated SNRNP70AID and wild-type mESCs. SNRNP70AID ESCs marked by stably integrated GFP transgenes (GFP+, highlighted with dashed white lines) and GFP-negative wild-type mESCs (GFP) were mixed and treated with auxin for 4 h. n = 3 independent experiments. c, Quantification of the numbers of Malat1 speckles (equivalent diameter greater than 0.5 μm) identified by RNA FISH (Fig. 3c). Box plots show 5th, 25th, 50th, 75th and 95th percentiles, with median values labelled by the plots and sample sizes (wild type, n = 52; SNRNP70AID, n = 88) labelled on the axis. P-values obtained by two-sided Mann–Whitney test. d, Sequencing tracks of Malat1 ChIRP-DNA-seq and TT-seq in Malat1 (i) and representative loci that are targeted by Malat1 (ii). In both panels, the top set of tracks show Malat1 ChIRP-seq upon SNRNP70AID degradation (at 0 h and 4 h) or upon treatment with DMSO control or triptolide (TPL). The bottom set of tracks show TT-seq upon SNRNP70AID degradation (at 0 h and 4 h). In panel ii, TT-seq signals on both plus and minus strands are shown. We used the mm9 mouse genome assembly. e, qPCR analysis of Malat1 ChIRP-DNA of SNRNP70AID mESCs before (0 h) and after (4 h) treatment with auxin. Data are shown as mean ± s.e.m., for three biological replicates. P-values obtained by two-sided t-test. f, Mechanistic representation of U1 snRNP and its interplay with Pol II and PASs in regulating the tethering and mobilization of noncoding RNA on chromatin. Notably, lncRNAs, uaRNAs and eRNAs share many features, including chromatin association, inefficient or absent splicing and polyadenylation, low-level expression and short half-lives7,8,42,53,63. LncRNAs in general are enriched with 5′ U1-recognition sites but depleted of 3′ splice sites. For uaRNAs and eRNAs, U1 binding on chromatin is enriched at enhancer DNA sequences and TSSs (the 5′ end of uaRNAs), even though U1-recognition sites are depleted in uaRNA DNA sequences17,30,64,65. U1 snRNP may bind uaRNAs and eRNAs through co-transcriptional U1–Pol II interactions. Splicing releases the U1 snRNP from pre-mRNAs18. However, lncRNAs, uaRNA and eRNAs remain associated with U1 snRNPs because of inefficient or absent splicing (the lack of 3′ss could be one reason)8,53. Through its interaction with transcriptionally engaged Pol II, U1 snRNP is tethered to chromatin and subsequently retains its associated lncRNAs and ncRNAs on chromatin. Meanwhile, the inhibitory function of U1 snRNP on polyadenylation promotes transcription elongation at cryptic PASs and RNA decay at authentic PASs23,25,30,41,65. Rapid RNA turnover renders these transcripts less likely to leave the chromatin, contributing in part to their enrichment on chromatin and lack of nuclear export. Although the properties of chromatin binding and instability appear to be intrinsically coupled for lncRNAs and chromatin-bound unstable ncRNAs, U1 snRNP and the RNA-degradation machinery appear to play independent yet synergistic roles in facilitating RNA–chromatin association. Coupled chromatin association and instability of many lncRNA transcripts may contribute to the observed cis-targeting and regulatory functions in their chromatin neighbourhoods. Most short-lived lncRNA transcripts spread locally within their neighbourhoods, while a few stable and abundant lncRNAs, such as Malat1, exist long enough to be trans-targeted to other genomic sites. For stable lncRNAs, persistent binding with U1 snRNP, and perhaps engaged Pol II, may drive lncRNA mobilization to distinct nuclear compartments (such as nuclear speckles) or to thousands of trans genomic sites (in the case of Malat1). Possibly, these highly expressed lncRNAs have developed evolutionarily to take advantage of the U1-tethering mechanism to achieve trans functions. In addition to U1 snRNP, U2 snRNP (but not the splicing reaction), and perhaps other factors, contributes to this process.

Supplementary information

Supplementary Information

The Supplementary Notes and Discussion file contains: Supplementary Notes 1-4, which contain additional information of the results in the main text; Supplementary Discussion 1, which discusses the chromatin retention of Xist; Supplementary Discussion 2, which disputes the possibility of direct base-pairing between U1 snRNA and genomic DNA; and Supplementary References.

Reporting Summary

Supplementary Figure 1:

Original gel image data.

Supplementary Table 1:

Sample information for REL-seq libraries.

Supplementary Table 2:

REL-seq identification of differentially localized regions.

Supplementary Table 3:

Proteins enriched in the NXF1-enChr RNA pull-down assay.

Supplementary Table 4:

Normalized abundance of lncRNAs in whole cell or subcellular fractions upon AMO treatment.

Supplementary Table 5:

Normalized abundance of lncRNAs in whole cell or subcellular fractions upon SNRNP70AID depletion.

Supplementary Table 6:

Normalized abundance of lncRNAs identified by polyA RNA-seq in whole cell or chromatin fraction upon SNRNP70AID depletion.

Supplementary Table 7:

Normalized abundance of lncRNAs identified by SLAM-seq in subcellular fractions upon SNRNP70AID depletion.

Supplementary Table 8:

Proteins identified from SNRNP70 FLAG coIP-SILAC.

Supplementary Table 9:

Sequences of primers, adaptors, AMOs, shRNAs, sgRNAs, and probes.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Yin, Y., Lu, J.Y., Zhang, X. et al. U1 snRNP regulates chromatin retention of noncoding RNAs. Nature 580, 147–150 (2020). https://doi.org/10.1038/s41586-020-2105-3

Download citation

Further reading


By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.