Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Pre-mRNA splicing order is predetermined and maintains splicing fidelity across multi-intronic transcripts

Abstract

Combinatorially, intron excision within a given nascent transcript could proceed down any of thousands of paths, each of which would expose different dynamic landscapes of cis-elements and contribute to alternative splicing. In this study, we found that post-transcriptional multi-intron splicing order in human cells is largely predetermined, with most genes spliced in one or a few predominant orders. Strikingly, these orders were conserved across cell types and stages of motor neuron differentiation. Introns flanking alternatively spliced exons were frequently excised last, after their neighboring introns. Perturbations to the spliceosomal U2 snRNA altered the preferred splicing order of many genes, and these alterations were associated with the retention of other introns in the same transcript. In one gene, early removal of specific introns was sufficient to induce delayed excision of three proximal introns, and this delay was caused by two distinct cis-regulatory mechanisms. Together, our results demonstrate that multi-intron splicing order in human cells is predetermined, is influenced by a component of the spliceosome and ensures splicing fidelity across long pre-mRNAs.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Widespread post-transcriptional splicing in human cells.
Fig. 2: Post-transcriptional splicing follows a defined order.
Fig. 3: Splicing order is consistent across cell types.
Fig. 4: Splicing order is consistent between alternative isoforms and displays later removal of AS introns.
Fig. 5: Splicing order changes upon U2-snRNA-mediated exon skipping.
Fig. 6: Perturbation of one intron disrupts excision of proximal introns.

Similar content being viewed by others

Data availability

Raw and processed sequencing data are available from the Gene Expression Omnibus at

accession number GSE232455. Source data are provided with this paper.

Code availability

Code for analysis of all nanopore sequencing data is available at https://github.com/churchmanlab/splicing_order.

References

  1. Pan, Q., Shai, O., Lee, L. J., Frey, B. J. & Blencowe, B. J. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 40, 1413–1415 (2008).

    Article  CAS  PubMed  Google Scholar 

  2. Wang, E. T. et al. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470–476 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Yeom, K.-H. et al. Tracking pre-mRNA maturation across subcellular compartments identifies developmental gene regulation through intron retention and nuclear anchoring. Genome Res. 31, 1106–1119 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Tilgner, H. et al. Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs. Genome Res. 22, 1616–1625 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Pandya-Jones, A. et al. Splicing kinetics and transcript release from the chromatin compartment limit the rate of Lipid A-induced gene expression. RNA 19, 811–827 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Bhatt, D. M. et al. Transcript Dynamics of Proinflammatory Genes Revealed by Sequence Analysis of Subcellular RNA Fractions. Cell 150, 279–290 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Melé, M. et al. The human transcriptome across tissues and individuals. Science 348, 660–665 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  8. Glinos, D. A. et al. Transcriptome variation in human tissues revealed by long-read sequencing. Nature 608, 353–359 (2022).

  9. Zhang, X. H.-F. & Chasin, L. A. Computational definition of sequence motifs governing constitutive exon splicing. Genes Dev. 18, 1241–1250 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Fairbrother, W. G. & Chasin, L. A. Human genomic sequences that inhibit splicing. Mol. Cell. Biol. 20, 6816–6825 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Barash, Y. et al. Deciphering the splicing code. Nature 465, 53–59 (2010).

    Article  CAS  PubMed  Google Scholar 

  12. Blencowe, B. J. An exon-centric perspective. Biochem. Cell Biol. 90, 603–612 (2012).

    Article  CAS  PubMed  Google Scholar 

  13. Kim, S. W. et al. Widespread intra-dependencies in the removal of introns from human transcripts. Nucleic Acids Res. 45, 9503–9513 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Drexler, H. L., Choquet, K. & Churchman, L. S. Splicing kinetics and coordination revealed by direct nascent RNA sequencing through nanopores. Mol. Cell 77, 985–998 (2020).

    Article  CAS  PubMed  Google Scholar 

  15. Takahara, K. et al. Order of intron removal influences multiple splice outcomes, including a two-exon skip, in a COL5A1 acceptor-site mutation that results in abnormal pro-α1(V) N-propeptides and Ehlers–Danlos syndrome type I. Am. J. Hum. Genet. 71, 451–465 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Schwarze, U., Starman, B. J. & Byers, P. H. Redefinition of exon 7 in the COL1A1 gene of type I collagen by an intron 8 splice-donor–site mutation in a form of osteogenesis imperfecta: influence of intron splice order on outcome of splice-site mutation. Am. J. Hum. Genet. 65, 336–344 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Ham, K. A., Aung-Htut, M. T., Fletcher, S. & Wilton, S. D. Nonsequential splicing events alter antisense-mediated exon skipping outcome in COL7A1. Int. J. Mol. Sci. 21, 7705 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Gazzoli, I. et al. Non-sequential and multi-step splicing of the dystrophin transcript. RNA Biol. 13, 290–305 (2016).

    Article  PubMed  Google Scholar 

  19. Sousa-Luís, R. et al. POINT technology illuminates the processing of polymerase-associated intact nascent transcripts. Mol. Cell 81, 1935–1950.e6 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  20. Reimer, K. A., Mimoso, C. A., Adelman, K. & Neugebauer, K. M. Co-transcriptional splicing regulates 3′ end cleavage during mammalian erythropoiesis. Mol. Cell 81, 998–1012 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. Will, C. L. & Lührmann, R. Spliceosome structure and function. Cold Spring Harb. Perspect. Biol. 3, a003707 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Dvinge, H., Guenthoer, J., Porter, P. L. & Bradley, R. K. RNA components of the spliceosome regulate tissue- and cancer-specific alternative splicing. Genome Res. 29, 1591–1604 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Jia, Y., Mu, J. C. & Ackerman, S. L. Mutation of a U2 snRNA gene causes global disruption of alternative splicing and neurodegeneration. Cell 148, 296–308 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Braun, J. E., Friedman, L. J., Gelles, J. & Moore, M. J. Synergistic assembly of human pre-spliceosomes across introns and exons. eLife 7, e37751 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  25. Wachutka, L., Caizzi, L., Gagneur, J. & Cramer, P. Global donor and acceptor splicing site kinetics in human cells. eLlfe 8, e45056 (2019).

    CAS  Google Scholar 

  26. Yap, K., Lim, Z. Q., Khandelia, P., Friedman, B. & Makeyev, E. V. Coordinated regulation of neuronal mRNA steady-state levels through developmentally controlled intron retention. Genes Dev. 26, 1209–1223 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Pendleton, K. E., Park, S.-K., Hunter, O. V., Bresson, S. M. & Conrad, N. K. Balance between MAT2A intron detention and splicing is determined cotranscriptionally. RNA 24, 778–786 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Bresson, S. M., Hunter, O. V., Hunter, A. C. & Conrad, N. K. Canonical poly(A) polymerase activity promotes the decay of a wide variety of mammalian nuclear RNAs. PLoS Genet. 11, e1005610 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  29. Bresson, S. M. & Conrad, N. K. The human nuclear poly(A)-binding protein promotes RNA hyperadenylation and decay. PLoS Genet. 9, e1003893 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Workman, R. E. et al. Nanopore native RNA sequencing of a human poly(A) transcriptome. Nat. Methods 16, 1297–1305 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Boutz, P. L., Bhutkar, A. & Sharp, P. A. Detained introns are a novel, widespread class of post-transcriptionally spliced introns. Genes Dev. 29, 63–80 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  32. Peet, R. K. The measurement of species diversity. Annu. Rev. Ecol. Syst. 5, 285–307 (1974).

  33. Sherwin, W. B. & Prat I Fornells, N. The introduction of entropy and information methods to ecology by Ramon Margalef. Entropy 21, 794 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Pai, A. A. et al. The kinetics of pre-mRNA splicing in the Drosophila genome and the influence of gene architecture. eLlfe 6, e32537 (2017).

    Google Scholar 

  35. Khodor, Y. L. et al. Nascent-seq indicates widespread cotranscriptional pre-mRNA splicing in Drosophila. Genes Dev. 25, 2502–2512 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Weyn-Vanhentenryck, S. M. et al. Precise temporal regulation of alternative splicing during neural development. Nat. Commun. 9, 2189 (2018).

  37. Ruggiu, M. et al. Rescuing Z+ agrin splicing in Nova null mice restores synapse formation and unmasks a physiologic defect in motor neuron firing. Proc. Natl Acad. Sci. USA 106, 3513–3518 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Yuan, Y. et al. Cell type-specific CLIP reveals that NOVA regulates cytoskeleton interactions in motoneurons. Genome Biol. 19, 117 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  39. Jacko, M. et al. Rbfox splicing factors promote neuronal maturation and axon initial segment assembly. Neuron 97, 853–868 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Maury, Y. et al. Combinatorial analysis of developmental cues efficiently converts human pluripotent stem cells into multiple neuronal subtypes. Nat. Biotechnol. 33, 89–96 (2015).

    Article  CAS  PubMed  Google Scholar 

  41. Hatje, K. et al. The landscape of human mutually exclusive splicing. Mol. Syst. Biol. 13, 959 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  42. Gooding, C. & Smith, C. W. J. Tropomyosin exons as models for alternative splicing. Adv. Exp. Med. Biol. 644, 27–42 (2008).

    Article  CAS  PubMed  Google Scholar 

  43. Caizzi, L. et al. Efficient RNA polymerase II pause release requires U2 snRNP function. Mol. Cell 81, 1920–1934 (2021).

    Article  CAS  PubMed  Google Scholar 

  44. Zubradt, M. et al. DMS-MaPseq for genome-wide or targeted RNA structure probing in vivo. Nat. Methods 14, 75–82 (2017).

    Article  CAS  PubMed  Google Scholar 

  45. Pineda, J. M. B. & Bradley, R. K. Most human introns are recognized via multiple and tissue-specific branchpoints. Genes Dev. 32, 577–591 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Cartegni, L., Wang, J., Zhu, Z., Zhang, M. Q. & Krainer, A. R. ESEfinder: a web resource to identify exonic splicing enhancers. Nucleic Acids Res. 31, 3568–3571 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Wang, X. et al. Predicting sequence and structural specificities of RNA binding regions recognized by splicing factor SRSF1. BMC Genomics 12, S8 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Buratti, E. & Baralle, F. E. Influence of RNA secondary structure on the pre-mRNA splicing process. Mol. Cell. Biol. 24, 10505–10514 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Muro, A. F. et al. Regulation of fibronectin EDA exon alternative splicing: possible role of RNA secondary structure for enhancer display. Mol. Cell. Biol. 19, 2657–2671 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Gohr, A., Iñiguez, L. P., Torres-Méndez, A., Bonnal, S. & Irimia, M. Insplico: effective computational tool for studying splicing order of adjacent introns genome-wide with short and long RNA-seq reads. Nucleic Acids Res. 51, e56 (2023).

    Article  PubMed  PubMed Central  Google Scholar 

  51. Wan, Y. et al. Dynamic imaging of nascent RNA reveals general principles of transcription dynamics and stochastic splice site selection. Cell 184, 2878–2895 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Zeng, Y. et al. Profiling lariat intermediates reveals genetic determinants of early and late co-transcriptional splicing. Mol. Cell 82, 4681–4699 (2022).

    Article  CAS  PubMed  Google Scholar 

  53. Meng, D., Zheng, Q., Zhang, X., Luo, L. & Jia, Y. A molecular brake that modulates spliceosome pausing at detained introns contributes to neurodegeneration. Protein Cell 14, 318–336 (2022).

    PubMed Central  Google Scholar 

  54. Taliaferro, J. M. et al. RNA sequence context effects measured in vitro predict in vivo protein binding and regulation. Mol. Cell 64, 294–306 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  55. Dominguez, D. et al. Sequence, structure, and context preferences of human RNA binding proteins. Mol. Cell 70, 854–867 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Brannan, K. W. et al. Robust single-cell discovery of RNA targets of RNA-binding proteins and ribosomes. Nat. Methods 18, 507–519 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Aw, J. G. A. et al. Determination of isoform-specific RNA structure with nanopore long reads. Nat. Biotechnol. 39, 336–346 (2021).

    Article  CAS  PubMed  Google Scholar 

  58. Berget, S. M. Exon recognition in vertebrate splicing. J. Biol. Chem. 270, 2411–2414 (1995).

    Article  CAS  PubMed  Google Scholar 

  59. De Conti, L., Baralle, M. & Buratti, E. Exon and intron definition in pre-mRNA splicing. Wiley Interdiscip. Rev. RNA 4, 49–60 (2013).

    Article  PubMed  Google Scholar 

  60. Paggi, J. M. & Bejerano, G. A sequence-based, deep learning model accurately predicts RNA splicing branchpoints. RNA 24, 1647–1658 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Smalec, B. M. et al. Genome-wide quantification of RNA flow across subcellular compartments reveals determinants of the mammalian transcript life cycle. Preprint at bioRxiv https://doi.org/10.1101/2022.08.21.504696 (2022).

  62. Mercatante, D. R., Mohler, J. L. & Kole, R. Cellular response to an antisense-mediated shift of Bcl-x pre-mRNA splicing and antineoplastic agents. J. Biol. Chem. 277, 49374–49382 (2002).

    Article  CAS  PubMed  Google Scholar 

  63. Sierakowska, H., Sambade, M. J., Schümperli, D. & Kole, R. Sensitivity of splice sites to antisense oligonucleotides in vivo. RNA 5, 369–377 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Haeussler, M. et al. Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR. Genome Biol. 17, 148 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  65. Ran, F. A. et al. Genome engineering using the CRISPR–Cas9 system. Nat. Protoc. 8, 2281–2308 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Drexler, H. L. et al. Revealing nascent RNA processing dynamics with nano-COP. Nat. Protoc. 16, 1343–1375 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Lopez-Delisle, L. et al. pyGenomeTracks: reproducible plots for multivariate genomic datasets. Bioinformatics 37, 422–423 (2021).

Download references

Acknowledgements

We thank members of the Churchman lab, W. Timp, D. Whye, and D. Wood for helpful discussions, advice, and assistance; C. Patil, H. Merens, R.S. Isaac, and N. Kramer for critical reading of the manuscript; D. Meng and Y. Jia (Tsinghua University) for raw nanopore sequencing files from Nmf291−/− mice; B. Battersby (Institute of Biotechnology, University of Helsinki) for human myoblasts; and the Biopolymers facility at Harvard Medical School and the Harvard University Bauer Core Facility for sequencing services. This work was supported by the NIH (R01-GM136794, R21-HG011682 and R01-HG010538 to L.S.C.), the Burroughs Wellcome Fund (S.R.), the Fonds de Recherche du Québec - Santé, and the Canadian Institutes of Health Research (post-doctoral fellowship awards to K.C.). This research was conducted with support from the Human Neuron Core within the Rosamund Stone Zander Translational Neuroscience Center, Boston Children’s Hospital, which is also supported by the IDDRC (NIH P50HD105351).

Author information

Authors and Affiliations

Authors

Contributions

Conceptualization, K.C. and L.S.C.; Methodology, K.C. (lead), B.M.S., S.-L.D., S.R. and L.S.C.; Investigation, K.C. (lead), A.R.B.-K., S.-L.D., B.M.S.; Software/Formal Analysis, K.C. (lead) and S.-L.D.; Writing – Original Draft, K.C. and L.S.C.; Writing – Review & Editing, K.C., A.R.B.-K., S.-L.D., B.M.S., S.R. and L.S.C.; Funding Acquisition, K.C., S.R. and L.S.C.; Supervision, S.R. and L.S.C.

Corresponding author

Correspondence to L. Stirling Churchman.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Structural & Molecular Biology thanks the anonymous reviewers for their contribution to the peer review of this work. Carolina Perdigoto and Dimitris Typas were the primary editors on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Intermediate isoforms are undergoing active processing. (related to Fig. 1).

a) Distribution of RNA 3′ ends based on read splicing status in nano-COP data14 from K562 cells. 3′ end features are defined as in14. b) UpSet plot showing the position of introns present in partially spliced reads from dRNA-seq of poly(A)-selected chromatin-associated RNA. ‘Middle’ includes introns that are not the first or last intron in a transcript. Two biological replicates are displayed. c) Scatter plot of the fraction of unspliced reads per intron between two biological replicates. Pearson’s correlation coefficient is shown on the plot (p-value < 0.0001, 95% confidence interval 0.948–0.950). Only reads spanning two introns or more are included. d) Coverage tracks from dRNA-seq of poly(A)-selected chromatin-associated RNA for three genes displaying various levels of post-transcriptional splicing. Introns with fraction of unspliced reads > 0.1 are shaded in grey. e) EXOSC10 mRNA levels and f) RNA levels of two promoter upstream transcripts (PROMPT) following shRNA-mediated knockdown (KD) of EXOSC10, as measured by qRT-PCR of total RNA. Dots represent biological replicates. g) Proportion of reads mapping to each intermediate isoform for intron groups used in splicing order analyses in WT K562 cells (Fig. 2). Biological duplicates for each shRNA treatment are shown side by side. h) Proportion of intermediate isoforms and intron groups that show a significant change (FDR < 0.1 and odds ratio > 1 or < −1) in abundance upon EXOSC10 KD compared to a scrambled control. i) Distribution of median poly(A) tail lengths as a function of splicing level in K562 replicate B. All reads are classified into groups of 3-4 introns based on the number of proximal post-transcriptionally excised introns that the read covers (splicing level). The median poly(A) tail length across reads is calculated for each number of excised introns in each intron group. Splicing levels were compared using a two-sided Wilcoxon rank-sum test (p-value < 0.001 for all comparisons). In G) and I), boxplots elements are shown as follows: center line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range; points, outliers.

Source data

Extended Data Fig. 2 Intron groups coverage, reproducibility and overlap with detained introns. (related to Fig. 2).

a) Features of introns that were included or not in splicing order analyses in K562 cells. Groups were compared using a two-sided Wilcoxon rank-sum test (***: p-value < 0.001). b) Expression level in K562 cells for genes (RPKM > 1) used in splicing order analyses. c) Read splicing status for the intron groups used in splicing order analyses. Two K562 biological replicates are shown. Each dot represents one intron group. d) Cumulative distribution function (CDF) of the number of reads at each splicing level for the intron groups used in splicing order analyses. e) Correlation in number of intermediate isoform reads or f) splicing order scores between K562 biological replicates prior to and after merging into replicates A and B. Each dot represents one intermediate isoform (E) or one splicing order (F). Pearson’s correlation coefficients are shown on the plots [p-value < 0.0001, intermediate isoforms: 95% confidence interval 0.88–0.90 (reps 3 vs. 4), 0.90–0.91 (reps 1 vs. 2), 0.91–0.92 (reps A vs. B), splicing order scores: 0.95–0.96 (reps 3 vs. 4), 0.95–0.96 (reps 1 vs. 2), 0.96–0.97 (reps A vs. B)]. g) Scores for splicing orders with a rank lower than 5. h) Overlap between detained introns (DIs) from31 and introns included in splicing order analyses, for introns present in genes expressed (RPKM or FPKM > 1) in K562 or HeLa cells and in at least one cell type from31. i) CDF of number of introns that are DIs or not for genes in which at least one DI was identified in31. j) Proportion of introns that were previously identified as DIs as a function of their position in the top ranked splicing order in K562 cells. Groups were compared using a two-sided Fisher’s exact test. *: p-value < 0.05, **: p-value < 0.01, ***: p-value < 0.001. For A) and C), boxplots elements are shown as follows: center line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range. In A), points represent outliers.

Source data

Extended Data Fig. 3 Splicing order evenness and selection of coverage thresholds. (related to Fig. 2).

a) Shannon diversity index of splicing order across all analyzed intron groups, separated by the number of introns in each group (3 or 4). Shannon diversity index is compared for measured (orange) or random (grey) splicing order. Boxplots elements are shown as follows: center line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range; points: outliers. b) Splicing order plots for the intron groups with the highest evenness for groups of 3 or 4 introns, respectively. The evenness (E) is shown in red font for each intron group. c) Cumulative distribution function plot of the number of reads at each splicing level for the intron groups used in splicing order analyses as a function of evenness. Evenness was binned into quartiles separately for groups of 3 or 4 introns. d) and e) Splicing order simulations with varying levels of read coverage and evenness values for 3 (D) or 4 (E) analyzed introns. Each possible splicing order is shown in a different color and the mean splicing order score for 100 simulations is displayed with 95% confidence interval.

Source data

Extended Data Fig. 4 Features associated with splicing order. (related to Fig. 2).

a) Splicing index from short-read RNA-seq of rRNA-depleted nascent RNA, as a function of the position of introns in the top ranked splicing order from chromatin polyA+ dRNA-seq. b) Correlation between splicing index from A) and the number of intermediate isoform reads with a given intron excised from chromatin polyA+ dRNA-seq. Each dot represents one intron included in splicing order analyses in K562 cells. Pearson’s correlation coefficient is shown on the plot [p-value < 0.0001, 95% confidence interval 0.50–0.57 (repA chromatin), 0.50–0.56 (repB chromatin), 0.24–0.34 (repA 4sU-chromatin and repB 4sU-chromatin)]. c) Acceptor and donor bond half-lives25 as a function of the position of introns in the top ranked splicing order from chromatin polyA+ dRNA-seq. d) Splicing status of reads mapping to the intron pairs (N = 69) compared between nano-COP14 and polyA+ dRNA-seq in E. e) Splicing order for pairs of consecutive introns in nano-COP vs. polyA+ dRNA-seq. Vertical bars show the range between the two biological replicates for chromatin polyA+ dRNA-seq. Pearson’s correlation coefficient is shown on the plot (p-value < 0.0001, 95% confidence interval 0.88–0.95). f) Intron length as a function of the position of introns in the top ranked splicing order per intron group. g) Proportion of intermediate isoform reads with a given intron excised for introns with strong, moderate or weak splice sites. h) Number of intron groups used in splicing order analyses that include a first or last intron in a transcript or both. i) Number of introns that are first or last in a transcript as a function of their position in the top ranked splicing order. For A), C), F) and G), introns or groups were compared using a two-sided Wilcoxon rank-sum test. For I), groups were compared using a two-sided Fisher’s exact test. *: p-value < 0.05, **: p-value < 0.01, ***: p-value < 0.001. In A), C), D), F) and G, boxplots elements are shown as follows: center line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range. For A), C) and G), points represent outliers.

Source data

Extended Data Fig. 5 Splicing order is consistent across cell types. (related to Figs. 3 and 4).

a) Western blot showing the cytoplasm, nucleoplasm and chromatin fractions obtained from cellular fractionation at each sMN differentiation timepoint. The exposition time used for each blot is noted at the bottom. More cells were used at day 9, resulting in higher abundance of the markers. b) Expression of differentiation markers during sMN differentiation, as measured by short-read RNA-seq of chromatin-associated RNA. Dots represent biological replicates. c) Principal component analysis of gene expression or exon inclusion in short-read RNA-seq across all timepoints of sMN differentiation. d) Correlation in number of intermediate isoform reads or splicing order scores between biological replicates for the intron groups used in splicing order analyses in HeLa cells and differentiating sMN at Days 9 and 14. Each dot represents one intermediate isoform (left) or one splicing order (right). For Day 14, replicates 1 and 2 were merged to achieve similar coverage as replicate 3. Pearson’s correlation coefficient is shown on the plot [p-value < 0.0001, 95% confidence interval intermediate isoform counts: 0.87–0.89 (HeLa), 0.89–0.92 (sMN Day 9), 0.86–0.88 (sMN Day 14), splicing order scores: 0.95–0.96 (HeLa), 0.96–0.97 (sMN Day 9), 0.94–0.95 (sMN Day 14)]. e) Proportion of intron groups with the same top splicing order for each pairwise comparison of cell types indicated on the x-axis. The total number of intron groups per comparison is shown on each bar. f) Evenness of splicing order for intron groups analyzed across cell types, separated by whether they share the same top splicing order across cell types or not. Boxplots elements are shown as follows: center line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range; points: outliers. g) Top: dRNA-seq coverage tracks of an example alternative cassette exon (in yellow) with differential inclusion during sMN differentiation. AS introns are shown in yellow font. Bottom: Splicing order plot for SLC37A4, shown separately for inclusion and exclusion of the alternative exon and for each timepoint that had sufficient coverage of a specific isoform.

Source data

Extended Data Fig. 6 Splicing order is consistent between alternative isoforms and displays later removal of AS introns. (related to Fig. 4).

a) Number of intron groups as a function of the order in which AS introns are removed in the top ranked splicing order per intron group. Each x-axis label compares the splicing order in inclusion vs. exclusion isoforms (top) or Day 9 vs. Day 14 (bottom). b) Correlation in splicing order scores for AS introns between exclusion and inclusion isoforms (top) or between Days 9 and 14 (bottom). Each dot represents one possible splicing order for one intron group. Pearson’s correlation coefficient is shown on the plot [p-value < 0.001, 95% confidence interval 0.82–0.95 (Day 9), 0.62–0.88 (Day 14), 0.72–0.91(exclusion), 0.48–0.87 (inclusion)]. c) Evenness for inclusion and exclusion isoforms. Isoforms were compared using a two-sided Wilcoxon rank-sum test (p-value > 0.05). d) Splicing index from short-read RNA-seq of chromatin-associated RNA for all intron groups that contain a differentially included exon between Days 9 and 14 (right) or for only those that had sufficient coverage to be analyzed by dRNA-seq (left). ‘Inclusion intron 1’, ‘inclusion intron 2’ and ‘exclusion intron’ refer to the three possible AS introns that flank or overlap an alternative exon. ‘Upstream’ and ‘downstream’ intron refer to the next proximal introns upstream or downstream of the AS introns. e) Splicing order for four mutually exclusive exons (MXE) AS events. Splicing order plots are shown for each isoform and time point that has sufficient coverage. The percent spliced in (PSI) for isoform #2 is shown for each timepoint. In C) and D), boxplots elements are shown as follows: center line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range. In C), points represent intron groups. In D), points represent outliers.

Source data

Extended Data Fig. 7 Splicing order changes upon U2-snRNA-mediated exon skipping. (related to Fig. 5).

a) NET-seq pause index upon control or U2 snRNA KD. b) Percent of excised introns as a function of the distance transcribed past the 3′ SS for nano-COP reads from control and U2 snRNA KD. Two biological replicates per condition are displayed. c) Splicing order in nano-COP data from control and U2 snRNA KD for pairs of consecutive introns. Dots represent biological replicates. d) Total RNA-seq splicing index for introns that are sensitive to U2 snRNA KD compared to those that are stable. Three biological replicates (‘rep’) per condition are shown. e) Summed splicing order score for introns that are sensitive or stable upon U2 snRNA knockdown as a function of their position in splicing orders in WT HeLa cells. f) Number of reads mapping to each intron group as a function of the number of excised introns and their status upon U2 snRNA KD. g) Proportion of genes with and without intron retention in short-read RNA-seq from upon U2 snRNA KD, as a function of the exon skipping level difference between control and U2 snRNA KD in the same genes. h) Number of genes showing co-occurrence of skipped exons (SE) and retained introns (RI) in dRNA-seq, as depicted in Fig. 5f. i) Summed splicing order score for RIs upon U2 snRNA knockdown as a function of their position in splicing orders in WT HeLa cells. Intron groups are composed of one intron involved in RI and two introns involved in SE. j) Number of genes showing co-occurrence of SE(s) and RI(s) in cDNA-PCR nanopore sequencing data from WT and NMF291−/− (U2 snRNA mutant) mice53. K) Example of a SE and RI co-occurrence in Puf60 in NMF29−/− mice. ***: p-value < 0.001 in the one-sided binomial test assessing co-occurrence of SE and RI. In A), D), E) and I), boxplots elements are shown as follows: center line, median; box limits, upper and lower quartiles; whiskers, 1.5x interquartile range; points, outliers. In E) and I), groups were compared using a two-sided Wilcoxon rank-sum test; *: p-value < 0.05; ***: p-value < 0.001.

Source data

Extended Data Fig. 8 Splicing order changes in IFRD2 upon U2 snRNA KD. (related to Fig. 5).

a) Working model to explain splicing order changes upon U2 snRNA KD. b) RT-PCR products from amplification of exons 2 to 5 and 9 to 12 of IFRD2, showing increased retention of introns 2 to 4 and 9 to 11 upon U2 snRNA KD compared to control. The identity of the spliced and unspliced products is shown on the right of the gel. Two biological replicates per knockdown condition are displayed. c) Scatter plots showing the correlation in frequency of intermediate isoform reads per splicing level for cDNA-PCR sequencing of IFRD2 exons 4–10 from chromatin-associated RNA with 12, 16 or 20 PCR cycles. Pearson’s R correlation coefficients are shown on each plot (p-value < 0.0001, 95% confidence interval: control KD, 12 cycles: 0.91–0.97; control KD, 16 cycles: 0.95–0.99; U2 snRNA KD, 12 cycles: 0.91–0.98; U2 snRNA KD, 16 cycles: 0.97–0.99). The data obtained from 20 cycles was used for subsequent analyses. d) Representation of reads mapping to introns 4 to 9 of IFRD2 upon cDNA-PCR sequencing of chromatin-associated RNA from control or U2 snRNA KD. Each line represents one read and each column represents one intron. For each number of excised introns, 100 reads were randomly subsampled from the total dataset. A black rectangle highlights the splicing order reversal upon U2 snRNA KD. e) Splicing order plots for IFRD2 in the absence (top) or presence (bottom) of SEs. On the y-axis, introns that are removed together to result in SE are shown separated by an underscore. The top 10 splicing orders per category (with or without SE) are shown. Experiments shown in C), D) and E) were performed on biological duplicates and data is displayed for one representative replicate per condition.

Source data

Extended Data Fig. 9 Perturbation of one intron disrupts excision of proximal introns. (related to Fig. 6).

a) RT-PCR following treatment with a positive control ASO resulting in the use of an alternative 5′SS in BCL2L1 (red asterisk). b) RT-PCR following treatment with ASOs targeting the 3′SS of introns 4, 5 and 9 of IFRD2. Asterisks indicate the alternative splicing events in which exons 5 to 8 are skipped, without (orange) or with (red) intron 9 retention. c) Proportion of reads with different alternative splicing events observed in cDNA-PCR nanopore sequencing of total RNA following treatment with ASOs targeting IFRD2. RI: retained intron, SE: skipped exon(s), int: intron, ex: exon. d) Deletion tiling of IFRD2 using CRISPR-Cas9 editing. The resulting deletions are shown as black horizontal lines delimited by vertical lines aligned to the sgRNA(s) used (blue triangles). Left: Percent of reads with the expected deletion in cDNA-PCR nanopore sequencing of chromatin-associated RNA. Deletions made with only one sgRNA and non-targeting controls were not assessed (N/A). Right: Percent of reads with exon skipping in each deletion. ‘Undetermined splicing pattern’ refers to reads in which a splicing event does not map to annotated intron-exon junctions and results from detection of the deletion and/or from the use of a cryptic splice site as a result of the deletion. The reads shown on the left and right plots are not mutually exclusive, where reads containing the deletion (left) can also be classified as ‘undetermined splicing pattern’ (right) if the deletion overlaps with exon(s). e) Top: Chromatograms from Sanger sequencing of pools of cells edited with a non-targeting sgRNA and one targeting the end of exon 5. The bottom bar plot shows the predicted type of indel introduced. The x-axis indicates the position relative to the expected cut site. f) RT-PCR of introns 2 to 4 of IFRD2 for the deletions shown in D) that are not displayed in Fig. 6f. g) Expected outcomes for intron 4 excision based on the intron and exon definition models of splice site recognition58,59. The observation of intron 4 retention (Fig. 6) is consistent with exon definition.

Source data

Extended Data Fig. 10 DMS-MaPseq analysis of IFRD2 intron 9. (related to Fig. 6).

Top: DMS-MaPseq reactivity for IFRD2 intron 9 (grey shade) +/− 10 nucleotides in the in vitro transcribed RNAs shown above. Individual dots represent two biological replicates. Bottom: Heatmap showing the log2 fold change of DMS reactivity in U2_SE or del_int4-8 relative to WT. Positions corresponding to G’s and T’s, which do not react with DMS, are shown in light grey. Positions where the largest fold change is observed in both mutated constructs are shown in orange font. Predicted branch points45,60 are shown as brown rectangles and the two predicted SRSF1 binding sites (ESEFinder46) with the highest scores are shown as beige rectangles.

Source data

Supplementary information

Supplementary Information

Supplementary Note 1

Reporting Summary

Supplementary Tables

Supplementary Tables 1–10

Source data

Source Data Fig. 1

Statistical Source Data

Source Data Fig. 2

Statistical Source Data

Source Data Fig. 3

Statistical Source Data

Source Data Fig. 4

Statistical Source Data

Source Data Fig. 4

Unprocessed gel

Source Data Fig. 5

Statistical Source Data

Source Data Fig. 5

Unprocessed gel

Source Data Fig. 6

Statistical Source Data

Source Data Fig. 6

Unprocessed gel

Source Data Extended Data Fig. 1

Statistical Source Data

Source Data Extended Data Fig. 2

Statistical Source Data

Source Data Extended Data Fig. 3

Statistical Source Data

Source Data Extended Data Fig. 4

Statistical Source Data

Source Data Extended Data Fig. 5

Statistical Source Data

Source Data Extended Data Fig. 5

Unprocessed western blot

Source Data Extended Data Fig. 6

Statistical Source Data

Source Data Extended Data Fig. 7

Statistical Source Data

Source Data Extended Data Fig. 8

Statistical Source Data

Source Data Extended Data Fig. 8

Unprocessed gel

Source Data Extended Data Fig. 9

Statistical Source Data

Source Data Extended Data Fig. 9

Unprocessed gel

Source Data Extended Data Fig. 10

Statistical Source Data

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Choquet, K., Baxter-Koenigs, A.R., Dülk, SL. et al. Pre-mRNA splicing order is predetermined and maintains splicing fidelity across multi-intronic transcripts. Nat Struct Mol Biol 30, 1064–1076 (2023). https://doi.org/10.1038/s41594-023-01035-2

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41594-023-01035-2

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing