A systems view of spliceosomal assembly and branchpoints with iCLIP

Article metrics

Abstract

Studies of spliceosomal interactions are challenging due to their dynamic nature. Here we used spliceosome iCLIP, which immunoprecipitates SmB along with small nuclear ribonucleoprotein particles and auxiliary RNA binding proteins, to map spliceosome engagement with pre-messenger RNAs in human cell lines. This revealed seven peaks of spliceosomal crosslinking around branchpoints (BPs) and splice sites. We identified RNA binding proteins that crosslink to each peak, including known and candidate splicing factors. Moreover, we detected the use of over 40,000 BPs with strong sequence consensus and structural accessibility, which align well to nearby crosslinking peaks. We show how the position and strength of BPs affect the crosslinking patterns of spliceosomal factors, which bind more efficiently upstream of strong or proximally located BPs and downstream of weak or distally located BPs. These insights exemplify spliceosome iCLIP as a broadly applicable method for transcriptomic studies of splicing mechanisms.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: Spliceosome iCLIP identifies protein interactions with snRNAs and splicing substrates.
Fig. 2: Analysis of spliceosomal interactions with pre-mRNAs in vitro and in vivo.
Fig. 3: Identification of RBPs overlapping with spliceosomal peaks at BPs and 3′SS.
Fig. 4: Comparison of BPs identified by spliceosome iCLIP, RNA-seq lariat reads or computational prediction.
Fig. 5: Spliceosome assembly at BPs identified by spliceosome iCLIP, RNA-seq lariat reads or computational prediction.
Fig. 6: BP position defines the binding patterns of splicing factors at 3′SS.
Fig. 7: BP strength correlates with the binding of splicing factors.

Data availability

The spliceosome iCLIP data generated and analyzed during the current study are available on EBI ArrayExpress under the accession number E-MTAB-8182 and are also available in raw and processed format on https://imaps.genialis.com/iclip. Additional datasets used in this study are listed in Supplementary Dataset 4. Source data for Fig. 1c are available online. Other data are available upon request.

Code availability

The code to identify BPs from spliceosome iCLIP reads is publicly available at the GitHub repository (https://github.com/nebo56/branch-point-detection-2).

References

  1. 1.

    Fica, S. M. & Nagai, K. Cryo-electron microscopy snapshots of the spliceosome: structural insights into a dynamic ribonucleoprotein machine. Nat. Struct. Mol. Biol. 24, 791–799 (2017).

  2. 2.

    Wahl, M. C., Will, C. L. & Lührmann, R. The spliceosome: design principles of a dynamic RNP machine. Cell 136, 701–718 (2009).

  3. 3.

    Chen, W. et al. Transcriptome-wide interrogation of the functional intronome by spliceosome profiling. Cell 173, 1031–1044 e13 (2018).

  4. 4.

    Burke, J. E. et al. Spliceosome profiling visualizes operations of a dynamic RNP at nucleotide resolution. Cell 173, 1014–1030 e17 (2018).

  5. 5.

    Wickramasinghe, V. O. et al. Regulation of constitutive and alternative mRNA splicing across the human transcriptome by PRPF8 is determined by 5’ splice site strength. Genome Biol. 16, 201 (2015).

  6. 6.

    König, J. et al. iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution. Nat. Struct. Mol. Biol. 17, 909–915 (2010).

  7. 7.

    Taggart, A. J. et al. Large-scale analysis of branchpoint usage across species and cell lines. Genome Res. 27, 639–649 (2017).

  8. 8.

    Pineda, J. M. B. & Bradley, R. K. Most human introns are recognized via multiple and tissue-specific branchpoints. Genes Dev. 32, 577–591 (2018).

  9. 9.

    Mercer, T. R. et al. Genome-wide discovery of human splicing branchpoints. Genome Res. 25, 290–303 (2015).

  10. 10.

    Carissimi, C., Saieva, L., Gabanella, F. & Pellizzoni, L. Gemin8 is required for the architecture and function of the survival motor neuron complex. J. Biol. Chem. 281, 37009–37016 (2006).

  11. 11.

    Huppertz, I. et al. iCLIP: protein-RNA interactions at nucleotide resolution. Methods 65, 274–287 (2014).

  12. 12.

    Paggi, J. M. & Bejerano, G. A sequence-based, deep learning model accurately predicts RNA splicing branchpoints. RNA 24, 1647–1658 (2018).

  13. 13.

    Lee, F. C. Y. & Ule, J. Advances in CLIP technologies for studies of protein-RNA interactions. Mol. Cell 69, 354–369 (2018).

  14. 14.

    Sugimoto, Y. et al. Analysis of CLIP and iCLIP methods for nucleotide-resolution studies of protein-RNA interactions. Genome Biol. 13, R67 (2012).

  15. 15.

    Haberman, N. et al. Insights into the design and interpretation of iCLIP experiments. Genome Biol. 18, 7 (2017).

  16. 16.

    Van Nostrand, E. L. et al. A large-scale binding and functional map of human RNA binding proteins. Preprint at bioRxiv https://doi.org/10.1101/179648 (2017).

  17. 17.

    Bessonov, S., Anokhina, M., Will, C. L., Urlaub, H. & Luhrmann, R. Isolation of an active step I spliceosome and composition of its RNP core. Nature 452, 846–850 (2008).

  18. 18.

    Gozani, O., Feld, R. & Reed, R. Evidence that sequence-independent binding of highly conserved U2 snRNP proteins upstream of the branch site is required for assembly of spliceosomal complex A. Genes Dev. 10, 233–243 (1996).

  19. 19.

    Zarnack, K. et al. Direct Competition between hnRNP C and U2AF65 Protects the Transcriptome from the Exonization of Alu Elements. Cell 152, 453–466 (2013).

  20. 20.

    Zhang, X. et al. Structure of the human activated spliceosome in three conformational states. Cell Res. 28, 307–322 (2018).

  21. 21.

    Jacquier, A. & Rosbash, M. RNA splicing and intron turnover are greatly diminished by a mutant yeast branch point. Proc. Natl Acad. Sci. USA 83, 5835–5839 (1986).

  22. 22.

    Hesselberth, J. R. Lives that introns lead after splicing. Wiley Inter. Rev. RNA 4, 677–691 (2013).

  23. 23.

    Talhouarne, G. J. S. & Gall, J. G. Lariat intronic RNAs in the cytoplasm of vertebrate cells. Proc. Natl Acad. Sci. USA 115, E7970–E7977 (2018).

  24. 24.

    Scotti, M. M. & Swanson, M. S. RNA mis-splicing in disease. Nat. Rev. Genet. 17, 19–32 (2016).

  25. 25.

    Lorenz, R. et al. ViennaRNA Package 2.0. Algorithms Mol. Biol. 6, 26 (2011).

  26. 26.

    Gioanni, J. et al. Establishment and characterisation of a new tumorigenic cell line with a normal karyotype derived from a human breast adenocarcinoma. Br. J. Cancer 62, 8–13 (1990).

  27. 27.

    Blazquez, L. et al. Exon junction complex shapes the transcriptome by repressing recursive splicing. Mol. Cell 72, 496–509 e9 (2018).

  28. 28.

    Chakrabarti, A., Haberman, N., Praznik, A., Luscombe, N. M. & Ule, J. Data science issues in studying protein–RNA interactions with CLIP technologies. Annu. Rev. Biomed. Data Sci. 1, 235–261 (2018).

Download references

Acknowledgements

We thank M. Llorian for help with the in vitro splicing reactions, K. Zarnack and G. Rot for help with the data analyses and L. Strittmatter and members of the Ule lab for helpful discussions and comments on the manuscript. This work was supported primarily by the European Research Council (grant nos. 206726-CLIP and 617837-Translate) and the Slovenian Research Agency (grant nos. P2-0209, Z7-3665 and J7-5460). C.R.S. was supported by an Edmond Lily Safra Fellowship and a Sir Henry Dale Fellowship jointly funded by the Wellcome Trust and the Royal Society (grant no. 215454/Z/19/Z). A.S.E. is supported by the Biotechnology and Biological Sciences Research Council (grant no. BB/M009513/1). A.M.C. is supported by a Wellcome Trust PhD Training Fellowship for Clinicians (no. 110292/Z/15/Z). D.P. and V.O.W. were supported by Medical Research Council grants (nos. MC_UU_12022/1 and MC_UU_12022/8 to A.R.V). L.P. was supported by the National Institute of Neurological Disorders and Stroke of the National Institutes of Health (NIH-NINDS) (grant no. R01 NS102451). The Francis Crick Institute receives its core funding from Cancer Research UK (grant no. FC001002), the UK Medical Research Council (grant no. FC001002) and the Wellcome Trust (grant no. FC001002).

Author information

M.B., C.R.S. and J.U. conceived the project, designed the experiments and wrote the manuscript with the assistance of all co-authors. M.B., C.R.S., Z.W., R.F. and A.S.E. performed experiments with assistance from J.U., J.K. and C.W.S. N.H. performed most of the computational analyses with assistance from C.R.S., T.C., R.F., A.M.C. and N.M.L. V.O.W., D.P. and A.R.V. provided crosslinked pellets from wild-type and PRPF8-depleted Cal51 cells. L.S. and L.P. developed and characterized the monoclonal antibody 18F6.

Correspondence to Jernej Ule.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Anke Sparmann was the primary editor on this article and managed its editorial process and peer review in collaboration with the rest of the editorial team.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 Quality control of spliceosome iCLIP with the anti-SmB/B’ antibodies.

(a) Western blot analysis of total HeLa cell extract with 18F6 antibody reveals a single band of 28 kDa. (b) Analysis of HeLa cells by immunostaining with 18F6 and epifluorescence microscopy shows expected localization of SmB/B′ (Cajal bodies and a speckled nuclear pattern excluding nucleoli). Scale bar, 5 μm. (c) UV-crosslinked HEK FLP-in cells with stably integrated 3×Flag-SmB were lysed under stringent conditions and subjected to partial RNase I digestion (+, final dilution 1:100,000; ++, final dilution 1:5,000). Spliceosomal RNPs were immunopurified with anti-Flag M2 antibody, RNA was 5′ end-radiolabeled, and RNPs were subjected to denaturing gel electrophoresis and nitrocellulose transfer, an autoradiogram of which is shown. The interrupted line indicates the area on the nitrocellulose membrane cut out for purification of crosslinked RNP complexes. (d) Autoradiogram of crosslinked RNPs after immunopurification with the anti-SmB/B′ antibodies 18F6, 12F5 or Y12 (ab3138, Abcam). HeLa cell pellet was lysed in medium lysis buffer and subjected to high (++, final dilution 1:10,000) or low (+, final dilution 1:100,000) concentrations of RNase I. Lysates were split evenly between beads for immunopurification. RNAs of immunopurified RNP complexes were radiolabeled at the 5′ end followed by size-separation on denaturing gels and nitrocellulose transfer. The time below each panel indicates length of exposure during autoradiography.

Supplementary Figure 2 Analysis of spliceosome iCLIP from cell extracts and in vitro splicing reactions.

(a) RNA maps of summarized crosslinking around the exon-intron, intron-exon junctions and computationally top-scoring BPs in human introns (Paggi, J. M. and Bejerano, G., RNA. 24, 1647-1658, 2018) for spliceosome iCLIP performed under medium or mild conditions from HEK293, K562 and HepG2 cells. (b) Immunoblot (IB) analysis of PRPF8 knockdown (KD) efficiency in Cal51 cells. (c) RNAs transcribed in vitro from a C6orf10 minigene construct were incubated with HeLa nuclear extracts (NE) as part of in vitro splicing reactions in the presence or absence of ATP. Resulting splicing products and intermediates were resolved by denaturing gel electrophoresis and visualized by autoradiography. (d) In vitro splicing reactions were diluted in mild lysis buffer, subjected to low RNase I treatment (final dilution 1:200,000) and used for spliceosome iCLIP. Autoradiogram of crosslinked size-separated RNP complexes show the radiolabeled RNA that is crosslinked to RBPs. The interrupted line indicates the area cut out from the nitrocellulose membrane for extraction of crosslinked RNAs which were used as a template for generating iCLIP cDNA libraries. (e) Normalized spliceosome iCLIP cDNA counts on the C6orf10 in vitro splicing product. Exons are marked by grey boxes. As expected, junction reads are almost exclusively present only in the +ATP library.

Supplementary Figure 3 Comparison of BPs determined by spliceosome iCLIP to other methods.

(a) Enrichment of mismatches at the first nucleotide of spliceosome iCLIP reads that overlap with ends of introns, compared to remaining iCLIP reads. (b) Reads that start in the region 40..20 nt upstream of 3′SS and do not align with intron ends. (c) Reads that start in the region 40..20 nt upstream of 3′SS and align with intron ends. (d) A table providing the number of BPs identified by spliceosome iCLIP (iCLIP BPs) in introns that also contain a computationally identified BP (Paggi, J. M. and Bejerano, G., RNA. 24, 1647-1658, 2018). They are divided into three categories based on the distance between the iCLIP BP and the top-scoring computational BP in each intron. (e) Weblogo of four categories of non-overlapping BP that are <5 nt away from each other, centered either on iCLIP or computational BPs, and separated according to the relative position of iCLIP vs computational BP (upstream or downstream). (f) The distribution of top BPs identified by published RNA-seq (Mercer, T. R. et al., Genome Res. 25, 290-303, 2015) (i.e., the BP with most lariat-spanning reads in each intron) around the BPs identified by spliceosome iCLIP (i.e., iCLIP BPs). (g) The distribution of top BPs identified by published RNA-seq (Taggart, A. J. et al., Genome Res. 27, 639-649, 2017) (i.e., the BP with most lariat-spanning reads in each intron) around the BPs identified by spliceosome iCLIP (i.e., iCLIP BPs). (h) A table providing the number of BPs identified by spliceosome iCLIP (iCLIP BPs) in introns that also contain a BP assigned by lariat-spanning reads from RNA-seq (Mercer, T. R. et al., Genome Res. 25, 290-303, 2015). They are divided into three categories based on the distance between the iCLIP BP and the top RNA-seq BP. (i) A table providing the number of BPs identified by spliceosome iCLIP (iCLIP BPs) in introns that also contain a BP assigned by lariat-spanning reads from RNA-seq (Taggart, A. J. et al., Genome Res. 27, 639-649, 2017). They are divided into three categories based on the distance between the iCLIP BP and the top RNA-seq BP. (j) Weblogo of iCLIP BPs that overlap with RNA-seq BPs (Mercer, T. R. et al., Genome Res. 25, 290-303, 2015). (k) Weblogo of iCLIP BPs that are >5 nt away from RNA-seq BP (Mercer, T. R. et al., Genome Res. 25, 290-303, 2015). (l) Weblogo of RNA-seq BPs (Mercer, T. R. et al., Genome Res. 25, 290-303, 2015) that are >5 nt away from iCLIP BP. (m) Weblogo of iCLIP BPs that overlap with RNA-seq BPs (Taggart, A. J. et al., Genome Res. 27, 639-649, 2017). (n) Weblogo of iCLIP BPs that are >5 nt away from RNA-seq BP (Taggart, A. J. et al., Genome Res. 27, 639-649, 2017). (o) Weblogo of RNA-seq BPs (Taggart, A. J. et al., Genome Res. 27, 639-649, 2017) that are >5 nt away from iCLIP BP. (p,q) The 100 nt RNA region centered on the BP was used to calculate pairing probability with RNAfold program with the default parameters (Lorenz, R. et al., Algorithms for molecular biology : AMB. 6, 26, 2011), and the average pairing probability of each nucleotide around BPs is shown for the 40 nt region around RNA-specific BPs located upstream (p) or downstream (q) of BPs identified by other methods. The A and non-A BPs are shown separately.

Supplementary Figure 4 Spliceosome assembly at method-specific or overlapping BPs.

RNA maps showing crosslinking (as cDNA starts from eCLIP experiments) of SF3B4 from K562 cells (a, b), of U2AF2 from K562 cells (c, d) and of PRPF8 from HepG2 cells (e, f) relative to BPs. BPs were categorized according to the method they were specifically detected by (spliceosome iCLIP, RNA-seq, computational prediction or overlapping) and in case of non-overlapping BPs, according to their location relative to each other: upstream (a, c, e) or downstream (b, d, f) of the other non-overlapping BP. Crosslinking of each RBP is regionally normalized to its average crosslinking over -100..50 nt region relative to 3′SS in order to most clearly allow comparisons between the relative positions of peaks for different RBPs.

Supplementary Figure 5 Crosslinking of many RBPs overlaps with peaks of spliceosomal crosslinking (1).

(a) Crosslinking patterns of selected RBPs (SF3B4, SF3B1, XRN2, EFTUD2, SUPV3L1, RBM22, U2AF2), as defined by cDNA starts of eCLIP or iCLIP in the indicated cell lines. Crosslinking of each is regionally normalized to its average crosslinking over -100..50 nt region relative to 3′SS in order to most clearly allow comparisons between the relative positions of peaks for the different RBPs. All 3′SS that contain BPs within 17..23 nt upstream of the exon are chosen, and crosslinking is plotted in the region -40..10 nt relative to 3′SS, and -40..10 nt relative to BPs. (b) Same as (a), but for all 3′SS that contain BPs within 24..39 nt upstream of the exon. (c) Same as (a), but for all 3′SS that contain BPs within 40..65 nt upstream of the exon.

Supplementary Figure 6 Crosslinking of many RBPs overlaps with peaks of spliceosomal crosslinking (2).

(a) Crosslinking patterns of selected RBPs (SMNDC1, BUD13, GPKOW, SF3A3, PRPF8, U2AF1), as defined by cDNA starts of eCLIP or iCLIP in the indicated cell lines. Crosslinking of each is regionally normalized to its average crosslinking over -100..50 nt region relative to 3′SS in order to most clearly allow comparisons between the relative positions of peaks for the different RBPs. All 3′SS that contain BPs within 17..23 nt upstream of the exon are chosen, and crosslinking is plotted in the region -40..10 nt relative to 3′SS, and -40..10 nt relative to BPs. (b) Same as (a), but for all 3′SS that contain BPs within 24..39 nt upstream of the exon. (c) Same as (a), but for all 3′SS that contain BPs within 40..65 nt upstream of the exon.

Supplementary Figure 7 Relation of BP position and consensus score to binding of splicing factors.

(a) Heatmaps depicting the normalized crosslinking of RBPs in peak regions around 10 groups of BPs that were categorized according to the distance of BPs from 3′SS. Crosslinks were derived as cDNA starts from eCLIP of K562 cells. (b) Heatmaps depicting the normalized crosslinking of RBPs in peak regions around 10 groups of BPs that were categorized according to the computational scores that define BP strength. Crosslinks were derived as cDNA starts from eCLIP of K562 cells. (c) BPs were divided into 10 quantiles based on their sequence consensus score, as determined previously (Paggi, J. M. and Bejerano, G., RNA. 24, 1647-1658, 2018). The median score of each quantile is shown on the x-axis. The 4,410 BPs chosen for this analysis satisfied two criteria: 1) They were located 23-28 nt away from intron-exon junction, and 2) they contained a total of at least 30 crosslink events of SF3 (SF3B4−K562−eCLIP, SF3B4−HepG2−eCLIP and SF3A3−HepG2−eCLIP) in the region 35-10 nt upstream of BPs and U2AF (U2AF2−HepG2−eCLIP, U2AF2−K562−eCLIP and U2AF1−K562−eCLIP) in the region 5-25 nt downstream of BPs (the peak binding region of these RBPs). The y-axis shows the ratio in binding of SF3 relative to U2AF factors (data and positions as in the preceding sentence). P-values for the indicated comparisons were calculated by the pairwise Wilcoxon Rank Sum test. Box-plot elements are defined by center line, median; box limits, upper and lower quartiles; and whiskers, 1.5× interquartile range. (d) BPs were divided into 10 quantiles as in (c). The % of Ys (C or T) in the region 1-21 nt downstream of BPs is shown on the y-axis. Box-plot elements are defined by center line, median; box limits, upper and lower quartiles; and whiskers, 1.5× interquartile range.

Supplementary information

Supplementary Information

Supplementary Figs. 1–7, and Tables 1–4.

Reporting Summary

Supplementary Dataset 1

Related to Fig. 3. Enrichment of eCLIP and iCLIP samples per peak. This table contains the enrichment of cDNA starts for each peak identified by spliceosome iCLIP, which are defined by the positions marked in Fig. 3.

Supplementary Dataset 2

Related to Fig. 4. BPs identified by spliceosome iCLIP, RNA-seq and computational predictions. This table contains genomic positions of all 4 BP groups, which are defined in Fig. 4c.

Supplementary Dataset 3

Related to Fig. 7 and Supplementary Fig. 7. Selected group of BPs identified by spliceosome iCLIP and computational modeling. This table contains a group of BPs that have been identified both by spliceosome iCLIP and computational modeling and are located at 23–28 nt upstream of the 3′SS. Each row represents a unique BP genomic position, computationally predicted BP score, number of spliceosome iCLIP cDNA-start positions, bp distance to the nearest 3′SS, bp length of the polY track and polY coverage between BP and 3′SS.

Supplementary Dataset 4

Related to the Methods. This table summarizes all the experiments used in this study. It contains two spreadsheets (tabs) in total, all dedicated to provide the source of data used to generate the figures. Tab 1 and 2 contain the source and accession number of published iCLIP and eCLIP data.

Source data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Further reading