RNA isoform screens uncover the essentiality and tumor-suppressor activity of ultraconserved poison exons

Abstract

While RNA-seq has enabled comprehensive quantification of alternative splicing, no correspondingly high-throughput assay exists for functionally interrogating individual isoforms. We describe pgFARM (paired guide RNAs for alternative exon removal), a CRISPR–Cas9-based method to manipulate isoforms independent of gene inactivation. This approach enabled rapid suppression of exon recognition in polyclonal settings to identify functional roles for individual exons, such as an SMNDC1 cassette exon that regulates pan-cancer intron retention. We generalized this method to a pooled screen to measure the functional relevance of ‘poison’ cassette exons, which disrupt their host genes’ reading frames yet are frequently ultraconserved. Many poison exons were essential for the growth of both cultured cells and lung adenocarcinoma xenografts, while a subset had clinically relevant tumor-suppressor activity. The essentiality and cancer relevance of poison exons are likely to contribute to their unusually high conservation and contrast with the dispensability of other ultraconserved elements for viability.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Fig. 1: pgFARM facilitates rapid, programmable exon skipping.
Fig. 2: An SMNDC1 poison exon modulates intron retention.
Fig. 3: Design and construction of a poison exon loss-of-function library.
Fig. 4: Unbiased detection of essential exons with pgFARM.
Fig. 5: Many conserved poison exons are essential for cell fitness.
Fig. 6: pgFARM uncovers modifiers of in vivo tumorigenesis.

Data availability

RNA-seq data generated as part of this study have been deposited in the Gene Expression Omnibus (accession number GSE120703). RNA-seq data generated by TCGA were downloaded from the Cancer Genomics Hub (CGHub) and Genomic Data Commons (GDC). Other data that support this study’s findings are available from the authors upon reasonable request. Source data for Figs. 14 and Extended Data Figs. 1, 2, 4, 6 and 10 are presented with the paper.

References

  1. 1.

    Wang, E. T. et al. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470–476 (2008).

  2. 2.

    Pan, Q., Shai, O., Lee, L. J., Frey, B. J. & Blencowe, B. J. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 40, 1413–1415 (2008).

  3. 3.

    Baralle, F. E. & Giudice, J. Alternative splicing as a regulator of development and tissue identity. Nat. Rev. Mol. Cell Biol. 18, 437–451 (2017).

  4. 4.

    Dvinge, H., Kim, E., Abdel-Wahab, O. & Bradley, R. K. RNA splicing factors as oncoproteins and tumour suppressors. Nat. Rev. Cancer 16, 413–430 (2016).

  5. 5.

    Scotti, M. M. & Swanson, M. S. RNA mis-splicing in disease. Nat. Rev. Genet. 17, 19–32 (2016).

  6. 6.

    Stein, C. A. & Castanotto, D. FDA-approved oligonucleotide therapies in 2017. Mol. Ther. 25, 1069–1075 (2017).

  7. 7.

    Inoue, D. et al. Spliceosomal disruption of the non-canonical BAF complex in cancer. Nature 574, 432–436 (2019).

  8. 8.

    Cartegni, L. & Krainer, A. R. Correction of disease-associated exon skipping by synthetic exon-specific activators. Nat. Struct. Biol. 10, 120–125 (2003).

  9. 9.

    Taylor, J. K., Zhang, Q. Q., Wyatt, J. R. & Dean, N. M. Induction of endogenous Bcl-xS through the control of Bcl-x pre-mRNA splicing by antisense oligonucleotides. Nat. Biotechnol. 17, 1097–1100 (1999).

  10. 10.

    Long, C. et al. Correction of diverse muscular dystrophy mutations in human engineered heart muscle by single-site genome editing. Sci. Adv. 4, eaap9004 (2018).

  11. 11.

    Liu, Y. et al. Genome-wide screening for functional long noncoding RNAs in human cells by Cas9 targeting of splice sites. Nat. Biotechnol. 36, 1203–1210 (2018).

  12. 12.

    Bejerano, G. et al. Ultraconserved elements in the human genome. Science 304, 1321–1325 (2004).

  13. 13.

    Lareau, L. F., Inada, M., Green, R. E., Wengrod, J. C. & Brenner, S. E. Unproductive splicing of SR genes associated with highly conserved and ultraconserved DNA elements. Nature 446, 926–929 (2007).

  14. 14.

    Ni, J. Z. et al. Ultraconserved elements are associated with homeostatic control of splicing regulators by alternative splicing and nonsense-mediated decay. Genes Dev. 21, 708–718 (2007).

  15. 15.

    Kurosaki, T., Popp, M. W. & Maquat, L. E. Quality and quantity control of gene expression by nonsense-mediated mRNA decay. Nat. Rev. Mol. Cell Biol. 20, 406–420 (2019).

  16. 16.

    Zheng, Q. et al. Precise gene deletion and replacement using the CRISPR/Cas9 system in human cells. Biotechniques 57, 115–124 (2014).

  17. 17.

    Zhu, S. et al. Genome-scale deletion screening of human long non-coding RNAs using a paired-guide RNA CRISPR–Cas9 library. Nat. Biotechnol. 34, 1279–1286 (2016).

  18. 18.

    Gasperini, M. et al. CRISPR/Cas9-mediated scanning for regulatory elements required for HPRT1 expression via thousands of large, programmed genomic deletions. Am. J. Hum. Genet. 101, 192–205 (2017).

  19. 19.

    Diao, Y. et al. A tiling-deletion-based genetic screen for cis-regulatory element identification in mammalian cells. Nat. Methods 14, 629–635 (2017).

  20. 20.

    Cao, J. et al. An easy and efficient inducible CRISPR/Cas9 platform with improved specificity for multiple gene targeting. Nucleic Acids Res. 44, e149 (2016).

  21. 21.

    Li, Y. et al. A versatile reporter system for CRISPR-mediated chromosomal rearrangements. Genome Biol. 16, 111 (2015).

  22. 22.

    Kosicki, M., Tomberg, K. & Bradley, A. Repair of double-strand breaks induced by CRISPR–Cas9 leads to large deletions and complex rearrangements. Nat. Biotechnol. 36, 765–771 (2018).

  23. 23.

    Lin, X. et al. Failure of MBNL1-dependent post-natal splicing transitions in myotonic dystrophy. Hum. Mol. Genet. 15, 2087–2097 (2006).

  24. 24.

    Kino, Y. et al. Nuclear localization of MBNL1: splicing-mediated autoregulation and repression of repeat-derived aberrant proteins. Hum. Mol. Genet. 24, 740–756 (2015).

  25. 25.

    Charizanis, K. et al. Muscleblind-like 2-mediated alternative splicing in the developing brain and dysregulation in myotonic dystrophy. Neuron 75, 437–450 (2012).

  26. 26.

    Rappsilber, J., Ajuh, P., Lamond, A. I. & Mann, M. SPF30 is an essential human splicing factor required for assembly of the U4/U5/U6 tri-small nuclear ribonucleoprotein into the spliceosome. J. Biol. Chem. 276, 31142–31150 (2001).

  27. 27.

    Dvinge, H. & Bradley, R. K. Widespread intron retention diversifies most cancer transcriptomes. Genome Med. 7, 45 (2015).

  28. 28.

    Jung, H. et al. Intron retention is a widespread mechanism of tumor-suppressor inactivation. Nat. Genet. 47, 1242–1248 (2015).

  29. 29.

    Saltzman, A. L. et al. Regulation of multiple core spliceosomal proteins by alternative splicing-coupled nonsense-mediated mRNA decay. Mol. Cell Biol. 28, 4320–4330 (2008).

  30. 30.

    Amoasii, L. et al. Single-cut genome editing restores dystrophin expression in a new mouse model of muscular dystrophy. Sci. Transl. Med. 9, eaan8081 (2017).

  31. 31.

    Yeo, G. & Burge, C. B. Maximum entropy modeling of short sequence motifs with applications to RNA splicing signals. J. Comput. Biol. 11, 377–394 (2004).

  32. 32.

    Sowalsky, A. G. et al. Whole transcriptome sequencing reveals extensive unspliced mRNA in metastatic castration-resistant prostate cancer. Mol. Cancer Res. 13, 98–106 (2015).

  33. 33.

    The Cancer Genome Atlas Research Network Comprehensive molecular profiling of lung adenocarcinoma. Nature 511, 543–550 (2014).

  34. 34.

    Siepel, A. et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050 (2005).

  35. 35.

    Yan, Q. et al. Systematic discovery of regulated and conserved alternative exons in the mammalian brain reveals NMD modulating chromatin regulators. Proc. Natl Acad. Sci. USA 112, 3445–3450 (2015).

  36. 36.

    Colombo, M., Karousis, E. D., Bourquin, J., Bruggmann, R. & Muhlemann, O. Transcriptome-wide identification of NMD-targeted human mRNAs reveals extensive redundancy between SMG6- and SMG7-mediated degradation pathways. RNA 23, 189–201 (2017).

  37. 37.

    Hart, T. et al. High-resolution CRISPR screens reveal fitness genes and genotype-specific cancer liabilities. Cell 163, 1515–1526 (2015).

  38. 38.

    Aguirre, A. J. et al. Genomic copy number dictates a gene-independent cell response to CRISPR/Cas9 targeting. Cancer Discov. 6, 914–929 (2016).

  39. 39.

    Munoz, D. M. et al. CRISPR screens provide a comprehensive assessment of cancer vulnerabilities but generate false-positive hits for highly amplified genomic regions. Cancer Discov. 6, 900–913 (2016).

  40. 40.

    Meyers, R. M. et al. Computational correction of copy number effect improves specificity of CRISPR–Cas9 essentiality screens in cancer cells. Nat. Genet. 49, 1779–1784 (2017).

  41. 41.

    Haapaniemi, E., Botla, S., Persson, J., Schmierer, B. & Taipale, J. CRISPR–Cas9 genome editing induces a p53-mediated DNA damage response. Nat. Med. 24, 927–930 (2018).

  42. 42.

    Adey, A. et al. The haplotype-resolved genome and epigenome of the aneuploid HeLa cancer cell line. Nature 500, 207–211 (2013).

  43. 43.

    Kohtz, J. D. et al. Protein–protein interactions and 5′-splice-site recognition in mammalian mRNA precursors. Nature 368, 119–124 (1994).

  44. 44.

    Anko, M. L. et al. The RNA-binding landscapes of two SR proteins reveal unique functions and binding to diverse RNA classes. Genome Biol. 13, R17 (2012).

  45. 45.

    Jumaa, H. & Nielsen, P. J. The splicing factor SRp20 modifies splicing of its own mRNA and ASF/SF2 antagonizes this regulation. EMBO J. 16, 5077–5085 (1997).

  46. 46.

    Doench, J. G. Am I ready for CRISPR? A user’s guide to genetic screens. Nat. Rev. Genet. 19, 67–80 (2018).

  47. 47.

    Sharma, S. V. et al. A chromatin-mediated reversible drug-tolerant state in cancer cell subpopulations. Cell 141, 69–80 (2010).

  48. 48.

    Shah, K. N. et al. Aurora kinase A drives the evolution of resistance to third-generation EGFR inhibitors in lung cancer. Nat. Med. 25, 111–118 (2019).

  49. 49.

    Chmielecki, J. et al. Optimization of dosing for EGFR-mutant non-small cell lung cancer with evolutionary cancer modeling. Sci. Transl. Med. 3, 90ra59 (2011).

  50. 50.

    Chen, S. et al. Genome-wide CRISPR screen in a mouse model of tumor growth and metastasis. Cell 160, 1246–1260 (2015).

  51. 51.

    Urbanski, L. M., Leclair, N. & Anczukow, O. Alternative-splicing defects in cancer: splicing regulators and their downstream targets, guiding the way to novel cancer therapeutics. WIREs RNA 9, e1476 (2018).

  52. 52.

    Karni, R. et al. The gene encoding the splicing factor SF2/ASF is a proto-oncogene. Nat. Struct. Mol. Biol. 14, 185–193 (2007).

  53. 53.

    Anczukow, O. et al. The splicing factor SRSF1 regulates apoptosis and proliferation to promote mammary epithelial cell transformation. Nat. Struct. Mol. Biol. 19, 220–228 (2012).

  54. 54.

    Golan-Gerstl, R. et al. Splicing factor hnRNP A2/B1 regulates tumor suppressor gene splicing and is an oncogenic driver in glioblastoma. Cancer Res. 71, 4464–4472 (2011).

  55. 55.

    Huang, X. et al. Enhancers of Polycomb EPC1 and EPC2 sustain the oncogenic potential of MLL leukemia stem cells. Leukemia 28, 1081–1091 (2014).

  56. 56.

    Wang, Y. et al. Epigenetic factor EPC1 is a master regulator of DNA damage response by interacting with E2F1 to silence death and activate metastasis-related gene signatures. Nucleic Acids Res. 44, 117–133 (2016).

  57. 57.

    Mou, H. et al. CRISPR/Cas9-mediated genome editing induces exon skipping by alternative splicing or exon deletion. Genome Biol. 18, 108 (2017).

  58. 58.

    Yuan, J. et al. Genetic modulation of RNA splicing with a CRISPR-guided cytidine deaminase. Mol. Cell 72, 380–394.e7 (2018).

  59. 59.

    Gapinske, M. et al. CRISPR-SKIP: programmable gene splicing with single base editors. Genome Biol. 19, 107 (2018).

  60. 60.

    Konermann, S. et al. Transcriptome engineering with RNA-targeting type VI-D CRISPR effectors. Cell 173, 665–676.e14 (2018).

  61. 61.

    Jillette, N. & Cheng, A. W. CRISPR artificial splicing factors. Preprint at bioRxiv https://doi.org/10.1101/431064 (2018).

  62. 62.

    Ahituv, N. et al. Deletion of ultraconserved elements yields viable mice. PLoS Biol. 5, e234 (2007).

  63. 63.

    Nolte, M. J. et al. Functional analysis of limb transcriptional enhancers in the mouse. Evol. Dev. 16, 207–223 (2014).

  64. 64.

    Dickel, D. E. et al. Ultraconserved enhancers are required for normal development. Cell 172, 491–499.e15 (2018).

  65. 65.

    Schneider, A., Hiller, M. & Buchholz, F. Large-scale dissection suggests that ultraconserved elements are dispensable for mouse embryonic stem cell survival and fitness. Preprint at bioRxiv https://doi.org/10.1101/683565 (2019).

  66. 66.

    Alsafadi, S. et al. Cancer-associated SF3B1 mutations affect alternative splicing by promoting alternative branchpoint usage. Nat. Commun. 7, 10615 (2016).

  67. 67.

    Mayr, C. & Bartel, D. P. Widespread shortening of 3′UTRs by alternative cleavage and polyadenylation activates oncogenes in cancer cells. Cell 138, 673–684 (2009).

  68. 68.

    Pineda, J. M. B. & Bradley, R. K. Most human introns are recognized via multiple and tissue-specific branchpoints. Genes Dev. 32, 577–591 (2018).

  69. 69.

    Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).

  70. 70.

    Goodpaster, T. & Randolph-Habecker, J. A flexible mouse-on-mouse immunohistochemical staining technique adaptable to biotin-free reagents, immunofluorescence, and multiple antibody staining. J. Histochem. Cytochem. 62, 197–204 (2014).

  71. 71.

    Katz, Y., Wang, E. T., Airoldi, E. M. & Burge, C. B. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods 7, 1009–1015 (2010).

  72. 72.

    Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).

  73. 73.

    Huber, W. et al. Orchestrating high-throughput genomic analysis with Bioconductor. Nat. Methods 12, 115–121 (2015).

  74. 74.

    Wickham, H., François, R., Henry, L. & Müller, K. dplyr: A grammar of data manipulation. R package version 0.7.6. (2018).

  75. 75.

    Wickham, H. ggplot2: Elegant Graphics for Data Analysis (Springer, 2009).

  76. 76.

    Dvinge, H. et al. Sample processing obscures cancer-specific alterations in leukemic transcriptomes. Proc. Natl Acad. Sci. USA 111, 16802–16807 (2014).

  77. 77.

    Flicek, P. et al. Ensembl 2013. Nucleic Acids Res. 41, D48–D55 (2013).

  78. 78.

    Meyer, L. R. et al. The UCSC Genome Browser database: extensions and updates 2013. Nucleic Acids Res. 41, D64–D69 (2013).

  79. 79.

    Li, B. & Dewey, C. N. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics 12, 323 (2011).

  80. 80.

    Trapnell, C., Pachter, L. & Salzberg, S. L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).

  81. 81.

    Therneau, T. M. & Grambsch, P. M. Modeling Survival Data: Extending the Cox Model (Springer, 2000).

Download references

Acknowledgements

We thank M. Gasperini, G. Findlay and J. Shendure for technical assistance and sharing pgRNA constructs, Q. Yan for sharing HeLa/iCas9 cells, A. Geballe for sharing Cas9-expressing IMR90 cells and D. Bennett for sharing Melan-a cells. J.D.T. is a Washington Research Foundation Postdoctoral Fellow. R.K.B. is a Scholar of The Leukemia and Lymphoma Society (1344-18). This research was supported in part by the Edward P. Evans Foundation, NIH/NIDDK (R01 DK103854), NIH/NHLBI (R01 HL128239), NIH/NINDS (P01 NS069539) and the Experimental Histopathology and Genomics Shared Resources of the Fred Hutch/University of Washington Cancer Consortium (P30 CA015704). The results published here are based in part on data generated by The Cancer Genome Atlas Research Network (http://cancergenome.nih.gov).

Author information

J.D.T., Q.F. and R.K.B. designed the study. J.D.T., J.T.P., Q.F., E.J.D.N., E.R.H., M.V.M., J.P., A.M.G., A.E.B., J.W., N.T.N. and A.H.B. performed experiments. J.D.T., J.T.P. and R.K.B. analysed data. J.D.T. and R.K.B. wrote the paper.

Correspondence to Robert K. Bradley.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 pgFARM-induced exclusion of HPRT1 exon two and MET exon 14.

a, Sanger sequencing of pgFARM-edited HPRT1 exon two in HeLa/iCas9 cells. b, Long range RT-PCR analysis of HPRT1 exon two skipping. c, RT-PCR analysis of HPRT1 exon two (e2) inclusion before/after Cas9 induction (day 0/day 10) and one week treatment with 6-thioguanine ( + 6TG). d, HPRT1 western blot analysis (n = 1 independent experiments) before (-) and after ( + ) one week treatment with 6TG. e, Cas9-expressing HEK293T cells (n = 3 biological replicates) that were untreated (wild-type) or expressing the indicated pgRNAs followed by one week treatment with 6TG. f, RT-PCR analysis of HPRT1 exon two (e2) inclusion in Cas9-expressing HEK293T cells (n = 3 biological replicates). g, Top, RT-PCR analysis of MET exon 14 (e14) inclusion with ( + ) or without (-) Cas9 expression. Bottom, quantification. (n = 1 independent experiments). h, As for (b), but for MET exon 14. Gray, non-targeting pgRNA; green, pgRNA targeting MET exon 14. See Source Data for uncropped gels. Source data

Extended Data Fig. 2 pgFARM-induced exclusion of MBNL1 exon five in multiple cell lines.

a, Sanger sequencing of pgFARM-edited MBNL1 exon two in HeLa/iCas9 cells. b, Long range RT-PCR analysis of MBNL1 exon two skipping (n = 1 independent experiments). c, Left, RT-PCR analysis (n = 3 biological replicates per group) of MBNL1 exon five (e5) inclusion in Cas9-expressing IMR90 cells expressing a non-targeting pgRNA (pgNTC) or pgMBNL1.a. Right, quantification of MBNL1 exon 5 inclusion. d, Left and center, RT-PCR analysis and associated quantification of Mbnl1 exon five (e5) inclusion in Cas9-expressing B16-F10 cells expressing the indicated pgRNA. Right, RT-PCR analysis (n = 3 biological replicates per group) and associated quantification of Mbnl1 exon (e5) inclusion in Cas9-expressing Melan-A cells expressing the indicated pgRNA. e, Individual Mbnl1 alleles that were cloned from gDNA of Cas9-expressing B16-F10 cells following delivery of a Mbnl1 exon five-targeting pgRNA and subjected to Sanger sequencing. f, Quantification of total MBNL1 protein levels (top) and MBNL1 protein encoded by the exon five-including isoform (bottom) before (day 0) and after (day 14) Cas9 induction in HeLa/iCas9 cells expressing the indicated pgRNA, measured by immunoblot in Fig. 1l. *, pgRNAs that induced the greatest MBNL1 exon five exclusion. Data are representative of n = 2 independent experiments. g, Scatter plot comparing pgRNA-mediated exclusion of MBNL1 exon five (e5) and inclusion of MBNL2 exon five (e5), a paralogous exon that is regulated by nuclear MBNL1. Datapoints (n = 24) are from HeLa/iCas9 cells treated with pgMBNL1.a, pgMBNL1.d, or pgMBNL1.e pgRNAs for two weeks. r, Pearson correlation; p, associated p-value computed using a two-sided Student’s t-test; shaded region, 95% confidence interval. See Source Data for uncropped gels. Source data

Extended Data Fig. 3 SMNDC1 poison exon inclusion in cancer.

a, As Fig. 2c, but for all TCGA cohorts analyzed in Fig. 2d. p computed with two-sided Mann-Whitney U test. Hinges, notches, and whiskers indicate 25th/75th percentiles, 95% confidence interval, and most extreme datapoints within 1.5X interquartile range from hinge. Sample sizes are BLCA: n = 338; BRCA: n = 1089; COAD: n = 451; ESCA: n = 180; HNSC: n = 40; KICH: n = 62; KIRC: n = 430; KIRP: n = 262; LIHC: n = 350; LUAD: n = 502; LUSC: n = 447; PRAD: n = 481; STAD: n = 30; THCA: n = 362. b, Overall survival of lung adenocarcinoma (LUAD) patients, where patients were stratified according to the relative inclusion of the SMNDC1 poison exon. High poison exon, top tercile of samples; low poison exon, bottom tercile of samples. p computed with a two-sided logrank test. n = 237 (low) and 132 (high) samples. The uneven sample allocation arises from edge effects at the boundaries of terciles (MISO only estimates exon inclusion to two significant digits). c, As (b), but for SMNDC1 gene expression. High expression, top tercile of samples; low expression, bottom tercile of samples. p computed with a two-sided logrank test. n = 169 (low) and 174 (high) samples.

Extended Data Fig. 4 pgFARM-induced exclusion of SMNDC1’s poison exon.

a, Sanger sequencing of pgFARM-edited SMNDC1 poison exon in HeLa/iCas9 cells. Annotations of eliminated (X) or disrupted (↓) sequence elements are indicated. b, Western blot for Cas9 and ACTB in parental PC9 and PC9-Cas9 (n = 3 biological replicates) transgenic cell lines. c, Left, PC9-Cas9 cells expressing the indicated pgRNAs following treatment with 6TG for one week. Right, quantification of cell survival. d, Representative SMNDC1 allele (n = 25 total sequenced alleles) of a PC9-Cas9 clonal cell line isolated following delivery of an SMNDC1 poison exon-targeting pgRNA. e, MaxEnt 3′ splice site scores for unedited (wild-type) or edited SMNDC1 alleles from individual PC9-Cas9 clones. “small” and “medium” indicate alleles containing indels of length ~1–10 bp and > 10 bp without intervening gDNA excision; “gDNA excision” indicates alleles with complete excision of intervening gDNA. Each class of editing event can effectively reduce 3′ splice site strength. f, As Fig. 2j, but restricted to introns that are not NMD-targets (NMD-irrelevant). g, As Fig. 2k, but restricted to introns that are not NMD-targets (NMD-irrelevant). See Source Data for uncropped gels. Source data

Extended Data Fig. 5 pgRNA library design.

a, Regions used to classify each poison exon (n = 12,653) according to its sequence conservation. b, Median conservation scores for each indicated region (violin plot width represents probability density of data distribution). c, Median per-nucleotide sequence conservation for exon groups described in the text. d, Per-nucleotide sequence conservation for an SRSF3 ultraconserved poison exon. e, As (d), but for an MTX2 poorly conserved poison exon. f, The most significant biological processes associated with genes containing unconserved poison exons (n = 2,363), conserved poison exons (n = 352), or conserved non-poison exons (n = 888) (related to Fig. 3c). FDR computed using the Wallenius method and corrected using the Benjamini-Hochberg method. g, pgRNA library summary. h, On-target scores (MIT score) for all gRNAs targeting 3′ splice sites analyzed in our study (“false”) and those included in the final library (“true”). i, As (h), but for off-target scores identified using Cas-OFFinder.

Extended Data Fig. 6 Analysis of pilot pgFARM screen.

a, pgRNA library generation for Illumina sequencing. b, pgRNA counts throughout the time course (n = 1,000; 3,604; 4,099; 805 for groups, left to right). c, Relative proliferation of HeLa/iCas9 cells expressing an SMNDC1 upstream constitutive exon-targeting pgRNA relative to control pgRNA (non-essential gene CSPG4; n = 2 independent experiments). d, Unnormalized fold-changes for non-targeting pgRNAs (n = 1,000) and pgRNAs targeting unexpressed ( < 1 transcripts per million, TPM) genes, located in genomic regions with the indicated copy numbers (n = 2, 38, 45, and 11, left to right). e, Normalized fold-changes for all non-targeting pgRNAs (NTC; n = 1,000) and pgRNAs targeting the indicated exons (n = 9 pgRNA per exon) in SNRNP70. f, Relative proliferation of HeLa/iCas9 cells expressing a SNRNP70 upstream constitutive exon-targeting pgRNA without (-) or with ( + ) simultaneous overexpression of a SNRNP70-encoding cDNA (n = 6 replicates per condition). g, Representative Sanger sequencing of a pgFARM-edited SNRNP70 upstream exon in HeLa/iCas9 cells (n = 19 total sequenced alleles). h, RNA-seq read coverage across the SNRNP70 locus containing the targeted upstream constitutive exon (gray box) from HeLa/iCas9 cells expressing the indicated pgRNA (n = 1 per pgRNA). Ψ, percent spliced in. i, SNRNP70 poison exon inclusion for HeLa/iCas9 cells expressing the indicated pgRNA relative to a non-targeting pgRNA (n = 1 per pgRNA). j, Scatter plot comparing cassette exon inclusion in HeLa/iCas9 cells treated with a non-targeting control pgRNA (pgNTC) or SNRNP70 upstream constitutive exon-targeting pgRNA (pgSNRNP70). Points are shaded by statistical significance (two-sided Mann-Whitney test). k, As (j), but comparing alternative 5′ splice site usage. For box plots, the line, hinges, and whiskers represent median, 25th and 75th percentiles, and most extreme datapoints within 1.5X interquartile range from hinge. See Source Data for uncropped gels. Source data

Extended Data Fig. 7 Analysis of pilot pgFARM screen, continued.

a, Normalized pgRNA fold-changes (n = 1,000 and 9 for non- and exon-targeting pgRNAs, respectively). The center line, hinges, and whiskers represent median, 25th and 75th percentiles, and most extreme datapoints within 1.5X interquartile range from hinge. b, RNA-seq read coverage across the SRSF3 locus containing the targeted upstream constitutive exon (gray box) from HeLa/iCas9 cells expressing the indicated pgRNA (n = 1 per pgRNA). Ψ, percent spliced in. c, SRSF3 poison exon inclusion for HeLa/iCas9 cells expressing the indicated pgRNA relative to a non-targeting pgRNA (n = 1 per pgRNA). d, SRSF3 RNA binding motif enrichment in differentially spliced exons (n = 2,046 left; 727 right) in HeLa/iCas9 cells expressing the indicated pgRNA. Data presented as mean ± 95% confidence interval computed by bootstrapping. e, Scatter plot comparing cassette exon inclusion in HeLa/iCas9 cells treated with a non-targeting control pgRNA (pgNTC) or AAVS1-targeting control pgRNA (pgAAVS1). Points are shaded by statistical significance (two-sided Mann-Whitney U test). f, RNA-seq read coverage across the entire SNRNP70 locus in HeLa/iCas9 cells expressing the indicated pgRNA (n = 1 per pgRNA). g, As (f), but for SRSF3 (n = 1 per pgRNA).

Extended Data Fig. 8 Analysis of large-scale pgFARM screens.

a, HeLa/iCas9 cells (n = 4 biological replicates) treated with the poison exon pgRNA library and grown in the presence ( + dox) or absence (- dox) of active Cas9. b, Scatter plots comparing normalized fold-changes (day 14 vs. day 0; n = 963 targeted exons) estimated with each replicate of the cell viability screen in HeLa/iCas9 cells. Pearson correlations for individual replicate comparisons are indicated. c, Normalized fold-changes for pgRNAs targeting exons in unexpressed (TPM ≤ 1; n = 96 for HeLa/iCas9 and 128 for PC9-Cas9) or highly expressed (TPM ≥ 10; n = 681 for HeLa/iCas9 and 661 for PC9-Cas9) genes. Each dot represents the median fold-change computed over all pgRNAs targeting exons in the indicated groups for a representative replicate from the screens in HeLa/iCas9 (left; n = 5) and PC9-Cas9 (right; n = 4) cells. TPM, transcripts per million. d, Normalized fold-changes for pgRNAs targeting lowly expressed genes (TPM < 5) located in genomic regions with the indicated copy numbers (n = 6, 165, and 14 per group, left to right, for HeLa/iCas9; n = 60, 107, and 45 per group, left to right, for PC9-Cas9). e, Rank plot of mean normalized fold-changes for conserved poison (orange) or upstream constitutive exons (purple) based on all replicates of the HeLa/iCas9 viability screen. f, As (e), but for all replicates of the PC9-Cas9 viability screen. For box plots, the center line, hinges, and whiskers represent median, 25th and 75th percentiles, and most extreme datapoints within 1.5X interquartile range from hinges, respectively.

Extended Data Fig. 9 pgFARM-induced exclusion of CPSF4 and SMG1 poison exons.

a, Sanger sequencing of pgFARM-edited CPSF4 poison exon in HeLa/iCas9 cells. Annotations of eliminated (X) or disrupted (↓) sequence elements are indicated. b, RNA-seq read coverage across the entire CPSF4 locus in HeLa/iCas9 cells expressing a CPSF4 poison exon-targeting pgRNA (pgCPSF4; n = 1). We observed no read coverage indicative of cryptic splicing in pgCPSF4-treated cells. The two sets of splice junction reads downstream of the CPSF4 poison exon correspond to usage of endogenous (naturally occurring in unedited cells) competing 3′ splice sites. c, As (b), but for an SMG1 poison exon-targeting pgRNA (pgSMG1; n = 1). d, Scatter plot comparing normalized fold-changes for pgRNAs targeting a poison exon compared to matched upstream coding exon within the same gene.

Extended Data Fig. 10 Analysis of xenograft screens.

a, Tumors derived from parental PC9 or PC9-Cas9 cells (n = 4 per group). b, Mice from early and late tumor time points (n = 4 and 10 tumors, respectively). c, pgRNA Illumina libraries. d, Pearson correlation (r) matrix for xenograft screen samples. Unsupervised clustering of library depth-normalized pgRNA counts by the complete-linkage method. e, Normalized counts (mean ± S.D.) for gRNAs targeting coding exons in the indicated genes. Data from Chen et al, 2015 (n = 1, 6, 3, and 9 for groups, left to right). f, Relative cell number (mean ± S.D.) for PC9-Cas9 cells expressing a pgRNA targeting the indicating exons (n = 3 per group). g, Progression-free survival of lung adenocarcinoma patients (n = 167/171 for low/high categories), where patients were stratified by inclusion of tumor-suppressive poison exons. h, As (g), but for overall survival. i, As (g), but for essential poison exons (n = 166/169 for low/high categories). j, As (i), but for overall survival. See Source Data for uncropped gels. Source data

Supplementary information

Supplementary Information

Supplementary Note

Reporting Summary

Supplementary Tables

Supplementary Tables 1–8

Source data

Source Data Fig. 1

Uncropped gels from Figure 1.

Source Data Fig. 2

Uncropped gels from Figure 2.

Source Data Fig. 3

Uncropped gels from Figure 3.

Source Data Fig. 4

Uncropped gels from Figure 4.

Source Data Extended Data Fig. 1

Uncropped gels from Extended Data Figure 1.

Source Data Extended Data Fig. 2

Uncropped gels from Extended Data Figure 2.

Source Data Extended Data Fig. 4

Uncropped gels from Extended Data Figure 4.

Source Data Extended Data Fig. 6

Uncropped gels from Extended Data Figure 6.

Source Data Extended Data Fig. 10

Uncropped gels from Extended Data Figure 10.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Thomas, J.D., Polaski, J.T., Feng, Q. et al. RNA isoform screens uncover the essentiality and tumor-suppressor activity of ultraconserved poison exons. Nat Genet 52, 84–94 (2020) doi:10.1038/s41588-019-0555-z

Download citation