Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Comprehensive transcriptome analysis using synthetic long-read sequencing reveals molecular co-association of distant splicing events

Subjects

Abstract

Alternative splicing shapes mammalian transcriptomes, with many RNA molecules undergoing multiple distant alternative splicing events. Comprehensive transcriptome analysis, including analysis of exon co-association in the same molecule, requires deep, long-read sequencing. Here we introduce an RNA sequencing method, synthetic long-read RNA sequencing (SLR-RNA-seq), in which small pools (≤1,000 molecules/pool, ≤1 molecule/gene for most genes) of full-length cDNAs are amplified, fragmented and short-read-sequenced. We demonstrate that these RNA sequences reconstructed from the short reads from each of the pools are mostly close to full length and contain few insertion and deletion errors. We report many previously undescribed isoforms (human brain: 13,800 affected genes, 14.5% of molecules; mouse brain 8,600 genes, 18% of molecules) and up to 165 human distant molecularly associated exon pairs (dMAPs) and distant molecularly and mutually exclusive pairs (dMEPs). Of 16 associated pairs detected in the mouse brain, 9 are conserved in human. Our results indicate conserved mechanisms that can produce distant but phased features on transcript and proteome isoforms.

This is a preview of subscription content, access via your institution

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Figure 1: Illustration of purpose and strategy of this work.
Figure 2: Comparison of SLRs and PacBio-CCS on the ERCC sequences.
Figure 3: Comparison of SLRs and PacBio-CCS on human and mouse transcriptomes.
Figure 4: Analysis of novel isoforms revealed by SLR-RNA-seq.
Figure 5: Analysis of distant molecularly associated exon pairs in the human brain transcriptome.
Figure 6: Conservation of distant molecularly associated exon pairs between human and mouse.

Accession codes

Primary accessions

Sequence Read Archive

References

  1. Kornblihtt, A.R. et al. Alternative splicing: a pivotal step between eukaryotic transcription and translation. Nat. Rev. Mol. Cell Biol. 14, 153–165 (2013).

    Article  CAS  Google Scholar 

  2. Nilsen, T.W. & Graveley, B.R. Expansion of the eukaryotic proteome by alternative splicing. Nature 463, 457–463 (2010).

    Article  CAS  Google Scholar 

  3. Chen, J. & Weiss, W.A. Alternative splicing in cancer: implications for biology and therapy. Oncogene 34, 1–14 (2014).

    Article  Google Scholar 

  4. Bonnal, S., Vigevani, L. & Valcárcel, J. The spliceosome as a target of novel antitumour drugs. Nat. Rev. Drug Discov. 11, 847–859 (2012).

    Article  CAS  Google Scholar 

  5. Ben-Dov, C., Hartmann, B., Lundgren, J. & Valcárcel, J. Genome-wide analysis of alternative pre-mRNA splicing. J. Biol. Chem. 283, 1229–1233 (2008).

    Article  CAS  Google Scholar 

  6. Fagnani, M. et al. Functional coordination of alternative splicing in the mammalian central nervous system. Genome Biol. 8, R108 (2007).

    Article  Google Scholar 

  7. Johnson, J.M. et al. Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays. Science 302, 2141–2144 (2003).

    Article  CAS  Google Scholar 

  8. Nagalakshmi, U. et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320, 1344–1349 (2008).

    Article  CAS  Google Scholar 

  9. Wang, E.T. et al. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470–476 (2008).

    Article  CAS  Google Scholar 

  10. Djebali, S. et al. Landscape of transcription in human cells. Nature 489, 101–108 (2012).

    Article  CAS  Google Scholar 

  11. Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-seq. Nat. Methods 5, 621–628 (2008).

    Article  CAS  Google Scholar 

  12. Sultan, M. et al. A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science 321, 956–960 (2008).

    Article  CAS  Google Scholar 

  13. Wilhelm, B.T. et al. Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature 453, 1239–1243 (2008).

    Article  CAS  Google Scholar 

  14. Modrek, B., Resch, a, Grasso, C. & Lee, C. Genome-wide detection of alternative splicing in expressed sequences of human genes. Nucleic Acids Res. 29, 2850–2859 (2001).

    Article  CAS  Google Scholar 

  15. Harrow, J. et al. GENCODE: producing a reference annotation for ENCODE. Genome Biol. 7 (suppl. 1), S4 (2006).

    Article  Google Scholar 

  16. Pan, Q., Shai, O., Lee, L.J., Frey, B.J. & Blencowe, B.J. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 40, 1413–1415 (2008).

    Article  CAS  Google Scholar 

  17. Bernstein, B.E. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).

    Article  Google Scholar 

  18. Tilgner, H., Grubert, F., Sharon, D. & Snyder, M.P. Defining a personal, allele-specific, and single-molecule long-read transcriptome. Proc. Natl. Acad. Sci. USA 111, 9869–9874 (2014).

    Article  CAS  Google Scholar 

  19. Steijger, T. et al. Assessment of transcript reconstruction methods for RNA-seq. Nat. Methods 10, 1177–1184 (2013).

    Article  CAS  Google Scholar 

  20. Cho, H. et al. High-resolution transcriptome analysis with long-read RNA sequencing. PLoS ONE 9, e108095 (2014).

    Article  Google Scholar 

  21. Tilgner, H. et al. Accurate identification and analysis of human mRNA isoforms using deep long read sequencing. G3 3, 387–397 (2013).

    Article  CAS  Google Scholar 

  22. Sharon, D., Tilgner, H., Grubert, F. & Snyder, M. A single-molecule long-read survey of the human transcriptome. Nat. Biotechnol. 31, 1009–1014 (2013).

    Article  CAS  Google Scholar 

  23. Au, K.F. et al. Characterization of the human ESC transcriptome by hybrid sequencing. Proc. Natl. Acad. Sci. USA 110, E4821–E4830 (2013).

    Article  CAS  Google Scholar 

  24. Koren, S. et al. Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat. Biotechnol. 30, 693–700 (2012).

    Article  CAS  Google Scholar 

  25. Eid, J. et al. Real-time DNA sequencing from single polymerase molecules. Science 323, 133–138 (2009).

    Article  CAS  Google Scholar 

  26. Kuleshov, V. et al. Whole-genome haplotyping using long reads and statistical methods. Nat. Biotechnol. 32, 261–266 (2014).

    Article  CAS  Google Scholar 

  27. McCoy, R.C. et al. Illumina TruSeq synthetic long-reads empower de novo assembly and resolve complex, highly-repetitive transposable elements. PLoS ONE 9, e106689 (2014).

    Article  Google Scholar 

  28. The External RNA Controls Consortium. The External RNA Controls Consortium: a progress report. Nat. Methods 2, 731–734 (2005).

  29. Chinwalla, A.T. et al. Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562 (2002).

    Article  Google Scholar 

  30. Wu, T.D. & Watanabe, C.K. GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21, 1859–1875 (2005).

    Article  CAS  Google Scholar 

  31. Li, S. et al. Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study. Nat. Biotechnol. 32, 915–925 (2014).

    Article  Google Scholar 

  32. Harrow, J. et al. GENCODE: the reference human genome annotation for The ENCODE Project. Genome Res. 22, 1760–1774 (2012).

    Article  CAS  Google Scholar 

  33. Derrien, T. et al. The GENCODE v7 catalog of human long noncoding RNAs: analysis of their gene structure, evolution, and expression. Genome Res. 22, 1775–1789 (2012).

    Article  CAS  Google Scholar 

  34. Duret, L., Chureau, C., Samain, S., Weissenbach, J. & Avner, P. The Xist RNA gene evolved in eutherians by pseudogenization of a protein-coding gene. Science 312, 1653–1655 (2006).

    Article  CAS  Google Scholar 

  35. Braunschweig, U. et al. Widespread intron retention in mammals functionally tunes transcriptomes. Genome Res. 24, 1774–1786 (2014).

    Article  CAS  Google Scholar 

  36. Benjamini, Y. & Yekutieli, D. The control of the false discovery rate in multiple testing under dependency. Ann. Stat. 29, 1165–1188 (2001).

    Article  Google Scholar 

  37. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    Article  CAS  Google Scholar 

  38. Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).

    Article  Google Scholar 

  39. Trapnell, C. et al. Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).

    Article  CAS  Google Scholar 

  40. Karolchik, D. et al. The UCSC Genome Browser database: 2014 update. Nucleic Acids Res. 42, D764–D770 (2014).

    Article  CAS  Google Scholar 

  41. Cheng, J. et al. Protection from Fas-mediated apoptosis by a soluble form of the Fas molecule. Science 263, 1759–1762 (1994).

    Article  CAS  Google Scholar 

  42. Lareau, L.F., Inada, M., Green, R.E., Wengrod, J.C. & Brenner, S.E. Unproductive splicing of SR genes associated with highly conserved and ultraconserved DNA elements. Nature 446, 926–929 (2007).

    Article  CAS  Google Scholar 

  43. Sun, S., Zhang, Z., Sinha, R., Karni, R. & Krainer, A.R. SF2/ASF autoregulation involves multiple layers of post-transcriptional and translational control. Nat. Struct. Mol. Biol. 17, 306–312 (2010).

    Article  CAS  Google Scholar 

  44. Smith, C.W. & Valcárcel, J. Alternative pre-mRNA splicing: the logic of combinatorial control. Trends Biochem. Sci. 25, 381–388 (2000).

    Article  CAS  Google Scholar 

  45. Barash, Y. et al. Deciphering the splicing code. Nature 465, 53–59 (2010).

    Article  CAS  Google Scholar 

  46. Tilgner, H. et al. Deep sequencing of subcellular RNA fractions shows splicing to be predominantly co-transcriptional in the human genome but inefficient for lncRNAs. Genome Res. 22, 1616–1625 (2012).

    Article  CAS  Google Scholar 

  47. Dujardin, G. et al. Transcriptional elongation and alternative splicing. Biochim. Biophys. Acta 1829, 134–140 (2013).

    Article  CAS  Google Scholar 

  48. Carrillo Oesterreich, F., Preibisch, S. & Neugebauer, K.M. Global analysis of nascent RNA reveals transcriptional pausing in terminal exons. Mol. Cell 40, 571–581 (2010).

    Article  CAS  Google Scholar 

  49. Vargas, D.Y. et al. Single-molecule imaging of transcriptionally coupled and uncoupled splicing. Cell 147, 1054–1065 (2011).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We thank N. Spies and F.A. Bava for a thorough reading of this manuscript and valuable comments and S. Shringarpure, V. Kuleshov, C.S. Foo and H. Tang for valuable comments on statistics. We thank A. Brunet for providing mice and S. Munro for valuable comments on this manuscript. We also thank the Genetics Bioinformatics Service Center at Stanford for providing a well-working computing cluster. M.R. is paid by grant 12-131829 from the Danish Council for Independent Research. This work was supported by grant 5U01HL10739304 (to M.S. as co-PI), 1P50HG007735-01 (to M.S. as co-PI) and 5P01GM09913004 (to M.S.).

Author information

Authors and Affiliations

Authors

Contributions

H.T., T.B., F.C. and M.P.S. devised the project. F.J., T.B., E.J., A.M. and M.R. carried out experiments. I.H. euthanized mice and extracted brains. H.T. carried out computational analysis. C.D.B. and M.P.S. supervised the project and provided financial support. H.T. wrote the first version of the manuscript. H.T., F.J., M.R. and M.P.S. wrote the final version of the manuscript with contributions from the other authors.

Corresponding author

Correspondence to Michael P Snyder.

Ethics declarations

Competing interests

A. Moshrefi, E. Jaeger and F. Chen are employees of Illumina. T. Blauwkamp is a former employee of Illumina. M. Snyder is on the scientific advisory board of Personalis, GenapSys and AxioMx. C. Bustamante is a founder of Identify Genomics. He is also on the Scientific Advisory Board of Identify, Etalon, Personalis and Ancestry.com. He is a former member of the advisory board member of InVitae. None of these organizations played a role in the design or conduct of the work presented here.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–8 and Supplementary Tables 1 and 2 and Supplementary Results (PDF 5403 kb)

Supplementary Data Set 1

This is a README describing all the supplementary datasets. (ZIP 324 kb)

Supplementary Data Set 2

Human Molecules per Million measurements for spliced genes. See associated README for file format. (ZIP 231 kb)

Supplementary Data Set 3

Mouse Molecules per Million measurements for spliced genes for both mice combined. See associated README for file format. (ZIP 228 kb)

Supplementary Data Set 4

Mouse Molecules per Million measurements for spliced genes for mouse number 2. See associated README for file format. (ZIP 220 kb)

Supplementary Data Set 5

Human Percent-Spliced-In (Psi) measurements for splice-sites. See associated README for file format. (ZIP 4739 kb)

Supplementary Data Set 6

Mouse Percent-Spliced-In (Psi) measurements for splice-sites for both mice combined. See associated README for file format. (ZIP 2622 kb)

Supplementary Data Set 7

Mouse Percent-Spliced-In (Psi) measurements for splice-sites for mouse number 1. See associated README for file format. (ZIP 2236 kb)

Supplementary Data Set 8

Mouse Percent-Spliced-In (Psi) measurements for splice-sites for mouse number 2. See associated README for file format. (ZIP 1686 kb)

Supplementary Data Set 9

Human Percent-Isoforme (Pi) measurements for spliced genes. See associated README for file format. (ZIP 3023 kb)

Supplementary Data Set 10

Mouse Percent-Isoforme (Pi) measurements for spliced genes for both mice combined. See associated README for file format. (ZIP 1100 kb)

Supplementary Data Set 11

Mouse Percent-Isoforme (Pi) measurements for spliced genes for mouse number 1. See associated README for file format. (ZIP 932 kb)

Supplementary Data Set 12

Mouse Percent-Isoforme (Pi) measurements for spliced genes for mouse number 2. See associated README for file format. (ZIP 695 kb)

Supplementary Data Set 13

Human "distant Molecularly Associated Pairs" (dMAPs) of exons and "distant Molecularly and Mutually Exclusive Pairs" (dMEPs) of exons using only human brain RNA. See associated README for file format. (ZIP 6 kb)

Supplementary Data Set 14

Human "distant Molecularly Associated Pairs" (dMAPs) of exons and "distant Molecularly and Mutually Exclusive Pairs" (dMEPs) of exons using human brain RNA and a variety of previously published long read RNA-datasets (Tilgner et al, GGG, 2013; Sharon et al, Nature Biotechnology, 2013; Tilgner et al, PNAS, 2014). See associated README for file format. (ZIP 11 kb)

Supplementary Data Set 15

Mouse "distant Molecularly Associated Pairs" (dMAPs) of exons and "distant Molecularly and Mutually Exclusive Pairs" (dMEPs) of exons using only mouse brain RNA. See associated README for file format. (ZIP 1 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Tilgner, H., Jahanbani, F., Blauwkamp, T. et al. Comprehensive transcriptome analysis using synthetic long-read sequencing reveals molecular co-association of distant splicing events. Nat Biotechnol 33, 736–742 (2015). https://doi.org/10.1038/nbt.3242

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt.3242

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing