Subjects

Abstract

We developed a massive-scale RNA sequencing protocol, short quantitative random RNA libraries or SQRL, to survey the complexity, dynamics and sequence content of transcriptomes in a near-complete fashion. This method generates directional, random-primed, linear cDNA libraries that are optimized for next-generation short-tag sequencing. We surveyed the poly(A)+ transcriptomes of undifferentiated mouse embryonic stem cells (ESCs) and embryoid bodies (EBs) at an unprecedented depth (10 Gb), using the Applied Biosystems SOLiD technology. These libraries capture the genomic landscape of expression, state-specific expression, single-nucleotide polymorphisms (SNPs), the transcriptional activity of repeat elements, and both known and new alternative splicing events. We investigated the impact of transcriptional complexity on current models of key signaling pathways controlling ESC pluripotency and differentiation, highlighting how SQRL can be used to characterize transcriptome content and dynamics in a quantitative and reproducible manner, and suggesting that our understanding of transcriptional complexity is far from complete.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Accessions

Gene Expression Omnibus

References

  1. 1.

    & Establishing a human transcript map. Nat. Genet. 10, 369–371 (1995).

  2. 2.

    et al. The TIGR gene indices: clustering and assembling EST and known genes and integration with eukaryotic genomes. Nucleic Acids Res. 33, D71–D74 (2005).

  3. 3.

    et al. The transcriptional landscape of the mammalian genome. Science 309, 1559–1563 (2005).

  4. 4.

    et al. Functional annotation of a full-length mouse cDNA collection. Nature 409, 685–690 (2001).

  5. 5.

    et al. Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs. Nature 420, 563–573 (2002).

  6. 6.

    et al. Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature 447, 799–816 (2007).

  7. 7.

    et al. Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116, 499–509 (2004).

  8. 8.

    et al. Complex loci in human and mouse genomes. PLoS Genet. 2, 564–577 (2006).

  9. 9.

    et al. RNA maps reveal new RNA classes and a possible function for pervasive transcription. Science 316, 1484–1488 (2007).

  10. 10.

    et al. Alternative splicing and differential gene expression in colon cancer detected by a whole genome exon array. BMC Genomics 7, 325 (2006).

  11. 11.

    , & Genome-wide analysis of mRNA processing in yeast using splicing-specific microarrays. Science 296, 907–910 (2002).

  12. 12.

    et al. Massively parallel signature sequencing (MPSS) as a tool for in-depth quantitative gene expression profiling in all organisms. Brief. Funct. Genomic. Proteomic. 1, 95–104 (2002).

  13. 13.

    et al. Serial analysis of gene expression. Science 270, 484–487 (1995).

  14. 14.

    et al. Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc. Natl. Acad. Sci. USA 100, 15776–15781 (2003).

  15. 15.

    et al. Polony multiplex analysis of gene expression (PMAGE) in mouse hypertrophic cardiomyopathy. Science 316, 1481–1484 (2007).

  16. 16.

    et al. Dynamic transcription programs during ES cell differentiation towards mesoderm in serum versus serum-free (BMP4) culture. BMC Genomics 8, 365 (2007).

  17. 17.

    et al. Transcriptional profiling of mouse and human ES cells identifies SLAIN1, a novel stem cell gene. Dev. Biol. 293, 90–103 (2006).

  18. 18.

    et al. A rescue strategy for multimapping short sequence tags refines surveys of transcriptional activity by CAGE. Genomics 91, 281–288 (2008).

  19. 19.

    et al. The UCSC genome browser database: 2008 update. Nucleic Acids Res. 36 (Suppl. 1), D773–779 (2008).

  20. 20.

    & AceView: a comprehensive cDNA-supported gene and transcripts annotation. Genome Biol. 7 (Suppl. 1), S12 (2006).

  21. 21.

    et al. Disclosing hidden transcripts: mouse natural sense-antisense transcripts tend to be poly(A) negative and nuclear localized. Genome Res. 15, 463–474 (2005).

  22. 22.

    et al. Extending assembly of short DNA sequences to handle error. Bioinformatics 23, 2942–2944 (2007).

  23. 23.

    BLAT—the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).

  24. 24.

    et al. A gene regulatory network in mouse embryonic stem cells. Proc. Natl. Acad. Sci. USA 104, 16438–16443 (2007).

  25. 25.

    et al. Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell 133, 523–536 (2008).

  26. 26.

    et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science published online, doi:10.1126/science.1158441 (1 May 2008).

  27. 27.

    et al. In vitro differentiation of murine embryonic stem cells toward a renal lineage. Differentiation 75, 337–349 (2007).

  28. 28.

    et al. Amplification of cDNA ends based on template-switching effect and step-out PCR. Nucleic Acids Res. 27, 1558–1560 (1999).

  29. 29.

    et al. Determination of the capped site sequence of mRNA based on the detection of Cap-dependent nucleotide addition using an anchor ligation method. DNA Res. 11, 305–309 (2004).

  30. 30.

    & CapSelect: a highly sensitive method for 5′ CAP–dependent enrichment of full-length cDNA in PCR-mediated analysis of mRNAs. Nucleic Acids Res. 27, e31 (1999).

  31. 31.

    et al. Accurate multiplex polony sequencing of an evolved bacterial genome. Science 309, 1728–1732 (2005).

  32. 32.

    Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Stat. Appl. Genet. Mol. Biol. 3, 3 (2004).

Download references

Acknowledgements

A.R.R.F. and S.M.G. are funded by the Australian National Health and Medical Research Council. A.R.R.F. is a C.J. Martin fellow (428261), and S.M.G. is a senior research fellow. N.C. is a University of Queensland postdoctoral fellow. G.J.F. and M.K.B. are supported by Australian Postgraduate awards. We acknowledge the Australian Research Council Centre for Functional and Applied Genomics Array Facility for expression profiling, and R. Nutter, G. Weightman and L. Stubberfield for support of this initiative.

Author information

Author notes

    • Alistair R R Forrest

    Present address: The Eskitis Institute for Cell and Molecular Therapies, Griffith University, Nathan, Queensland, 4111, Australia.

    • Nicole Cloonan
    • , Alistair R R Forrest
    •  & Gabriel Kolle

    These authors contributed equally to this work.

Affiliations

  1. Expression Genomics Laboratory, Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, St. Lucia, Queensland, 4072, Australia.

    • Nicole Cloonan
    • , Alistair R R Forrest
    • , Gabriel Kolle
    • , Brooke B A Gardiner
    • , Geoffrey J Faulkner
    • , Mellissa K Brown
    • , Darrin F Taylor
    • , Anita L Steptoe
    • , Shivangi Wani
    • , Graeme Bethel
    • , Alan J Robertson
    • , Andrew C Perkins
    • , Stephen J Bruce
    •  & Sean M Grimmond
  2. Applied Biosystems Inc., 500 Cummings Center, Beverly, Massachusetts 01915, USA.

    • Clarence C Lee
    • , Swati S Ranade
    • , Heather E Peckham
    • , Jonathan M Manning
    •  & Kevin J McKernan

Authors

  1. Search for Nicole Cloonan in:

  2. Search for Alistair R R Forrest in:

  3. Search for Gabriel Kolle in:

  4. Search for Brooke B A Gardiner in:

  5. Search for Geoffrey J Faulkner in:

  6. Search for Mellissa K Brown in:

  7. Search for Darrin F Taylor in:

  8. Search for Anita L Steptoe in:

  9. Search for Shivangi Wani in:

  10. Search for Graeme Bethel in:

  11. Search for Alan J Robertson in:

  12. Search for Andrew C Perkins in:

  13. Search for Stephen J Bruce in:

  14. Search for Clarence C Lee in:

  15. Search for Swati S Ranade in:

  16. Search for Heather E Peckham in:

  17. Search for Jonathan M Manning in:

  18. Search for Kevin J McKernan in:

  19. Search for Sean M Grimmond in:

Contributions

N.C. created and integrated the sequence mapping and visualization pipeline, performed SOLiD sequencing bioinformatics, SNP analysis and splicing studies. A.R.R.F. conceived and pioneered the SQRL library strategy, performed preliminary genomic analysis, and developed the initial visualization methods. G.K. led the array-SQRL analyses, RT-PCR, pathway analysis and contributed to SNP analysis. B.B.A.G., M.K.B., G.K. and N.C. contributed to method design. A.L.S., G.K., S.J.B. and A.C.P. contributed to sample generation. G.K., A.R.R.F. and B.B.A.G. constructed libraries. C.C.L., S.S.R., B.B.A.G., G.B. and K.J.M. contributed to library sequencing. N.C., G.K., G.J.F., A.R.R.F., S.M.G., D.F.T., H.E.P. and J.M.M. contributed to data analysis. G.K., A.J.R., S.W., N.C. and A.L.S. contributed to experimental validation. S.M.G. supervised the work and prepared the manuscript with N.C., G.K. and A.R.R.F.

Competing interests

C.C.L., S.S.R., H.E.P., J.M.M. and K.J.M. are employed by Applied Biosystems, a manufacturer of DNA sequencing instrumentation and supplies.

Corresponding author

Correspondence to Sean M Grimmond.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–13, Supplementary Tables 1–16, Supplementary Methods

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nmeth.1223

Further reading