Protocol | Published:

Defining transcribed regions using RNA-seq

Nature Protocols volume 5, pages 255266 (2010) | Download Citation

Abstract

Next-generation sequencing technologies are revolutionizing genomics research. It is now possible to generate gigabase pairs of DNA sequence within a week without time-consuming cloning or massive infrastructure. This technology has recently been applied to the development of 'RNA-seq' techniques for sequencing cDNA from various organisms, with the goal of characterizing entire transcriptomes. These methods provide unprecedented resolution and depth of data, enabling simultaneous quantification of gene expression, discovery of novel transcripts and exons, and measurement of splicing efficiency. We present here a validated protocol for nonstrand-specific transcriptome sequencing via RNA-seq, describing the library preparation process and outlining the bioinformatic analysis procedure. While sample preparation and sequencing take a fairly short period of time (1–2 weeks), the downstream analysis is by far the most challenging and time-consuming aspect and can take weeks to months, depending on the experimental objectives.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    , & Genome-wide transcription and the implications for genomic organization. Nat. Rev. Genet. 8, 413–423 (2007).

  2. 2.

    , & Long non-coding RNAs: insights into functions. Nat. Rev. Genet. 10, 155–159 (2009).

  3. 3.

    & Origins and mechanisms of miRNAs and siRNAs. Cell 136, 642–655 (2009).

  4. 4.

    & RNA-seq: from technology to biology. Cell Mol. Life Sci. published online, doi:10.1007/s00018-009-0180-6 (27 October 2009).

  5. 5.

    & RNA-seq—quantitative measurement of expression through massively parallel RNA-sequencing. Methods 48, 249–257 (2009).

  6. 6.

    et al. Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature 453, 1239–1243 (2008).

  7. 7.

    Next-generation DNA sequencing methods. Annu. Rev. Genomics Hum. Genet. 9, 387–402 (2008).

  8. 8.

    et al. Whole-genome microarrays of fission yeast: characteristics, accuracy, reproducibility, and processing of array data. BMC Genomics 4, 27 (2003).

  9. 9.

    et al. Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat. Methods 5, 613–619 (2008).

  10. 10.

    et al. A large genome center's improvements to the Illumina sequencing system. Nat. Methods 5, 1005–1010 (2008).

  11. 11.

    et al. Paired-end mapping reveals extensive structural variation in the human genome. Science 318, 420–426 (2007).

  12. 12.

    et al. Highly integrated single-base resolution maps of the epigenome in Arabidopsis. Cell 133, 523–536 (2008).

  13. 13.

    , , & Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–223 (2009).

  14. 14.

    et al. Determination of tag density required for digital transcriptome analysis: application to an androgen-sensitive prostate cancer model. Proc. Natl. Acad. Sci. USA 105, 20179–20184 (2008).

  15. 15.

    et al. Transcriptome analysis by strand-specific sequencing of complementary DNA. Nucleic Acids Res. 37, 123 (2009).

  16. 16.

    et al. A simple method for directional transcriptome sequencing using Illumina technology. Nucleic Acids Res. published online, doi:10.1093/nar/gkp811 (8 October 2009).

  17. 17.

    et al. Clusters of internally primed transcripts reveal novel long noncoding RNAs. PLoS Genet. 2, e37 (2006).

  18. 18.

    , , & Pyrobayes: an improved base caller for SNP discovery in pyrosequences. Nat. Methods 5, 179–181 (2008).

  19. 19.

    et al. Probabilistic base calling of Solexa sequencing data. BMC Bioinformatics 9, 431 (2008).

  20. 20.

    et al. Swift: primary data analysis for the Illumina Solexa sequencing platform. Bioinformatics 25, 2194–2199 (2009).

  21. 21.

    & Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 8, 186–194 (1998).

  22. 22.

    et al. Annotating genomes with massive-scale RNA sequencing. Genome Biol. 9, R175 (2008).

  23. 23.

    , , & Gene discovery using massively parallel pyrosequencing to develop ESTs for the flesh fly Sarcophaga crassipalpis. BMC Genomics 10, 234 (2009).

  24. 24.

    et al. Ab initio construction of a eukaryotic transcriptome by massively parallel mRNA sequencing. Proc. Natl. Acad. Sci. USA 106, 3264–3269 (2009).

  25. 25.

    et al. Wasp gene expression supports an evolutionary link between maternal behavior and eusociality. Science 318, 441–444 (2007).

  26. 26.

    & How to map billions of short reads onto genomes. Nat. Biotechnol. 27, 455–457 (2009).

  27. 27.

    BLAT—the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).

  28. 28.

    , & Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 18, 1851–1858 (2008).

  29. 29.

    , , & Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).

  30. 30.

    , & TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25, 1105–1111 (2009).

  31. 31.

    et al. SHRiMP: accurate mapping of short color-space reads. PLoS Comput. Biol. 5, e1000386 (2009).

  32. 32.

    , , & SOAP: short oligonucleotide alignment program. Bioinformatics 24, 713–714 (2008).

Download references

Acknowledgements

We thank Dr. J.-R. Landry for critical reading of the manuscript. Research in the Bähler laboratory is funded by Cancer Research UK and by PhenOxiGEn, an EU FP7 research project.

Author information

Author notes

    • Brian T Wilhelm
    •  & Samuel Marguerat

    These authors contributed equally to this work.

Affiliations

  1. Institute for Research in Immunology and Cancer (IRIC), Université de Montréal, Montréal, Québec, Canada.

    • Brian T Wilhelm
  2. Department of Genetics, Evolution & Environment and UCL Cancer Institute, University College London, London, UK.

    • Samuel Marguerat
    •  & Jürg Bähler
  3. Unit for Functional and Comparative Genomics, School of Biological Sciences, University of Liverpool, Liverpool, UK.

    • Ian Goodhead

Authors

  1. Search for Brian T Wilhelm in:

  2. Search for Samuel Marguerat in:

  3. Search for Ian Goodhead in:

  4. Search for Jürg Bähler in:

Contributions

All authors contributed extensively to the work presented in this paper.

Corresponding author

Correspondence to Jürg Bähler.

About this article

Publication history

Published

DOI

https://doi.org/10.1038/nprot.2009.229

Further reading

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.