Quantification of the yeast transcriptome by single-molecule sequencing

Article metrics


We present single-molecule sequencing digital gene expression (smsDGE), a high-throughput, amplification-free method for accurate quantification of the full range of cellular polyadenylated RNA transcripts using a Helicos Genetic Analysis system. smsDGE involves a reverse-transcription and polyA-tailing sample preparation procedure followed by sequencing that generates a single read per transcript. We applied smsDGE to the transcriptome of Saccharomyces cerevisiae strain DBY746, using 6 of the available 50 channels in a single sequencing run, yielding on average 12 million aligned reads per channel. Using spiked-in RNA, accurate quantitative measurements were obtained over four orders of magnitude. High correlation was demonstrated across independent flow-cell channels, instrument runs and sample preparations. Transcript counting in smsDGE is highly efficient due to the representation of each transcript molecule by a single read. This efficiency, coupled with the high throughput enabled by the single-molecule sequencing platform, provides an alternative method for expression profiling.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Figure 1: Sample preparation, sequencing and analysis workflow.
Figure 2: Data description.
Figure 3: Reproducibility and counting accuracy.
Figure 4: Count reproducibility.
Figure 5: TSS mapping.
Figure 6: Sequence information.

Accession codes




  1. 1

    Lockhart, D.J. & Winzeler, E.A. Genomics, gene expression and DNA arrays. Nature 405, 827–836 (2000).

  2. 2

    Churchill, G.A. Fundamentals of experimental design for cDNA microarrays. Nat. Genet. 32 Suppl, 490–495 (2002).

  3. 3

    Velculescu, V.E., Zhang, L., Vogelstein, B. & Kinzler, K.W. Serial analysis of gene expression. Science 270, 484–487 (1995).

  4. 4

    Brenner, S. et al. Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat. Biotechnol. 18, 630–634 (2000).

  5. 5

    Saha, S. et al. Using the transcriptome to annotate the genome. Nat. Biotechnol. 20, 508–512 (2002).

  6. 6

    Shiraki, T. et al. Cap analysis gene expression for high-throughput analysis of transcriptional starting point and identification of promoter usage. Proc. Natl. Acad. Sci. USA 100, 15776–15781 (2003).

  7. 7

    Hashimoto, S. et al. 5′-end SAGE for the analysis of transcriptional start sites. Nat. Biotechnol. 22, 1146–1149 (2004).

  8. 8

    Kim, J.B. et al. Polony multiplex analysis of gene expression (PMAGE) in mouse hypertrophic cardiomyopathy. Science 316, 1481–1484 (2007).

  9. 9

    Chen, J. & Rattray, M. Analysis of tag-position bias in MPSS technology. BMC Genomics 7, 77 (2006).

  10. 10

    Siddiqui, A.S. et al. Sequence biases in large scale gene expression profiling data. Nucleic Acids Res. 34, e83 (2006).

  11. 11

    Gilchrist, M.A., Qin, H. & Zaretzki, R. Modeling SAGE tag formation and its effects on data interpretation within a Bayesian framework. BMC Bioinformatics 8, 403 (2007).

  12. 12

    Hene, L. et al. Deep analysis of cellular transcriptomes - LongSAGE versus classic MPSS. BMC Genomics 8, 333 (2007).

  13. 13

    So, A.P., Turner, R.F. & Haynes, C.A. Minimizing loss of sequence information in SAGE ditags by modulating the temperature dependent 3′ → 5′ exonuclease activity of DNA polymerases on 3′-terminal isoheptyl amino groups. Biotechnol. Bioeng. 94, 54–65 (2006).

  14. 14

    Nagalakshmi, U. et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320, 1344–1349 (2008).

  15. 15

    Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).

  16. 16

    Cloonan, N. et al. Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat. Methods 5, 613–619 (2008).

  17. 17

    Oshlack, A. & Wakefield, M.J. Transcript length bias in RNA-seq data confounds systems biology. Biol. Direct 4, 14 (2009).

  18. 18

    Harris, T.D. et al. Single-molecule DNA sequencing of a viral genome. Science 320, 106–109 (2008).

  19. 19

    Bowers, J. et al. Novel virtual terminator nucleotides for next generation DNA sequencing. Nat. Methods (in the press).

  20. 20

    Fisk, D.G. et al. Saccharomyces cerevisiae S288C genome annotation: a working hypothesis. Yeast 23, 857–865 (2006).

  21. 21

    Zhang, Z. & Dietrich, F.S. Mapping of transcription start sites in Saccharomyces cerevisiae using 5′ SAGE. Nucleic Acids Res. 33, 2838–2851 (2005).

  22. 22

    Miura, F. et al. A large-scale full-length cDNA analysis to explore the budding yeast transcriptome. Proc. Natl. Acad. Sci. USA 103, 17846–17851 (2006).

  23. 23

    Kent, W.J. et al. The human genome browser at UCSC. Genome Res. 12, 996–1006 (2002).

  24. 24

    Juneau, K., Palm, C., Miranda, M. & Davis, R.W. High-density yeast-tiling array reveals previously undiscovered introns and extensive regulation of meiotic splicing. Proc. Natl. Acad. Sci. USA 104, 1522–1527 (2007).

  25. 25

    Holstege, F.C. et al. Dissecting the regulatory circuitry of a eukaryotic genome. Cell 95, 717–728 (1998).

Download references


We thank all of the past and present colleagues at Helicos who have contributed to this work.

Author information

Correspondence to Tal Raz.

Ethics declarations

Competing interests

All of the authors are or have been employees of Helicos Biosciences.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–3, Supplementary Tables 4 and 6, and Supplementary Methods (PDF 369 kb)

Supplementary Table 1

Transcript counts (XLS 1784 kb)

Supplementary Table 2

qPCR measurements (XLS 25 kb)

Supplementary Table 3

Detected sequence variants (XLS 498 kb)

Supplementary Table 5

Coverage peaks in yeast genome. (XLS 106 kb)

Rights and permissions

Reprints and Permissions

About this article

Further reading