Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study

Subjects

An Erratum to this article was published on 07 November 2014

This article has been updated

Abstract

High-throughput RNA sequencing (RNA-seq) greatly expands the potential for genomics discoveries, but the wide variety of platforms, protocols and performance capabilitites has created the need for comprehensive reference data. Here we describe the Association of Biomolecular Resource Facilities next-generation sequencing (ABRF-NGS) study on RNA-seq. We carried out replicate experiments across 15 laboratory sites using reference RNA standards to test four protocols (poly-A–selected, ribo-depleted, size-selected and degraded) on five sequencing platforms (Illumina HiSeq, Life Technologies PGM and Proton, Pacific Biosciences RS and Roche 454). The results show high intraplatform (Spearman rank R > 0.86) and inter-platform (R > 0.83) concordance for expression measures across the deep-count platforms, but highly variable efficiency and cost for splice junction and variant detection between all platforms. For intact RNA, gene expression profiles from rRNA-depletion and poly-A enrichment are similar. In addition, rRNA depletion enables effective analysis of degraded RNA samples. This study provides a broad foundation for cross-platform standardization, evaluation and improvement of RNA-seq.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Experimental design and sequencing platforms.
Figure 2: Transcript coverage across all genes detected.
Figure 3: Intra- and inter-platform variation of RNA-seq transcript metrics.
Figure 4: Inter-platform consistency of splicing and differential expression analysis.
Figure 5: Differentially expressed genes in ribo-depleted and poly-A–enriched libraries.

Similar content being viewed by others

Accession codes

Primary accessions

Gene Expression Omnibus

Change history

  • 10 October 2014

    In the version of this article initially published, author Jeffrey Rosenfeld's middle initial “A” was omitted. The error has been corrected in the HTML and PDF versions of the article.

References

  1. Wang, E.T. et al. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470–476 (2008).

    Article  CAS  Google Scholar 

  2. Nagalakshmi, U., Waern, K. & Snyder, M. RNA-Seq: a method for comprehensive transcriptome analysis. Curr. Protoc. Mol. Biol. 89, 4.11 (2010).

    Google Scholar 

  3. Liu, S., Lin, L., Jiang, P., Wang, D. & Xing, Y. A comparison of RNA-Seq and high-density exon array for detecting differential gene expression between closely related species. Nucleic Acids Res. 39, 578–588 (2011).

    Article  CAS  Google Scholar 

  4. Marioni, J.C., Mason, C.E., Mane, S.M., Stephens, M. & Gilad, Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 18, 1509–1517 (2008).

    Article  CAS  Google Scholar 

  5. Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat. Methods 5, 621–628 (2008).

    Article  CAS  Google Scholar 

  6. Liu, L. et al. Comparison of next-generation sequencing systems. J. Biomed. Biotechnol. 2012, 251364 (2012).

    PubMed  PubMed Central  Google Scholar 

  7. Ratan, A. et al. Comparison of sequencing platforms for single nucleotide variant calls in a human sample. PLoS ONE 8, e55089 (2013).

    Article  CAS  Google Scholar 

  8. Quail, M.A. et al. A tale of three next generation sequencing platforms: comparison of Ion Torrent, Pacific Biosciences and Illumina MiSeq sequencers. BMC Genomics 13, 341 (2012).

    Article  CAS  Google Scholar 

  9. Loman, N.J. et al. Performance comparison of benchtop high-throughput sequencing platforms. Nat. Biotechnol. 30, 434–439 (2012).

    Article  CAS  Google Scholar 

  10. Shi, L. et al. The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat. Biotechnol. 24, 1151–1161 (2006).

    Article  CAS  Google Scholar 

  11. SEQC/MAQC-III Consortium. A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium. Nat. Biotechnol. 10.1038/nbt.2957 (24 August 2014).

  12. 't Hoen, P.A. et al. Reproducibility of high-throughput mRNA and small RNA sequencing across laboratories. Nat. Biotechnol. 31, 1015–1022 (2013).

    Article  CAS  Google Scholar 

  13. Tarazona, S., Garcia-Alcalde, F., Dopazo, J., Ferrer, A. & Conesa, A. Differential expression in RNA-seq: a matter of depth. Genome Res. 21, 2213–2223 (2011).

    Article  CAS  Google Scholar 

  14. Katz, Y., Wang, E.T., Airoldi, E.M. & Burge, C.B. Analysis and design of RNA sequencing experiments for identifying isoform regulation. Nat. Methods 7, 1009–1015 (2010).

    Article  CAS  Google Scholar 

  15. Łabaj, P.P. et al. Characterization and improvement of RNA-Seq precision in quantitative transcript expression profiling. Bioinformatics 27, i383–i391 (2011).

    Article  Google Scholar 

  16. McIntyre, L.M. et al. RNA-seq: technical variability and sampling. BMC Genomics 12, 293 (2011).

    Article  CAS  Google Scholar 

  17. Huang, R. et al. An RNA-Seq strategy to detect the complete coding and non-coding transcriptome including full-length imprinted macro ncRNAs. PLoS ONE 6, e27288 (2011).

    Article  CAS  Google Scholar 

  18. Wang, Z., Gerstein, M. & Snyder, M. RNA-Seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10, 57–63 (2009).

    Article  CAS  Google Scholar 

  19. Toung, J.M., Morley, M., Li, M. & Cheung, V.G. RNA-sequence analysis of human B-cells. Genome Res. 21, 991–998 (2011).

    Article  CAS  Google Scholar 

  20. Roberts, A., Pimentel, H., Trapnell, C. & Pachter, L. Identification of novel transcripts in annotated genomes using RNA-Seq. Bioinformatics 27, 2325–2329 (2011).

    Article  CAS  Google Scholar 

  21. Angeletti, R.H. et al. Research technologies: fulfilling the promise. FASEB J. 13, 595–601 (1999).

    Article  CAS  Google Scholar 

  22. Moelans, C.B., Oostenrijk, D., Moons, M.J. & van Diest, P.J. Formaldehyde substitute fixatives: effects on nucleic acid preservation. J. Clin. Pathol. 64, 960–967 (2011).

    Article  CAS  Google Scholar 

  23. Opitz, L. et al. Impact of RNA degradation on gene expression profiling. BMC Med. Genomics 3, 36 (2010).

    Article  Google Scholar 

  24. Morlan, J.D., Qu, K. & Sinicropi, D.V. Selective depletion of rRNA enables whole transcriptome profiling of archival fixed tissue. PLoS ONE 7, e42882 (2012).

    Article  CAS  Google Scholar 

  25. Li, S. et al. Detecting and correcting systematic variation in large-scale RNA sequencing data. Nat. Biotechnol. 10.1038/nbt.3000 (24 August 2014).

  26. Pareek, C.S., Smoczynski, R. & Tretyn, A. Sequencing technologies and genome sequencing. J. Appl. Genet. 52, 413–435 (2011).

    Article  CAS  Google Scholar 

  27. Adiconis, X. et al. Comparative analysis of RNA sequencing methods for degraded or low-input samples. Nat. Methods 10, 623–629 (2013).

    Article  CAS  Google Scholar 

  28. Boland, J.F. et al. The new sequencer on the block: comparison of Life Technology's Proton sequencer to an Illumina HiSeq for whole-exome sequencing. Hum. Genet. 132, 1153–1163 (2013).

    Article  CAS  Google Scholar 

  29. Glenn, T.C. Field guide to next-generation DNA sequencers. Mol. Ecol. Resour. 11, 759–769 (2011).

    Article  CAS  Google Scholar 

  30. Jiang, L. et al. Synthetic spike-in standards for RNA-seq experiments. Genome Res. 21, 1543–1551 (2011).

    Article  CAS  Google Scholar 

  31. Zook, J.M., Samarov, D., McDaniel, J., Sen, S.K. & Salit, M. Synthetic spike-in standards improve run-specific systematic error analysis for DNA and RNA sequencing. PLoS ONE 7, e41356 (2012).

    Article  CAS  Google Scholar 

  32. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    Article  CAS  Google Scholar 

  33. Hansen, K.D., Brenner, S.E. & Dudoit, S. Biases in Illumina transcriptome sequencing caused by random hexamer priming. Nucleic Acids Res. 38, e131 (2010).

    Article  Google Scholar 

  34. Aird, D. et al. Analyzing and minimizing PCR amplification bias in Illumina sequencing libraries. Genome Biol. 12, R18 (2011).

    Article  CAS  Google Scholar 

  35. Risso, D., Schwartz, K., Sherlock, G. & Dudoit, S. GC-content normalization for RNA-Seq data. BMC Bioinformatics 12, 480 (2011).

    Article  CAS  Google Scholar 

  36. 1000 Genomes Project Consortium et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56–65 (2012).

  37. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    Article  CAS  Google Scholar 

  38. Sharon, D., Tilgner, H., Grubert, F. & Snyder, M. A single-molecule long-read survey of the human transcriptome. Nat. Biotechnol. 31, 1009–1014 (2013).

    Article  CAS  Google Scholar 

  39. Smyth, G.K. in Bioinformatics and Computational Biology Solutions Using R and Bioconductor (eds. Gentleman, R., Carey, V., Huber, W., Irizarry, R. & Dudoit, S.) 397–420 (Springer New York, 2005).

  40. Cui, P. et al. A comparison between ribo-minus RNA-sequencing and polyA-selected RNA-sequencing. Genomics 96, 259–265 (2010).

    Article  CAS  Google Scholar 

  41. Leek, J.T., Johnson, W.E., Parker, H.S., Jaffe, A.E. & Storey, J.D. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinformatics 28, 882–883 (2012).

    Article  CAS  Google Scholar 

  42. Baldi, P., Brunak, S., Chauvin, Y., Andersen, C.A. & Nielsen, H. Assessing the accuracy of prediction algorithms for classification: an overview. Bioinformatics 16, 412–424 (2000).

    Article  CAS  Google Scholar 

  43. Shi, L. et al. The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat. Biotechnol. 28, 827–838 (2010).

    Article  CAS  Google Scholar 

  44. Li, S. & Mason, C. E. The pivotal regulatory landscape of RNA modifications. Annu. Rev. Genomics Hum. Genet. 10.1146/annurev-genom-090413-025405 (2 June 2014).

  45. Haas, B.J. & Zody, M.C. Advancing RNA-Seq analysis. Nat. Biotechnol. 28, 421–423 (2010).

    Article  CAS  Google Scholar 

  46. Wenger, Y. & Galliot, B. RNAseq versus genome-predicted transcriptomes: a large population of novel transcripts identified in an Illumina-454 Hydra transcriptome. BMC Genomics 14, 204 (2013).

    Article  CAS  Google Scholar 

  47. Pipes, L. et al. The non-human primate reference transcriptome resource (NHPRTR) for comparative functional genomics. Nucleic Acids Res. 41, D906–D914 (2013).

    Article  CAS  Google Scholar 

  48. Krupp, M. et al. RNA-Seq Atlas–a reference database for gene expression profiling in normal tissue by next-generation sequencing. Bioinformatics 28, 1184–1185 (2012).

    Article  CAS  Google Scholar 

  49. Van Peer, G., Mestdagh, P. & Vandesompele, J. Accurate RT-qPCR gene expression analysis on cell culture lysates. Sci. Rep. 2, 222 (2012).

    Article  Google Scholar 

  50. Hellemans, J., Mortier, G., De Paepe, A., Speleman, F. & Vandesompele, J. qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data. Genome Biol. 8, R19 (2007).

    Article  Google Scholar 

  51. Bustin, S.A. et al. The MIQE guidelines: minimum information for publication of quantitative real-time PCR experiments. Clin. Chem. 55, 611–622 (2009).

    Article  CAS  Google Scholar 

  52. Robinson, M.D., McCarthy, D.J. & Smyth, G.K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).

    Article  CAS  Google Scholar 

  53. Robinson, M.D. & Smyth, G.K. Moderated statistical tests for assessing differences in tag abundance. Bioinformatics 23, 2881–2887 (2007).

    Article  CAS  Google Scholar 

  54. Robinson, M.D. & Smyth, G.K. Small-sample estimation of negative binomial dispersion, with applications to SAGE data. Biostatistics 9, 321–332 (2008).

    Article  Google Scholar 

  55. Leek, J.T. & Storey, J.D. Capturing heterogeneity in gene expression studies by surrogate variable analysis. PLoS Genet. 3, 1724–1735 (2007).

    Article  CAS  Google Scholar 

  56. Wang, L., Wang, S. & Li, W. RSeQC: quality control of RNA-seq experiments. Bioinformatics 28, 2184–2185 (2012).

    Article  CAS  Google Scholar 

  57. Canales, R.D. et al. Evaluation of DNA microarray results with quantitative gene expression platforms. Nat. Biotechnol. 24, 1115–1122 (2006).

    Article  CAS  Google Scholar 

  58. Dvinge, H. & Bertone, P. HTqPCR: high-throughput analysis and visualization of quantitative real-time PCR data in R. Bioinformatics 25, 3325–3326 (2009).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We greatly appreciate the contribution and distribution of reference sample RNA from L. Shi (FDA) and his valuable interactions to assist in the planning of this study. This work was supported with funding from the National Institutes of Health (NIH), including R01HG006798, R01NS076465, R24RR032341, as well as funds from the Irma T. Hirschl and Monique Weill-Caulier Charitable Trusts and the STARR Consortium (I7-A765).

We thank the following contributors for their technical wisdom, including laboratory expertise, data analysis and bioinformatics contributions, and technical design guidance and consultation. Without their help, this study would not have been possible: D. Stopka (Memorial Sloan-Kettering Cancer Institute), G. Grove (Penn State Univ.), D. Hannon (Penn State Univ.), K. Jones (NIH/NCI/SAIC), C. Raley (NIH/NCI/SAIC), H. O'Geen (UC Davis), D. Zheng (Univ. Illinois-Urbana), O. Nguyen (UC Davis), Z.-W. Lu (UC Davis), J. Spisak (Cornell Univ.), D. Lin (NIH/NIAID), J. Pillardy (Cornell Univ.), P.-Y. Wu (Georgia Institute of Technology), J. Phan (Emory Univ.), D. Oschwald (New York Genome Center), H. Arnold (PerkinElmer), S. Tyndale (Univ. Southern California), H. Truong (Univ. Southern California), Y. Zhang (Univ. Florida), N. Panayotova (Univ. Florida), D. Moraga (Univ. Florida), S. Shanker (Univ. Florida), and N. Barker (US Army Environmental Quality Research Program).

We would also like to thank the platform vendors, Illumina, Life Technologies, Pacific Biosciences and Roche Life Sciences, for their support of this study, and their distinguished scientists for providing technical expertise and assistance in study designs, protocols, new methods development and significant contributions of reagents and sequencing kits. In particular, alphabetically by vendor: G. Schroth (Illumina); M. Gallad, J. Smith, T. Bittick, R. Setterquist and G. Scott (Life Technologies); J. Korlach, S. Turner and E. Tseng (Pacific Biosciences); and K. Fredrickson and C. Teiling (Roche Life Sciences).

We are sincerely appreciative of the Association of Biomolecular Resource Facilities (ABRF) for supporting this study and the contributing ABRF Research Groups. Special thanks to our ABRF executive board liaison A. Perera (Stowers Institute for Medical Research).

Author information

Authors and Affiliations

Authors

Contributions

All authors are members of the Association of Biomolecular Resource Facilities Next-Generation Sequencing (ABRF-NGS) Consortium. S.W.T., C.M.N., D.A.B., G.S.G. and C.E.M. managed the project. S.W.T., C.M.N., D.G., S.L., W.F., A.V., C.W., P.A.S., Y.G., D.K., J.B., B.H., R.K., N.J., N.R., J.G., N.G.-R., C.H., D.R., J.R., T.S., J.G.U., C.E.M. and P.Z. performed sequencing. S.L., S.W.T., C.M.N., D.A.B., G.S.G. and C.E.M. designed the analyses. S.L., P.A.S., J.G.U., P.Z., C.E.M. and D.K. performed the data analyses. S.L., P.Z., M.W., D.K., J.G.U. and C.E.M. made the figures. S.L., S.W.T., C.M.N., D.A.B., G.S.G. and C.E.M. wrote and revised the manuscript. The ABRF-NGS Consortium members contributed to the design and execution of the study.

Corresponding authors

Correspondence to Don A Baldwin, George S Grills or Christopher E Mason.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1-39 and Supplementary Tables 1–8 (PDF 10745 kb)

Supplementary Software (ZIP 168 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, S., Tighe, S., Nicolet, C. et al. Multi-platform assessment of transcriptome profiling using RNA-seq in the ABRF next-generation sequencing study. Nat Biotechnol 32, 915–925 (2014). https://doi.org/10.1038/nbt.2972

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nbt.2972

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing