Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Differential analysis of RNA-seq incorporating quantification uncertainty

This article has been updated

Abstract

We describe sleuth (http://pachterlab.github.io/sleuth), a method for the differential analysis of gene expression data that utilizes bootstrapping in conjunction with response error linear modeling to decouple biological variance from inferential variance. sleuth is implemented in an interactive shiny app that utilizes kallisto quantifications and bootstraps for fast and accurate analysis of data from RNA-seq experiments.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Overview of sleuth.
Figure 2: Sensitivity and false discovery rates of differential expression methods.
Figure 3: Self-consistency of differential expression methods when using less data.

Similar content being viewed by others

Accession codes

Primary accessions

European Nucleotide Archive

Gene Expression Omnibus

Sequence Read Archive

Change history

  • 23 August 2017

    In the version of this article initially published, the final term in equation (2) in the Online Methods was incorrectly specified as xti. The correct term is ζti. Also, the two callouts to Supplementary Note 3 in the Online Methods section were incorrect and should have referred to Supplementary Note 2. These errors have been corrected in the HTML and PDF versions of the article.

References

  1. Conesa, A. et al. Genome Biol. 17, 13 (2016).

    Article  Google Scholar 

  2. Law, C.W., Chen, Y., Shi, W. & Smyth, G.K. Genome Biol. 15, R29 (2014).

    Article  Google Scholar 

  3. Li, B., Ruotti, V., Stewart, R.M., Thomson, J.A. & Dewey, C.N. Bioinformatics 26, 493–500 (2010).

    Article  Google Scholar 

  4. Trapnell, C. et al. Nat. Biotechnol. 28, 511–515 (2010).

    Article  CAS  Google Scholar 

  5. Glaus, P., Honkela, A. & Rattray, M. Bioinformatics 28, 1721–1728 (2012).

    Article  CAS  Google Scholar 

  6. Anders, S. & Huber, W. Genome Biol. 11, R106 (2010).

    Article  CAS  Google Scholar 

  7. Leng, N. et al. Bioinformatics 29, 1035–1043 (2013).

    Article  CAS  Google Scholar 

  8. Robinson, M.D. & Smyth, G.K. Bioinformatics 23, 2881–2887 (2007).

    Article  CAS  Google Scholar 

  9. Love, M.I., Huber, W. & Anders, S. Genome Biol. 15, 550 (2014).

    Article  Google Scholar 

  10. Bray, N.L., Pimentel, H., Melsted, P. & Pachter, L. Nat. Biotechnol. 34, 525–527 (2016).

    Article  CAS  Google Scholar 

  11. Turro, E., Astle, W.J. & Tavaré, S. Bioinformatics 30, 180–188 (2014).

    Article  CAS  Google Scholar 

  12. Trapnell, C. et al. Nat. Biotechnol. 31, 46–53 (2013).

    Article  CAS  Google Scholar 

  13. Teng, M. et al. Genome Biol. 17, 74 (2016).

    Article  Google Scholar 

  14. Bottomly, D. et al. PLoS One 6, e17820 (2011).

    Article  CAS  Google Scholar 

  15. Lappalainen, T. et al. Nature 501, 506–511 (2013).

    Article  CAS  Google Scholar 

  16. Soneson, C., Love, M.I. & Robinson, M.D. F1000Res. 4, 1521 (2016).

    Article  Google Scholar 

  17. Kim, D., Langmead, B. & Salzberg, S.L. Nat. Methods 12, 357–360 (2015).

    Article  CAS  Google Scholar 

  18. Liao, Y., Smyth, G.K. & Shi, W. Bioinformatics 30, 923–930 (2014).

    Article  CAS  Google Scholar 

  19. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Genome Biol. 10, R25 (2009).

    Article  Google Scholar 

  20. Köster, J. & Rahmann, S. Bioinformatics 28, 2520–2522 (2012).

    Article  Google Scholar 

Download references

Acknowledgements

H.P. and L.P. were partially supported by NIH grant nos. R01 DK094699 and R01 HG006129. We thank D. Li, A. Tseng, and P. Sturmfels for help with implementing some of the interactive features in sleuth.

Author information

Authors and Affiliations

Authors

Contributions

H.P. led the development of the sleuth statistical model and was assisted by S.P., N.L.B., P.M., and L.P. The method comparison and testing framework was designed by H.P., N.L.B., P.M., and L.P. The interactive sleuth live software was designed and implemented by H.P., as was the sleuth R package. H.P. automated production of the results. H.P., N.L.B., P.M., and L.P. analyzed results and wrote the paper.

Corresponding author

Correspondence to Lior Pachter.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Integrated supplementary information

Supplementary Figure 1 Sensitivity versus FDR in the "effect from experiment" simulation at the transcript level.

Zoomed out version of performance on effect from experiment simulation at the isoform level. The gray box in the bottom left-hand corner represents the zoomed in region in Figure 2. See Figure 2 caption for more information.

Supplementary Figure 2 Sensitivity versus FDR in the "effect from experiment" simulation at the gene level.

Zoomed out version of performance on effect from experiment simulation at the gene level. The gray box in the bottom left-hand corner represents the zoomed in region in Figure 2. See Figure 2 caption for more information.

Supplementary Figure 3 Null resampling experiment at the isoform level.

Each dot represents the number of false positives in a particular shuffling, and the box plot represents the distribution. Each point represents the number of false positives of a method on a single experiment. Each box plot contains hinges at the 25th and 75th percentile, a line at the median, and whiskers extending to the smallest/largest value no less/more than 1.5*IQR from the median.

Supplementary Figure 4 Null resampling experiment at the gene level.

Each dot represents the number of false positives in a particular shuffling, and the box plot represents the distribution. Each point represents the number of false positives of a method on a single experiment. Each box plot contains hinges at the 25th and 75th percentile, a line at the median, and whiskers extending to the smallest/largest value no less/more than 1.5*IQR from the median.

Supplementary Figure 5 Sensitivity versus FDR in the "effect from experiment" simulation at the transcript level including alternative variance estimators.

Zoomed out version of performance on effect from reference simulation at the isoform level introducing additional variance estimators for sleuth. The gray box in the bottom left-hand corner represents the zoomed in region in Supplementary Figure 6.

Supplementary Figure 6 Sensitivity versus FDR in the "effect from experiment" simulation at the transcript level including alternative variance estimators.

Zoomed in version of performance on effect from reference simulation at the isoform level introducing additional variance estimators for sleuth.

Supplementary Figure 7 Sensitivity versus FDR in the "effect from experiment" simulation at the gene level including tximport.

Zoomed out version of performance on effect from reference simulation at the gene level substituting tximport for featureCounts. The gray box in the bottom left-hand corner represents the zoomed in region in Supplementary Figure 8.

Supplementary Figure 8 Sensitivity versus FDR in the "effect from experiment" simulation at the gene level including tximport.

Zoomed in version of performance on effect from reference simulation at the gene level substituting tximport for featureCounts.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–8 and Supplementary Notes 1–3

Supplementary Software

Sleuth software used in the article along with the analysis to reproduce figures.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Pimentel, H., Bray, N., Puente, S. et al. Differential analysis of RNA-seq incorporating quantification uncertainty. Nat Methods 14, 687–690 (2017). https://doi.org/10.1038/nmeth.4324

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.4324

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing