Brief Communication | Published:

Salmon provides fast and bias-aware quantification of transcript expression

Nature Methods volume 14, pages 417419 (2017) | Download Citation

Abstract

We introduce Salmon, a lightweight method for quantifying transcript abundance from RNA–seq reads. Salmon combines a new dual-phase parallel inference algorithm and feature-rich bias models with an ultra-fast read mapping procedure. It is the first transcriptome-wide quantifier to correct for fragment GC-content bias, which, as we demonstrate here, substantially improves the accuracy of abundance estimates and the sensitivity of subsequent differential expression analysis.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

References

  1. 1.

    et al. Cell 158, 929–944 (2014).

  2. 2.

    , , & Genome Res. 24, 1086–1101 (2014).

  3. 3.

    et al. Nat. Genet. 45, 1113–1120 (2013).

  4. 4.

    , , , & Genome Biol. 12, R22 (2011).

  5. 5.

    , & Nat. Biotechnol. 34, 1287–1291 (2016).

  6. 6.

    et al. Cell Metab. 16, 435–448 (2012).

  7. 7.

    et al. Genome Biol. 17, 74 (2016).

  8. 8.

    , & Nucleic Acids Res. 40, D54–D56 (2012).

  9. 9.

    , & Nat. Biotechnol. 32, 462–464 (2014).

  10. 10.

    , , & Nat. Biotechnol. 34, 525–527 (2016).

  11. 11.

    et al. Nature 501, 506–511 (2013).

  12. 12.

    SEQC/MAQ-III Consortium. Nat. Biotechnol. 32, 903–914 (2014).

  13. 13.

    , , & Bioinformatics 31, 2778–2784 (2015).

  14. 14.

    , , , & Bioinformatics 26, 493–500 (2010).

  15. 15.

    & Nat. Methods 10, 71–73 (2013).

  16. 16.

    & Nat. Methods 9, 357–359 (2012).

  17. 17.

    , , & Bioinformatics 32, i192–i200 (2016).

  18. 18.

    et al. Nat. Biotechnol. 31, 1015–1022 (2013).

  19. 19.

    , , , & in Proc. 19th ACM SIGKDD Int. Conf. Knowledge Discov. & Data Mining 446–454 (ACM, 2013).

  20. 20.

    et al. Pattern Recognition and Machine Learning (Springer, 2006).

  21. 21.

    , , , & Bioinformatics 31, 3881–3889 (2015).

  22. 22.

    et al. BMC Genomics 15 (Suppl. 10), S5 (2014).

  23. 23.

    in Mixtures: Estimation and Applications (eds. Mengersen, K.L., Robert, C.P. & Titterington, D.M.) Ch. 2 (John Wiley & Sons, 2011).

  24. 24.

    , & ICML 15, 2370–2379 (2015).

  25. 25.

    , & Stat. Sci. 26, 1 (2011).

  26. 26.

    , , & Algorithms Mol. Biol. 6, 9 (2011).

  27. 27.

    et al. Genome Biol. 12, R13 (2011).

  28. 28.

    , , & in Proc. Ninth Eur. Conf. Computer Syst. 27 (ACM, 2014).

  29. 29.

    & F1000Research 5, 1795 (2016).

  30. 30.

    Linux J. 2014 (2014).

  31. 31.

    , , & figshare (2014).

  32. 32.

    & Preprint at (2016).

Download references

Acknowledgements

We wish to thank those who have been using and providing feedback on Salmon since early in its (open) development cycle. The software has been greatly improved in many ways based on their feedback. This research is funded in part by the Gordon and Betty Moore Foundation's Data-Driven Discovery Initiative through Grant GBMF4554 to C.K. It is partially funded by the US National Science Foundation (CCF-1256087, CCF-1319998, BBSRC-NSF/BIO-1564917) and the US National Institutes of Health (R21HG006913, R01HG007104). C.K. received support as an Alfred P. Sloan Research Fellow. This work was partially completed while G.D. was a postdoctoral fellow in the Computational Biology Department at Carnegie Mellon University. M.I.L. was supported by NIH grant 5T32CA009337-35. R.A.I. was supported by NIH R01 grant HG005220.

Author information

Affiliations

  1. Department of Computer Science, Stony Brook University, Stony Brook, New York, USA.

    • Rob Patro
  2. DNAnexus, Mountain View, California, USA.

    • Geet Duggal
  3. Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Cambridge, Massachusetts, USA.

    • Michael I Love
    •  & Rafael A Irizarry
  4. Department of Biostatistics, Harvard T.H. Chan School of Public Health, Cambridge, Massachusetts, USA.

    • Michael I Love
    •  & Rafael A Irizarry
  5. Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA.

    • Carl Kingsford

Authors

  1. Search for Rob Patro in:

  2. Search for Geet Duggal in:

  3. Search for Michael I Love in:

  4. Search for Rafael A Irizarry in:

  5. Search for Carl Kingsford in:

Contributions

R.P. and C.K. designed the method, which was implemented by R.P. R.P., G.D., M.I.L., R.I., and C.K. designed the experiments, and R.P., G.D., and M.I.L. conducted the experiments. R.P., G.D., M.I.L., R.A.I., and C.K. wrote the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding authors

Correspondence to Rob Patro or Carl Kingsford.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–7, Supplementary Tables 1–4, Supplementary Notes 1 and 2, and Supplementary Algorithms 1

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nmeth.4197

Further reading