Brief Communication | Published:

Near-optimal probabilistic RNA-seq quantification

Nature Biotechnology volume 34, pages 525527 (2016) | Download Citation

  • An Erratum to this article was published on 09 August 2016

This article has been updated

Abstract

We present kallisto, an RNA-seq quantification program that is two orders of magnitude faster than previous approaches and achieves similar accuracy. Kallisto pseudoaligns reads to a reference, producing a list of transcripts that are compatible with each read while avoiding alignment of individual bases. We use kallisto to analyze 30 million unaligned paired-end RNA-seq reads in <10 min on a standard laptop computer. This removes a major computational bottleneck in RNA-seq analysis.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Change history

  • 27 July 2016

    In the version of this article initially published, in the HTML version only, the equation “αtN > 0.01” was written as “αtN > 0.01.” In addition, in the Figure 1 legend, the formatting of the nodes was incorrect (v_1, etc., rather than v1). The errors have been corrected in the HTML and PDF versions of the article.

References

  1. 1.

    et al. Genome Biol. 14, R36 (2013).

  2. 2.

    et al. Nat. Biotechnol. 28, 511–515 (2010).

  3. 3.

    & Nat. Methods 10, 71–73 (2013).

  4. 4.

    , & Bioinformatics 31, 166–169 (2015).

  5. 5.

    , & Nat. Biotechnol. 32, 462–464 (2014).

  6. 6.

    , , , & Nat. Methods 5, 621–628 (2008).

  7. 7.

    , , & in Algorithms in Bioinformatics (eds. Moulton, V. & Singh, M.) 202–214 (Springer, 2010).

  8. 8.

    , & Nat. Biotechnol. 29, 987–991 (2011).

  9. 9.

    & BMC Bioinformatics 12, 323 (2011).

  10. 10.

    SEQC/MAQC-III Consortium. Nat. Biotechnol. 32, 903–914 (2014).

  11. 11.

    et al. Nature 501, 506–511 (2013).

  12. 12.

    , , , & Genome Biol. 12, R22 (2011).

  13. 13.

    , , , & Genome Res. 18, 1509–1517 (2008).

  14. 14.

    & Nat. Methods 5, 19–21 (2008).

  15. 15.

    , , , & Nat. Genet. 44, 226–232 (2012).

  16. 16.

    , , , & BMC Bioinformatics 16, 278 (2015).

  17. 17.

    & Bioinformatics 28, 2520–2522 (2012).

Download references

Acknowledgements

N.L.B., H.P. and L.P. were partially funded by NIH R01 HG006129. P.M. was partially funded by a Fulbright fellowship.

Author information

Affiliations

  1. Innovative Genomics Initiative, University of California, Berkeley, California, USA.

    • Nicolas L Bray
  2. Department of Computer Science, University of California, Berkeley, California, USA.

    • Harold Pimentel
    •  & Lior Pachter
  3. Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland, Reykjavik, Iceland.

    • Páll Melsted
  4. Department of Mathematics, University of California, Berkeley, California, USA.

    • Lior Pachter
  5. Department of Molecular & Cell Biology, University of California, Berkeley, California, USA.

    • Lior Pachter

Authors

  1. Search for Nicolas L Bray in:

  2. Search for Harold Pimentel in:

  3. Search for Páll Melsted in:

  4. Search for Lior Pachter in:

Contributions

N.L.B. and L.P. developed the concept of pseudoalignment and conceived the idea for applying it to RNA-seq quantification. P.M. conceived the implementation using De Bruijn graphs. N.L.B., H.P., P.M. and L.P. designed the kallisto software and N.L.B. implemented a prototype. H.P. and P.M. wrote the current kallisto implementation. N.B. and H.P. automated production of the results. N.L.B., H.P., P.M. and L.P. analyzed results and wrote the paper.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Lior Pachter.

Integrated supplementary information

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–11

Excel files

  1. 1.

    Supplementary Table 1a

    Performance of quantification as measured by SEQC qPCR

  2. 2.

    Supplementary Table 1b

    Gene level performance of quantification as measured by SEQC

  3. 3.

    Supplementary Table 2

    Performance of kallisto with and without bias

Zip files

  1. 1.

    Supplementary Software

  2. 2.

    Supplementary Code

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nbt.3519

Further reading