Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms

Patro, Rob; Mount, Stephen M; Kingsford, Carl

doi:10.1038/nbt.2862

Brief Communication
Published: 20 April 2014

Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms

Rob Patro¹,
Stephen M Mount^2,3 &
Carl Kingsford¹

Nature Biotechnology volume 32, pages 462–464 (2014)Cite this article

26k Accesses
372 Citations
123 Altmetric
Metrics details

Subjects

Abstract

We introduce Sailfish, a computational method for quantifying the abundance of previously annotated RNA isoforms from RNA-seq data. Because Sailfish entirely avoids mapping reads, a time-consuming step in all current methods, it provides quantification estimates much faster than do existing approaches (typically 20 times faster) without loss of accuracy. By facilitating frequent reanalysis of data and reducing the need to optimize parameters, Sailfish exemplifies the potential of lightweight algorithms for efficiently processing sequencing reads.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on Springer Link
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Figure 1: Overview of the Sailfish pipeline.**

**Figure 2: Speed and accuracy of Sailfish.**

Inferring gene regulatory networks from single-cell multiome data using atlas-scale external data

Article Open access 12 April 2024

Single-cell long-read sequencing-based mapping reveals specialized splicing patterns in developing and adult mouse and human brain

Article Open access 09 April 2024

Assessing GPT-4 for cell type annotation in single-cell RNA-seq analysis

Article Open access 25 March 2024

Accession codes

Accessions

Sequence Read Archive

SRX016366

References

Soneson, C. & Delorenzi, M. BMC Bioinformatics 14, 91 (2013).
Article Google Scholar
Roychowdhury, S. et al. Sci. Trans. Med. 111ra121 (2011).
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Genome Biol. 10, R25 (2009).
Article Google Scholar
Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Nat. Methods 5, 621–628 (2008).
Article CAS Google Scholar
Trapnell, C. et al. Nat. Biotechnol. 28, 511–515 (2010).
Article CAS Google Scholar
Li, B. & Dewey, C. BMC Bioinformatics 12, 323 (2011).
Article CAS Google Scholar
Roberts, A. & Pachter, L. Nat. Methods 10, 71–73 (2012).
Article Google Scholar
Philippe, N., Salson, M., Commes, T. & Rivals, E. Genome Biol. 14, R30 (2013).
Article Google Scholar
Botelho, F.C., Pagh, R. & Ziviani, N. Proceedings of the 10th International Workshop on Algorithms and Data Structures Halifax, NS, Canada, August 15–17, 2007 (eds. Dehne, F., Sack, J.-R. & Zeh, N.)139–150 (Springer, 2007).
Marçais, G. & Kingsford, C. Bioinformatics 27, 764–770 (2011).
Article Google Scholar
Varadhan, R. & Roland, C. Scand. J. Stat. 35, 335–353 (2008).
Article Google Scholar
Nicolae, M., Mangul, S., Mandoiu, I. & Zelikovsky, A. Algorithms Mol. Biol. 6, 9 (2011).
Article Google Scholar
Salzman, J., Jiang, H. & Wong, W.H. Stat. Sci. 26, 62–83 (2011).
Article Google Scholar
Zheng, W., Chung, L.M. & Zhao, H. BMC Bioinformatics 12, 290 (2011).
Article CAS Google Scholar
Shi, L. et al. Nat. Biotechnol. 24, 1151–1161 (2006).
Article CAS Google Scholar
Bullard, J.H., Purdom, E., Hansen, K.D. & Dudoit, S. BMC Bioinformatics 11, 94 (2010).
Article Google Scholar
Griebel, T. et al. Nucleic Acids Res. 40, 10073–10083 (2012).
Article CAS Google Scholar
Grabherr, M.G. et al. Nat. Biotechnol. 29, 644–652 (2011).
Article CAS Google Scholar
Sacomoto, G.A. et al. BMC Bioinformatics 13 (suppl. 6), S5 (2012).
PubMed PubMed Central Google Scholar
Pruitt, K.D., Tatusova, T., Brown, G.R. & Maglott, D.R. Nucleic Acids Res. 40, D1, D130–D135 (2012).
Article Google Scholar
Flicek, P. et al. Nucleic Acids Res. 41, D1, D48–D55 (2013).
Google Scholar
Trapnell, C., Pachter, L. & Salzberg, S. Bioinformatics 25, 1105–1111 (2009).
Article CAS Google Scholar
Pheatt, C. J. Comput. Sci. Coll. 23, 298–298 (2008).
Google Scholar

Download references

Acknowledgements

This work has been partially funded by the US National Science Foundation (CCF-1256087, CCF-1053918, and EF-0849899) and US National Institutes of Health (R21AI085376, R21HG006913 and R01HG007104). C.K. received support as an Alfred P. Sloan Research Fellow. We would like to thank A. Roberts for helping to diagnose and resolve an artifact in an earlier version of this manuscript pertaining to the synthetic data generated by the Flux Simulator.

Author information

Authors and Affiliations

Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
Rob Patro & Carl Kingsford
Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland, USA
Stephen M Mount
Center for Bioinformatics and Computational Biology, University of Maryland, College Park, Maryland, USA
Stephen M Mount

Authors

Rob Patro
View author publications
You can also search for this author in PubMed Google Scholar
Stephen M Mount
View author publications
You can also search for this author in PubMed Google Scholar
Carl Kingsford
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.P., S.M.M. and C.K. designed the method and algorithms, devised the experiments, and wrote the manuscript. R.P. implemented the Sailfish software.

Corresponding author

Correspondence to Carl Kingsford.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–7, Supplementary Table 1 and Supplementary Notes 1–3 (PDF 1337 kb)

Supplementary Data

Version 0.6.3 of the Sailfish source code (ZIP 666 kb)

Source data

Source data to Fig. 1

Rights and permissions

Reprints and permissions

About this article

Cite this article

Patro, R., Mount, S. & Kingsford, C. Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nat Biotechnol 32, 462–464 (2014). https://doi.org/10.1038/nbt.2862

Download citation

Received: 17 August 2013
Accepted: 04 March 2014
Published: 20 April 2014
Issue Date: May 2014
DOI: https://doi.org/10.1038/nbt.2862

This article is cited by

Identification of acidic stress-responsive genes and acid tolerance engineering in Synechococcus elongatus PCC 7942
- Jie Zhang
- Tao Sun
- Lei Chen
Applied Microbiology and Biotechnology (2024)
DIVE: a reference-free statistical approach to diversity-generating and mobile genetic element discovery
- Jordi Abante
- Peter L. Wang
- Julia Salzman
Genome Biology (2023)
Sorting and packaging of RNA into extracellular vesicles shape intracellular transcript levels
- Tina O’Grady
- Makon-Sébastien Njock
- Franck Dequiedt
BMC Biology (2022)
Bulk RNA sequencing analysis of developing human induced pluripotent cell-derived retinal organoids
- Devansh Agarwal
- Rian Kuhns
- Ray A. Enke
Scientific Data (2022)
Foliar application of clay-delivered RNA interference for whitefly control
- Ritesh G. Jain
- Stephen J. Fletcher
- Neena Mitter
Nature Plants (2022)