Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms


We introduce Sailfish, a computational method for quantifying the abundance of previously annotated RNA isoforms from RNA-seq data. Because Sailfish entirely avoids mapping reads, a time-consuming step in all current methods, it provides quantification estimates much faster than do existing approaches (typically 20 times faster) without loss of accuracy. By facilitating frequent reanalysis of data and reducing the need to optimize parameters, Sailfish exemplifies the potential of lightweight algorithms for efficiently processing sequencing reads.

Figure 1: Overview of the Sailfish pipeline.
Figure 2: Speed and accuracy of Sailfish.

This work has been partially funded by the US National Science Foundation (CCF-1256087, CCF-1053918, and EF-0849899) and US National Institutes of Health (R21AI085376, R21HG006913 and R01HG007104). C.K. received support as an Alfred P. Sloan Research Fellow. We would like to thank A. Roberts for helping to diagnose and resolve an artifact in an earlier version of this manuscript pertaining to the synthetic data generated by the Flux Simulator.

R.P., S.M.M. and C.K. designed the method and algorithms, devised the experiments, and wrote the manuscript. R.P. implemented the Sailfish software.

Correspondence to Carl Kingsford.

The authors declare no competing financial interests.

Supplementary Text and Figures

Supplementary Figures 1–7, Supplementary Table 1 and Supplementary Notes 1–3 (PDF 1337 kb)

Version 0.6.3 of the Sailfish source code (ZIP 666 kb)

Patro, R., Mount, S. & Kingsford, C. Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. Nat Biotechnol 32, 462–464 (2014).

