Brief Communication | Published:

Streaming fragment assignment for real-time analysis of sequencing experiments

Nature Methods volume 10, pages 7173 (2013) | Download Citation

This article has been updated

Abstract

We present eXpress, a software package for efficient probabilistic assignment of ambiguously mapping sequenced fragments. eXpress uses a streaming algorithm with linear run time and constant memory use. It can determine abundances of sequenced molecules in real time and can be applied to ChIP-seq, metagenomics and other large-scale sequencing data. We demonstrate its use on RNA-seq data and show that eXpress achieves greater efficiency than other quantification methods.

Access optionsAccess options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Change history

  • 04 December 2012

    In the HTML version of this article initially published online, errors in mathematical terms were present in the Online Methods section. The errors have been corrected in the HTML version.

Accessions

Sequence Read Archive

References

  1. 1.

    , , , & Genome Biol. 12, 402 (2011).

  2. 2.

    & Nat. Methods 5, 19–21 (2008).

  3. 3.

    et al. Bioinformatics 25, 2613–2614 (2009).

  4. 4.

    & BMC Bioinformatics 12, 323 (2011).

  5. 5.

    et al. Nat. Biotechnol. 28, 511–515 (2010).

  6. 6.

    et al. PLOS Comput. Biol. 7, e1002111 (2011).

  7. 7.

    , & Bioinformatics 27, 1618–1624 (2011).

  8. 8.

    , & Commun. Inf. Syst. 10, 69–82 (2010).

  9. 9.

    & J. R. Stat. Soc. Series B Stat. Methodol. 71, 593–613 (2009).

  10. 10.

    & in Proc. Hum. Lang. Technol. Conf. North Am. Ch. Assoc. Comput. Linguist. 611–619 (ACL, 2009).

  11. 11.

    , & Nucleic Acids Res. 38, e131 (2010).

  12. 12.

    et al. Genome Biol. 12, R22 (2011).

  13. 13.

    et al. Nucleic Acids Res. 39, e9 (2011).

  14. 14.

    MAQC Consortium. et al. Nat. Biotechnol. 24, 1151–1161 (2006).

  15. 15.

    & Genome Biol. 11, R106 (2010).

  16. 16.

    et al. Nat. Biotechnol. (in the press).

  17. 17.

    et al. Nat. Biotechnol. 26, 1146–1153 (2008).

  18. 18.

    Genome Biol. 11, 207 (2010).

  19. 19.

    , , & Genome Biol. 10, R25 (2009).

  20. 20.

    et al. Nucleic Acids Res. 40, e41 (2011).

Download references

Acknowledgements

This work was supported by US National Institutes of Health grant R01HG006129. A.R. was supported in part by a National Science Foundation graduate research fellowship. We thank H. Pimentel for developing Map2GTF for converting genome mappings to transcriptome mappings and incorporating it into TopHat to help with our analysis.

Author information

Affiliations

  1. Department of Computer Science, University of California, Berkeley, Berkeley, California, USA.

    • Adam Roberts
    •  & Lior Pachter
  2. Department of Mathematics, University of California, Berkeley, Berkeley, California, USA.

    • Lior Pachter
  3. Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, California, USA.

    • Lior Pachter

Authors

  1. Search for Adam Roberts in:

  2. Search for Lior Pachter in:

Contributions

A.R. and L.P. developed the mathematics and statistics and designed the algorithms. A.R. implemented the method in eXpress. A.R. and L.P. tested the software and performed the analysis. A.R. and L.P. wrote the manuscript.

Competing interests

The authors declare no competing financial interests.

Corresponding author

Correspondence to Lior Pachter.

Supplementary information

PDF files

  1. 1.

    Supplementary Text and Figures

    Supplementary Figures 1–11 and Supplementary Tables 1 and 2

Zip files

  1. 1.

    Supplementary Software

    eXpress source code and compiled binary files.

About this article

Publication history

Received

Accepted

Published

DOI

https://doi.org/10.1038/nmeth.2251

Further reading