Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

De novo assembly and analysis of RNA-seq data

Abstract

We describe Trans-ABySS, a de novo short-read transcriptome assembly and analysis pipeline that addresses variation in local read densities by assembling read substrings with varying stringencies and then merging the resulting contigs before analysis. Analyzing 7.4 gigabases of 50-base-pair paired-end Illumina reads from an adult mouse liver poly(A) RNA library, we identified known, new and alternative structures in expressed transcripts, and achieved high sensitivity and specificity relative to reference-based assembly methods.

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Representation of transcripts and contigs across assemblies.
Figure 2: Performance comparisons between ABySS and reference-based transcriptome analysis tools.

References

  1. Pepke, S., Wold, B. & Mortazavi, A. Nat. Methods 6, S22–S32 (2009).

    Article  CAS  Google Scholar 

  2. Griffith, M. et al. Nat. Methods 7, 843–847 (2010).

    Article  CAS  Google Scholar 

  3. Ameur, A. et al. Genome Biol. 11, R34 (2010).

    Article  Google Scholar 

  4. Au, K.F. et al. Nucleic Acids Res. 38, 4570–4578 (2010).

    Article  CAS  Google Scholar 

  5. De Bona, F. et al. Bioinformatics 24, i174–i180 (2008).

    Article  Google Scholar 

  6. Trapnell, C., Pachter, L. & Salzberg, S.L. Bioinformatics 25, 1105–1111 (2009).

    Article  CAS  Google Scholar 

  7. Wu, T.D. & Nacu, S. Bioinformatics 26, 873–881 (2010).

    Article  CAS  Google Scholar 

  8. Guttman, M. et al. Nat. Biotechnol. 28, 503–510 (2010).

    Article  CAS  Google Scholar 

  9. Trapnell, C. et al. Nat. Biotechnol. 28, 511–515 (2010).

    Article  CAS  Google Scholar 

  10. Li, B. et al. Bioinformatics 26, 493–500 (2010).

    Article  Google Scholar 

  11. Li, J., Jiang, H. & Wong, W.H. Genome Biol. 11, R50 (2010).

    Article  Google Scholar 

  12. Krawitz, P. et al. Bioinformatics 26, 722–729 (2010).

    Article  CAS  Google Scholar 

  13. Cartwright, R.A. Mol. Biol. Evol. 26, 473–480 (2009).

    Article  CAS  Google Scholar 

  14. Degner, J.F. et al. Bioinformatics 25, 3207–3212 (2009).

    Article  CAS  Google Scholar 

  15. Birzele, F. et al. Nucleic Acids Res. 38, 3999–4010 (2010).

    Article  CAS  Google Scholar 

  16. Simpson, J.T. et al. Genome Res. 19, 1117–1123 (2009).

    Article  CAS  Google Scholar 

  17. Flicek, P. & Birney, E. Nat. Methods 6 (Suppl.), S6–S12 (2009).

    Article  CAS  Google Scholar 

  18. Birol, I. et al. Bioinformatics 25, 2872–2877 (2009).

    Article  CAS  Google Scholar 

  19. Slater, G.S. & Birney, E. BMC Bioinformatics 6, 31 (2005).

    Article  Google Scholar 

  20. Li, H. & Durbin, R. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  Google Scholar 

  21. Hubbard, T.J. et al. Nucleic Acids Res. 37, D690–D697 (2009).

    Article  CAS  Google Scholar 

  22. Kent, W.J. Genome Res. 12, 656–664 (2002).

    Article  CAS  Google Scholar 

  23. Hsu, F. et al. Bioinformatics 22, 1036–1046 (2006).

    Article  CAS  Google Scholar 

  24. Pruitt, K.D., Tatusova, T. & Maglott, D.R. Nucleic Acids Res. 35, D61–D65 (2007).

    Article  CAS  Google Scholar 

  25. Thierry-Mieg, D. & Thierry-Mieg, J. Genome Biol. 7 (Suppl.), 11–14 (2006).

    Google Scholar 

  26. Melamud, E. & Moult, J. Nucleic Acids Res. 37, 4873–4886 (2009).

    Article  CAS  Google Scholar 

  27. Nagalakshmi, U. et al. Science 320, 1344–1349 (2008).

    Article  CAS  Google Scholar 

  28. Jackman, S.D. & Birol, I. Genome Biol. 11, 202 (2010).

    Article  Google Scholar 

  29. Sheth, N. et al. Nucleic Acids Res. 34, 3955–3967 (2006).

    Article  CAS  Google Scholar 

  30. Rhead, B. et al. Nucleic Acids Res. 38 Database issue, D613–D619 (2010).

    Article  CAS  Google Scholar 

  31. Koscielny, G. et al. Genomics 93, 213–220 (2009).

    Article  CAS  Google Scholar 

  32. Trapnell, C. & Salzberg, S.L. Nat. Biotechnol. 27, 455–457 (2009).

    Article  CAS  Google Scholar 

Download references

Acknowledgements

Funding for this work was provided in part by Genome Canada, Genome British Columbia, Michael Smith Foundation for Health Research and the Canadian Institute of Health Research (CIHR), including the CIHR Bioinformatics Training Program for Health Research. We thank S. Morrissy and G. Taylor for insightful discussions, A. He for technical assistance, A. Fejes for assistance with coverage bias calculations, and A. Tuin and N. Watkins (DNA Software) for assistance with primer design.

Author information

Authors and Affiliations

Authors

Contributions

G.R. and J.S. wrote the paper. J.S., G.R. and K.M. reviewed predictions and recommended analysis methods. G.R. coordinated analysis and validation. B.K., A.-L.P. and A.T. constructed libraries under the supervision of YJ.Z. S.L. generated biological material and performed RT-PCR validation. R.A.M. supervised sequencing activities. Y.S.B., T.C., R. Corbett, R. Chiu, M.F., M.G., J.Q.Q., R.N., H.M.O., N.T., R.V., S.K.C. and R.S. developed analysis methods and code and performed analyses. R. Corbett and R. Chiu performed comparisons with reference-based methods. S.D.J. develops and maintains ABySS and generated the ABySS assemblies. A.R. contributed algorithms and code for ABySS. M.A.M., S.J.M.J. and P.A.H. directed research. S.J.M.J. suggested analysis methods. YJ.Z. and M.H. developed the WTSS protocol. J.S. supervised activities. P.A.H. supervised validation. I.B. developed ABySS and Trans-ABySS and directed bioinformatics work.

Corresponding author

Correspondence to Inanc Birol.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–21, Supplementary Tables 1–4, Supplementary Note (PDF 2262 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Robertson, G., Schein, J., Chiu, R. et al. De novo assembly and analysis of RNA-seq data. Nat Methods 7, 909–912 (2010). https://doi.org/10.1038/nmeth.1517

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.1517

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing