Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Brief Communication
  • Published:

The GEM mapper: fast, accurate and versatile alignment by filtration

Abstract

Because of ever-increasing throughput requirements of sequencing data, most existing short-read aligners have been designed to focus on speed at the expense of accuracy. The Genome Multitool (GEM) mapper can leverage string matching by filtration to search the alignment space more efficiently, simultaneously delivering precision (performing fully tunable exhaustive searches that return all existing matches, including gapped ones) and speed (being several times faster than comparable state-of-the-art tools).

This is a preview of subscription content, access via your institution

Access options

Rent or buy this article

Prices vary by article type

from$1.95

to$39.95

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Salient points of our algorithmic approach.
Figure 2: Benchmarking the GEM mapper on real Illumina GA IIx and Roche 454 sequencing data.
Figure 3: Several accuracy benchmarks for the GEM mapper on simulated Illumina GA IIx and Roche 454 sequencing data.

Similar content being viewed by others

References

  1. Sboner, A., Mu, X.J., Greenbaum, D., Auerbach, R.K. & Gerstein, M.B. Genome Biol. 12, 125 (2011).

    Article  Google Scholar 

  2. Ventura, M. et al. Genome Res. 21, 1640–1649 (2011).

    Article  CAS  Google Scholar 

  3. Metzker, M.L. Nat. Rev. Genet. 11, 31–46 (2010).

    Article  CAS  Google Scholar 

  4. Hansen, K.D., Brenner, S.E. & Dudoit, S. Nucleic Acids Res. 38, e131 (2010).

    Article  Google Scholar 

  5. Karakoc, E. et al. Nat. Methods 9, 176–178 (2012).

    Article  CAS  Google Scholar 

  6. Alkan, C. et al. Nat. Genet. 41, 1061–1067 (2009).

    Article  CAS  Google Scholar 

  7. Hach, F. et al. Nat. Methods 7, 576–577 (2010).

    Article  CAS  Google Scholar 

  8. Gusfield, D. Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology (Cambridge University Press, 1997).

  9. Li, H. & Durbin, R. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  Google Scholar 

  10. Langmead, B., Trapnell, C., Pop, M. & Salzberg, S.L. Genome Biol. 10, R25 (2009).

    Article  Google Scholar 

  11. Langmead, B. & Salzberg, S.L. Nat. Methods 9, 357–359 (2012).

    Article  CAS  Google Scholar 

  12. Navarro, G. & Baeza-Yates, R. J. Discrete Algorithms (Amst.) 1, 205–239 (2000).

    Google Scholar 

  13. Myers, E.W. Algorithmica 12, 345–374 (1994).

    Article  Google Scholar 

  14. Li, R. et al. Bioinformatics 25, 1966–1967 (2009).

    Article  CAS  Google Scholar 

  15. Li, H. & Durbin, R. Bioinformatics 26, 589–595 (2010).

    Article  Google Scholar 

  16. Li, H. & Homer, N. Brief. Bioinform. 11, 473–483 (2010).

    Article  CAS  Google Scholar 

  17. Burrows, M. & Wheeler, D.J. Technical Report 124 (Digital Equipment Corporation, Palo Alto, California, 1994).

  18. Ferragina, P. & Manzini, G. in Proceedings of the 41st Symposium on Foundations of Computer Science (FOCS 2000) 390–398 (2000).

  19. Myers, E.W. JACM 46, 395–415 (1999).

    Article  Google Scholar 

  20. Eddy, S.R. Nat. Biotechnol. 22, 909–910 (2004).

    Article  CAS  Google Scholar 

  21. The Tomato Sequencing Consortium. Nature 485, 653–641 (2012).

  22. Li, H. et al. Bioinformatics 25, 2078–2079 (2009).

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by grant CSD2007-00050 from the Ministerio de Educación y Ciencia (Spain) and grant 1R01MH090941-01 from the US National Institutes of Health/National Human Genome Research Institute. Additional funding was provided by the European Union 7th Framework integrating project Revolutionary Approaches and Devices for Nucleic Acid Analysis (READNA, funded under grant agreement Health-F4-2008-201418) and the European Union 7th Framework project European Sequencing and Genotyping Infrastructure (ESGI, funded under grant agreement 262055). We thank S. Heath for his thorough revision of the original manuscript. S.M.-S. thanks J. Campos-Laclaustra for his advice. On behalf of the GEM project, P.R. also thanks T. Alioto, J. Camps-Puchades, T. Derrien, S. Djebali, P. Ferreira, I. Gut, S. Heath, D. Gonzalez-Knowles, R. Kofler, V. Lacroix, J. Lagarde and A. Merkel for their continued support and insights.

Author information

Authors and Affiliations

Authors

Contributions

S.M.-S. designed and implemented algorithms and contributed material to the manuscript. M.S. contributed with fruitful discussions. R.G. initiated the project and contributed with fruitful discussions. P.R. designed and implemented algorithms, was the main architect of the GEM project and wrote the manuscript. All the authors read and approved the manuscript.

Corresponding author

Correspondence to Paolo Ribeca.

Ethics declarations

Competing interests

Our institutions have decided a double-licensing scheme for the GEM tools; they will be free for academic noncommercial use, but a fee will be required for commercial use.

Supplementary information

Supplementary Text and Figures

Supplementary Figure 1, Supplementary Tables 1–6, Supplementary Discussion, Supplementary Protocol and Supplementary Data. (PDF 813 kb)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Marco-Sola, S., Sammeth, M., Guigó, R. et al. The GEM mapper: fast, accurate and versatile alignment by filtration. Nat Methods 9, 1185–1188 (2012). https://doi.org/10.1038/nmeth.2221

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nmeth.2221

This article is cited by

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research