MUEGANO: A divide and conquer algorithm to overcome memory limitations when assembling shotgun projects

Abstract

When assembling a large quantity of reads in a genomic shotgun project a serious limitation is the amount of random access memory (RAM) of the computers used in the project. This arises because all assembling programs must look at all the overlaps between reads at the same time, using RAM in order to construct contigs, and the memory of the computer can be filled up during this step, causing the abortion of the assembling process.Here we propose an algorithm that is capable of overcoming any memory limitation by using redundancy of processing and thus producing an increase in computing time but overcoming the memory limitation.The proposed algorithm consists in dividing the reads in a set of groups which size is half the maximum capability in memory of the computer used and performing assembling for all the possible combination pairs of such groups. After eliminating the redundancy of the set of contigs obtained in the previous step, the process is iterated until a set of contigs of manageable size is obtained such that the set can be handled by the assembler in a final step.Each step of the procedure increases the time of computing from k to approximately k + k(k-1)/2, but in many practical cases only one step is needed to finish the assembling process. The procedure is suitable for any kind of assembler and was successfully applied to the assembly of a very large set of reads from the maize genome.

Author information

Affiliations

Authors

Corresponding author

Correspondence to Octavio Martinez.

Rights and permissions

Creative Commons Attribution 3.0 License.

Reprints and Permissions

About this article

Cite this article

Martinez, O., Fernandez-Cortes, A. MUEGANO: A divide and conquer algorithm to overcome memory limitations when assembling shotgun projects. Nat Prec (2009). https://doi.org/10.1038/npre.2009.3712.1

Download citation

Keywords

  • shotgun assembly maize genome algorithm

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing