MUEGANO: A divide and conquer algorithm to overcome memory limitations when assembling shotgun projects

Martinez, Octavio; Fernandez-Cortes, Araceli

doi:10.1038/npre.2009.3712.1

Download PDF

Manuscript
Open access
Published: 02 September 2009

MUEGANO: A divide and conquer algorithm to overcome memory limitations when assembling shotgun projects

Octavio Martinez¹ &
Araceli Fernandez-Cortes¹

Nature Precedings (2009)Cite this article

Abstract

When assembling a large quantity of reads in a genomic shotgun project a serious limitation is the amount of random access memory (RAM) of the computers used in the project. This arises because all assembling programs must look at all the overlaps between reads at the same time, using RAM in order to construct contigs, and the memory of the computer can be filled up during this step, causing the abortion of the assembling process.Here we propose an algorithm that is capable of overcoming any memory limitation by using redundancy of processing and thus producing an increase in computing time but overcoming the memory limitation.The proposed algorithm consists in dividing the reads in a set of groups which size is half the maximum capability in memory of the computer used and performing assembling for all the possible combination pairs of such groups. After eliminating the redundancy of the set of contigs obtained in the previous step, the process is iterated until a set of contigs of manageable size is obtained such that the set can be handled by the assembler in a final step.Each step of the procedure increases the time of computing from k to approximately k + k(k-1)/2, but in many practical cases only one step is needed to finish the assembling process. The procedure is suitable for any kind of assembler and was successfully applied to the assembly of a very large set of reads from the maize genome.

Article PDF

Author information

Authors and Affiliations

Laboratorio Nacional de Genómica para la Biodiversidad (Langebio), CINVESTAV Irapuato, México
Octavio Martinez & Araceli Fernandez-Cortes

Authors

Octavio Martinez
View author publications
You can also search for this author in PubMed Google Scholar
Araceli Fernandez-Cortes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Octavio Martinez.

Rights and permissions

Creative Commons Attribution 3.0 License.

Reprints and permissions

About this article

Cite this article

Martinez, O., Fernandez-Cortes, A. MUEGANO: A divide and conquer algorithm to overcome memory limitations when assembling shotgun projects. Nat Prec (2009). https://doi.org/10.1038/npre.2009.3712.1

Download citation

Received: 02 September 2009
Accepted: 02 September 2009
Published: 02 September 2009
DOI: https://doi.org/10.1038/npre.2009.3712.1

Keywords

shotgun assembly maize genome algorithm

MUEGANO: A divide and conquer algorithm to overcome memory limitations when assembling shotgun projects

Abstract

Similar content being viewed by others

Linear time complexity de novo long read genome assembly with GoldRush

Time- and memory-efficient genome assembly with Raven

Terabase-scale metagenome coassembly with MetaHipMer

Article PDF

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Search

Quick links

Abstract

Similar content being viewed by others

Linear time complexity de novo long read genome assembly with GoldRush

Time- and memory-efficient genome assembly with Raven

Terabase-scale metagenome coassembly with MetaHipMer

Article PDF

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links