How to apply de Bruijn graphs to genome assembly

A mathematical concept known as a de Bruijn graph turns the formidable challenge of assembling a contiguous genome from billions of short sequencing reads into a tractable computational problem.

Figure 1: Bridges of Königsberg problem.
Figure 2: De Bruijn graph.
Figure 3: Two strategies for genome assembly: from Hamiltonian cycles to Eulerian cycles.

This work was supported by grants from Howard Hughes Medical Institute (HHMI grant 52005726), the US National Institutes of Health (NIH grant 3P41RR024851-02S1) and the National Science Foundation (NSF grant DMS-0718810). We are grateful to S. Wasserman for many helpful comments.

Correspondence to Pavel A Pevzner.

Supplementary information

Supplementary Figure 1 and 2

De Bruijn graph from reads with sequencing errors (PDF 139 kb)

Compeau, P., Pevzner, P. & Tesler, G. How to apply de Bruijn graphs to genome assembly. Nat Biotechnol 29, 987–991 (2011).

