We have assembled de novo the Escherichia coli K-12 MG1655 chromosome in a single 4.6-Mb contig using only nanopore data. Our method has three stages: (i) overlaps are detected between reads and then corrected by a multiple-alignment process; (ii) corrected reads are assembled using the Celera Assembler; and (iii) the assembly is polished using a probabilistic model of the signal-level data. The assembly reconstructs gene order and has 99.5% nucleotide identity.
Subscribe to Journal
Get full journal access for 1 year
only $20.17 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Jain, M. et al. Nat. Methods 12, 351–356 (2015).
Koren, S. et al. Genome Biol. 14, R101 (2013).
Koren, S. et al. Nat. Biotechnol. 30, 693–700 (2012).
Rasko, D.A. et al. N. Engl. J. Med. 365, 709–717 (2011).
Chin, C.-S. et al. Nat. Methods 10, 563–569 (2013).
Kim, K.E. et al. Sci. Data 1, 140045 (2014).
Koren, S. & Phillippy, A.M. Curr. Opin. Microbiol. 23, 110–120 (2015).
Quick, J., Quinlan, A.R. & Loman, N.J. Gigascience 3, 22 (2014).
Goodwin, S. et al. Preprint at bioRxiv 10.1101/013490 (2015).
Loman, N.J. & Quinlan, A.R. Bioinformatics 30, 3399–3401 (2014).
Myers, G. in Int. Workshop Algorithms Bioinformatics (eds. Brown, D. & Morgenstern, B.) 52–67 (Springer, 2014).
Lee, C., Grasso, C. & Sharlow, M.F. Bioinformatics 18, 452–464 (2002).
Myers, E.W. et al. Science 287, 2196–2204 (2000).
Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. Bioinformatics 29, 1072–1075 (2013).
Darling, A.E., Mau, B. & Perna, N.T. PLoS ONE 5, e11147 (2010).
Milne, I. et al. Brief. Bioinform. 14, 193–202 (2013).
Treangen, T.J., Sommer, D.D., Angly, F.E., Koren, S. & Pop, M. Curr. Protoc. Bioinformatics 33, 11.8 (2011).
Delcher, A.L., Phillippy, A., Carlton, J. & Salzberg, S.L. Nucleic Acids Res. 30, 2478–2483 (2002).
Li, H. Preprint at http://arxiv.org/abs/1303.3997 (2013).
Quinlan, A.R. & Hall, I.M. Bioinformatics 26, 841–842 (2010).
Cock, P.J.A. et al. Bioinformatics 25, 1422–1423 (2009).
Data analysis was performed on the Medical Research Council Cloud Infrastructure for Microbial Bioinformatics (CLIMB) cyberinfrastructure. N.J.L. is funded by a Medical Research Council Special Training Fellowship in Biomedical Informatics. J.Q. is funded by the UK National Institute for Health Research (NIHR) Surgical Reconstruction and Microbiology Research Centre. J.T.S. is supported by the Ontario Institute for Cancer Research through funding provided by the Government of Ontario. We thank the staff of Oxford Nanopore for technical help and advice during the MinION Access Programme. We are grateful to the EU COST action ES1103, whose funding allowed us to attend a hackathon that kick-started the work presented here. We thank L. Parts for comments on the manuscript and H. Eno for help with proofreading.
N.J.L. and J.T.S. are members of the MinION Access Programme (MAP). N.J.L. has received free-of-charge reagents for nanopore sequencing presented in this study. N.J.L., J.Q. and J.T.S. have received travel and accommodation expenses to speak at an Oxford Nanopore–organized symposium. N.J.L. and J.Q. have ongoing research collaborations with Oxford Nanopore but do not receive financial compensation for this.
Integrated supplementary information
Supplementary Figure 1 Kernel density plot showing the accuracy of reads from the four individual MinION runs used to generate the de novo assembly.
The mean accuracy varies from 78.2% (run 3) to 82.2% (run 1).
Supplementary Figure 2 Kernel density plot demonstrating the raw nanopore read accuracy and effect of two rounds of error correction on accuracy.
The mauve area represents uncorrected sequencing reads, where the green area shows the improvement in accuracy after the first round of correction and the yellow shows improvement from the second round of correction. Further rounds of correction did not improve the accuracy further.
About this article
Cite this article
Loman, N., Quick, J. & Simpson, J. A complete bacterial genome assembled de novo using only nanopore sequencing data. Nat Methods 12, 733–735 (2015) doi:10.1038/nmeth.3444
Scientific Reports (2019)
Nucleic Acids Research (2019)
Nonlinear sequence similarity between the Xist and Rsx long noncoding RNAs suggests shared functions of tandem repeat domains
Complete Genome Sequence of Halomonas olivaria, a Moderately Halophilic Bacterium Isolated from Olive Processing Effluents, Obtained by Nanopore Sequencing
Microbiology Resource Announcements (2019)
Comparative assessment of long-read error correction software applied to Nanopore RNA-sequencing data
Briefings in Bioinformatics (2019)