A complete bacterial genome assembled de novo using only nanopore sequencing data

Article metrics

Abstract

We have assembled de novo the Escherichia coli K-12 MG1655 chromosome in a single 4.6-Mb contig using only nanopore data. Our method has three stages: (i) overlaps are detected between reads and then corrected by a multiple-alignment process; (ii) corrected reads are assembled using the Celera Assembler; and (iii) the assembly is polished using a probabilistic model of the signal-level data. The assembly reconstructs gene order and has 99.5% nucleotide identity.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.

from$8.99

All prices are NET prices.

Figure 1: Single-contig assembly of E. coli K-12 MG1655.
Figure 2: Comparing 5-mer counts of the assembly and the reference genome before and after signal-level polishing.

Accession codes

Primary accessions

European Nucleotide Archive

References

  1. 1

    Jain, M. et al. Nat. Methods 12, 351–356 (2015).

  2. 2

    Koren, S. et al. Genome Biol. 14, R101 (2013).

  3. 3

    Koren, S. et al. Nat. Biotechnol. 30, 693–700 (2012).

  4. 4

    Rasko, D.A. et al. N. Engl. J. Med. 365, 709–717 (2011).

  5. 5

    Chin, C.-S. et al. Nat. Methods 10, 563–569 (2013).

  6. 6

    Kim, K.E. et al. Sci. Data 1, 140045 (2014).

  7. 7

    Koren, S. & Phillippy, A.M. Curr. Opin. Microbiol. 23, 110–120 (2015).

  8. 8

    Quick, J., Quinlan, A.R. & Loman, N.J. Gigascience 3, 22 (2014).

  9. 9

    Goodwin, S. et al. Preprint at bioRxiv 10.1101/013490 (2015).

  10. 10

    Loman, N.J. & Quinlan, A.R. Bioinformatics 30, 3399–3401 (2014).

  11. 11

    Myers, G. in Int. Workshop Algorithms Bioinformatics (eds. Brown, D. & Morgenstern, B.) 52–67 (Springer, 2014).

  12. 12

    Lee, C., Grasso, C. & Sharlow, M.F. Bioinformatics 18, 452–464 (2002).

  13. 13

    Myers, E.W. et al. Science 287, 2196–2204 (2000).

  14. 14

    Gurevich, A., Saveliev, V., Vyahhi, N. & Tesler, G. Bioinformatics 29, 1072–1075 (2013).

  15. 15

    Darling, A.E., Mau, B. & Perna, N.T. PLoS ONE 5, e11147 (2010).

  16. 16

    Milne, I. et al. Brief. Bioinform. 14, 193–202 (2013).

  17. 17

    Treangen, T.J., Sommer, D.D., Angly, F.E., Koren, S. & Pop, M. Curr. Protoc. Bioinformatics 33, 11.8 (2011).

  18. 18

    Delcher, A.L., Phillippy, A., Carlton, J. & Salzberg, S.L. Nucleic Acids Res. 30, 2478–2483 (2002).

  19. 19

    Li, H. Preprint at http://arxiv.org/abs/1303.3997 (2013).

  20. 20

    Quinlan, A.R. & Hall, I.M. Bioinformatics 26, 841–842 (2010).

  21. 21

    Cock, P.J.A. et al. Bioinformatics 25, 1422–1423 (2009).

Download references

Acknowledgements

Data analysis was performed on the Medical Research Council Cloud Infrastructure for Microbial Bioinformatics (CLIMB) cyberinfrastructure. N.J.L. is funded by a Medical Research Council Special Training Fellowship in Biomedical Informatics. J.Q. is funded by the UK National Institute for Health Research (NIHR) Surgical Reconstruction and Microbiology Research Centre. J.T.S. is supported by the Ontario Institute for Cancer Research through funding provided by the Government of Ontario. We thank the staff of Oxford Nanopore for technical help and advice during the MinION Access Programme. We are grateful to the EU COST action ES1103, whose funding allowed us to attend a hackathon that kick-started the work presented here. We thank L. Parts for comments on the manuscript and H. Eno for help with proofreading.

Author information

N.J.L. and J.T.S. conceived the project. N.J.L., J.Q. and J.T.S. implemented the Nanocorrect pipeline. J.T.S. conceived and implemented the Nanopolish pipeline. J.Q. generated the nanopore E. coli sequence data. N.J.L. and J.T.S. performed de novo assembly and analyzed the results. N.J.L. and J.T.S. wrote the manuscript. All authors approved the final manuscript.

Correspondence to Jared T Simpson.

Ethics declarations

Competing interests

N.J.L. and J.T.S. are members of the MinION Access Programme (MAP). N.J.L. has received free-of-charge reagents for nanopore sequencing presented in this study. N.J.L., J.Q. and J.T.S. have received travel and accommodation expenses to speak at an Oxford Nanopore–organized symposium. N.J.L. and J.Q. have ongoing research collaborations with Oxford Nanopore but do not receive financial compensation for this.

Integrated supplementary information

Supplementary Figure 1 Kernel density plot showing the accuracy of reads from the four individual MinION runs used to generate the de novo assembly.

The mean accuracy varies from 78.2% (run 3) to 82.2% (run 1).

Supplementary Figure 2 Kernel density plot demonstrating the raw nanopore read accuracy and effect of two rounds of error correction on accuracy.

The mauve area represents uncorrected sequencing reads, where the green area shows the improvement in accuracy after the first round of correction and the yellow shows improvement from the second round of correction. Further rounds of correction did not improve the accuracy further.

Supplementary Figure 3 Spec file for Celera Assembler.

Supplementary information

Supplementary Text and Figures

Supplementary Figures 1–3, Supplementary Tables 1 and 2 and Supplementary Note (PDF 785 kb)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Loman, N., Quick, J. & Simpson, J. A complete bacterial genome assembled de novo using only nanopore sequencing data. Nat Methods 12, 733–735 (2015) doi:10.1038/nmeth.3444

Download citation

Further reading