Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Chromosome-scale assemblies of plant genomes using nanopore long reads and optical maps

Abstract

Plant genomes are often characterized by a high level of repetitiveness and polyploid nature. Consequently, creating genome assemblies for plant genomes is challenging. The introduction of short-read technologies 10 years ago substantially increased the number of available plant genomes. Generally, these assemblies are incomplete and fragmented, and only a few are at the chromosome scale. Recently, Pacific Biosciences and Oxford Nanopore sequencing technologies were commercialized that can sequence long DNA fragments (kilobases to megabase) and, using efficient algorithms, provide high-quality assemblies in terms of contiguity and completeness of repetitive regions1,2,3,4. However, even though genome assemblies based on long reads exhibit high contig N50s (>1 Mb), these methods are still insufficient to decipher genome organization at the chromosome level. Here, we describe a strategy based on long reads (MinION or PromethION sequencers) and optical maps (Saphyr system) that can produce chromosome-level assemblies and demonstrate applicability by generating high-quality genome sequences for two new dicotyledon morphotypes, Brassica rapa Z1 (yellow sarson) and Brassica oleracea HDEM (broccoli), and one new monocotyledon, Musa schizocarpa (banana). All three assemblies show contig N50s of >5 Mb and contain scaffolds that represent entire chromosomes or chromosome arms.

Your institute does not have access to this article

Relevant articles

Open Access articles citing this article.

Access options

Buy article

Get time limited or full article access on ReadCube.

$32.00

All prices are NET prices.

Fig. 1: Comparison of contig N50 and genome sizes of 105 existing plant genome assemblies.
Fig. 2: Circular representation of anchored scaffolds of B.oleracea HDEM, B.rapa Z1 and M.schizocarpa genome assemblies.
Fig. 3: Base annotation of the three ONT genomes and the corresponding current references (B.rapa, B.oleracea and Musa species).

Data availability

The genome assemblies, gene predictions and genome browsers are freely available at http://www.genoscope.cns.fr/plants. The Illumina, MinION and PromethION data, the assemblies and the annotations are available in the European Nucleotide Archive under the following projects: PRJEB26620 (B.rapa), PRJEB26621 (B.oleracea) and PRJEB26661 (M.schizocarpa). Germplasm for these genomes will be made freely and publicly available to the entire community. M.schizocarpa germplasm is available at Bioversity International Transit Center under ITC number ITC0926. B.rapa ssp. trilocularis (genotype Z1) is available at the Plant Genetic Resources of Canada and B.oleracea ssp. italica (genotype HDEM) is available at the Biological Resource Center BrACySol, Rennes, France. All supporting data are included in the Supplementary Information.

References

  1. Chin, C. S. et al. Phased diploid genome assembly with single-molecule real-time sequencing. Nat. Methods 13, 1050–1054 (2016).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  2. Jiao, W. B. & Schneeberger, K. The impact of third generation genomic technologies on plant genome assembly. Curr. Opin. Plant. Biol. 36, 64–70 (2017).

    CAS  Article  PubMed  Google Scholar 

  3. Michael, T. P. et al. High contiguity Arabidopsis thaliana genome assembly with a single nanopore flow cell. Nat. Commun. 9, 541 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  4. Schmidt, M. H. et al. De novo assembly of a new Solanum pennellii accession using nanopore sequencing. Plant Cell 29, 2336–2348 (2017).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  5. Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796–815 (2000).

    Article  Google Scholar 

  6. International Rice Genome Sequencing Project The map-based sequence of the rice genome. Nature 436, 793–800 (2005).

    Article  Google Scholar 

  7. Du, H. et al. Sequencing and de novo assembly of a near complete indica rice genome. Nat. Commun. 8, 15324 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  8. Edger, P. P. et al. Single-molecule sequencing and optical mapping yields an improved genome of woodland strawberry (Fragaria vesca) with chromosome-scale contiguity. Gigascience 7, 1–7 (2018).

    Article  PubMed  Google Scholar 

  9. Dassanayake, M. et al. The genome of the extremophile crucifer Thellungiella parvula. Nat. Genet. 43, 913–918 (2011).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  10. International Brachypodium Initiative Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature 463, 763–768 (2010).

    Article  Google Scholar 

  11. Raymond, O. et al. The Rosa genome provides new insights into the domestication of modern roses. Nat. Genet. 50, 772–777 (2018).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  12. Cheng, F. et al. Subgenome parallel selection is associated with morphotype diversification and convergent crop domestication in Brassica rapa and Brassica oleracea. Nat. Genet. 48, 1218–1224 (2016).

    CAS  Article  PubMed  Google Scholar 

  13. Cai, C. C. et al. Brassica rapa genome 2.0: a reference upgrade through sequence re-assembly and gene re-annotation. Mol. Plant 10, 649–651 (2017).

    CAS  Article  PubMed  Google Scholar 

  14. Wang, X. W. et al. The genome of the mesopolyploid crop species Brassica rapa. Nat. Genet. 43, 1035–1039 (2011).

    CAS  Article  PubMed  Google Scholar 

  15. Parkin, I. A. et al. Transcriptome and methylome profiling reveals relics of genome dominance in the mesopolyploid Brassica oleracea. Genome Biol. 15, R77 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  16. D’Hont, A. et al. The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature 488, 213–217 (2012).

    Article  PubMed  Google Scholar 

  17. Martin, G. et al. Improvement of the banana “Musa acuminata” reference sequence using NGS data and semi-automated bioinformatics methods. BMC Genomics 17, 243 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Lam, E. T. et al. Genome mapping on nanochannel arrays for structural variation analysis and sequence assembly. Nat. Biotechnol. 30, 771–776 (2012).

    CAS  Article  PubMed  Google Scholar 

  19. Sakai, H. et al. The power of single molecule real-time sequencing technology in the de novo assembly of a eukaryotic genome. Sci. Rep. 5, 16780 (2015).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  20. Wang, X. et al. Genomic analyses of primitive, wild and cultivated citrus provide insights into asexual reproduction. Nat. Genet. 49, 765–772 (2017).

    CAS  Article  PubMed  Google Scholar 

  21. Berlin, K. et al. Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat. Biotechnol. 33, 623–630 (2015).

    CAS  Article  PubMed  Google Scholar 

  22. Golicz, A. A. et al. The pangenome of an agronomically important crop plant Brassica oleracea. Nat. Commun. 7, 13390 (2016).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  23. Schranz, M. E. et al. Characterization and effects of the replicated flowering time gene FLC in Brassica rapa. Genetics 162, 1457–1468 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. Goubet, P. M. et al. Contrasted patterns of molecular evolution in dominant and recessive self-incompatibility haplotypes in Arabidopsis. PLoS Genet. 8, e1002495 (2012).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  25. Shiba, H. et al. Genomic organization of the S-locus region of Brassica. Biosci. Biotechnol. Biochem. 67, 622–626 (2003).

    CAS  Article  PubMed  Google Scholar 

  26. Bachmann, J. A., Tedder, A., Laenen, B., Steige, K. A. & Slotte, T. Targeted long-read sequencing of a locus under long-term balancing selection in Capsella. G3 (Bethesda) 8, 1327–1333 (2018).

    Google Scholar 

  27. Kim, D., Jung, J., Choi, Y. O. & Kim, S. Development of a system for S locus haplotyping based on the polymorphic SLL2 gene tightly linked to the locus determining self-incompatibility in radish (Raphanus sativus L.). Euphytica 209, 525–535 (2016).

    CAS  Article  Google Scholar 

  28. Yang, J. H. et al. The genome sequence of allopolyploid Brassica juncea and analysis of differential homoeolog gene expression influencing selection. Nat. Genet. 48, 1225–1232 (2016).

    CAS  Article  PubMed  Google Scholar 

  29. Jarvis, D. E. et al. The genome of Chenopodium quinoa. Nature 542, 307–312 (2017).

    CAS  Article  PubMed  Google Scholar 

  30. Jiao, W. B. et al. Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data. Genome Res. 27, 778–786 (2017).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  31. Reyes-Chin-Wo, S. et al. Genome assembly with in vitro proximity ligation data and whole-genome triplication in lettuce. Nat. Commun. 8, 14953 (2017).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  32. Teh, B. T. et al. The draft genome of tropical fruit durian (Durio zibethinus). Nat. Genet. 49, 1633–1641 (2017).

    CAS  Article  PubMed  Google Scholar 

  33. Gawel, N. J. & Jarret, R. L. A modified CTAB DNA extraction procedure for Musa and Ipomoea. Plant Mol. Biol. Rep. 9, 262–266 (1991).

    CAS  Article  Google Scholar 

  34. Risterucci, A. M. et al. A high-density linkage map of Theobroma cacao L. Theor. Appl. Genet. 101, 948–955 (2000).

    CAS  Article  Google Scholar 

  35. Engelen, S. & Aury J. M. Fastxtend tool (Genoscope/CEA, 2015); http://www.genoscope.cns.fr/fastxtend/

  36. Li, R., Li, Y., Kristiansen, K. & Wang, J. SOAP: short oligonucleotide alignment program. Bioinformatics 24, 713–714 (2008).

    CAS  Article  PubMed  Google Scholar 

  37. Kim, D., Song, L., Breitwieser, F. P. & Salzberg, S. L. Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome Res. 26, 1721–1729 (2016).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  38. Vaser, R. et al. Ra assembler. v. git commit 65bedfe (Faculty of Electrical Engineering and Computing, University of Zagreb, 2017); https://github.com/rvaser/ra

  39. Ruan, J. et al. SMARTdenovo assembler. v. git commit 3d9c22e (Agricultral Genomics Insititute, China, 2015) ; https://github.com/ruanjue/smartdenovo

  40. Wick, R. et al. Fitlong tool. v. git commit 8d81024 (University of Melbourne, 2017); https://github.com/rrwick/Filtlong

  41. Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  42. Vaser, R., Sovic, I., Nagarajan, N. & Sikic, M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 27, 737–746 (2017).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  43. Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  44. de Givry, S., Bouchez, M., Chabrier, P., Milan, D. & Schiex, T. CARHTA GENE: multipopulation integrated genetic and radiation hybrid mapping. Bioinformatics 21, 1703–1704 (2005).

    Article  PubMed  Google Scholar 

  45. Kent, W. J. BLAT—the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  46. RepeatMasker Open-4. 0 (Institute for Systems Biology, 2013); http://www.repeatmasker.org

  47. Chalhoub, B. et al. Plant genetics. Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome. Science 345, 950–953 (2014).

    CAS  Article  PubMed  Google Scholar 

  48. Morgulis, A., Gertz, E. M., Schaffer, A. A. & Agarwala, R. A fast and symmetric DUST implementation to mask low-complexity DNA sequences. J. Comput. Biol. 13, 1028–1040 (2006).

    CAS  Article  PubMed  Google Scholar 

  49. Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  50. Birney, E., Clamp, M. & Durbin, R. GeneWise and Genomewise. Genome Res. 14, 988–995 (2004).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  51. Dubarry, M. et al. Gmove a tool for eukaryotic gene predictions using various evidences (poster). F1000Res. 5, 681 (2016).

    Google Scholar 

  52. Waterhouse, R. M. et al. BUSCO applications from quality assessments to gene prediction and phylogenomics. Mol. Biol. Evol. 35, 543–548 (2018).

    Article  PubMed  Google Scholar 

  53. Marcais, G. et al. MUMmer4: a fast and versatile genome alignment system. PLoS Comput. Biol. 14, e1005944 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  54. Nettstad M. Dot (DNA Nexus, 2017); http://github.com/dnanexus/dot

  55. Dereeper, A. et al. Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 36, W465–W469 (2008).

    CAS  Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work was supported by the Genoscope, the Commissariat à l’Energie Atomique et aux Energies Alternatives (CEA) and France Génomique (ANR-10-INBS-09-08). We are grateful to ONT for early access to the MinION device through the MinION Access Programme and we thank their staff for technical help. Work by X.V. and M.G. is supported financially by Région Hauts-de-France, the Ministère de l’Enseignement Supérieur et de la Recherche (CPER Climibio) and the European Fund for Regional Economic Development.

Author information

Authors and Affiliations

Authors

Contributions

C.F., G.D., F.-C.B., E.D. and C.C. extracted the DNA. C.C. and A.L. optimized and performed the sequencing. E.D., W.B. and V.B. generated the optical maps. P.D., R.D. and M.M.-D. generated the genetic map for the B.oleracea HDEM accession. B.I., C.B. and J.-M.A. performed the genome assemblies. G.M. performed the anchoring of the M.schizocarpa scaffolds. C.F., J.M. and M.R.-G. performed the anchoring of the B.oleracea scaffolds. M.D. and J.-M.A. performed the anchoring of the B.rapa scaffolds. M.D. and B.N. performed the gene prediction for the genome assemblies. B.I., C.B., M.D., F.D., J.-M.A. and S.E. performed the bioinformatic analyses. X.V. and M.G. performed the S-locus annotation of the two Brassicaceae genomes. B.I., C.B., M.D. and J.-M.A. wrote the article. A.D., A.-M.C., P.W. and J.-M.A. supervised the study.

Corresponding author

Correspondence to Jean-Marc Aury.

Ethics declarations

Competing interests

The authors declare no competing interests. B.I., S.E., C.C., P.W. and J.-M.A. are part of the MinION Access Programme and J.-M.A. received travel and accommodation expenses to speak at ONT conferences.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Supplementary Tables 1–21 and Supplementary Figures 1–19.

Reporting Summary

Supplementary File 2

Detailed information about the 105 plant genome assemblies.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Belser, C., Istace, B., Denis, E. et al. Chromosome-scale assemblies of plant genomes using nanopore long reads and optical maps. Nature Plants 4, 879–887 (2018). https://doi.org/10.1038/s41477-018-0289-4

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41477-018-0289-4

Further reading

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing