Abstract
Plant genomes are often characterized by a high level of repetitiveness and polyploid nature. Consequently, creating genome assemblies for plant genomes is challenging. The introduction of short-read technologies 10 years ago substantially increased the number of available plant genomes. Generally, these assemblies are incomplete and fragmented, and only a few are at the chromosome scale. Recently, Pacific Biosciences and Oxford Nanopore sequencing technologies were commercialized that can sequence long DNA fragments (kilobases to megabase) and, using efficient algorithms, provide high-quality assemblies in terms of contiguity and completeness of repetitive regions1,2,3,4. However, even though genome assemblies based on long reads exhibit high contig N50s (>1 Mb), these methods are still insufficient to decipher genome organization at the chromosome level. Here, we describe a strategy based on long reads (MinION or PromethION sequencers) and optical maps (Saphyr system) that can produce chromosome-level assemblies and demonstrate applicability by generating high-quality genome sequences for two new dicotyledon morphotypes, Brassica rapa Z1 (yellow sarson) and Brassica oleracea HDEM (broccoli), and one new monocotyledon, Musa schizocarpa (banana). All three assemblies show contig N50s of >5 Mb and contain scaffolds that represent entire chromosomes or chromosome arms.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The genome assemblies, gene predictions and genome browsers are freely available at http://www.genoscope.cns.fr/plants. The Illumina, MinION and PromethION data, the assemblies and the annotations are available in the European Nucleotide Archive under the following projects: PRJEB26620 (B. rapa), PRJEB26621 (B. oleracea) and PRJEB26661 (M. schizocarpa). Germplasm for these genomes will be made freely and publicly available to the entire community. M. schizocarpa germplasm is available at Bioversity International Transit Center under ITC number ITC0926. B. rapa ssp. trilocularis (genotype Z1) is available at the Plant Genetic Resources of Canada and B. oleracea ssp. italica (genotype HDEM) is available at the Biological Resource Center BrACySol, Rennes, France. All supporting data are included in the Supplementary Information.
References
Chin, C. S. et al. Phased diploid genome assembly with single-molecule real-time sequencing. Nat. Methods 13, 1050–1054 (2016).
Jiao, W. B. & Schneeberger, K. The impact of third generation genomic technologies on plant genome assembly. Curr. Opin. Plant. Biol. 36, 64–70 (2017).
Michael, T. P. et al. High contiguity Arabidopsis thaliana genome assembly with a single nanopore flow cell. Nat. Commun. 9, 541 (2018).
Schmidt, M. H. et al. De novo assembly of a new Solanum pennellii accession using nanopore sequencing. Plant Cell 29, 2336–2348 (2017).
Arabidopsis Genome Initiative Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408, 796–815 (2000).
International Rice Genome Sequencing Project The map-based sequence of the rice genome. Nature 436, 793–800 (2005).
Du, H. et al. Sequencing and de novo assembly of a near complete indica rice genome. Nat. Commun. 8, 15324 (2017).
Edger, P. P. et al. Single-molecule sequencing and optical mapping yields an improved genome of woodland strawberry (Fragaria vesca) with chromosome-scale contiguity. Gigascience 7, 1–7 (2018).
Dassanayake, M. et al. The genome of the extremophile crucifer Thellungiella parvula. Nat. Genet. 43, 913–918 (2011).
International Brachypodium Initiative Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature 463, 763–768 (2010).
Raymond, O. et al. The Rosa genome provides new insights into the domestication of modern roses. Nat. Genet. 50, 772–777 (2018).
Cheng, F. et al. Subgenome parallel selection is associated with morphotype diversification and convergent crop domestication in Brassica rapa and Brassica oleracea. Nat. Genet. 48, 1218–1224 (2016).
Cai, C. C. et al. Brassica rapa genome 2.0: a reference upgrade through sequence re-assembly and gene re-annotation. Mol. Plant 10, 649–651 (2017).
Wang, X. W. et al. The genome of the mesopolyploid crop species Brassica rapa. Nat. Genet. 43, 1035–1039 (2011).
Parkin, I. A. et al. Transcriptome and methylome profiling reveals relics of genome dominance in the mesopolyploid Brassica oleracea. Genome Biol. 15, R77 (2014).
D’Hont, A. et al. The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature 488, 213–217 (2012).
Martin, G. et al. Improvement of the banana “Musa acuminata” reference sequence using NGS data and semi-automated bioinformatics methods. BMC Genomics 17, 243 (2016).
Lam, E. T. et al. Genome mapping on nanochannel arrays for structural variation analysis and sequence assembly. Nat. Biotechnol. 30, 771–776 (2012).
Sakai, H. et al. The power of single molecule real-time sequencing technology in the de novo assembly of a eukaryotic genome. Sci. Rep. 5, 16780 (2015).
Wang, X. et al. Genomic analyses of primitive, wild and cultivated citrus provide insights into asexual reproduction. Nat. Genet. 49, 765–772 (2017).
Berlin, K. et al. Assembling large genomes with single-molecule sequencing and locality-sensitive hashing. Nat. Biotechnol. 33, 623–630 (2015).
Golicz, A. A. et al. The pangenome of an agronomically important crop plant Brassica oleracea. Nat. Commun. 7, 13390 (2016).
Schranz, M. E. et al. Characterization and effects of the replicated flowering time gene FLC in Brassica rapa. Genetics 162, 1457–1468 (2002).
Goubet, P. M. et al. Contrasted patterns of molecular evolution in dominant and recessive self-incompatibility haplotypes in Arabidopsis. PLoS Genet. 8, e1002495 (2012).
Shiba, H. et al. Genomic organization of the S-locus region of Brassica. Biosci. Biotechnol. Biochem. 67, 622–626 (2003).
Bachmann, J. A., Tedder, A., Laenen, B., Steige, K. A. & Slotte, T. Targeted long-read sequencing of a locus under long-term balancing selection in Capsella. G3 (Bethesda) 8, 1327–1333 (2018).
Kim, D., Jung, J., Choi, Y. O. & Kim, S. Development of a system for S locus haplotyping based on the polymorphic SLL2 gene tightly linked to the locus determining self-incompatibility in radish (Raphanus sativus L.). Euphytica 209, 525–535 (2016).
Yang, J. H. et al. The genome sequence of allopolyploid Brassica juncea and analysis of differential homoeolog gene expression influencing selection. Nat. Genet. 48, 1225–1232 (2016).
Jarvis, D. E. et al. The genome of Chenopodium quinoa. Nature 542, 307–312 (2017).
Jiao, W. B. et al. Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data. Genome Res. 27, 778–786 (2017).
Reyes-Chin-Wo, S. et al. Genome assembly with in vitro proximity ligation data and whole-genome triplication in lettuce. Nat. Commun. 8, 14953 (2017).
Teh, B. T. et al. The draft genome of tropical fruit durian (Durio zibethinus). Nat. Genet. 49, 1633–1641 (2017).
Gawel, N. J. & Jarret, R. L. A modified CTAB DNA extraction procedure for Musa and Ipomoea. Plant Mol. Biol. Rep. 9, 262–266 (1991).
Risterucci, A. M. et al. A high-density linkage map of Theobroma cacao L. Theor. Appl. Genet. 101, 948–955 (2000).
Engelen, S. & Aury J. M. Fastxtend tool (Genoscope/CEA, 2015); http://www.genoscope.cns.fr/fastxtend/
Li, R., Li, Y., Kristiansen, K. & Wang, J. SOAP: short oligonucleotide alignment program. Bioinformatics 24, 713–714 (2008).
Kim, D., Song, L., Breitwieser, F. P. & Salzberg, S. L. Centrifuge: rapid and sensitive classification of metagenomic sequences. Genome Res. 26, 1721–1729 (2016).
Vaser, R. et al. Ra assembler. v. git commit 65bedfe (Faculty of Electrical Engineering and Computing, University of Zagreb, 2017); https://github.com/rvaser/ra
Ruan, J. et al. SMARTdenovo assembler. v. git commit 3d9c22e (Agricultral Genomics Insititute, China, 2015) ; https://github.com/ruanjue/smartdenovo
Wick, R. et al. Fitlong tool. v. git commit 8d81024 (University of Melbourne, 2017); https://github.com/rrwick/Filtlong
Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).
Vaser, R., Sovic, I., Nagarajan, N. & Sikic, M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 27, 737–746 (2017).
Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963 (2014).
de Givry, S., Bouchez, M., Chabrier, P., Milan, D. & Schiex, T. CARHTA GENE: multipopulation integrated genetic and radiation hybrid mapping. Bioinformatics 21, 1703–1704 (2005).
Kent, W. J. BLAT—the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).
RepeatMasker Open-4. 0 (Institute for Systems Biology, 2013); http://www.repeatmasker.org
Chalhoub, B. et al. Plant genetics. Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome. Science 345, 950–953 (2014).
Morgulis, A., Gertz, E. M., Schaffer, A. A. & Agarwala, R. A fast and symmetric DUST implementation to mask low-complexity DNA sequences. J. Comput. Biol. 13, 1028–1040 (2006).
Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).
Birney, E., Clamp, M. & Durbin, R. GeneWise and Genomewise. Genome Res. 14, 988–995 (2004).
Dubarry, M. et al. Gmove a tool for eukaryotic gene predictions using various evidences (poster). F1000Res. 5, 681 (2016).
Waterhouse, R. M. et al. BUSCO applications from quality assessments to gene prediction and phylogenomics. Mol. Biol. Evol. 35, 543–548 (2018).
Marcais, G. et al. MUMmer4: a fast and versatile genome alignment system. PLoS Comput. Biol. 14, e1005944 (2018).
Nettstad M. Dot (DNA Nexus, 2017); http://github.com/dnanexus/dot
Dereeper, A. et al. Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 36, W465–W469 (2008).
Acknowledgements
This work was supported by the Genoscope, the Commissariat à l’Energie Atomique et aux Energies Alternatives (CEA) and France Génomique (ANR-10-INBS-09-08). We are grateful to ONT for early access to the MinION device through the MinION Access Programme and we thank their staff for technical help. Work by X.V. and M.G. is supported financially by Région Hauts-de-France, the Ministère de l’Enseignement Supérieur et de la Recherche (CPER Climibio) and the European Fund for Regional Economic Development.
Author information
Authors and Affiliations
Contributions
C.F., G.D., F.-C.B., E.D. and C.C. extracted the DNA. C.C. and A.L. optimized and performed the sequencing. E.D., W.B. and V.B. generated the optical maps. P.D., R.D. and M.M.-D. generated the genetic map for the B. oleracea HDEM accession. B.I., C.B. and J.-M.A. performed the genome assemblies. G.M. performed the anchoring of the M. schizocarpa scaffolds. C.F., J.M. and M.R.-G. performed the anchoring of the B. oleracea scaffolds. M.D. and J.-M.A. performed the anchoring of the B. rapa scaffolds. M.D. and B.N. performed the gene prediction for the genome assemblies. B.I., C.B., M.D., F.D., J.-M.A. and S.E. performed the bioinformatic analyses. X.V. and M.G. performed the S-locus annotation of the two Brassicaceae genomes. B.I., C.B., M.D. and J.-M.A. wrote the article. A.D., A.-M.C., P.W. and J.-M.A. supervised the study.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests. B.I., S.E., C.C., P.W. and J.-M.A. are part of the MinION Access Programme and J.-M.A. received travel and accommodation expenses to speak at ONT conferences.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Supplementary Information
Supplementary Tables 1–21 and Supplementary Figures 1–19.
Supplementary File 2
Detailed information about the 105 plant genome assemblies.
Rights and permissions
About this article
Cite this article
Belser, C., Istace, B., Denis, E. et al. Chromosome-scale assemblies of plant genomes using nanopore long reads and optical maps. Nature Plants 4, 879–887 (2018). https://doi.org/10.1038/s41477-018-0289-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41477-018-0289-4
This article is cited by
-
Large-scale gene expression alterations introduced by structural variation drive morphotype diversification in Brassica oleracea
Nature Genetics (2024)
-
One-step creation of CMS lines using a BoCENH3-based haploid induction system in Brassica crop
Nature Plants (2024)
-
Origin and evolution of the triploid cultivated banana genome
Nature Genetics (2024)
-
High quality genomes produced from single MinION flow cells clarify polyploid and demographic histories of critically endangered Fraxinus (ash) species
Communications Biology (2024)
-
Genome-wide expansion and reorganization during grass evolution: from 30 Mb chromosomes in rice and Brachypodium to 550 Mb in Avena
BMC Plant Biology (2023)