Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Chromosome-level genomes of three key Allium crops and their trait evolution

An Author Correction to this article was published on 14 December 2023

This article has been updated

Abstract

Allium crop breeding remains severely hindered due to the lack of high-quality reference genomes. Here we report high-quality chromosome-level genome assemblies for three key Allium crops (Welsh onion, garlic and onion), which are 11.17 Gb, 15.52 Gb and 15.78 Gb in size with the highest recorded contig N50 of 507.27 Mb, 109.82 Mb and 81.66 Mb, respectively. Beyond revealing the genome evolutionary process of Allium species, our pathogen infection experiments and comparative metabolomic and genomic analyses showed that genes encoding enzymes involved in the metabolic pathway of Allium-specific flavor compounds may have evolved from an ancient uncharacterized plant defense system widely existing in many plant lineages but extensively boosted in alliums. Using in situ hybridization and spatial RNA sequencing, we obtained an overview of cell-type categorization and gene expression changes associated with spongy mesophyll cell expansion during onion bulb formation, thus indicating the functional roles of bulb formation genes.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Evolution of Allium and African lily genomes.
Fig. 2: Comparative genomic analyses.
Fig. 3: Insights into allicin/isoallicin biosynthesis.
Fig. 4: Spatial RNA-seq reveals spatial distribution of gene expression in metabolic process and onion bulb formation.
Fig. 5: Evolution of genes involved in bulb formation in Allium plants and expression pattern divergence.

Similar content being viewed by others

Data availability

The raw sequencing data of onion, garlic, Welsh onion and Africa lily were deposited in the National Center for Biotechnology Information Sequence Read Archive under the accession PRJNA948806 and in the National Genomics Data Center (https://ngdc.cncb.ac.cn/?lang=en) under the accession PRJCA016760. The assemblies of the four genomes reported in this paper have been deposited in the National Center for Biotechnology Information under accessions JASDDO000000000, JASFAV000000000, JASFAW000000000, JASFAX000000000 and in the Genome Warehouse in National Genomics Data Center128,129, Beijing Institute of Genomics, Chinese Academy of Sciences/China National Center for Bioinformation, under accession GWHCBHY00000000, GWHCBHZ00000000, GWHCBIA00000000 and GWHCBIB00000000 that are publicly accessible at https://ngdc.cncb.ac.cn/gwh.

Code availability

All software used in the study are publicly available on the Internet as described in the Methods and Reporting Summary.

Change history

References

  1. Chase, M. W. et al. An update of the Angiosperm Phylogeny Group classification for the orders and families of flowering plants: APG IV. Bot. J. Linn. Soc. 181, 1–20 (2016).

    Article  Google Scholar 

  2. Jones, M. G. et al. Biosynthesis of the flavour precursors of onion and garlic. J. Exp. Bot. 55, 1903–1918 (2004).

    Article  CAS  PubMed  Google Scholar 

  3. Yoshimoto, N. & Saito, K. S-Alk (en)ylcysteine sulfoxides in the genus Allium: proposed biosynthesis, chemical conversion, and bioactivities. J. Exp. Bot. 70, 4123–4137 (2019).

    Article  PubMed  Google Scholar 

  4. Reis, A. C. et al. rDNA mapping, heterochromatin characterization and AT/GC content of Agapanthus africanus (L.) Hoffmanns (Agapanthaceae). An. Acad. Bras. Cienc. 88, 1727–1734 (2016).

    Article  CAS  PubMed  Google Scholar 

  5. Sharaibi, O. J. & Afolayan, A. J. Micromorphological characterization of the leaf and rhizome of Agapanthus praecox subsp. praecox Willd. (Amaryllidaceae). J. Bot. 2017, 1–10 (2017).

    Article  Google Scholar 

  6. Fenwick, G. R., Hanley, A. B. & Whitaker, J. R. The genus Allium—part 1. Crit. Rev. Food Sci. Nutr. 22, 199–271 (1985).

    Article  CAS  PubMed  Google Scholar 

  7. Boulos, L. Flora of Egypt (Al Hadara Publishing, 1999).

  8. Harris, J., Cottrell, S., Plummer, S. & Lloyd, D. Antimicrobial properties of Allium sativum (garlic). Appl. Microbiol. Biotechnol. 57, 282–286 (2001).

    Article  CAS  PubMed  Google Scholar 

  9. Capasso, A. Antioxidant action and therapeutic efficacy of Allium sativum L. Molecules 18, 690–700 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Borlinghaus, J., Albrecht, F., Gruhlke, M. C., Nwachukwu, I. D. & Slusarenko, A. J. Allicin: chemistry and biological properties. Molecules 19, 12591–12618 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  11. Fu, J. et al. Identification and characterization of abundant repetitive sequences in Allium cepa. Sci. Rep. 9, 16756 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  12. Khandagale, K. et al. Omics approaches in Allium research: progress and way ahead. PeerJ 8, e9824 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  13. King, J., Bradeen, J., Bark, O., McCallum, J. & Havey, M. J. A low-density genetic map of onion reveals a role for tandem duplication in the evolution of an extremely large diploid genome. Theor. Appl. Genet. 96, 52–62 (1998).

    Article  CAS  Google Scholar 

  14. Jakše, J. et al. Pilot sequencing of onion genomic DNA reveals fragments of transposable elements, low gene densities, and significant gene enrichment after methyl filtration. Mol. Genet. Genomics 280, 287–292 (2008).

    Article  PubMed  Google Scholar 

  15. Shigyo, M., Khar, A. & Abdelrahman, M. (eds) The Allium Genomes pp. 99–112 (Springer, 2018).

  16. Kiseleva, A., Kirov, I. & Khrustaleva, L. Chromosomal organization of centromeric Ty3/gypsy retrotransposons in Allium cepa L. and Allium fistulosum L. Russ. J. Genet. 50, 586–592 (2014).

    Article  CAS  Google Scholar 

  17. Hertweck, K. L. Assembly and comparative analysis of transposable elements from low coverage genomic sequence data in Asparagales. Genome 56, 487–494 (2013).

    Article  CAS  PubMed  Google Scholar 

  18. Peška, V., Mandáková, T., Ihradská, V. & Fajkus, J. Comparative dissection of three giant genomes: Allium cepa, Allium sativum, and Allium ursinum. Int. J. Mol. Sci. 20, 733 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  19. Sun, X. et al. A chromosome-level genome assembly of garlic (Allium sativum) provides insights into genome evolution and allicin biosynthesis. Mol. Plant 13, 1328–1339 (2020).

    Article  CAS  PubMed  Google Scholar 

  20. Ohri, D., Fritsch, R. M. & Hanelt, P. Evolution of genome size in Allium (Alliaceae). Plant Syst. Evol. 210, 57–86 (1998).

    Article  Google Scholar 

  21. Duchoslav, M., Šafářová, L. & Jandová, M. Role of adaptive and non-adaptive mechanisms forming complex patterns of genome size variation in six cytotypes of polyploid Allium oleraceum (Amaryllidaceae) on a continental scale. Ann. Bot. 111, 419–431 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Khrustaleva, L., Kudryavtseva, N., Romanov, D., Ermolaev, A. & Kirov, I. Comparative tyramide-FISH mapping of the genes controlling flavor and bulb color in Allium species revealed an altered gene order. Sci. Rep. 9, 12007 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  23. Shigyo, M., Khar, A. & Abdelrahman, M. (eds) The Allium Genomes pp 197–214 (Springer, 2018).

  24. Ricroch, A., Yockteng, R., Brown, S. C. & Nadot, S. Evolution of genome size across some cultivated Allium species. Genome 48, 511–520 (2005).

    Article  CAS  PubMed  Google Scholar 

  25. Liao, N. et al. Chromosome-level genome assembly of bunching onion illuminates genome evolution and flavor formation in Allium crops. Nat. Commun. 13, 6690 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Finkers, R. et al. Insights from the first genome assembly of onion (Allium cepa). G3 (Bethesda) 11, jkab243 (2021).

    Article  PubMed  Google Scholar 

  27. Nystedt, B. et al. The Norway spruce genome sequence and conifer genome evolution. Nature 497, 579–584 (2013).

    Article  CAS  PubMed  Google Scholar 

  28. Neale, D. B. et al. Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies. Genome Biol. 15, R59 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  29. Guan, R. et al. Draft genome of the living fossil Ginkgo biloba. GigaScience 5, 49 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Li, G. et al. A high-quality genome assembly highlights rye genomic characteristics and agronomically important genes. Nat. Genet. 53, 574–584 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Seppey, M., Manni, M. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness. Methods Mol. Biol. 1962, 227–245 (2019).

    Article  CAS  PubMed  Google Scholar 

  32. Ming, R. et al. The pineapple genome and the evolution of CAM photosynthesis. Nat. Genet. 47, 1435–1442 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Harkess, A. et al. The asparagus genome sheds light on the origin and evolution of a young Y chromosome. Nat. Commun. 8, 1279 (2017).

    Article  PubMed  PubMed Central  Google Scholar 

  34. Zhang, Y. et al. Chromosome-scale assembly of the Dendrobium chrysotoxum genome enhances the understanding of orchid evolution. Hortic. Res. 8, 183 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Magallón, S., Gómez‐Acevedo, S., Sánchez‐Reyes, L. L. & Hernández‐Hernández, T. A metacalibrated time‐tree documents the early rise of flowering plant phylogenetic diversity. New Phytol. 207, 437–453 (2015).

    Article  PubMed  Google Scholar 

  36. Zhang, G. Q. et al. The Apostasia genome and the evolution of orchids. Nature 549, 379–383 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Wang, X. et al. Genome alignment spanning major Poaceae lineages reveals heterogeneous evolutionary rates and alters inferred dates for key evolutionary events. Mol. Plant 8, 885–898 (2015).

    Article  CAS  PubMed  Google Scholar 

  38. Sensalari, C., Maere, S. & Lohaus, R. ksrates: positioning whole-genome duplications relative to speciation events in KS distributions. Bioinformatics 38, 530–532 (2022).

    Article  CAS  PubMed  Google Scholar 

  39. Yamaguchi, Y. & Kumagai, H. Characteristics, biosynthesis, decomposition, metabolism and functions of the garlic odour precursor, S‑allyl‑L‑cysteine sulfoxide. Exp. Ther. Med. 19, 1528–1535 (2020).

    CAS  PubMed  Google Scholar 

  40. Michelmore, R. W. & Meyers, B. C. Clusters of resistance genes in plants evolve by divergent selection and a birth-and-death process. Genome Res. 8, 1113–1130 (1998).

    Article  CAS  PubMed  Google Scholar 

  41. Guo, L. et al. The opium poppy genome and morphinan production. Science 362, 343–347 (2018).

    Article  CAS  PubMed  Google Scholar 

  42. Yuan, M. et al. Pattern-recognition receptors are required for NLR-mediated plant immunity. Nature 592, 105–109 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Lescot, M. et al. PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences. Nucleic Acids Res. 30, 325–327 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Chen, A. et al. Large field of view-spatially resolved transcriptomics at nanoscale resolution. Preprint at bioRxiv https://doi.org/10.1101/2021.01.17.427004 (2021).

  45. Xia, K. et al. The single-cell stereo-seq reveals region-specific cell subtypes and transcriptome profiling in Arabidopsis leaves. Dev. Cell 57, 1299–1310 (2022).

    Article  CAS  PubMed  Google Scholar 

  46. Chen, A. et al. Spatiotemporal transcriptomic atlas of mouse organogenesis using DNA nanoball-patterned arrays. Cell 185, 1777–1792 (2022).

    Article  CAS  PubMed  Google Scholar 

  47. Zhang, C. et al.Transcriptome sequencing and metabolism analysis reveals the role of cyanidin metabolism in dark-red onion (Allium cepa L.) bulbs. Sci. Rep. 8, 14109 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  48. Goławska, S., Sprawka, I., Łukasik, I. & Goławski, A. Are naringenin and quercetin useful chemicals in pest-management strategies? J. Pest Sci. 87, 173–180 (2014).

    Article  Google Scholar 

  49. Kurepa, J., Shull, T. E. & Smalle, J. A. Quercetin feeding protects plants against oxidative stress (version 1; peer review: 1 approved, 1 approved with reservations). F1000Res. 5, 2430 (2016).

    Article  Google Scholar 

  50. Sossountzov, L. et al. Spatial and temporal expression of a maize lipid transfer protein gene. Plant Cell 3, 923–933 (1991).

    CAS  PubMed  PubMed Central  Google Scholar 

  51. Suh, M. C. et al. Cuticular lipid composition, surface structure, and gene expression in Arabidopsis stem epidermis. Plant Physiol. 139, 1649–1665 (2005).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. DeBono, A. et al. Arabidopsis LTPG is a glycosylphosphatidylinositol-anchored lipid transfer protein required for export of lipids to the plant surface. Plant Cell 21, 1230–1238 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  53. Yeats, T. H. & Rose, J. K. The biochemistry and biology of extracellular plant lipid‐transfer proteins (LTPs). Protein Sci. 17, 191–198 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Heath, O. Formative effects of environmental factors as exemplified in the development of the onion plant. Nature 155, 623–626 (1945).

    Article  Google Scholar 

  55. Mita, T. & Shibaoka, H. Changes in microtubules in onion leaf sheath cells during bulb development. Plant Cell Physiol. 24, 109–117 (1983).

    Article  Google Scholar 

  56. Trapnell, C. et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat. Biotechnol. 32, 381–386 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Qiu, X. et al. Single-cell mRNA quantification and differential analysis with Census. Nat. Methods 14, 309–315 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Zhang, C. et al. Transcriptome analysis of sucrose metabolism during bulb swelling and development in onion (Allium cepa L.). Front. Plant Sci. 7, 1425 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  59. Atif, M. J. et al. Mechanism of Allium crops bulb enlargement in response to photoperiod: a review. Int. J. Mol. Sci. 21, 1325 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Shibaoka, H. Plant hormone-induced changes in the orientation of cortical microtubules: alterations in the cross-linking between microtubules and the plasma membrane. Annu. Rev. Plant Biol. 45, 527–544 (1994).

    Article  CAS  Google Scholar 

  61. Zhong, R., Burk, D. H., Morrison, W. H. & Ye, Z. H. A kinesin-like protein is essential for oriented deposition of cellulose microfibrils and cell wall strength. Plant Cell 14, 3101–3117 (2002).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Ganguly, A., Zhu, C., Chen, W. & Dixit, R. FRA1 kinesin modulates the lateral stability of cortical microtubules through cellulose synthase–microtubule uncoupling proteins. Plant Cell 32, 2508–2524 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Rao, G., Zeng, Y., He, C. & Zhang, J. Characterization and putative post-translational regulation of α-and β-tubulin gene families in Salix arbutifolia. Sci. Rep. 6, 19258 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Goley, E. D. & Welch, M. D. The ARP2/3 complex: an actin nucleator comes of age. Nat. Rev. Mol. Cell Biol. 7, 713–726 (2006).

    Article  CAS  PubMed  Google Scholar 

  65. Singh, R. et al. Oil palm genome sequence reveals divergence of interfertile species in Old and New worlds. Nature 500, 335–339 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  66. Paterson, A. H., Bowers, J. E. & Chapman, B. A. Ancient polyploidization predating divergence of the cereals, and its consequences for comparative genomics. Proc. Natl Acad. Sci. USA 101, 9903–9908 (2004).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  67. Tuskan, G. A. et al. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science 313, 1596–1604 (2006).

    Article  CAS  PubMed  Google Scholar 

  68. Vanneste, K., Baele, G., Maere, S. & Van de Peer, Y. Analysis of 41 plant genomes supports a wave of successful genome duplications in association with the Cretaceous–Paleogene boundary. Genome Res. 24, 1334–1347 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  69. D'hont, A. et al. The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature 488, 213–217 (2012).

    Article  CAS  PubMed  Google Scholar 

  70. Putnik, P. et al. An overview of organosulfur compounds from Allium spp.: from processing and preservation to evaluation of their bioavailability, antimicrobial, and anti-inflammatory properties. Food Chem. 276, 680–691 (2019).

    Article  CAS  PubMed  Google Scholar 

  71. Ellmore, G. S. & Feldberg, R. S. Alliin lyase localization in bundle sheaths of the garlic clove (Allium sativum). Am. J. Bot. 81, 89–94 (1994).

    Article  CAS  Google Scholar 

  72. Stotz, H. U. et al. Role of camalexin, indole glucosinolates, and side chain modification of glucosinolate‐derived isothiocyanates in defense of Arabidopsis against Sclerotinia sclerotiorum. Plant J. 67, 81–93 (2011).

    Article  CAS  PubMed  Google Scholar 

  73. Hématy, K. et al. Moonlighting function of phytochelatin synthase1 in extracellular defense against fungal pathogens. Plant Physiol. 182, 1920–1932 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  74. Matern, A. et al. A substrate of the ABC transporter PEN3 stimulates bacterial flagellin (flg22)-induced callose deposition in Arabidopsis thaliana. J. Biol. Chem. 294, 6857–6870 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Birnbaum, K. et al. A gene expression map of the Arabidopsis root. Science 302, 1956–1960 (2003).

    Article  CAS  PubMed  Google Scholar 

  76. Jean-Baptiste, K. et al. Dynamics of gene expression in single root cells of Arabidopsis thaliana. Plant Cell 31, 993–1011 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  77. Shulse, C. N. et al. High-throughput single-cell transcriptome profiling of plant cell types. Cell Rep. 27, 2241–2247 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Olsen, J. L. et al. The genome of the seagrass Zostera marina reveals angiosperm adaptation to the sea. Nature 530, 331–335 (2016).

    Article  CAS  PubMed  Google Scholar 

  79. Shusei, S. et al. The tomato genome sequence provides insights into fleshy fruit evolution. Nature 485, 635–641 (2012).

    Article  Google Scholar 

  80. Project, A. G. et al. The Amborella genome and the evolution of flowering plants. Science 342, 1241089 (2013).

    Article  Google Scholar 

  81. Hyde, P. T., Earle, E. D. & Mutschler, M. A. Doubled haploid onion (Allium cepa L.) lines and their impact on hybrid performance. HortScience 47, 1690–1695 (2012).

    Article  Google Scholar 

  82. Healey, A., Furtado, A., Cooper, T. & Henry, R. J. Protocol: a simple method for extracting next-generation sequencing quality genomic DNA from recalcitrant plant species. Plant Methods 10, 21 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  83. Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  84. Xie, T. et al. De novo plant genome assembly based on chromatin interactions: a case study of Arabidopsis thaliana. Mol. Plant 8, 489–492 (2015).

    Article  CAS  PubMed  Google Scholar 

  85. Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with Hifiasm. Nat. Methods 18, 170–175 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  86. Durand, N. C. et al. Juicer provides a one-click system for analyzing loop-resolution Hi-C experiments. Cell Syst. 3, 95–98 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Dudchenko, O. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science 356, 92–95 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Rhie, A., Walenz, B. P., Koren, S. & Phillippy, A. M. Merqury: reference-free quality, completeness, and phasing assessment for genome assemblies. Genome Biol. 21, 1–27 (2020).

    Article  Google Scholar 

  89. Fajkus, P. et al. Allium telomeres unmasked: the unusual telomeric sequence (CTCGGTTATGGG) n is synthesized by telomerase. Plant J. 85, 337–347 (2016).

    Article  CAS  PubMed  Google Scholar 

  90. Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Majoros, W. H., Pertea, M. & Salzberg, S. L. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20, 2878–2879 (2004).

    Article  CAS  PubMed  Google Scholar 

  92. Tichenor, C. A new software metric to complement function points: the software non-functional assessment process (SNAP). https://apps.dtic.mil/sti/pdfs/ADA592012.pdf (2013).

  93. Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  94. Birney, E. & Durbin, R. Using GeneWise in the Drosophila annotation experiment. Genome Res. 10, 547–548 (2000).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  95. Haas, B. J. et al. De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512 (2013).

    Article  CAS  PubMed  Google Scholar 

  96. Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  98. Hunter, S. et al. InterPro: the integrative protein signature database. Nucleic Acids Res. 37, D211–D215 (2009).

    Article  CAS  PubMed  Google Scholar 

  99. Li, L., Stoeckert, C. J. & Roos, D. S. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 13, 2178–2189 (2003).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  100. Edgar, R. C. MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatics 5, 113 (2004).

    Article  PubMed  PubMed Central  Google Scholar 

  101. Zhang, C., Rabiee, M., Sayyari, E. & Mirarab, S. ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees. BMC Bioinformatics 19, 15–30 (2018).

    Article  Google Scholar 

  102. Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).

    Article  CAS  PubMed  Google Scholar 

  103. Bell, C. D., Soltis, D. E. & Soltis, P. S. The age and diversification of the angiosperms re‐revisited. Am. J. Bot. 97, 1296–1303 (2010).

    Article  PubMed  Google Scholar 

  104. Xu, Y. et al. Corrigendum #2 to ‘VGSC: a web-based vector graph toolkit of genome synteny and collinearity’. BioMed Res. Int. 2019, 2150291 (2019).

    PubMed  PubMed Central  Google Scholar 

  105. Nei, M. & Gojobori, T. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. Evol. 3, 418–426 (1986).

    CAS  PubMed  Google Scholar 

  106. Sun, P. et al. WGDI: a user-friendly toolkit for evolutionary analyses of whole-genome duplications and ancestral karyotypes. Mol. Plant 15, 1841–1851 (2022).

    Article  CAS  PubMed  Google Scholar 

  107. Soltis, P. S. & Soltis, D. E. Ancient WGD events as drivers of key innovations in angiosperms. Curr. Opin. Plant Biol. 30, 159–165 (2016).

    Article  PubMed  Google Scholar 

  108. Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  109. Sayyari, E., Whitfield, J. B. & Mirarab, S. DiscoVista: interpretable visualizations of gene tree discordance. Mol. Phylogenet. Evol. 122, 110–115 (2018).

    Article  PubMed  Google Scholar 

  110. Zwaenepoel, A. & Van de Peer, Y. wgd—simple command line tools for the analysis of ancient whole-genome duplications. Bioinformatics 35, 2153–2155 (2019).

    Article  CAS  PubMed  Google Scholar 

  111. Tarailo‐Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinformatics 25, 1–14 (2009).

    Article  Google Scholar 

  112. Sato, K., Tanaka, T., Shigenobu, S., Motoi, Y. & Itoh, T. Improvement of barley genome annotations by deciphering the Haruna Nijo genome. DNA Res. 23, 21–28 (2016).

    CAS  PubMed  Google Scholar 

  113. Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics 21, i351–i358 (2005).

    Article  CAS  PubMed  Google Scholar 

  114. Edgar, R. C. & Myers, E. W. PILER: identification and classification of genomic repeats. Bioinformatics 21, i152–i158 (2005).

    Article  CAS  PubMed  Google Scholar 

  115. Xu, Z. & Wang, H. LTR_FINDER: an efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res. 35, W265–W268 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  116. Benson, G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 27, 573–580 (1999).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  117. Wang, H. et al. Identification of antibiotic resistance genes in the multidrug-resistant Acinetobacter baumannii strain, MDR-SHH02, using whole-genome sequencing. Int. J. Mol. Med. 39, 364–372 (2017).

    Article  CAS  PubMed  Google Scholar 

  118. Ellinghaus, D., Kurtz, S. & Willhoeft, U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9, 18 (2008).

    Article  PubMed  PubMed Central  Google Scholar 

  119. Mistry, J., Finn, R. D., Eddy, S. R., Bateman, A. & Punta, M. Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Res. 41, e121 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  120. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  121. Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490 (2010).

    Article  PubMed  PubMed Central  Google Scholar 

  122. Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).

    Article  CAS  PubMed  Google Scholar 

  123. Hao, Y. et al. Integrated analysis of multimodal single-cell data. Cell 184, 3573–3587 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  124. Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  125. Carmona-Saez, P., Chagoyen, M., Tirado, F., Carazo, J. & Pascual-Montano, A. GENECODIS: a web-based tool for finding significant concurrent annotations in gene lists. Genome Biol. 8, R3 (2007).

    Article  PubMed  PubMed Central  Google Scholar 

  126. Huang, T. et al. Deciphering the effects of gene deletion on yeast longevity using network and machine learning approaches. Biochimie 94, 1017–1025 (2012).

    Article  CAS  PubMed  Google Scholar 

  127. Chen, L., Li, B.-Q. & Feng, K.-Y. Predicting biological functions of protein complexes using graphic and functional features. Curr. Bioinform. 8, 545–551 (2013).

    Article  CAS  Google Scholar 

  128. Chen, M. et al. Genome warehouse: a public repository housing genome-scale data. Genom. Proteom. Bioinform. 19, 584–589 (2021).

    Article  Google Scholar 

  129. CNCB-NGDC Members and PartnersDatabase resources of the National Genomics Data Center, China National Center for Bioinformation in 2023. Nucleic Acids Res. 51, D18–D28 (2023).

    Article  Google Scholar 

Download references

Acknowledgements

We thank M. J. Havey from the Department of Horticulture, University of Wisconsin (Madison, WI, USA) for providing the onion seeds of the doubled haploid line and L. Cui from the Liaoning Academy of Agricultural Sciences (Shenyang, Liaoning, China) for providing the inbred lines of Welsh onion. We thank Y. Gu, L. Chen, Q. Lin, L. Chen, B. Mu, H. Sun, X. Wei, J. Li, S. Li, H. Lu and S. Zhang for general technical assistance or discussion. We thank G. Zhang from Zhejiang University and K. Wang from Northwestern Polytechnical University (NWPU) for their comments and suggestions when preparing our manuscript. We thank Y. Zeng from BGI-Shenzhen for sharing his idea on Allium genome study. This work was supported by the Thousand Talents Plan (5113190037), the Talents Team Construction Fund of NWPU and the Fundamental Research Funds for the Central Universities (3102019JC007) to J.C.; the Talents Team Construction Fund of NWPU and the Projects of Interdisciplinary of NWPU (0202022GH0306) to W.W.; National Natural Science Foundation of China (21801206), the Joint Research Funds of Department of Science & Technology of Shaanxi Province and NWPU (2020GXLH-Z-015) to Z.R.; Guangdong Provincial Key Laboratory of Genome Read and Write (2017B030301011) to X.X.

Author information

Authors and Affiliations

Authors

Contributions

F.H. and J.C. managed the project. J.C., W.W., R.M., H.Y. and X.X. conceived the study. F.H. and J.C. wrote the manuscript with contributions from all other authors. B.Z. and H.Z. were responsible for genome assembly into contigs and Hi-C scaffolding. X.L. handled genome annotation and analysis of phylogeny, transcriptome, gene family evolution, allicin/isoallicin pathway and immune system. B.Z., Z.T., X.L. and Z.L. were responsible for the analysis of whole-genome synteny and whole-genome duplication (WGD). Z.T. and P.Z. handled repeat annotation and conducted analysis of LTR retrotransposons. L.Z., J.H., J.Q., Q.L., Y.Z. and K.W. were responsible for in situ hybridization experiments. L.Z. focused on studying the evolution and formation of tunicated bulbs. H.Z. was responsible for genome evaluation and conducted ultra-high-performance liquid chromatography/tandem mass spectrometry (UHPLC-MS/MS) analysis. H.Z., K.X., X.G., L.L., W.S., B.Z., S.L. and L.P. conducted spatial RNA sequencing and analyzed spatial transcriptome data. Y.P., W.Z., F.L., Z.R. and J.M. were in charge of plant material collection and DNA/RNA preparation. H.Y., L.H. and W.C. supervised genome sequencing and library construction and also collected plant material for bulb development.

Corresponding authors

Correspondence to Xun Xu, Hui Yang, Richard C. Macknight, Wen Wang or Jing Cai.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Genetics thanks Zhangjun Fei and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Genome overview and phylogenetic analyses of Allium species and African lily, related to Fig. 1.

a, Overview of genomes of Allium species and African lily: track a corresponds to chromosome length. From b to d, three rings represent density distribution of different genomic features, including coding gene, GC, and repeat sequences, respectively. Track e corresponds to syntenic blocks. b, Species tree by summarizing each gene tree with Astral. c, Phylogenies from three supergene matrice, each of which was constructed by concatenating ortholog gene sets defined by reciprocal best hit (RBH) by BLAST, single-copy genes (SCG) identified using OrthoFinder and single-copy genes (SCG) identified using OrthoMCL, respectively. d, Single-copy genes identified using OrthoMCL. e, Single-copy genes identified using OrthoFinder. f, Relative evolutionary rate comparison among three Allium species and African lily.

Extended Data Fig. 2 Analysis of repetitive sequences in three Allium species and African lily.

a, Overall composition of repetitive elements in different genomes. DNA, DNA element; LINE, long interspersed nuclear element; LTR, long terminal repeat transposable element; SINE, short interspersed nuclear element. Divergence distribution of Copia (b) and Gypsy (c) retrotransposons in four assembled genomes. X-axis represents divergence measured in percentage of sequence differences with consensus in TE library. d, Phylogenetic relationships of Gypsy and Copia retrotransposon domains across genomes of asparagus, rice, and four species sequenced in this study (Supplementary Data 3 for alignments of sequences and a version with support values on branches).

Extended Data Fig. 3 Evolution of gene families, related to Fig. 3.

a, Pie diagram on each branch of tree represents the proportion of gene families undergoing gain (green) or loss (red) events. Numbers below pie diagram denote total number of expanded and contracted gene families. b, Distribution of single-copy, multiple-copy, unique, and other orthologs in 10 plant species, and three columns on right show number of genes in families, family number, and average genes per family in 10 plant species. c, Point-line and Venn diagrams represent shared and unique gene families among 10 species or four closely related Amaryllidaceae species (Welsh onion, garlic, onion, and African lily). Each number represents number of gene families. d, Significant GO terms of molecular function, biological process, and cellular component enriched in expanded gene families in ancestral branch of three Allium plants. e, Significant KEGG pathways enriched in expanded gene families in ancestral branch of three Allium plants. f, Network diagram of significant KEGG pathways enriched in expanded gene families in ancestral branch of three Allium plants. GO terms or KEGG pathways probably related to alliinase, plant immune system, and bulb development discussed in this study are labeled with red arrows. The p values (unadjusted, one side) presented in d and e were calculated using the hypergeometric test.

Extended Data Fig. 4 Clustering of alliinase gene family in Allium and outgroup species, related to Fig. 3.

Phylogenetic analysis of alliinase gene family, with alliinase genes forming 15 distinct groups (A–O). Bootstrap values are shown on each branch. For the alliinase gene family, we constructed a gene tree using standard maximum-likelihood phylogenetic analysis implemented in IQ-TREE with default parameters and 1,000 bootstraps.

Extended Data Fig. 5 Clustering of lachrymatory factor synthase gene family (LFS) in Allium, related to Fig. 3.

Phylogenetic analysis of LFS gene family, with LFS genes forming eight distinct groups (A–H). Bootstrap values are shown on each branch. For LFS gene family, we constructed a gene tree using standard maximum-likelihood phylogenetic analysis implemented in IQ-TREE with default parameters and 1,000 bootstraps.

Extended Data Fig. 6 Location of genes associated with allicin/isoallicin biosynthesis on chromosomes.

Localization of genes related to allicin/isoallicin biosynthetic pathways on chromosomes of a, Welsh onion, b, garlic, and c, onion.

Extended Data Fig. 7 Gene expression changes upon pathogen infection in onion.

a, Relative expression levels of alliin biosynthesis genes in transcriptome of fungal-infected blades. b, Significant KEGG pathways enriched in up-regulated genes in response to fungal infection. Comparing gene expression profiles between leaf blades of fungal-infected and healthy plants, we identified 1,035 up-regulated genes during infection. c, Relative expression levels of alliin biosynthesis genes after artificial injury and exposure to bacterial culture by qRT-PCR. d and e, Relative expression levels of homologous genes of alliin biosynthesis genes in Arabidopsis thaliana. TPM: transcripts per kilobase of exon model per million mapped reads. Transcriptome data were downloaded from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE142747. Four-week-old Arabidopsis thaliana plant leaves were infiltrated with sterile water (Mock) or different Pst strains, then harvested at 3 or 6 h after infiltration. Two bacterial strains, P. s. pv. tomato (Pst) DC3000 (avrRpt2), which activates RPS2 (resistance to P. syringae 2)-dependent effector-triggered immunity (ETI) and pattern-triggered immunity (PTI), and Pst DC3000 without ‘avirulent’ effector to activate PTI only in wild-type plants. Data presented in a, ce are mean ± standard error of three or four independent experiments, and bars with p values were significantly different based on t test (two-sided). The p values (unadjusted, one side) presented in b were calculated using the hypergeometric test.

Extended Data Fig. 8 Cell marker genes in different clusters, related to Fig. 4.

a, Dot plot showing expression profiles of marker genes in all 14 cell clusters. ID of cell marker genes in onion are below x-axis. Expression level for each bin was calculated by scaled number of molecular identifiers (MIDs) for each marker gene. Average expression (AE) level of all bins in each cell cluster is denoted by dot color. Percentage of bins expressing marker gene (PE) in each cell cluster is denoted by dot size. b, Spatial expression patterns of cell marker genes from in situ hybridization (left) and spatial RNA-seq (right), showing that in situ hybridization and stereo-seq data were highly consistent and supported cell categorization. Scale bars equal to 1 mm for in situ hybridization, 800 um for spatial RNA-seq. Number in parenthesis represents number of samples with same signal pattern over total number of samples.

Extended Data Fig. 9 Spatial expression patterns of genes related to synthesis of flavonoid compounds and cuticular wax, related to Fig. 4e,f.

a, Spatial expression patterns of genes (CHS, CHIL, CHI, F3H, F3'H, LTP2 and LTP3) in different onion bulb development stages based on stereo-seq data. Co-expression of genes in epidermal cells was observed in all samples, showing that flavonoid and cuticular wax biosynthesis is mainly active in epidermal cells. b, Expression pattern of LTP in leaf base of onion by RNA in situ hybridization. Lower panels are images with reverse colors produced by ImageJ based on images in upper panels. Expression patterns based on in situ hybridization confirmed LTP results in Fig. 4f from large-scale Stereo-seq analysis. Scale bars equal to 800 μm for a, 1 mm for b. Number in parenthesis represents number of samples with same signal pattern over total number of samples.

Extended Data Fig. 10 Spongy mesophyll cell lineages and onion bulb formation, related to Figs. 4g,5.

a, Visualization of spongy mesophyll cell lineages, including six sections at three development stages. Sections from different stages and individuals show similar patterns of spongy mesophyll cell development: spongy mesophyll cells from inner to outer layer, and for each layer from base to top and from outer to inner, represent early to late points along the expansion process. Progression of numbers from small to large indicates pseudotime sequence, reflecting progression of spongy mesophyll cell expansion. Same color pattern is also used in panel b. b and c, UMAP dimensionality reduction projection of spongy mesophyll cells grouped by pseudotime scores and sections, respectively. General pattern of pseudotime increasing from left to right can be seen in panel b, but no clear pattern can be seen in c, suggesting that expansion of spongy mesophyll cell development is a fundamentally short process, starting asynchronously at different development stages as indicated in panel a. d and e, Spatial visualization of expression of indicated genes related to onion bulb formation. Scale bars equal to 800 μm for a, d and e.

Supplementary information

Supplementary Information

Supplementary Results 1–6, Supplementary Methods and Supplementary Figs. 1–20.

Reporting Summary

Supplementary Tables

Supplementary Tables 1–54.

Supplementary Data 1

Sequence alignment for the analysis of positive selection of genes in the three Allium species.

Supplementary Data 2

Sequence alignment of alliinase gene family and LFS gene family.

Supplementary Data 3

Sequence alignment of phylogenetic relationships of Gypsy and Copia retrotransposon domains across genomes of asparagus, rice and four species sequenced in this study, and a version with support values on branches.

Supplementary Data 4

Sequence alignment of positively selected or rapidly evolving genes involved in bulb formation in onion or garlic.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hao, F., Liu, X., Zhou, B. et al. Chromosome-level genomes of three key Allium crops and their trait evolution. Nat Genet 55, 1976–1986 (2023). https://doi.org/10.1038/s41588-023-01546-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41588-023-01546-0

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research