Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Article
  • Published:

Genome sequencing reveals the genetic architecture of heterostyly and domestication history of common buckwheat

Abstract

Common buckwheat, Fagopyrum esculentum, is an orphan crop domesticated in southwest China that exhibits heterostylous self-incompatibility. Here we present chromosome-scale assemblies of a self-compatible F. esculentum accession and a self-compatible wild relative, Fagopyrum homotropicum, together with the resequencing of 104 wild and cultivated F. esculentum accessions. Using these genomic data, we report the roles of transposable elements and whole-genome duplications in the evolution of Fagopyrum. In addition, we show that (1) the breakdown of heterostyly occurs through the disruption of a hemizygous gene jointly regulating the style length and female compatibility and (2) southeast Tibet was involved in common buckwheat domestication. Moreover, we obtained mutants conferring the waxy phenotype for the first time in buckwheat. These findings demonstrate the utility of our F. esculentum assembly as a reference genome and promise to accelerate buckwheat research and breeding.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Sequencing and assembly of F. esculentum PL4.
Fig. 2: Genome structure of F. esculentum PL4.
Fig. 3: Development of a waxy buckwheat.
Fig. 4: Genetic architecture of the F. esculentum mating system.
Fig. 5: Population structure of cultivated and wild common buckwheat.

Similar content being viewed by others

Data availability

The final genome assemblies, annotations, RNA sequences and raw genome sequence data generated in this study have all been deposited in the DNA Data Bank of Japan (DDBJ) database under BioProjects PRJDB15031 (assembly of F. esculentum PL4: BSUD01000001–BSUD01003041, assembly of F. esculentum GS1: BSUE01000001–BSUE01042256, raw reads of various F. esculentum accessions: DRR438014–DRR438137 and DRR477348–477353) and PRJDB15175 (assembly of F. homotropicum: BSWB01000001–BSWB01000436, raw reads of F. homotropicum: DRR438312). The final genome assemblies and annotations are also available from the Buckwheat Genome DataBase (BGDB) (http://buckwheat.kazusa.or.jp), as well as the initial scaffolds and contigs used to construct the final assemblies. Publicly available datasets from the following databases and websites were used in this study: NR database of NCBI, UniProtKB (https://www.uniprot.org), PFAM, Phytozome (https://phytozome-next.jgi.doe.gov), TAIR (https://www.arabidopsis.org), MBKbase (https://www.mbkbase.org) and CARNIVOROM. Source data are provided with this paper.

References

  1. Ohnishi, O. Discovery of the wild ancestor of common buckwheat. Fagopyrum 11, 5–10 (1991).

    Google Scholar 

  2. Ohnishi, O. & Matsuoka, Y. Search for the wild ancestor of buckwheat II. Taxonomy of Fagopyrum (Polygonaceae) species based on morphology, isozymes and cpDNA variability. Genes Genet. Syst. 71, 383–390 (1996).

  3. Weisskopf, A. & Fuller, D. Q. in Encyclopedia of Global Archaeology (ed. Smith, C.) 1025–1028 (Springer, 2014).

  4. Hunt, H. V., Shang, X. & Jones, M. K. Buckwheat: a crop from outside the major Chinese domestication centres? A review of the archaeobotanical, palynological and genetic evidence. Veg. Hist. Archaeobot. 27, 493–506 (2018).

    PubMed  Google Scholar 

  5. Yasui, Y. et al. Assembly of the draft genome of buckwheat and its applications in identifying agronomically useful genes. DNA Res. 23, 215–224 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Penin, A. A. et al. High-resolution transcriptome atlas and improved genome assembly of common buckwheat, Fagopyrum esculentum. Front. Plant Sci. 12, 612382 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  7. Zhang, L. et al. The tartary buckwheat genome provides insights into rutin biosynthesis and abiotic stress tolerance. Mol. Plant 10, 1224–1237 (2017).

    Article  CAS  PubMed  Google Scholar 

  8. Matsui, K. & Yasui, Y. Genetic and genomic research for the development of an efficient breeding system in heterostylous self-incompatible common buckwheat (Fagopyrum esculentum). Theor. Appl. Genet. 133, 1641–1653 (2020).

    Article  CAS  PubMed  Google Scholar 

  9. Matsui, K. & Yasui, Y. Buckwheat heteromorphic self-incompatibility: genetics, genomics and application to breeding. Breed. Sci. 70, 32–38 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  10. Darwin, C. The Different Forms of Flowers on Plants of the Same Species (John Murray, 1897).

  11. Yasui, Y. et al. S-LOCUS EARLY FLOWERING 3 is exclusively present in the genomes of short-styled buckwheat plants that exhibit heteromorphic self-incompatibility. PLoS ONE 7, e31264 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Matsui, K., Tetsuka, T., Nishio, T. & Hara, T. Heteromorphic incompatibility retained in self-compatible plants produced by a cross between common and wild buckwheat. New Phytol. 159, 701–708 (2003).

    Article  CAS  PubMed  Google Scholar 

  13. Tsai, H. et al. Discovery of rare mutations in populations: TILLING by sequencing. Plant Physiol. 156, 1257–1268 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. SimĂ£o, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).

    Article  PubMed  Google Scholar 

  15. Ou, S., Chen, J. & Jiang, N. Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res. 46, e126 (2018).

    PubMed  PubMed Central  Google Scholar 

  16. Yabe, S. et al. Rapid genotyping with DNA micro-arrays for high-density linkage mapping and QTL mapping in common buckwheat (Fagopyrum esculentum Moench). Breed. Sci. 64, 291–299 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Yang, Y. et al. Improved transcriptome sampling pinpoints 26 ancient and more recent polyploidy events in Caryophyllales, including two allopolyploidy events. New Phytol. 217, 855–870 (2018).

    Article  CAS  PubMed  Google Scholar 

  18. Verde, I. et al. The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution. Nat. Genet. 45, 487–494 (2013).

    Article  CAS  PubMed  Google Scholar 

  19. Jaillon, O. et al. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449, 463–467 (2007).

    Article  CAS  PubMed  Google Scholar 

  20. Dohm, J. C. et al. The genome of the recently domesticated crop plant sugar beet (Beta vulgaris). Nature 505, 546–549 (2014).

    Article  CAS  PubMed  Google Scholar 

  21. Fawcett, J. A., Maere, S. & Van de Peer, Y. Plants with double genomes might have had a better chance to survive the Cretaceous–Tertiary extinction event. Proc. Natl Acad. Sci. USA 106, 5737–5742 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Vanneste, K., Baele, G., Maere, S. & Van de Peer, Y. Analysis of 41 plant genomes supports a wave of successful genome duplications in association with the Cretaceous–Paleogene boundary. Genome Res. 24, 1334–1347 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Kreft, I. et al. Breeding buckwheat for nutritional quality. Breed. Sci. 70, 67–73 (2020).

    Article  PubMed  Google Scholar 

  24. Tanaka, K. et al. Pepsin-resistant 16-kD buckwheat protein is associated with immediate hypersensitivity reaction in patients with buckwheat allergy. Int. Arch. Allergy Immunol. 129, 49–56 (2002).

    Article  CAS  PubMed  Google Scholar 

  25. Satoh, R., Koyano, S., Takagi, K., Nakamura, R. & Teshima, R. Identification of an IgE-binding epitope of a major buckwheat allergen, BWp16, by SPOTs assay and mimotope screening. Int. Arch. Allergy Immunol. 153, 133–140 (2010).

    Article  CAS  PubMed  Google Scholar 

  26. Barrett, S. C. ‘A most complex marriage arrangement’: recent advances on heterostyly and unresolved questions. New Phytol. 224, 1051–1067 (2019).

    Article  PubMed  Google Scholar 

  27. Pamela, V. & Dowrick, J. Heterostyly and homostyly in Primula obconica. Heredity 10, 219–236 (1956).

    Article  Google Scholar 

  28. Li, J. et al. Genetic architecture and evolution of the S locus supergene in Primula vulgaris. Nat. Plants 2, 16188 (2016).

    Article  CAS  PubMed  Google Scholar 

  29. Huu, C. N. et al. Presence versus absence of CYP734A50 underlies the style-length dimorphism in primroses. eLife 5, e17956 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  30. Cocker, J. M. et al. Primula vulgaris (primrose) genome assembly, annotation and gene expression, with comparative genomics on the heterostyly supergene. Sci. Rep. 8, 17942 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Shore, J. S. et al. The long and short of the S-locus in Turnera (Passifloraceae). New Phytol. 224, 1316–1329 (2019).

    Article  CAS  PubMed  Google Scholar 

  32. Matzke, C. M. et al. Pistil mating type and morphology are mediated by the brassinosteroid inactivating activity of the S-locus gene BAHD in heterostylous Turnera species. Int. J. Mol. Sci. 22, 10603 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Gutiérrez-Valencia, J. et al. Genomic analyses of the Linum distyly supergene reveal convergent evolution at the molecular level. Curr. Biol. 32, 4360–4371 (2022).

  34. Matzke, C. M., Shore, J. S., Neff, M. M. & McCubbin, A. G. The Turnera style S-locus gene TsBAHD possesses brassinosteroid-inactivating activity when expressed in Arabidopsis thaliana. Plants 9, 1566 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Huu, C. N., Plaschil, S., Himmelbach, A., Kappel, C. & Lenhard, M. Female self-incompatibility type in heterostylous Primula is determined by the brassinosteroid-inactivating cytochrome P450 CYP734A50. Curr. Biol. 32, 671–676 (2022).

    Article  CAS  PubMed  Google Scholar 

  36. Huu, C. N., Keller, B., Conti, E., Kappel, C. & Lenhard, M. Supergene evolution via stepwise duplications and neofunctionalization of a floral-organ identity gene. Proc. Natl Acad. Sci. USA 117, 23148–23157 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Potente, G. et al. Comparative genomics elucidates the origin of a supergene controlling floral heteromorphism. Mol. Biol. Evol. 39, msac035 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Konishi, T., Yasui, Y. & Ohnishi, O. Original birthplace of cultivated common buckwheat inferred from genetic relationships among cultivated populations and natural populations of wild common buckwheat revealed by AFLP analysis. Genes Genet. Syst. 80, 113–119 (2005).

  39. Konishi, T. & Ohnishi, O. Close genetic relationship between cultivated and natural populations of common buckwheat in the Sanjiang area is not due to recent gene flow between them—an analysis using microsatellite markers. Genes Genet. Syst. 82, 53–64 (2007).

  40. Ohnishi, O. On the origin of cultivated common buckwheat based on allozyme analyses of cultivated and wild populations of common buckwheat. Fagopyrum 26, 3–9 (2009).

    CAS  Google Scholar 

  41. Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  42. Novembre, J. & Stephens, M. Interpreting principal component analyses of spatial population genetic variation. Nat. Genet. 40, 646–649 (2008).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Alexander, D., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Saitou, N. & Nei, M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987).

    CAS  PubMed  Google Scholar 

  45. Yi, X. et al. Sequencing of 50 human exomes reveals adaptation to high altitude. Science 329, 75–78 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Krzyzanska, M., Hunt, H. V., Crema, E. R. & Jones, M. K. Modelling the potential ecological niche of domesticated buckwheat in China: archaeological evidence, environmental constraints and climate change. Veg. Hist. Archaeobot. 31, 331–345 (2022).

  47. Jones, M. & Brown, T. in Rethinking Agriculture: Archaeological and Ethnoarchaeological Perspectives 1st edn (eds Denham, T. P. et al.) 36–49 (Routledge, 2007).

  48. Tanno, K. & Willcox, G. How fast was wild wheat domesticated? Science 311, 1886 (2006).

    Article  CAS  PubMed  Google Scholar 

  49. Brown, T. A., Jones, M. K., Powell, W. & Allaby, R. G. The complex origins of domesticated crops in the Fertile Crescent. Trends Ecol. Evol. 24, 103–109 (2009).

    Article  PubMed  Google Scholar 

  50. Meyer, R. S. et al. Domestication history and geographical adaptation inferred from a SNP map of African rice. Nat. Genet. 48, 1083–1088 (2016).

    Article  CAS  PubMed  Google Scholar 

  51. Zhou, Y., Massonnet, M., Sanjak, J. S., Cantu, D. & Gaut, B. S. Evolutionary genomics of grape (Vitis vinifera ssp. vinifera) domestication. Proc. Natl Acad. Sci. USA 114, 11715–11720 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Qian, L.-S., Chen, J.-H., Deng, T. & Sun, H. Plant diversity in Yunnan: current status and future directions. Plant Divers. 42, 281–291 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  53. Matsui, K., Tetsuka, T., Hara, T. & Morishita, T. Breeding and characterization of a new self-compatible common buckwheat [Fagopyrum esculentum] parental line, ‘Buckwheat Norin-PL1’. Bull. Natl Agric. Res. Cent. Kyushu Okinawa Reg. 49, 11–17 (2008).

  54. Yabe, S. et al. Potential of genomic selection in mass selection breeding of an allogamous crop: an empirical study to increase yield of common buckwheat. Front. Plant Sci. 9, 276 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  55. Tomiyoshi, M., Yasui, Y., Ohsako, T., Li, C.-Y. & Ohnishi, O. Phylogenetic analysis of AGAMOUS sequences reveals the origin of the diploid and tetraploid forms of self-pollinating wild buckwheat, Fagopyrum homotropicum Ohnishi. Breed. Sci. 62, 241–247 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  57. Gonda, I. et al. The genome sequence of tetraploid sweet basil, Ocimum basilicum L., provides tools for advanced genome editing and molecular breeding. DNA Res. 27, dsaa027 (2020).

    Article  PubMed  PubMed Central  Google Scholar 

  58. Avni, R. et al. Wild emmer genome architecture and diversity elucidate wheat evolution and domestication. Science 357, 93–97 (2017).

    Article  CAS  PubMed  Google Scholar 

  59. Luo, M.-C. et al. Genome sequence of the progenitor of the wheat D genome Aegilops tauschii. Nature 551, 498–502 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963 (2014).

    Article  PubMed  PubMed Central  Google Scholar 

  61. Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770 (2011).

    Article  PubMed  PubMed Central  Google Scholar 

  62. Takeshima, R., Ogiso-Tanaka, E., Yasui, Y. & Matsui, K. Targeted amplicon sequencing + next-generation sequencing-based bulked segregant analysis identified genetic loci associated with preharvest sprouting tolerance in common buckwheat (Fagopyrum esculentum). BMC Plant Biol. 21, 18 (2021).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Ogiso-Tanaka, E., Shimizu, T., Hajika, M., Kaga, A. & Ishimoto, M. Highly multiplexed AmpliSeq technology identifies novel variation of flowering time-related genes in soybean (Glycine max). DNA Res. 26, 243–260 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  64. Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  65. Iwata, H. & Ninomiya, S. AntMap: constructing genetic linkage maps using an ant colony optimization algorithm. Breed. Sci. 56, 371–377 (2006).

    Article  Google Scholar 

  66. Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009).

  67. Jain, C., Koren, S., Dilthey, A., Phillippy, A. M. & Aluru, S. A fast adaptive algorithm for computing whole-genome homology maps. Bioinformatics 34, i748–i756 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Marçais, G. et al. MUMmer4: a fast and versatile genome alignment system. PLoS Comput. Biol. 14, e1005944 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  69. Kikuchi, S., Matsui, K., Tanaka, H., Ohnishi, O. & Tsujimoto, H. Chromosome evolution among seven Fagopyrum species revealed by fluorescence in situ hybridization (FISH) probed with rDNAs. Chromosome Sci. 11, 37–43 (2008).

    Google Scholar 

  70. Ellinghaus, D., Kurtz, S. & Willhoeft, U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9, 18 (2008).

  71. Steinbiss, S., Willhoeft, U., Gremme, G. & Kurtz, S. Fine-grained annotation and classification of de novo predicted LTR retrotransposons. Nucleic Acids Res. 37, 7002–7013 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  73. Li, W. & Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006).

    Article  CAS  PubMed  Google Scholar 

  74. Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  75. Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  76. Neumann, P., NovĂ¡k, P., HoÅ¡tĂ¡kovĂ¡, N. & Macas, J. Systematic survey of plant LTR-retrotransposons elucidates phylogenetic relationships of their polyprotein domains and provides a reference for element classification. Mob. DNA 10, 1 (2019).

    Article  PubMed  PubMed Central  Google Scholar 

  77. Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  78. Hoang, D. T., Chernomor, O., Von Haeseler, A., Minh, B. Q. & Vinh, L. S. UFBoot2: improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 35, 518–522 (2018).

    Article  CAS  PubMed  Google Scholar 

  79. Roberts, A. & Pachter, L. Streaming fragment assignment for real-time analysis of sequencing experiments. Nat. Methods 10, 71–73 (2013).

    Article  CAS  PubMed  Google Scholar 

  80. Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).

    Article  PubMed  PubMed Central  Google Scholar 

  81. Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  82. Pertea, M. et al. Stringtie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  83. Brůna, T., Hoff, K. J., Lomsadze, A., Stanke, M. & Borodovsky, M. BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database. NAR Genom. Bioinform. 3, lqaa108 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  84. Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).

    Article  CAS  PubMed  Google Scholar 

  85. Mistry, J. et al. Pfam: the protein families database in 2021. Nucleic Acids Res. 49, D412–D419 (2021).

  86. Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  87. Wu, T. D. & Watanabe, C. K. GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21, 1859–1875 (2005).

    Article  CAS  PubMed  Google Scholar 

  88. Rifkin, J. L. et al. Recombination landscape dimorphism and sex chromosome evolution in the dioecious plant Rumex hastatulus. Philos. Trans. R. Soc. Lond. B 377, 20210226 (2022).

    Article  Google Scholar 

  89. Palfalvi, G. et al. Genomes of the Venus flytrap and close relatives unveil the roots of plant carnivory. Curr. Biol. 30, 2312–2320 (2020).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  90. Gilman, I. S., Moreno-Villena, J. J., Lewis, Z. R., Goolsby, E. W. & Edwards, E. J. Gene co-expression reveals the modularity and integration of C4 and CAM in Portulaca. Plant Physiol. 189, 735–753 (2022).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. McGrath, J. M. M. et al. A contiguous de novo genome assembly of sugar beet EL10 (Beta vulgaris L.). DNA Res. 30, dsac033 (2022).

  92. Jarvis, D. E. et al. The genome of Chenopodium quinoa. Nature 542, 307–312 (2017).

    Article  CAS  PubMed  Google Scholar 

  93. Lightfoot, D. et al. Single-molecule sequencing and Hi-C-based proximity-guided assembly of amaranth (Amaranthus hypochondriacus) chromosomes provide insights into genome evolution. BMC Biol. 15, 74 (2017).

  94. Wang, Y. et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 40, e49 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  95. Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).

    Article  CAS  PubMed  Google Scholar 

  96. Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).

  97. Yao, G. et al. Plastid phylogenomic insights into the evolution of Caryophyllales. Mol. Phylogenet. Evol. 134, 74–86 (2019).

    Article  PubMed  Google Scholar 

  98. Capella-GutiĂ©rrez, S., Silla-MartĂ­nez, J. M. & GabaldĂ³n, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).

    Article  PubMed  PubMed Central  Google Scholar 

  99. Bouckaert, R. et al. BEAST 2.5: an advanced software platform for Bayesian evolutionary analysis. PLoS Comput. Biol. 15, e1006650 (2019).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  100. Ng, K. K. S. et al. The genome of Shorea leprosula (Dipterocarpaceae) highlights the ecological relevance of drought in aseasonal tropical rainforests. Commun. Biol. 4, 1166 (2021).

  101. Drummond, A. J., Ho, S. Y. W., Phillips, M. J. & Rambaut, A. Relaxed phylogenetics and dating with confidence. PLoS Biol. 4, e88 (2006).

    Article  PubMed  PubMed Central  Google Scholar 

  102. Le, S. Q. & Gascuel, O. An improved general amino acid replacement matrix. Mol. Biol. Evol. 25, 1307–1320 (2008).

    Article  CAS  PubMed  Google Scholar 

  103. Yang, Z. Among-site rate variation and its impact on phylogenetic analyses. Trends Ecol. Evol. 11, 367–372 (1996).

    Article  CAS  PubMed  Google Scholar 

  104. Yule, G. U. A mathematical theory of evolution, based on the conclusions of Dr. J. C. Willis, F.R.S. Philos. Trans. R. Soc. B 213, 21–87 (1925).

    Google Scholar 

  105. MagallĂ³n, S., GĂ³mez-Acevedo, S., SĂ¡nchez-Reyes, L. L. & HernĂ¡ndez-HernĂ¡ndez, T. A metacalibrated time-tree documents the early rise of flowering plant phylogenetic diversity. New Phytol. 207, 437–453 (2015).

    Article  PubMed  Google Scholar 

  106. Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  108. Danecek, P. et al. Twelve years of SAMtools and BCFtools. GigaScience 10, giab008 (2021).

  109. Koboldt, D. C. et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22, 568–576 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  110. Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly 6, 80–92 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  111. Nakamura, T., Yamamori, M., Hirano, H., Hidaka, S. & Nagamine, T. Production of waxy (amylose-free) wheats. Mol. Gen. Genet. 248, 253–259 (1995).

    Article  CAS  PubMed  Google Scholar 

  112. Chang, S., Puryear, J. & Cairney, J. A simple and efficient method for isolating RNA from pine trees. Plant Mol. Biol. Rep. 11, 113–116 (1993).

    Article  CAS  Google Scholar 

  113. Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4, 7 (2015).

    Article  PubMed  PubMed Central  Google Scholar 

  114. Tajima, F. Evolutionary relationship of DNA sequences in finite populations. Genetics 105, 437–460 (1983).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  115. Korunes, K. L. & Samuk, K. pixy: unbiased estimation of nucleotide diversity and divergence in the presence of missing data. Mol. Ecol. Resour. 21, 1359–1368 (2021).

    Article  PubMed  PubMed Central  Google Scholar 

  116. Weir, B. S. & Cockerham, C. C. Estimating F-statistics for the analysis of population structure. Evolution 38, 1358–1370 (1984).

  117. Browning, S. R. & Browning, B. L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 1084–1097 (2007).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  118. Hill, W. & Robertson, A. Linkage disequilibrium in finite populations. Theor. Appl. Genet. 38, 226–231 (1968).

    Article  CAS  PubMed  Google Scholar 

  119. Nagano, M., Aii, J., Campbell, C., Kawasaki, S. & Adachi, T. Genome size analysis of the genus Fagopyrum. Fagopyrum 17, 35–39 (2000).

Download references

Acknowledgements

This work was supported by KAKEN-HI (grants 20K06761 and 21H00356 to J.A.F., 22H05172 and 22H05181 to K.S., and 18KK0172 to Y.Y.); ACT-X ‘Environments and Biotechnology’ from the Japan Science and Technology Agency (JST) (grant JPMJAX20BA to R.T.); Cabinet Office, Government of Japan, Moonshot R&D Program for Agriculture, Forestry and Fisheries to Y.Y.; the research programme on development of innovative technology from the Project of the Bio-oriented Technology Research Advancement Institution (BRAIN) (grant JPJ007097 to T.H.); and the Leverhume Trust (grant RPG-2017-196 to M.K.J.). We thank S. Wright for helpful discussions and providing genomic data of R. hastatulus, K. L. Farquharson for language-editing support of the manuscript and RIKEN HOKUSAI and the National Institute of Genetics for providing computational resources.

Author information

Authors and Affiliations

Authors

Contributions

J.A.F., R.T., S.K., T.K.-T., M.K.J., H.H., T. Ota and Y.Y. wrote the paper. R.T., K.M., E.O.-T., T.H. and Y.Y. prepared materials for genome assembly. J.A.F., Y.D., C.L., M.L., H.V.H., M.K.J., D.L.L., T. Ohsako and Y.Y. prepared materials for whole-genome sequencing used for population genetic analyses. N.M., K.N., T.N., H.S., M.U. and Y.Y. prepared materials for mutagenesis. J.A.F., E.Y., N.T., K.S., H.H., T. Ota and Y.Y. performed computational data analyses. S.K., E.O.-T., K.F., T.H., K.M., N.M., H.S., M.U., D.M., M.N., K.S. and Y.Y. performed experiments. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Jeffrey A. Fawcett, Chengyun Li, Hideki Hirakawa, Tatsuya Ota or Yasuo Yasui.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Plants thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Comparison of F. esculentum PL4 pseudomolecules with linkage map.

Left grey bars indicate the PL4 pseudomolecules and right grey bars indicate linkage maps from Yabe et al.16. Positions of the array markers developed by the same previous study are shown by black lines. Markers where the order matches between the pseudomolecules and the linkage map are connected by blue lines, and those that do not match are connected by red lines. The thickness of the lines is proportional to the number of markers. The marker on P1_3 (FE140468) that is anchored to Chr1 is connected by a red dotted line. The map position of S locus, which contains the S-ELF3 gene, is 84.2 cM on the linkage map of Yabe et al.16. Sh denotes the locus containing the homologue of S-ELF3 which has a nonsense mutation in F. esculentum PL4 (see also Supplementary Fig. 30). We note that the notations of P1_8.1 and P1_8.2 are incorrectly interchanged in the Figure 2 of Yabe et al.16.

Extended Data Fig. 2 Nucleotide divergence between F. esculentum PL4 and F. homotropicum across the genome.

Nucleotide divergence, that is, the average number of differences per site, was calculated across a sliding window of 2 Mb with a step of 400 kb based on results of MUMMER. Regions in F. esculentum PL4 that were masked by RepeatMasker using the TE library constructed for F. esculentum PL4 were excluded. Windows with < 10,000 aligned sites were not plotted. Regions with a divergence of < 0.001, which are to be likely regions in F. esculentum PL4 that are derived from F. homotropicum, are shown in grey and correspond to regions indicated in Fig. 2a.

Source data

Extended Data Fig. 3 Nucleotide divergence between the flanking LTRs of full-length LTR retrotransposons of the three Fagopyrum species.

Nucleotide divergence was calculated between each flanking LTR whose alignment length was ≥100 nucleotides. a,Gypsy (F. esculentum: n = 24,585, F. homotropicum: n = 24,342, F. tataricum: n = 4,359) and Copia-type LTR retrotransposons (F. esculentum: n = 2,750, F. homotropicum: n = 2,700, F. tataricum: n = 1,028). b,Athila (F. esculentum: n = 9,830, F. homotropicum: n = 10,389, F. tataricum: n = 425), CRM (F. esculentum: n = 1,154, F. homotropicum: n = 1,073, F. tataricum: n = 166), and Tekay family (F. esculentum: n = 1,174, F. homotropicum: n = 1,197, F. tataricum: n = 1,427) of the Gypsy-type LTR retrotransposons.

Source data

Extended Data Fig. 4 Distribution of various types of Transposable Elements across the F. esculentum genome.

For LTR retrotransposons, only full-length LTR retrotransposons whose nucleotide divergence could be calculated with an alignment length of ≥100 nucleotides are shown and the colour represents the nucleotide divergence between the flanking LTRs. The divergence corresponds to those shown in Extended Data Fig. 3 and Supplementary Fig. 11. Those with divergence > 0.1 are shown as 0.1. Athila_11 and Athila_18 are the two largest Gypsy-type subfamilies. CRM_44 and CRM_88 are Gypsy-type subfamilies of the CRM clade not associated with centromeric regions in F. esculentum and F. homotropicum, whereas the remaining CRM subfamilies (that is, other) are associated with centromeric regions in F. esculentum and F. homotropicum (FeCEN). The numbers 11, 18, 44, and 88 correspond to the ClusterIDs of Supplementary Table 17 (see also Supplementary Fig. 12). Number of elements plotted are as follows - Copia: n = 2,750, Athila_18: n = 2,712, Athila_11: n = 2,603, Athila_other: n = 4,515, Tekay: n = 1,174, CRM_44: n = 201, CRM_88: n = 321, CRM_other: n = 632, LINE/SINE: n = 6,818, Helitron: n = 1,098.

Source data

Extended Data Fig. 5 Number and timing of whole-genome duplications (WGDs) in the ancestor of Fagopyrum.

a, Phylogenetic relationship of Caryophyllales species relevant to determining the timing of WGDs. b, Number of gene families (n = 159) where 0, 1, 2, 3, or 4 gene duplications likely corresponding to WGDs were placed at each branch shown in a by phylogenetic analysis. Filled and unfilled bars indicate the number of gene families with and without ≥70% bootstrap support, respectively. c, Age estimates of the two WGDs based on phylogenetic dating analysis of gene families with two WGDs in branch C. The bar plots indicate the median age estimates of each gene family (younger WGD: n = 80, older WGD: n = 77) which correspond to the estimates of Prior Setting 1 in Supplementary Table 21. The line plots are based on all ages of the MCMC analyses combined (F. esculentum-F. tataricum: n = 2,367,263, Fagopyrum-Rumex: n = 1,368,152, younger WGD: n = 720,080, older WGD: n = 693,077). Note that the age distribution of F. esculentum-F. tataricum and Fagopyrum-Rumex follow the prior age constraints assigned to both nodes (Supplementary Table 20). See also Supplementary Fig. 19 for age estimates without various prior age constraints. d,Ks distributions of orthologous gene pairs of F. esculentum and F. homotropicum (n = 21,192), F. esculentum and F. tataricum (n = 17,682) identified by OrthoFinder. e,Ks distributions of orthologous gene pairs of F. esculentum and R. hastatulus (n = 10,397), F. esculentum and A. vesiculosa (n = 9,215), F. esculentum and B. vulgaris (n = 9,466) identified by OrthoFinder, and collinear gene duplicates (n = 3,620) of F. esculentum identified by MCScanX.

Source data

Extended Data Fig. 6 Molecular evolution of Fag e 2 genes and sequence of Fag e 2 knockout mutant.

a, Amino acid alignment of Fag e 2 genes in the three Fagopyrum species. The epitope sequence25 is indicated by a red square. Conserved cysteine residues of Fag e 2 homologues are indicated by red arrowheads. The background colour is proportional to the degree of similarity of each residue compared to its aligned column b, Maximum likelihood phylogenetic tree of Fag e 2 genes constructed with IQ-TREE based on the amino acid alignment of the gene family including Caryophyllales species identified by OrthoFinder. Tree is unrooted and bootstrap values of ≥80% are indicated by each node. Scale bar indicates branch length. Nodes corresponding to tandem duplications are indicated by blue diamonds. c, Amino acid alignment of the epitope sequence indicated by a red square in a. The position conserved across all sequences is indicated by asterisk. The three groups correspond to those in b. d, Upper panel shows the amino acid sequence encoded by the EMS-induced Fag e 2 gene. Red letters indicate epitope amino acids. Green letters indicate the eight Cys residues conserved within plant 2S albumins. Lower panel indicates the results of Sanger sequencing of the mutant/wild type heterozygote (left) and wild type homozygote (right).

Extended Data Fig. 7 Characterization of S- and s-haplotypes in F. esculentum GS1 scaffolds.

a, Dotplot based on minimap2 between scaffold 2156 which contains S-ELF3 (S-haplotype) and scaffold 21180 (s-haplotype). b, Gene-based collinearity between scaffold 2156 and scaffold 21180 and the collinear regions in F. esculentum PL4 Chr 1 and Chr 6. Orange and light blue horizontal bars indicate genes on Chr 1 and Chr 6, respectively. Genes in squares of the same colours are homologous and thus most probably allelic in F. esculentum GS1. The hemizygous region containing the S locus, indicated by thick black lines, can therefore be restricted to between the two genes, FesPL4_r1.1_Chr1.g191810.1 and FesPL4_r1.1_Chr1.g199860.1. Genomic structure and RNA-Seq analysis of the scaffold 21180 region between these two genes are shown in Supplementary Fig. 27. c, Diagram describing the proposed origin of Sh-haplotype in F. homotropicum. The hemizygous region of the S-haplotype flanked by two genes was translocated from Chr 6 to Chr 1. This translocation is consistent with comparison with a previously developed linkage map (Extended Data Fig. 1) and results of allelism test crosses (Supplementary Fig. 31). A frameshift mutation resulted in a loss-of-function S-ELF3 (s-elf3-ψ1) before or after the translocation, whereas the putative genes encoding for the stamen phenotype, IPPA has remained functional, where IP, P, and A encode for pollen incompatibility, pollen size, and anther height, respectively.

Extended Data Fig. 8 Change of thrum to long homostyle flowers caused by loss-of-function mutation in S-ELF3.

a, Illustration of phenotype changes by loss-of-function mutation in S-ELF3 of the EMS mutant. b, Pollen tube growth after three hours of crossing the F. esculentum EMS mutant (long homostyle) pistil with the wild type thrum pollen (left panel) and the wild type pin pollen (right panel). Thrum pollen tubes reached the ovule (left panel, yellow arrowhead), whereas pin pollen tubes were arrested in the long styles of the EMS mutant (right panel, red arrowheads). Scale bar = 0.2mm. c, Style length of 10 wild type thrum, wild type pin, F. esculentum PL4 (long homostyle), and F. esculentum EMS mutant (long homostyle) flowers. d, Anther filament length of 10 wild type thrum, wild type pin, F. esculentum PL4 (long homostyle), and F. esculentum EMS mutant (long homostyle) flowers. The box plots show the median and first and third quartiles with the whiskers extending to 1.5 times the inter-quartile range. All wild types used here are F. esculentum cv. Harunoibuki.

Extended Data Fig. 9 Molecular evolutionary analyses of S-ELF3.

a, Maximum likelihood phylogenetic tree based on the amino acid sequences of the homologues of S-ELF3 and putative evolutionary scenario of S-ELF3. Sequences of S-ELF3 are from a previously study11 under the following GenBank accessions numbers - AB641416 (F. esculentum S-ELF3), AB641417 (F. cymosum S-ELF3), and AB641418 (F. urophyllum S-ELF3). Sources of the remaining sequences are as described in Supplementary Table 19. Alignment was performed using MAFFT with the option -linsi and filtered with trimAl with the options ‘-automated1 -resoverlap 0.7 -seqoverlap 50’. Phylogenetic tree was constructed using the resulting alignments as input with IQ-TREE with 1,000 bootstrap replicates. The tree was rooted using the two Arabidopsis genes and FesPL4_sc0096.1.g000180.1 as outgroup. The F. tataricum ortholog of FesPL4_r1.1_Chr5.g038430.1 (FtPinG0006101400.01.T01) was filtered out as the result of trimAl. Bootstrap values of ≥80% are indicated next to each node. The magenta dot corresponds to a whole-genome duplication. G and IS encode for style length and style incompatibility, respectively. b,Ks between the S-ELF3 genes in F. esculentum, F. cymosum, F. urophyllum, and the paralogs in F. esculentum. Ks was calculated using codeml of the PAML package. The Ks estimates between FesPL4_sc0096.1.g000180.1 and the other FagopyrumS-ELF3 homologues are > 3 and not shown.

Extended Data Fig. 10 Candidate region of artificial selection.

a, Nucleotide diversity π of cultivated, Wild Tib, and other wild accessions excluding Wild Tib of 50-60 Mb region of Chr1 containing the peak of PBS. π was calculated for each 1 Mb sliding window with a step of 200 kb. b, Lengths of haplotype identical to the reference genome in each phased genome are shown. Each haplotype was extended up- and downstream from a core SNP (Chr1 52,562,436) within the low diversity region until a mismatch to the reference genome is encountered. The region depicted corresponds to Chr1 52,325,106 to 52,906,013. Black rectangles indicate predicted protein-coding genes in the region.

Supplementary information

Supplementary Information

Supplementary text and Figs. 1–44.

Reporting Summary

Supplementary Tables

Supplementary Tables 1–41.

Source data

Source Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 2

Statistical source data.

Source Data Extended Data Fig. 3

Statistical source data.

Source Data Extended Data Fig. 4

Statistical source data.

Source Data Extended Data Fig. 5

Statistical source data.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fawcett, J.A., Takeshima, R., Kikuchi, S. et al. Genome sequencing reveals the genetic architecture of heterostyly and domestication history of common buckwheat. Nat. Plants 9, 1236–1251 (2023). https://doi.org/10.1038/s41477-023-01474-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41477-023-01474-1

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing