Common buckwheat, Fagopyrum esculentum, is an orphan crop domesticated in southwest China that exhibits heterostylous self-incompatibility. Here we present chromosome-scale assemblies of a self-compatible F. esculentum accession and a self-compatible wild relative, Fagopyrum homotropicum, together with the resequencing of 104 wild and cultivated F. esculentum accessions. Using these genomic data, we report the roles of transposable elements and whole-genome duplications in the evolution of Fagopyrum. In addition, we show that (1) the breakdown of heterostyly occurs through the disruption of a hemizygous gene jointly regulating the style length and female compatibility and (2) southeast Tibet was involved in common buckwheat domestication. Moreover, we obtained mutants conferring the waxy phenotype for the first time in buckwheat. These findings demonstrate the utility of our F. esculentum assembly as a reference genome and promise to accelerate buckwheat research and breeding.
This is a preview of subscription content, access via your institution
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Rent or buy this article
Prices vary by article type
Prices may be subject to local taxes which are calculated during checkout
The final genome assemblies, annotations, RNA sequences and raw genome sequence data generated in this study have all been deposited in the DNA Data Bank of Japan (DDBJ) database under BioProjects PRJDB15031 (assembly of F. esculentum PL4: BSUD01000001–BSUD01003041, assembly of F. esculentum GS1: BSUE01000001–BSUE01042256, raw reads of various F. esculentum accessions: DRR438014–DRR438137 and DRR477348–477353) and PRJDB15175 (assembly of F. homotropicum: BSWB01000001–BSWB01000436, raw reads of F. homotropicum: DRR438312). The final genome assemblies and annotations are also available from the Buckwheat Genome DataBase (BGDB) (http://buckwheat.kazusa.or.jp), as well as the initial scaffolds and contigs used to construct the final assemblies. Publicly available datasets from the following databases and websites were used in this study: NR database of NCBI, UniProtKB (https://www.uniprot.org), PFAM, Phytozome (https://phytozome-next.jgi.doe.gov), TAIR (https://www.arabidopsis.org), MBKbase (https://www.mbkbase.org) and CARNIVOROM. Source data are provided with this paper.
Ohnishi, O. Discovery of the wild ancestor of common buckwheat. Fagopyrum 11, 5–10 (1991).
Ohnishi, O. & Matsuoka, Y. Search for the wild ancestor of buckwheat II. Taxonomy of Fagopyrum (Polygonaceae) species based on morphology, isozymes and cpDNA variability. Genes Genet. Syst. 71, 383–390 (1996).
Weisskopf, A. & Fuller, D. Q. in Encyclopedia of Global Archaeology (ed. Smith, C.) 1025–1028 (Springer, 2014).
Hunt, H. V., Shang, X. & Jones, M. K. Buckwheat: a crop from outside the major Chinese domestication centres? A review of the archaeobotanical, palynological and genetic evidence. Veg. Hist. Archaeobot. 27, 493–506 (2018).
Yasui, Y. et al. Assembly of the draft genome of buckwheat and its applications in identifying agronomically useful genes. DNA Res. 23, 215–224 (2016).
Penin, A. A. et al. High-resolution transcriptome atlas and improved genome assembly of common buckwheat, Fagopyrum esculentum. Front. Plant Sci. 12, 612382 (2021).
Zhang, L. et al. The tartary buckwheat genome provides insights into rutin biosynthesis and abiotic stress tolerance. Mol. Plant 10, 1224–1237 (2017).
Matsui, K. & Yasui, Y. Genetic and genomic research for the development of an efficient breeding system in heterostylous self-incompatible common buckwheat (Fagopyrum esculentum). Theor. Appl. Genet. 133, 1641–1653 (2020).
Matsui, K. & Yasui, Y. Buckwheat heteromorphic self-incompatibility: genetics, genomics and application to breeding. Breed. Sci. 70, 32–38 (2020).
Darwin, C. The Different Forms of Flowers on Plants of the Same Species (John Murray, 1897).
Yasui, Y. et al. S-LOCUS EARLY FLOWERING 3 is exclusively present in the genomes of short-styled buckwheat plants that exhibit heteromorphic self-incompatibility. PLoS ONE 7, e31264 (2012).
Matsui, K., Tetsuka, T., Nishio, T. & Hara, T. Heteromorphic incompatibility retained in self-compatible plants produced by a cross between common and wild buckwheat. New Phytol. 159, 701–708 (2003).
Tsai, H. et al. Discovery of rare mutations in populations: TILLING by sequencing. Plant Physiol. 156, 1257–1268 (2011).
Simão, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).
Ou, S., Chen, J. & Jiang, N. Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res. 46, e126 (2018).
Yabe, S. et al. Rapid genotyping with DNA micro-arrays for high-density linkage mapping and QTL mapping in common buckwheat (Fagopyrum esculentum Moench). Breed. Sci. 64, 291–299 (2014).
Yang, Y. et al. Improved transcriptome sampling pinpoints 26 ancient and more recent polyploidy events in Caryophyllales, including two allopolyploidy events. New Phytol. 217, 855–870 (2018).
Verde, I. et al. The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution. Nat. Genet. 45, 487–494 (2013).
Jaillon, O. et al. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449, 463–467 (2007).
Dohm, J. C. et al. The genome of the recently domesticated crop plant sugar beet (Beta vulgaris). Nature 505, 546–549 (2014).
Fawcett, J. A., Maere, S. & Van de Peer, Y. Plants with double genomes might have had a better chance to survive the Cretaceous–Tertiary extinction event. Proc. Natl Acad. Sci. USA 106, 5737–5742 (2009).
Vanneste, K., Baele, G., Maere, S. & Van de Peer, Y. Analysis of 41 plant genomes supports a wave of successful genome duplications in association with the Cretaceous–Paleogene boundary. Genome Res. 24, 1334–1347 (2014).
Kreft, I. et al. Breeding buckwheat for nutritional quality. Breed. Sci. 70, 67–73 (2020).
Tanaka, K. et al. Pepsin-resistant 16-kD buckwheat protein is associated with immediate hypersensitivity reaction in patients with buckwheat allergy. Int. Arch. Allergy Immunol. 129, 49–56 (2002).
Satoh, R., Koyano, S., Takagi, K., Nakamura, R. & Teshima, R. Identification of an IgE-binding epitope of a major buckwheat allergen, BWp16, by SPOTs assay and mimotope screening. Int. Arch. Allergy Immunol. 153, 133–140 (2010).
Barrett, S. C. ‘A most complex marriage arrangement’: recent advances on heterostyly and unresolved questions. New Phytol. 224, 1051–1067 (2019).
Pamela, V. & Dowrick, J. Heterostyly and homostyly in Primula obconica. Heredity 10, 219–236 (1956).
Li, J. et al. Genetic architecture and evolution of the S locus supergene in Primula vulgaris. Nat. Plants 2, 16188 (2016).
Huu, C. N. et al. Presence versus absence of CYP734A50 underlies the style-length dimorphism in primroses. eLife 5, e17956 (2016).
Cocker, J. M. et al. Primula vulgaris (primrose) genome assembly, annotation and gene expression, with comparative genomics on the heterostyly supergene. Sci. Rep. 8, 17942 (2018).
Shore, J. S. et al. The long and short of the S-locus in Turnera (Passifloraceae). New Phytol. 224, 1316–1329 (2019).
Matzke, C. M. et al. Pistil mating type and morphology are mediated by the brassinosteroid inactivating activity of the S-locus gene BAHD in heterostylous Turnera species. Int. J. Mol. Sci. 22, 10603 (2021).
Gutiérrez-Valencia, J. et al. Genomic analyses of the Linum distyly supergene reveal convergent evolution at the molecular level. Curr. Biol. 32, 4360–4371 (2022).
Matzke, C. M., Shore, J. S., Neff, M. M. & McCubbin, A. G. The Turnera style S-locus gene TsBAHD possesses brassinosteroid-inactivating activity when expressed in Arabidopsis thaliana. Plants 9, 1566 (2020).
Huu, C. N., Plaschil, S., Himmelbach, A., Kappel, C. & Lenhard, M. Female self-incompatibility type in heterostylous Primula is determined by the brassinosteroid-inactivating cytochrome P450 CYP734A50. Curr. Biol. 32, 671–676 (2022).
Huu, C. N., Keller, B., Conti, E., Kappel, C. & Lenhard, M. Supergene evolution via stepwise duplications and neofunctionalization of a floral-organ identity gene. Proc. Natl Acad. Sci. USA 117, 23148–23157 (2020).
Potente, G. et al. Comparative genomics elucidates the origin of a supergene controlling floral heteromorphism. Mol. Biol. Evol. 39, msac035 (2022).
Konishi, T., Yasui, Y. & Ohnishi, O. Original birthplace of cultivated common buckwheat inferred from genetic relationships among cultivated populations and natural populations of wild common buckwheat revealed by AFLP analysis. Genes Genet. Syst. 80, 113–119 (2005).
Konishi, T. & Ohnishi, O. Close genetic relationship between cultivated and natural populations of common buckwheat in the Sanjiang area is not due to recent gene flow between them—an analysis using microsatellite markers. Genes Genet. Syst. 82, 53–64 (2007).
Ohnishi, O. On the origin of cultivated common buckwheat based on allozyme analyses of cultivated and wild populations of common buckwheat. Fagopyrum 26, 3–9 (2009).
Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, e190 (2006).
Novembre, J. & Stephens, M. Interpreting principal component analyses of spatial population genetic variation. Nat. Genet. 40, 646–649 (2008).
Alexander, D., Novembre, J. & Lange, K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 19, 1655–1664 (2009).
Saitou, N. & Nei, M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425 (1987).
Yi, X. et al. Sequencing of 50 human exomes reveals adaptation to high altitude. Science 329, 75–78 (2010).
Krzyzanska, M., Hunt, H. V., Crema, E. R. & Jones, M. K. Modelling the potential ecological niche of domesticated buckwheat in China: archaeological evidence, environmental constraints and climate change. Veg. Hist. Archaeobot. 31, 331–345 (2022).
Jones, M. & Brown, T. in Rethinking Agriculture: Archaeological and Ethnoarchaeological Perspectives 1st edn (eds Denham, T. P. et al.) 36–49 (Routledge, 2007).
Tanno, K. & Willcox, G. How fast was wild wheat domesticated? Science 311, 1886 (2006).
Brown, T. A., Jones, M. K., Powell, W. & Allaby, R. G. The complex origins of domesticated crops in the Fertile Crescent. Trends Ecol. Evol. 24, 103–109 (2009).
Meyer, R. S. et al. Domestication history and geographical adaptation inferred from a SNP map of African rice. Nat. Genet. 48, 1083–1088 (2016).
Zhou, Y., Massonnet, M., Sanjak, J. S., Cantu, D. & Gaut, B. S. Evolutionary genomics of grape (Vitis vinifera ssp. vinifera) domestication. Proc. Natl Acad. Sci. USA 114, 11715–11720 (2017).
Qian, L.-S., Chen, J.-H., Deng, T. & Sun, H. Plant diversity in Yunnan: current status and future directions. Plant Divers. 42, 281–291 (2020).
Matsui, K., Tetsuka, T., Hara, T. & Morishita, T. Breeding and characterization of a new self-compatible common buckwheat [Fagopyrum esculentum] parental line, ‘Buckwheat Norin-PL1’. Bull. Natl Agric. Res. Cent. Kyushu Okinawa Reg. 49, 11–17 (2008).
Yabe, S. et al. Potential of genomic selection in mass selection breeding of an allogamous crop: an empirical study to increase yield of common buckwheat. Front. Plant Sci. 9, 276 (2018).
Tomiyoshi, M., Yasui, Y., Ohsako, T., Li, C.-Y. & Ohnishi, O. Phylogenetic analysis of AGAMOUS sequences reveals the origin of the diploid and tetraploid forms of self-pollinating wild buckwheat, Fagopyrum homotropicum Ohnishi. Breed. Sci. 62, 241–247 (2012).
Cheng, H., Concepcion, G. T., Feng, X., Zhang, H. & Li, H. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat. Methods 18, 170–175 (2021).
Gonda, I. et al. The genome sequence of tetraploid sweet basil, Ocimum basilicum L., provides tools for advanced genome editing and molecular breeding. DNA Res. 27, dsaa027 (2020).
Avni, R. et al. Wild emmer genome architecture and diversity elucidate wheat evolution and domestication. Science 357, 93–97 (2017).
Luo, M.-C. et al. Genome sequence of the progenitor of the wheat D genome Aegilops tauschii. Nature 551, 498–502 (2017).
Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963 (2014).
Marçais, G. & Kingsford, C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics 27, 764–770 (2011).
Takeshima, R., Ogiso-Tanaka, E., Yasui, Y. & Matsui, K. Targeted amplicon sequencing + next-generation sequencing-based bulked segregant analysis identified genetic loci associated with preharvest sprouting tolerance in common buckwheat (Fagopyrum esculentum). BMC Plant Biol. 21, 18 (2021).
Ogiso-Tanaka, E., Shimizu, T., Hajika, M., Kaga, A. & Ishimoto, M. Highly multiplexed AmpliSeq technology identifies novel variation of flowering time-related genes in soybean (Glycine max). DNA Res. 26, 243–260 (2019).
Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).
Iwata, H. & Ninomiya, S. AntMap: constructing genetic linkage maps using an ant colony optimization algorithm. Breed. Sci. 56, 371–377 (2006).
Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinformatics 10, 421 (2009).
Jain, C., Koren, S., Dilthey, A., Phillippy, A. M. & Aluru, S. A fast adaptive algorithm for computing whole-genome homology maps. Bioinformatics 34, i748–i756 (2018).
Marçais, G. et al. MUMmer4: a fast and versatile genome alignment system. PLoS Comput. Biol. 14, e1005944 (2018).
Kikuchi, S., Matsui, K., Tanaka, H., Ohnishi, O. & Tsujimoto, H. Chromosome evolution among seven Fagopyrum species revealed by fluorescence in situ hybridization (FISH) probed with rDNAs. Chromosome Sci. 11, 37–43 (2008).
Ellinghaus, D., Kurtz, S. & Willhoeft, U. LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons. BMC Bioinformatics 9, 18 (2008).
Steinbiss, S., Willhoeft, U., Gremme, G. & Kurtz, S. Fine-grained annotation and classification of de novo predicted LTR retrotransposons. Nucleic Acids Res. 37, 7002–7013 (2009).
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol. Biol. Evol. 30, 772–780 (2013).
Li, W. & Godzik, A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659 (2006).
Fu, L., Niu, B., Zhu, Z., Wu, S. & Li, W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152 (2012).
Eddy, S. R. Accelerated profile HMM searches. PLoS Comput. Biol. 7, e1002195 (2011).
Neumann, P., Novák, P., Hoštáková, N. & Macas, J. Systematic survey of plant LTR-retrotransposons elucidates phylogenetic relationships of their polyprotein domains and provides a reference for element classification. Mob. DNA 10, 1 (2019).
Minh, B. Q. et al. IQ-TREE 2: new models and efficient methods for phylogenetic inference in the genomic era. Mol. Biol. Evol. 37, 1530–1534 (2020).
Hoang, D. T., Chernomor, O., Von Haeseler, A., Minh, B. Q. & Vinh, L. S. UFBoot2: improving the ultrafast bootstrap approximation. Mol. Biol. Evol. 35, 518–522 (2018).
Roberts, A. & Pachter, L. Streaming fragment assignment for real-time analysis of sequencing experiments. Nat. Methods 10, 71–73 (2013).
Chen, S., Zhou, Y., Chen, Y. & Gu, J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).
Pertea, M. et al. Stringtie enables improved reconstruction of a transcriptome from RNA-seq reads. Nat. Biotechnol. 33, 290–295 (2015).
Brůna, T., Hoff, K. J., Lomsadze, A., Stanke, M. & Borodovsky, M. BRAKER2: automatic eukaryotic genome annotation with GeneMark-EP+ and AUGUSTUS supported by a protein database. NAR Genom. Bioinform. 3, lqaa108 (2021).
Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods 12, 59–60 (2015).
Mistry, J. et al. Pfam: the protein families database in 2021. Nucleic Acids Res. 49, D412–D419 (2021).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
Wu, T. D. & Watanabe, C. K. GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics 21, 1859–1875 (2005).
Rifkin, J. L. et al. Recombination landscape dimorphism and sex chromosome evolution in the dioecious plant Rumex hastatulus. Philos. Trans. R. Soc. Lond. B 377, 20210226 (2022).
Palfalvi, G. et al. Genomes of the Venus flytrap and close relatives unveil the roots of plant carnivory. Curr. Biol. 30, 2312–2320 (2020).
Gilman, I. S., Moreno-Villena, J. J., Lewis, Z. R., Goolsby, E. W. & Edwards, E. J. Gene co-expression reveals the modularity and integration of C4 and CAM in Portulaca. Plant Physiol. 189, 735–753 (2022).
McGrath, J. M. M. et al. A contiguous de novo genome assembly of sugar beet EL10 (Beta vulgaris L.). DNA Res. 30, dsac033 (2022).
Jarvis, D. E. et al. The genome of Chenopodium quinoa. Nature 542, 307–312 (2017).
Lightfoot, D. et al. Single-molecule sequencing and Hi-C-based proximity-guided assembly of amaranth (Amaranthus hypochondriacus) chromosomes provide insights into genome evolution. BMC Biol. 15, 74 (2017).
Wang, Y. et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 40, e49 (2012).
Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).
Emms, D. M. & Kelly, S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 20, 238 (2019).
Yao, G. et al. Plastid phylogenomic insights into the evolution of Caryophyllales. Mol. Phylogenet. Evol. 134, 74–86 (2019).
Capella-Gutiérrez, S., Silla-Martínez, J. M. & Gabaldón, T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics 25, 1972–1973 (2009).
Bouckaert, R. et al. BEAST 2.5: an advanced software platform for Bayesian evolutionary analysis. PLoS Comput. Biol. 15, e1006650 (2019).
Ng, K. K. S. et al. The genome of Shorea leprosula (Dipterocarpaceae) highlights the ecological relevance of drought in aseasonal tropical rainforests. Commun. Biol. 4, 1166 (2021).
Drummond, A. J., Ho, S. Y. W., Phillips, M. J. & Rambaut, A. Relaxed phylogenetics and dating with confidence. PLoS Biol. 4, e88 (2006).
Le, S. Q. & Gascuel, O. An improved general amino acid replacement matrix. Mol. Biol. Evol. 25, 1307–1320 (2008).
Yang, Z. Among-site rate variation and its impact on phylogenetic analyses. Trends Ecol. Evol. 11, 367–372 (1996).
Yule, G. U. A mathematical theory of evolution, based on the conclusions of Dr. J. C. Willis, F.R.S. Philos. Trans. R. Soc. B 213, 21–87 (1925).
Magallón, S., Gómez-Acevedo, S., Sánchez-Reyes, L. L. & Hernández-Hernández, T. A metacalibrated time-tree documents the early rise of flowering plant phylogenetic diversity. New Phytol. 207, 437–453 (2015).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120 (2014).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Danecek, P. et al. Twelve years of SAMtools and BCFtools. GigaScience 10, giab008 (2021).
Koboldt, D. C. et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 22, 568–576 (2012).
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff. Fly 6, 80–92 (2012).
Nakamura, T., Yamamori, M., Hirano, H., Hidaka, S. & Nagamine, T. Production of waxy (amylose-free) wheats. Mol. Gen. Genet. 248, 253–259 (1995).
Chang, S., Puryear, J. & Cairney, J. A simple and efficient method for isolating RNA from pine trees. Plant Mol. Biol. Rep. 11, 113–116 (1993).
Chang, C. C. et al. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4, 7 (2015).
Tajima, F. Evolutionary relationship of DNA sequences in finite populations. Genetics 105, 437–460 (1983).
Korunes, K. L. & Samuk, K. pixy: unbiased estimation of nucleotide diversity and divergence in the presence of missing data. Mol. Ecol. Resour. 21, 1359–1368 (2021).
Weir, B. S. & Cockerham, C. C. Estimating F-statistics for the analysis of population structure. Evolution 38, 1358–1370 (1984).
Browning, S. R. & Browning, B. L. Rapid and accurate haplotype phasing and missing-data inference for whole-genome association studies by use of localized haplotype clustering. Am. J. Hum. Genet. 81, 1084–1097 (2007).
Hill, W. & Robertson, A. Linkage disequilibrium in finite populations. Theor. Appl. Genet. 38, 226–231 (1968).
Nagano, M., Aii, J., Campbell, C., Kawasaki, S. & Adachi, T. Genome size analysis of the genus Fagopyrum. Fagopyrum 17, 35–39 (2000).
This work was supported by KAKEN-HI (grants 20K06761 and 21H00356 to J.A.F., 22H05172 and 22H05181 to K.S., and 18KK0172 to Y.Y.); ACT-X ‘Environments and Biotechnology’ from the Japan Science and Technology Agency (JST) (grant JPMJAX20BA to R.T.); Cabinet Office, Government of Japan, Moonshot R&D Program for Agriculture, Forestry and Fisheries to Y.Y.; the research programme on development of innovative technology from the Project of the Bio-oriented Technology Research Advancement Institution (BRAIN) (grant JPJ007097 to T.H.); and the Leverhume Trust (grant RPG-2017-196 to M.K.J.). We thank S. Wright for helpful discussions and providing genomic data of R. hastatulus, K. L. Farquharson for language-editing support of the manuscript and RIKEN HOKUSAI and the National Institute of Genetics for providing computational resources.
The authors declare no competing interests.
Peer review information
Nature Plants thanks the anonymous reviewers for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Left grey bars indicate the PL4 pseudomolecules and right grey bars indicate linkage maps from Yabe et al.16. Positions of the array markers developed by the same previous study are shown by black lines. Markers where the order matches between the pseudomolecules and the linkage map are connected by blue lines, and those that do not match are connected by red lines. The thickness of the lines is proportional to the number of markers. The marker on P1_3 (FE140468) that is anchored to Chr1 is connected by a red dotted line. The map position of S locus, which contains the S-ELF3 gene, is 84.2 cM on the linkage map of Yabe et al.16. Sh denotes the locus containing the homologue of S-ELF3 which has a nonsense mutation in F. esculentum PL4 (see also Supplementary Fig. 30). We note that the notations of P1_8.1 and P1_8.2 are incorrectly interchanged in the Figure 2 of Yabe et al.16.
Extended Data Fig. 2 Nucleotide divergence between F. esculentum PL4 and F. homotropicum across the genome.
Nucleotide divergence, that is, the average number of differences per site, was calculated across a sliding window of 2 Mb with a step of 400 kb based on results of MUMMER. Regions in F. esculentum PL4 that were masked by RepeatMasker using the TE library constructed for F. esculentum PL4 were excluded. Windows with < 10,000 aligned sites were not plotted. Regions with a divergence of < 0.001, which are to be likely regions in F. esculentum PL4 that are derived from F. homotropicum, are shown in grey and correspond to regions indicated in Fig. 2a.
Extended Data Fig. 3 Nucleotide divergence between the flanking LTRs of full-length LTR retrotransposons of the three Fagopyrum species.
Nucleotide divergence was calculated between each flanking LTR whose alignment length was ≥100 nucleotides. a,Gypsy (F. esculentum: n = 24,585, F. homotropicum: n = 24,342, F. tataricum: n = 4,359) and Copia-type LTR retrotransposons (F. esculentum: n = 2,750, F. homotropicum: n = 2,700, F. tataricum: n = 1,028). b,Athila (F. esculentum: n = 9,830, F. homotropicum: n = 10,389, F. tataricum: n = 425), CRM (F. esculentum: n = 1,154, F. homotropicum: n = 1,073, F. tataricum: n = 166), and Tekay family (F. esculentum: n = 1,174, F. homotropicum: n = 1,197, F. tataricum: n = 1,427) of the Gypsy-type LTR retrotransposons.
Extended Data Fig. 4 Distribution of various types of Transposable Elements across the F. esculentum genome.
For LTR retrotransposons, only full-length LTR retrotransposons whose nucleotide divergence could be calculated with an alignment length of ≥100 nucleotides are shown and the colour represents the nucleotide divergence between the flanking LTRs. The divergence corresponds to those shown in Extended Data Fig. 3 and Supplementary Fig. 11. Those with divergence > 0.1 are shown as 0.1. Athila_11 and Athila_18 are the two largest Gypsy-type subfamilies. CRM_44 and CRM_88 are Gypsy-type subfamilies of the CRM clade not associated with centromeric regions in F. esculentum and F. homotropicum, whereas the remaining CRM subfamilies (that is, other) are associated with centromeric regions in F. esculentum and F. homotropicum (FeCEN). The numbers 11, 18, 44, and 88 correspond to the ClusterIDs of Supplementary Table 17 (see also Supplementary Fig. 12). Number of elements plotted are as follows - Copia: n = 2,750, Athila_18: n = 2,712, Athila_11: n = 2,603, Athila_other: n = 4,515, Tekay: n = 1,174, CRM_44: n = 201, CRM_88: n = 321, CRM_other: n = 632, LINE/SINE: n = 6,818, Helitron: n = 1,098.
Extended Data Fig. 5 Number and timing of whole-genome duplications (WGDs) in the ancestor of Fagopyrum.
a, Phylogenetic relationship of Caryophyllales species relevant to determining the timing of WGDs. b, Number of gene families (n = 159) where 0, 1, 2, 3, or 4 gene duplications likely corresponding to WGDs were placed at each branch shown in a by phylogenetic analysis. Filled and unfilled bars indicate the number of gene families with and without ≥70% bootstrap support, respectively. c, Age estimates of the two WGDs based on phylogenetic dating analysis of gene families with two WGDs in branch C. The bar plots indicate the median age estimates of each gene family (younger WGD: n = 80, older WGD: n = 77) which correspond to the estimates of Prior Setting 1 in Supplementary Table 21. The line plots are based on all ages of the MCMC analyses combined (F. esculentum-F. tataricum: n = 2,367,263, Fagopyrum-Rumex: n = 1,368,152, younger WGD: n = 720,080, older WGD: n = 693,077). Note that the age distribution of F. esculentum-F. tataricum and Fagopyrum-Rumex follow the prior age constraints assigned to both nodes (Supplementary Table 20). See also Supplementary Fig. 19 for age estimates without various prior age constraints. d,Ks distributions of orthologous gene pairs of F. esculentum and F. homotropicum (n = 21,192), F. esculentum and F. tataricum (n = 17,682) identified by OrthoFinder. e,Ks distributions of orthologous gene pairs of F. esculentum and R. hastatulus (n = 10,397), F. esculentum and A. vesiculosa (n = 9,215), F. esculentum and B. vulgaris (n = 9,466) identified by OrthoFinder, and collinear gene duplicates (n = 3,620) of F. esculentum identified by MCScanX.
a, Amino acid alignment of Fag e 2 genes in the three Fagopyrum species. The epitope sequence25 is indicated by a red square. Conserved cysteine residues of Fag e 2 homologues are indicated by red arrowheads. The background colour is proportional to the degree of similarity of each residue compared to its aligned column b, Maximum likelihood phylogenetic tree of Fag e 2 genes constructed with IQ-TREE based on the amino acid alignment of the gene family including Caryophyllales species identified by OrthoFinder. Tree is unrooted and bootstrap values of ≥80% are indicated by each node. Scale bar indicates branch length. Nodes corresponding to tandem duplications are indicated by blue diamonds. c, Amino acid alignment of the epitope sequence indicated by a red square in a. The position conserved across all sequences is indicated by asterisk. The three groups correspond to those in b. d, Upper panel shows the amino acid sequence encoded by the EMS-induced Fag e 2 gene. Red letters indicate epitope amino acids. Green letters indicate the eight Cys residues conserved within plant 2S albumins. Lower panel indicates the results of Sanger sequencing of the mutant/wild type heterozygote (left) and wild type homozygote (right).
a, Dotplot based on minimap2 between scaffold 2156 which contains S-ELF3 (S-haplotype) and scaffold 21180 (s-haplotype). b, Gene-based collinearity between scaffold 2156 and scaffold 21180 and the collinear regions in F. esculentum PL4 Chr 1 and Chr 6. Orange and light blue horizontal bars indicate genes on Chr 1 and Chr 6, respectively. Genes in squares of the same colours are homologous and thus most probably allelic in F. esculentum GS1. The hemizygous region containing the S locus, indicated by thick black lines, can therefore be restricted to between the two genes, FesPL4_r1.1_Chr1.g191810.1 and FesPL4_r1.1_Chr1.g199860.1. Genomic structure and RNA-Seq analysis of the scaffold 21180 region between these two genes are shown in Supplementary Fig. 27. c, Diagram describing the proposed origin of Sh-haplotype in F. homotropicum. The hemizygous region of the S-haplotype flanked by two genes was translocated from Chr 6 to Chr 1. This translocation is consistent with comparison with a previously developed linkage map (Extended Data Fig. 1) and results of allelism test crosses (Supplementary Fig. 31). A frameshift mutation resulted in a loss-of-function S-ELF3 (s-elf3-ψ1) before or after the translocation, whereas the putative genes encoding for the stamen phenotype, IPPA has remained functional, where IP, P, and A encode for pollen incompatibility, pollen size, and anther height, respectively.
Extended Data Fig. 8 Change of thrum to long homostyle flowers caused by loss-of-function mutation in S-ELF3.
a, Illustration of phenotype changes by loss-of-function mutation in S-ELF3 of the EMS mutant. b, Pollen tube growth after three hours of crossing the F. esculentum EMS mutant (long homostyle) pistil with the wild type thrum pollen (left panel) and the wild type pin pollen (right panel). Thrum pollen tubes reached the ovule (left panel, yellow arrowhead), whereas pin pollen tubes were arrested in the long styles of the EMS mutant (right panel, red arrowheads). Scale bar = 0.2mm. c, Style length of 10 wild type thrum, wild type pin, F. esculentum PL4 (long homostyle), and F. esculentum EMS mutant (long homostyle) flowers. d, Anther filament length of 10 wild type thrum, wild type pin, F. esculentum PL4 (long homostyle), and F. esculentum EMS mutant (long homostyle) flowers. The box plots show the median and first and third quartiles with the whiskers extending to 1.5 times the inter-quartile range. All wild types used here are F. esculentum cv. Harunoibuki.
a, Maximum likelihood phylogenetic tree based on the amino acid sequences of the homologues of S-ELF3 and putative evolutionary scenario of S-ELF3. Sequences of S-ELF3 are from a previously study11 under the following GenBank accessions numbers - AB641416 (F. esculentum S-ELF3), AB641417 (F. cymosum S-ELF3), and AB641418 (F. urophyllum S-ELF3). Sources of the remaining sequences are as described in Supplementary Table 19. Alignment was performed using MAFFT with the option -linsi and filtered with trimAl with the options ‘-automated1 -resoverlap 0.7 -seqoverlap 50’. Phylogenetic tree was constructed using the resulting alignments as input with IQ-TREE with 1,000 bootstrap replicates. The tree was rooted using the two Arabidopsis genes and FesPL4_sc0096.1.g000180.1 as outgroup. The F. tataricum ortholog of FesPL4_r1.1_Chr5.g038430.1 (FtPinG0006101400.01.T01) was filtered out as the result of trimAl. Bootstrap values of ≥80% are indicated next to each node. The magenta dot corresponds to a whole-genome duplication. G and IS encode for style length and style incompatibility, respectively. b,Ks between the S-ELF3 genes in F. esculentum, F. cymosum, F. urophyllum, and the paralogs in F. esculentum. Ks was calculated using codeml of the PAML package. The Ks estimates between FesPL4_sc0096.1.g000180.1 and the other FagopyrumS-ELF3 homologues are > 3 and not shown.
a, Nucleotide diversity π of cultivated, Wild Tib, and other wild accessions excluding Wild Tib of 50-60 Mb region of Chr1 containing the peak of PBS. π was calculated for each 1 Mb sliding window with a step of 200 kb. b, Lengths of haplotype identical to the reference genome in each phased genome are shown. Each haplotype was extended up- and downstream from a core SNP (Chr1 52,562,436) within the low diversity region until a mismatch to the reference genome is encountered. The region depicted corresponds to Chr1 52,325,106 to 52,906,013. Black rectangles indicate predicted protein-coding genes in the region.
About this article
Cite this article
Fawcett, J.A., Takeshima, R., Kikuchi, S. et al. Genome sequencing reveals the genetic architecture of heterostyly and domestication history of common buckwheat. Nat. Plants 9, 1236–1251 (2023). https://doi.org/10.1038/s41477-023-01474-1