Skip to main content

Thank you for visiting You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

Introgressing the Aegilops tauschii genome into wheat as a basis for cereal improvement


Increasing crop production is necessary to feed the world’s expanding population, and crop breeders often utilize genetic variations to improve crop yield and quality. However, the narrow diversity of the wheat D genome seriously restricts its selective breeding. A practical solution is to exploit the genomic variations of Aegilops tauschii via introgression. Here, we established a rapid introgression platform for transferring the overall genetic variations of A. tauschii to elite wheats, thereby enriching the wheat germplasm pool. To accelerate the process, we assembled four new reference genomes, resequenced 278 accessions of A. tauschii and constructed the variation landscape of this wheat progenitor species. Genome comparisons highlighted diverse functional genes or novel haplotypes with potential applications in wheat improvement. We constructed the core germplasm of A. tauschii, including 85 accessions covering more than 99% of the species’ overall genetic variations. This was crossed with elite wheat cultivars to generate an A. tauschii-wheat synthetic octoploid wheat (A-WSOW) pool. Laboratory and field analysis with two examples of the introgression lines confirmed its great potential for wheat breeding. Our high-quality reference genomes, genomic variation landscape of A. tauschii and the A-WSOW pool provide valuable resources to facilitate gene discovery and breeding in wheat.

Access options

Rent or Buy article

Get time limited or full article access on ReadCube.


All prices are NET prices.

Fig. 1: RHI accelerates the transformation of the wild fragment to wheat.
Fig. 2: Geographical distribution and phylogenetic analysis of 278 resequenced A. tauschii accessions.
Fig. 3: Evolution of the A. tauschii genome.
Fig. 4: Variation landscape of six D genomes and 278 resequenced accessions of A. tauschii.
Fig. 5: Examples of stable A-WI lines in wheat improvement.

Data availability

All raw data of the genome, RNA sequencing and resequencing of A. tauschii were deposited in the National Center for Biotechnology Information (NCBI) under BioProject number PRJNA663737. All raw data of the wild population resequencing of A. tauschii were deposited in NCBI under BioProject number PRJNA705859. The assembly and annotation of these four genomes are available at China National GeneBank (CNGB) under accession number CNP0001325. All germplasm materials generated from this research have been stored in the State Key Laboratory of Crop Stress Adaptation and Improvement, Henan University. These materials can be shared with researchers for academic purposes upon request to C.-P.S., Y.Z., H. Li or C.Z. Source data are provided with this paper.

Code availability

The custom pipelines and scripts used in the project have been deposited in GitHub (


  1. 1.

    McFadden, E. S. & Sears, E. R. The origin of Triticum spelta and its free-threshing hexaploid relatives. J. Hered. 37, 81–89 (1946).

    PubMed  Google Scholar 

  2. 2.

    Kihara, H. Discovery of the DD-analyser, one of the ancestors of Triticum vulgare. Agric. Hortic. 19, 13–14 (1944).

    Google Scholar 

  3. 3.

    Huang, S. et al. Genes encoding plastid acetyl-CoA carboxylase and 3-phosphoglycerate kinase of the Triticum/Aegilops complex and the evolutionary history of polyploid wheat. Proc. Natl Acad. Sci. USA 99, 8133–8138 (2002).

    CAS  PubMed  Google Scholar 

  4. 4.

    Singh, N. et al. Genomic analysis confirms population structure and identifies inter-lineage hybrids in Aegilops tauschii. Front. Plant. Sci. 10, 9 (2019).

    PubMed  PubMed Central  Google Scholar 

  5. 5.

    Wang, J. et al. Aegilops tauschii single nucleotide polymorphisms shed light on the origins of wheat D-genome genetic diversity and pinpoint the geographic origin of hexaploid wheat. New Phytol. 198, 925–937 (2013).

    CAS  PubMed  Google Scholar 

  6. 6.

    Dvorak, J. et al. The origin of spelt and free-threshing hexaploid wheat. J. Hered. 103, 426–441 (2012).

    CAS  PubMed  Google Scholar 

  7. 7.

    Voss-Fels, K. et al. Subgenomic diversity patterns caused by directional selection in bread wheat gene pools. Plant Genome 8, plantgenome2015.2003.0013 (2015).

    Google Scholar 

  8. 8.

    Pont, C. et al. Tracing the ancestry of modern bread wheats. Nat. Genet. 51, 905–911 (2019).

    CAS  PubMed  Google Scholar 

  9. 9.

    Zhou, Y. et al. Triticum population sequencing provides insights into wheat adaptation. Nat. Genet. 52, 1412–1422 (2020).

    PubMed  Google Scholar 

  10. 10.

    Mirzaghaderi, G. & Mason, A. S. Broadening the bread wheat D genome. Theor. Appl. Genet. 132, 1295–1307 (2019).

    CAS  PubMed  Google Scholar 

  11. 11.

    Wang, H. et al. Horizontal gene transfer of Fhb7 from fungus underlies Fusarium head blight resistance in wheat. Science 368, eaba5435 (2020).

    CAS  PubMed  Google Scholar 

  12. 12.

    Kishii, M. An update of recent use of Aegilops species in wheat breeding. Front. Plant. Sci. 10, 585 (2019).

    PubMed  PubMed Central  Google Scholar 

  13. 13.

    Zhao, G. et al. The Aegilops tauschii genome reveals multiple impacts of transposons. Nat. Plants 3, 946–955 (2017).

    CAS  PubMed  Google Scholar 

  14. 14.

    Luo, M. C. et al. Genome sequence of the progenitor of the wheat D genome Aegilops tauschii. Nature 551, 498–502 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  15. 15.

    Matsuoka, Y., Takumi, S. & Kawahara, T. Natural variation for fertile triploid F1 hybrid formation in allohexaploid wheat speciation. Theor. Appl. Genet. 115, 509–518 (2007).

    PubMed  Google Scholar 

  16. 16.

    Das, M. K., Bai, G., Mujeeb-Kazi, A. & Rajaram, S. Genetic diversity among synthetic hexaploid wheat accessions (Triticum aestivum) with resistance to several fungal diseases. Genet. Resour. Crop. Evol. 63, 1285–1296 (2016).

    CAS  Google Scholar 

  17. 17.

    Li, A. L., Liu, D. C., Yang, W. Y., Kishii, M. & Mao, L. Synthetic hexaploid wheat: yesterday, today, and tomorrow. Engineering 4, 552–558 (2018).

    CAS  Google Scholar 

  18. 18.

    Cox, T. S. et al. Comparing two approaches for introgression of germplasm from Aegilops tauschii into common wheat. Crop J. 5, 355–362 (2017).

    Google Scholar 

  19. 19.

    Zhang, D. et al. Development and utilization of introgression lines using synthetic octaploid wheat (Aegilops tauschii × hexaploid wheat) as donor. Front. Plant. Sci. 9, 1113 (2018).

    PubMed  PubMed Central  Google Scholar 

  20. 20.

    Hao, M. et al. The resurgence of introgression breeding, as exemplified in wheat improvement. Front. Plant. Sci. 11, 252 (2020).

    PubMed  PubMed Central  Google Scholar 

  21. 21.

    Watson, A. et al. Speed breeding is a powerful tool to accelerate crop research and breeding. Nat. Plants 4, 23–29 (2018).

    PubMed  Google Scholar 

  22. 22.

    Rasheed, A. et al. Crop breeding chips and genotyping platforms: progress, challenges, and perspectives. Mol. Plant 10, 1047–1064 (2017).

    CAS  PubMed  Google Scholar 

  23. 23.

    Sun, C. et al. The wheat 660K SNP array demonstrates great potential for marker assisted selection in polyploid wheat. Plant Biotechnol. J. 18, 1354–1360 (2020).

    CAS  PubMed  PubMed Central  Google Scholar 

  24. 24.

    Van Slageren, M. Wild Wheats: a Monograph of Aegilops L. and Amblyopyrum (Jaub. & Spach) Eig (Poaceae) (Wageningen Agricultural Univ., 1994).

  25. 25.

    Jones, H. et al. Strategy for exploiting exotic germplasm using genetic, morphological, and environmental diversity: the Aegilops tauschii Coss. example. Theor. Appl. Genet. 126, 1793–1808 (2013).

    CAS  PubMed  Google Scholar 

  26. 26.

    Zhang, C. et al. An ancestral NB-LRR with duplicated 3′UTRs confers stripe rust resistance in wheat and barley. Nat. Commun. 10, 4023 (2019).

    PubMed  PubMed Central  Google Scholar 

  27. 27.

    Arora, S. et al. Resistance gene cloning from a wild crop relative by sequence capture and association genetics. Nat. Biotechnol. 37, 139–143 (2019).

    CAS  PubMed  Google Scholar 

  28. 28.

    Kihara, H. & Tanaka, M. Morphological and physiological variation among Aegilops squarrosa strains collected in Pakistan, Afghanistan and Iran. Preslia 30, 241–251 (1958).

    Google Scholar 

  29. 29.

    Eig, A. Monographisch-Kritische Ubersicht der Gattung Aegilops Vol. 55 (Verlag des Repertoriums, 1929).

  30. 30.

    Tanaka, M. Geographical distribution of Aegilops species based on the collections at the Plant Germ-Plasm Institute, Kyoto University. In Proc. of the 6th International Wheat Genetics Symposium (ed. Sakamoto, S.) 1009–1024 (Kyoto University, 1983).

  31. 31.

    Jaaska, V. NAD-dependent aromatic alcohol dehydrogenase in wheats (Triticum L.) and goatgrasses (Aegilops L.): evolutionary genetics. Theor. Appl. Genet. 67, 535–540 (1984).

    CAS  PubMed  Google Scholar 

  32. 32.

    Dvorak, J., Luo, M. C., Yang, Z. L. & Zhang, H. B. The structure of the Aegilops tauschii genepool and the evolution of hexaploid wheat. Theor. Appl. Genet. 97, 657–670 (1998).

    CAS  Google Scholar 

  33. 33.

    Mizuno, N., Yamasaki, M., Matsuoka, Y., Kawahara, T. & Takumi, S. Population structure of wild wheat D-genome progenitor Aegilops tauschii Coss.: implications for intraspecific lineage diversification and evolution of common wheat. Mol. Plant 19, 999–1013 (2010).

    Google Scholar 

  34. 34.

    Zhao, L. B. et al. Fluorescence in situ hybridization karyotyping reveals the presence of two distinct genomes in the taxon Aegilops tauschii. BMC Genom. 19, 3 (2018).

    Google Scholar 

  35. 35.

    Dudnikov, A. J. Multivariate analysis of genetic variation in Aegilops tauschii from the world germplasm collection. Genet. Resour. Crop. Evol. 47, 185–190 (2000).

    Google Scholar 

  36. 36.

    Dudnikov, A. J. Allozyme variation in transcaucasian populations of Aegilops squarrosa. Heredity 80, 248–258 (1998).

    Google Scholar 

  37. 37.

    Zhang, D. et al. An advanced backcross population through synthetic octaploid wheat as a ‘bridge’: development and QTL detection for seed dormancy. Front. Plant. Sci. 8, 2123 (2017).

    Google Scholar 

  38. 38.

    Cheng, H. et al. Frequent intra- and inter-species introgression shapes the landscape of genetic variation in bread wheat. Genome Biol. 20, 136 (2019).

    PubMed  PubMed Central  Google Scholar 

  39. 39.

    Montenegro, J. D. et al. The pangenome of hexaploid bread wheat. Plant J. 90, 1007–1013 (2017).

    CAS  PubMed  Google Scholar 

  40. 40.

    Jia, J. et al. Aegilops tauschii draft genome sequence reveals a gene repertoire for wheat adaptation. Nature 496, 91–95 (2013).

    CAS  PubMed  Google Scholar 

  41. 41.

    Luo, M. C. et al. A 4-gigabase physical map unlocks the structure and evolution of the complex genome of Aegilops tauschii, the wheat D-genome progenitor. Proc. Natl Acad. Sci. USA 110, 7940–7945 (2013).

    CAS  PubMed  Google Scholar 

  42. 42.

    Sun, S. L. et al. Extensive intraspecific gene order and gene structural variations between Mo17 and other maize genomes. Nat. Genet. 50, 1289–1295 (2018).

    CAS  PubMed  Google Scholar 

  43. 43.

    Thind, A. K. et al. Chromosome-scale comparative sequence analysis unravels molecular mechanisms of genome dynamics between two wheat cultivars. Genome Biol. 19, 104 (2018).

    PubMed  PubMed Central  Google Scholar 

  44. 44.

    McHale, L. K. et al. Structural variants in the soybean genome localize to clusters of biotic stress-response genes. Plant Physiol. 159, 1295–1308 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  45. 45.

    Dolatabadian, A. et al. Characterization of disease resistance genes in the Brassica napus pangenome reveals significant structural variation. Plant Biotechnol. J. 18, 969–982 (2020).

    CAS  PubMed  Google Scholar 

  46. 46.

    Zhang, W. J. et al. Identification and characterization of Sr13, a tetraploid wheat gene that confers resistance to the Ug99 stem rust race group. Proc. Natl Acad. Sci. USA 114, E9483–E9492 (2017).

    CAS  PubMed  Google Scholar 

  47. 47.

    Wang, M. et al. TaCYP81D5, one member in a wheat cytochrome P450 gene cluster, confers salinity tolerance via reactive oxygen species scavenging. Plant Biotechnol. J. 18, 791–804 (2020).

    CAS  PubMed  Google Scholar 

  48. 48.

    Beales, J., Turner, A., Griffiths, S., Snape, J. W. & Laurie, D. A. A pseudo-response regulator is misexpressed in the photoperiod insensitive Ppd-D1a mutant of wheat (Triticum aestivum L.). Theor. Appl. Genet. 115, 721–733 (2007).

    CAS  PubMed  Google Scholar 

  49. 49.

    Turner, A., Beales, J., Faure, S., Dunford, R. & Laurie, D. The pseudo-response regulator Ppd-H1 provides adaptation to photoperiod in barley. Science 310, 1031–1034 (2005).

    CAS  PubMed  Google Scholar 

  50. 50.

    Eiko et al. Development of PCR markers for Tamyb10 related to R-1, red grain colour gene in wheat. Theor. Appl. Genet. 122, 1561–1576 (2011).

    Google Scholar 

  51. 51.

    Yong, Z. et al. Genome-wide association study for pre-harvest sprouting resistance in a large germplasm collection of Chinese wheat landraces. Front. Plant. Sci. 08, 401 (2017).

    Google Scholar 

  52. 52.

    Dong, Z. D., Chen, J., Li, T., Chen, F. & Cui, D. Q. Molecular survey of Tamyb10-1 genes and their association with grain colour and germinability in Chinese wheat and Aegilops tauschii. J. Genet. 94, 453–459 (2015).

    PubMed  Google Scholar 

  53. 53.

    Lang J. et al. Myb10-D confers PHS-3D resistance to pre-harvest sprouting by regulating NCED in ABA biosynthesis pathway of wheat. New Phytol. (2021).

  54. 54.

    Hickey, L. T., Hafeez, A. N., Robinson, H., Jackson, S. A. & Wulff, B. B. H. Breeding crops to feed 10 billion. Nat. Biotechnol. 37, 744–754 (2019).

    CAS  PubMed  Google Scholar 

  55. 55.

    Gao C. Genome engineering for crop improvement and future agriculture. Cell (2021).

  56. 56.

    Liu, Y. et al. Pan-genome of wild and cultivated soybeans. Cell 182, 162–176 (2020).

    CAS  PubMed  Google Scholar 

  57. 57.

    Della Coletta, R., Qiu, Y., Ou, S., Hufford, M. B. & Hirsch, C. N. How the pan-genome is changing crop genomics and improvement. Genome Biol. 22, 3 (2021).

    PubMed  PubMed Central  Google Scholar 

  58. 58.

    Alkan, C., Sajjadian, S. & Eichler, E. E. Limitations of next-generation genome sequence assembly. Nat. Methods 8, 61–65 (2011).

    CAS  PubMed  Google Scholar 

  59. 59.

    Pellicer, J., Fay, M. F. & Leitch, I. J. The largest eukaryotic genome of them all? Botanical J. Linn. Soc. 164, 10–15 (2010).

    Google Scholar 

  60. 60.

    Appels, R. et al. Shifting the limits in wheat research and breeding using a fully annotated reference genome. Science 361, eaar7191 (2018).

    PubMed  PubMed Central  Google Scholar 

  61. 61.

    Tao, Y., Zhao, X., Mace, E., Henry, R. & Jordan, D. Exploring and exploiting pan-genomics for crop improvement. Mol. Plant 12, 156–169 (2019).

    CAS  PubMed  Google Scholar 

  62. 62.

    Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009).

    CAS  PubMed  PubMed Central  Google Scholar 

  63. 63.

    McKenna, A. et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  64. 64.

    Danecek, P. et al. The variant call format and VCFtools. Bioinformatics 27, 2156–2158 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  65. 65.

    Wang, K., Li, M. & Hakonarson, H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 38, e164 (2010).

    PubMed  PubMed Central  Google Scholar 

  66. 66.

    Retief, J. D. in Bioinformatics Methods and Protocols Vol. 132 (eds Misener, S. & Krawetz, S. A.) 243–258 (Humana Press, 2000).

  67. 67.

    Patterson, N., Price, A. L. & Reich, D. Population structure and eigenanalysis. PLoS Genet. 2, 2074–2093 (2006).

    CAS  Google Scholar 

  68. 68.

    Van Berkum, N. L. et al. Hi-C: A method to study the three-dimensional architecture of genomes. J. Vis. Exp. (2010).

  69. 69.

    Servant, N. et al. HiC-Pro: an optimized and flexible pipeline for Hi-C data processing. Genome Biol. 16, 259 (2015).

    PubMed  PubMed Central  Google Scholar 

  70. 70.

    Liu, B. et al. Estimation of genomic characteristics by analyzing k-mer frequency in de novo genome projects. Preprint at (2013).

  71. 71.

    Koren, S. et al. Canu: scalable and accurate long-read assembly via adaptive k-mer weighting and repeat separation. Genome Res. 27, 722–736 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  72. 72.

    Chin, C.-S. et al. Phased diploid genome assembly with single-molecule real-time sequencing. Nat. Methods 13, 1050–1054 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  73. 73.

    Chin, C. S. et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat. Methods 10, 563–569 (2013).

    CAS  PubMed  Google Scholar 

  74. 74.

    Walker, B. J. et al. Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963 (2014).

    PubMed  PubMed Central  Google Scholar 

  75. 75.

    Burton, J. N. et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 31, 1119–1125 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  76. 76.

    Ruan, J. & Li, H. Fast and accurate long-read assembly with wtdbg2. Nat. Methods 17, 155–158 (2020).

    CAS  PubMed  Google Scholar 

  77. 77.

    Kurtz, S. et al. Versatile and open software for comparing large genomes. Genome Biol. 5, R12 (2004).

    PubMed  PubMed Central  Google Scholar 

  78. 78.

    Simao, F. A., Waterhouse, R. M., Ioannidis, P., Kriventseva, E. V. & Zdobnov, E. M. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31, 3210–3212 (2015).

    CAS  PubMed  Google Scholar 

  79. 79.

    Tang, H. et al. ALLMAPS: robust scaffold ordering based on multiple maps. Genome Biol. 16, 3 (2015).

    CAS  PubMed  PubMed Central  Google Scholar 

  80. 80.

    He, Y. et al. Long-read assembly of the Chinese Rhesus macaque genome and identification of ape-specific structural variants. Nat. Commun. 10, 4233 (2019).

    PubMed  PubMed Central  Google Scholar 

  81. 81.

    Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinformatics 4, 4.10 (2004).

    Google Scholar 

  82. 82.

    Yu, X. J., Zheng, H. K., Wang, J., Wang, W. & Su, B. Detecting lineage-specific adaptive evolution of brain-expressed genes in human using Rhesus macaque as outgroup. Genomics 88, 745–751 (2006).

    CAS  PubMed  Google Scholar 

  83. 83.

    Birney, E., Clamp, M. & Durbin, R. GeneWise and Genomewise. Genome Res. 14, 988–995 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  84. 84.

    Kim, D. et al. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36 (2013).

    PubMed  PubMed Central  Google Scholar 

  85. 85.

    Trapnell, C. et al. Transcript assembly and quantification by RNA-seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat. Biotechnol. 28, 511–515 (2010).

    CAS  PubMed  PubMed Central  Google Scholar 

  86. 86.

    Grabherr, M. G. et al. Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat. Biotechnol. 29, 644–652 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  87. 87.

    Haas, B. J. et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654–5666 (2003).

    CAS  PubMed  PubMed Central  Google Scholar 

  88. 88.

    Kent, W. J. BLAT—the BLAST-like alignment tool. Genome Res. 12, 656–664 (2002).

    CAS  PubMed  PubMed Central  Google Scholar 

  89. 89.

    Stanke, M. et al. AUGUSTUS: ab initio prediction of alternative transcripts. Nucleic Acids Res. 34, W435–W439 (2006).

    CAS  PubMed  PubMed Central  Google Scholar 

  90. 90.

    Burge, C. & Karlin, S. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78–94 (1997).

    CAS  PubMed  Google Scholar 

  91. 91.

    Majoros, W. H., Pertea, M. & Salzberg, S. L. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20, 2878–2879 (2004).

    CAS  PubMed  Google Scholar 

  92. 92.

    Alioto, T., Blanco, E., Parra, G. & Guigó, R. Using geneid to identify genes. Curr. Protoc. Bioinformatics 64, e56 (2018).

    PubMed  Google Scholar 

  93. 93.

    Bromberg, Y. & Rost, B. SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res. 35, 3823–3835 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  94. 94.

    Haas, B. J. et al. Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol. 9, R7 (2008).

    PubMed  PubMed Central  Google Scholar 

  95. 95.

    Avni, R. et al. Wild emmer genome architecture and diversity elucidate wheat evolution and domestication. Science 357, 93–97 (2017).

    CAS  PubMed  PubMed Central  Google Scholar 

  96. 96.

    Lowe, T. M. & Chan, P. P. tRNAscan-SE on-line: integrating search and context for analysis of transfer RNA genes. Nucleic Acids Res. 44, W54–W57 (2016).

    CAS  PubMed  PubMed Central  Google Scholar 

  97. 97.

    Nawrocki, E. P. & Eddy, S. R. Infernal 1.1: 100-fold faster RNA homology searches. Bioinformatics 29, 2933–2935 (2013).

    CAS  PubMed  PubMed Central  Google Scholar 

  98. 98.

    Hunter, S. et al. InterPro: the integrative protein signature database. Nucleic Acids Res. 37, D211–D215 (2009).

    CAS  PubMed  Google Scholar 

  99. 99.

    Yu, G., Wang, L. G., Han, Y. & He, Q. Y. ClusterProfiler: an R package for comparing biological themes among gene clusters. OMICS 16, 284–287 (2012).

    CAS  PubMed  PubMed Central  Google Scholar 

  100. 100.

    Emms, D. M. & Kelly, S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 16, 157 (2015).

    PubMed  PubMed Central  Google Scholar 

  101. 101.

    Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).

    CAS  PubMed  PubMed Central  Google Scholar 

  102. 102.

    Yang, Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).

    CAS  PubMed  PubMed Central  Google Scholar 

  103. 103.

    Akdemir, K. C. & Chin, L. HiCPlotter integrates genomic data with interaction matrices. Genome Biol. 16, 198 (2015).

    PubMed  PubMed Central  Google Scholar 

  104. 104.

    Durand, E. Y., Patterson, N., Reich, D. & Slatkin, M. Testing for ancient admixture between closely related populations. Mol. Biol. Evol. 28, 2239–2252 (2011).

    CAS  PubMed  PubMed Central  Google Scholar 

  105. 105.

    Martin, S. H., Davey, J. W. & Jiggins, C. D. Evaluating the use of ABBA–BABA statistics to locate introgressed loci. Mol. Biol. Evol. 32, 244–257 (2015).

    CAS  PubMed  Google Scholar 

  106. 106.

    Bosse, M. et al. Genomic analysis reveals selection for Asian genes in European pigs following human-mediated introgression. Nat. Commun. 5, 4392 (2014).

    CAS  PubMed  PubMed Central  Google Scholar 

  107. 107.

    Li, H. et al. Recombination between homoeologous chromosomes induced in durum wheat by the Aegilops speltoides Su1-Ph1 suppressor. Theor. Appl. Genet. 132, 3265–3276 (2019).

    CAS  PubMed  Google Scholar 

  108. 108.

    Komuro, S., Endo, R., Shikata, K. & Kato, A. Genomic and chromosomal distribution patterns of various repeated DNA sequences in wheat revealed by a fluorescence in situ hybridization procedure. Genome 56, 131–137 (2013).

    CAS  PubMed  Google Scholar 

  109. 109.

    Du, P. et al. Development of oligonucleotides and multiplex probes for quick and accurate identification of wheat and Thinopyrum bessarabicum chromosomes. Genome 60, 93–103 (2017).

    CAS  PubMed  Google Scholar 

  110. 110.

    Meng, L., Li, H., Zhang, L. & Wang, J. QTL IciMapping: integrated software for genetic linkage map construction and quantitative trait locus mapping in biparental populations. Crop J. 3, 269–283 (2015).

    Google Scholar 

  111. 111.

    Kosambi, D. D. The estimation of map distances from recombination values. Ann. Eugen. 12, 172–175 (1943).

    Google Scholar 

  112. 112.

    Chen, S., Zhou, Y., Chen, Y. & Gu, J. Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34, i884–i890 (2018).

    PubMed  PubMed Central  Google Scholar 

  113. 113.

    Kokot, M., Długosz, M. & Deorowicz, S. KMC 3: counting and manipulating k-mer statistics. Bioinformatics 33, 2759–2761 (2017).

    CAS  PubMed  Google Scholar 

  114. 114.

    Guo, Z. et al. Discovery, evaluation and distribution of haplotypes of the wheat Ppd-D1 gene. New Phytol. 185, 841–851 (2010).

    CAS  PubMed  Google Scholar 

Download references


We are grateful to J. Dvorak (University of California, Davis), J. Wang (Sichuan Agricultural University), W. Ji (Northwest A&F University) and Z. Ru (Henan Institute of Science and Technology) for sharing germplasm. We thank M.-C. Luo (University of California, Davis), Y. Jiao (Institute of Botany, Chinese Academy of Sciences (CAS)), Z. Ni (China Agricultural University), Z. Tian (Institute of Genetics and Developmental Biology, CAS), W. Song (Northwest A&F University), L. Mao (Institute of Crop Sciences, CAAS), D. Wang (Henan Agricultural University) and J. Sun (Institute of Crop Sciences, CAAS) for helpful discussion on the project. We also thank E. Wang (Institute of Plant Physiology and Ecology, CAS), S. Song (University of Pennsylvania) and J. Adams (Nanjing University) for critical reading of the manuscript. This project was supported by grants from the Ministry of Agriculture of China (2016ZX08009), National Natural Science Foundation of China (31430061, 32001492 and 31871615) and Natural Science Foundation of Henan Province (202300410053).

Author information




C.-P.S. and Y.Z. initiated and designed the project. S.B., C.Z., L.C., J.M., J.L. and J. Hu carried out the sequence data analysis. H. Li, D.Z., F.N., L.Z., R.F., H. Liang, Y.G., H.X. and S.X. developed the A-WSOW populations. G.G., S.B. and J. Hou extracted DNA and RNA. G.S., T.S. and W.J. contributed to the genome sequencing and resequencing. H. Li, F.N., L.Z., R.F. and A.S. performed the cytology experiment and karyotype pattern analysis. Y.Z., H. Li, F.M., D.Z., S.L., X.Z., G.G., L.L., F.N., X.Q., A.S. and Z.Z. performed the laboratory and field experiments and QTL analysis. C.-P.S., Y.Z., C.Z., H. Li and S.B. wrote the manuscript. C.-P.S., Y.Z., C.Z., H. Li, S.B. and J. Huang revised the manuscript.

Corresponding authors

Correspondence to Changsong Zou or Chun-Peng Song.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Plants thanks Alexandra Przewieslik-Allen and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Rapid introgression of Ae. tauschii into wheat.

a, The workflow represents the process from the production of inter-hybrids to the stable introgression population of elite wheat. The upper picture indicates the cross between Ae. tauschii and wheat, which produces the tetra-haploid (ABDD). The orange capital A, B, and D letter represent the wheat A, B, and D genome, respectively. The purple D indicates the genome of Ae. tauschii, and the pink D indicates the genome formed by the chromosomal exchange between Ae. tauschii and wheat. b, Distribution of the introgression alleles across the D genome of wheat in Ae. tauschii T015-wheat cultivar Z18 A-WI population. Purple lines represent the introgression alleles from Ae. tauschii. The nicks on the pseudochromosome represent the position of the centromere region. c, Genome-FISH indicated that a genome fragment of Ae. tauschii (i) recombined to the end of 5DS of the wheat (ii), which formed a novel introgression line (iii). White: probe oligo-pSc119.2 and oligo-pTa71, green: oligo-(GAA)10, red: oligo-pTa535.

Extended Data Fig. 2 The A-WSOW pool formed by crossing between wheat and 85 core germplasm of Ae. tauschii.

a, Components of the core germplasm of Ae. tauschii. The core germplasm pool consists of 85 accessions from all sublineages, including 24 L1EX accessions, 38 L1EY accessions, three L1E accessions, two L1W accessions, five L2W accessions, and 13 L2E accessions. The name of each sublineage is the same as described in Fig. 2. L1E represents inter-sublineage accessions between L1EX and L1EY. The numbers in parentheses indicate the accession number of each sublineage. b, Spike and grain phenotype of partial A-WSOWs. AK58 and T093 represent wheat cultivar and Ae. tauschii, respectively. c, Karyotypic pattern of partial A-WSOWs. Most of the SOWs include a complete chromosome set of Ae. tauschii. These SOWs occasionally lose chromosomes. The ND-FISH probes oligo-pSc119.2 (green), oligo-pTa535 (red), oligo-(GAA)10 (yellow) are used.

Extended Data Fig. 3 Comparison of karyotype pattern and subspecies index (SI) of different groups in Ae. tauschii.

a, The probe oligo-(GAA)10 showed diverse signal patterns among different sublineages of Ae. tauschii. b, A sequential ND-FISH using the probe oligo-pTa535 was conducted to distinguish the 1D-7D chromosomes of Ae. tauschii. Chinese Spring was used as a representative of wheat. c, Clustering based on the karyotype pattern. The red and blue square indicates the presence and absence signal, respectively. The ND-FISH signals are transferred into the digital matrix for cluster analysis. The green dot indicates Ae. tauschii AY61. The oligo-(GAA)10 probe can divide Ae. tauschii into L1 and L2. d, Distribution of subspecies index values of spikelet. The panel’s upper right corner shows the measurement points for spikelet glume width (G) and rachis segment width (R). SI value is calculated by the formula: SI = G/R. Violin plots illustrate the density distribution of SI. The box is the interquartile range, the horizontal line in the box represents the median value. The dashed line indicates an SI value equal to 1.3. The difference between L1 and L2 is analysed by the two-sided Wilcoxon signed-rank test. P < 2.2 × 10-16 and *** indicates P < 0.001.

Extended Data Fig. 4 Heatmaps represent the chromatin interaction matrix of Hi-C pseudo-chromosomes.

a and b indicated the map of Ae. tauschii AY61 and T093, respectively. The Hi-C heatmaps are shown at a resolution of 1 Mb window.

Extended Data Fig. 5 Comparison between the long-read based and short-read based genome assembly.

a, Treemaps indicated the difference of fragmentation between long-read (PacBio) and short-read (Illumina) assemblies of the Ae. tauschii genome. The colored rectangles represent the top longest contigs that account for ~200 Mb of the whole assembly. b) to e) represent the assembly of AY61, T093, AL8/78, CS-D, respectively. Yellow box means contig length is > 3 Mb. The gray box indicates contig length is from 0.5 Mb to 3 Mb. Pink box means contig length between 0.1-0.5 Mb. Orange box represents contig length < 0.1 Mb. f, The distribution of gaps across the gene body, upstream and downstream regions of different assemblies. Each gene body region was divided into ten bins with equal size and normalized gap content for the total gene set in each bin.

Extended Data Fig. 6 L1 of Ae. tauschii has no significant introgression to wheat at the population level.

a, D-statistics based on ABBA and BABA SNP frequency differences. We found significant infiltration events from wheat to L2, L2 to wheat, and L1 to L2. However, no significant incident was detected between L1 to wheat. AdmixTools calculated the D statistic of the whole genome. If there is no introgression happened from the population P3 to P2, the expected D statistic would be zero. If there existed introgression from P3 to P2, the D statistic would be significantly smaller than zero, while the significant positive value indicating introgression happed from P3 to P1. We divided the chromosome into 500 kb bins and then performed a block jackknife for calculating Z-score. D statistic was taken as significant if the absolute Z-score was greater than 3. b and c, The distribution and overlap of introgression blocks from L1 to wheat D genome with three independent methods. The introgression blocks from L1 to wheat were identified through the phylogenetic topology (Gene tree), identical by descent (IBD), and fd methods. 500 kb window was used for the analysis. Single copy genes of Ae. tauschii T093, AY61, CS-D, and CS-A were used to construct gene tree in the window of 500 kb. The region with the phylogenetic topology of ((AY61, (CS-D, T093)), CS-A) was supposed to introgression blocks from D1 to CS-D. IBD was identified in the wheat, L1 and L2, the number of recorded IBD tracts between wheat and the two groups was computed in 500 kb window. According to the previous reported method106, the putative introgression segments from L1 to each of the wheat accessions were identified. At the population level, infiltration fragments detected in more than half of wheat individuals were considered candidate fragments. The fd statistic was performed under a given four-taxon topologies ((L2, wheat), L1, SS) in 500 kb window, fd statistic value between 0 and 1 indicates introgression proportion from population L1 to wheat. All three methods detected less than 6% of the area detected by any detections and none of the region.

Extended Data Fig. 7 Examples of the variation landscape in the wild population of Ae. tauschii.

a, A 5-kb PAV example of the presence-and-absence of Sr13-like genes, a member of NB-ARC gene family. b, An example of tandem-repeat variations of P450 genes involved in positively regulating salt stress resistance in wheat. c, Heatmap showed the expression diversification of tandem-repeat genes mentioned in b at Ae. tauschii and CS, the expression level of tandem-repeat genes was significantly higher in response to ABA, Salt, and PEG stress treatment than that of the wheat genome. d, Genetic variation of Ppd1 in Ae. tauschii and wheat population. In the heatmap, red indicates polymorphic sites (1/1), pink indicates heterozygous sites (0/1), gray indicates no polymorphism (0/0), and light green indicates missing (./.). Circled numbers 1 to 4 indicate the variations reported in previous studies114. Circles 1 and 2 indicate 24 bp and 15 bp insertions in the upstream, respectively. Circles 3 and 4 indicate the 5 bp deletion in the seventh exon and the 18 bp insertion in the eighth exon. The two sites marked by red asterisks can cause amino acid changes and are significantly related to the flowering date. e, Using the two loci mentioned in d, Ae. tauschii can be divided into three haplotypes, and the flowering date of haplotype 1 (n = 53) is significantly earlier than that of haplotype 2 (n = 97) and 3 (n = 15). The middle bars represent the median, while the bottom and top of each box represent the 25th and 75th percentiles, respectively, and the whiskers extend to 1.5 times the interquartile range. Two-tailed Wilcox test was used to assess the statistical significance between each group.

Supplementary information

Supplementary Information

Supplementary Text 1.1–1.4, Figs. 1–12, Tables 4, 5, 7–12, 14, 17 and 18 and references.

Reporting Summary

Supplementary Tables

Supplementary Tables 1–3, 6, 13, 15 and 16.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhou, Y., Bai, S., Li, H. et al. Introgressing the Aegilops tauschii genome into wheat as a basis for cereal improvement. Nat. Plants (2021).

Download citation


Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing