Introgressions from crop wild relatives (CWRs) have been used to introduce beneficial traits into cultivated plants. Introgressions have traditionally been detected using cytological methods. Recently, single nucleotide polymorphism (SNP)-based methods have been proposed to detect introgressions in crosses for which both parents are known. However, for unknown material, no method was available to detect introgressions and predict the putative donor species. Here, we present a method to detect introgressions and the putative donor species. We demonstrate the utility of this method using 10 publicly available wheat genome sequences and identify nine major introgressions. We show that the method can distinguish different introgressions at the same locus. We trace introgressions to early wheat cultivars and show that natural introgressions were utilised in early breeding history and still influence elite lines today. Finally, we provide evidence that these introgressions harbour resistance genes.
Wheat (Triticum aestivum, 2n = 6x = 42, BAD) is one of the most widely grown1 and consumed crops in the world2. Due to expected population growth and declining acreage for crop cultivation, projections indicate that the wheat yield per hectare will need to increase significantly over the next decades3. More frequent and severe abiotic and biotic stresses will also affect crop yield4,5, so breeding for yield stability in wheat is becoming more important6. As a result of rapid advances in sequencing technologies combined with decreasing costs, the first complete wheat genome has been sequenced7 and the first step towards a wheat pan-genome sequence has been made8. These resources, combined with high-density SNP matrices and high-throughput phenotyping, allow for the identification of genes related to specific traits and the improvement of breeding strategies.
Traditional breeding strategies, which are based on crossing elite cultivars and selecting the best offspring, have increased yield, but have decreased the genetic diversity of crop plants compared with that of their CWRs9. Interspecific hybridisation, also known as interspecies crossing, wide hybridisation, or distant hybridisation, allows the transfer of DNA from a donor species into a crop plant. This strategy has been used in breeding to add desired traits to crop plants, for example, resistance or tolerance to biotic or abiotic stress10,11,12. The foreign DNA that is derived from the donor species and has been integrated into the genome is called an introgression. Traditionally, large introgressions have been detected using cytological methods13,14,15,16, but these methods are low-throughput.
Fortunately, the wealth of wheat reference quality assemblies (RQAs)8 and next generation sequencing (NGS) data for CWRs make it possible to investigate interspecific hybridisation events without cytological methods. Analyses of SNPs can be used to detect introgressed regions in crop plants17,18. If the donor is unknown, coverage analysis or transposable element analysis can be used to detect candidate regions of introgressions8,19. Such computational methods can be used to trace introgressed regions in breeding materials, and this information can be used in breeding programs to minimise such regions to increase wheat yield and yield stability.
However, the methods mentioned above do not identify putative donor species, which is especially interesting for elite lines with multiple introgressions from different donors or for old landraces. For such wheat accessions, the origin of a genomic region that might influence an important trait is often unknown. This also hampers the search for additional beneficial alleles. Here, we describe a method for identifying introgressions and predicting putative donor species and demonstrate its applicability using 10 RQAs of wheat.
Detection of nine introgressions in 10 wheat RQAs
We aimed to detect introgressions by mapping publicly available short reads to wheat reference genomes8 (Table S1). We identified putative introgressions as regions with decreased coverage of reads from the progenitor species of the wheat subgenomes; that is, for a genomic window, the proportion covered by reads is decreased (Methods). Triticum urartu (2n = 2x = 14, Au) and Aegilops tauschii (2n = 2x = 14, D) are the donors of the A and D subgenomes of wheat20,21,22, respectively, whereas the donor of the B subgenome is either not known yet or extinct23. Aegilops speltoides (2n = 2x = 14, S) is the most closely related remnant taxon to the wheat B subgenome donor23,24,25. For this reason, decreased coverage of reads from the subgenome donors can be expected at introgressed regions of the A and D subgenomes, but not for the B subgenome using Ae. speltoides as the subgenome donor. Increased coverage of reads from a wild relative in the same region indicates that it is a putative donor species of this introgression. Because several wheat relatives share a common ancestor or were subgenome donors for polyploid Triticum and Aegilops taxa, increased coverage of reads from a wheat relative at an introgressed region in an elite background provides hints about the source of the introgression, but does not necessarily identify the exact donor species.
Using this method and a wide range of wheat relatives8,26,27 (Table S2), we detected nine large introgressions in the 10 wheat RQAs, located on chromosomes 2A, 2B, 2D, 3D, and 4A (Fig. 1, Table S3, Supplementary Data 1). For all nine introgressions, we identified a putative donor.
Interestingly, two introgressions were detected in all 10 RQAs; one was close to, but downstream of the centromere of chromosome 2A (introgression 3) and the other was on chromosome 4AL (introgression 9). On the one hand, we detected regions with decreased coverage on chromosomes 2A and 4AL in Triticum urartu, Triticum boeoticum (2n = 2x = 14, Ab) and Triticum monococcum (2n = 2x = 14, Ab) indicating that these three species share similar sequences in these regions of 2A and 4AL. On the other hand, normal coverage in these regions was detected in polyploid relatives including Triticum dicoccoides (2n = 4x = 28, BA), Triticum dicoccon (2n = 4x = 28, BA), Triticum spelta (2n = 6x = 42, BAD), and Triticum sphaerococcum (2n = 6x = 42, BAD) (Fig. 2), suggesting that these species share a similar sequence compared to the reference sequence of cv. Julius. This introgressions might be common to all these polyploid taxa. One exception was accession K240104 of T. dicoccoides originating from Syria27, in which a low-coverage region was detected on chromosome 2A. Based on the assumption that Triticum urartu is the A-subgenome donor and Aegilops speltoides, which shows increased coverage in these regions, is the B-subgenome donor, we might speculate that these introgressions might date back to the time of hybridization between A and B subgenome.
The low-coverage region on chromosome 2A is located at about 395–407 Mb in Chinese Spring and harbours 17 annotated protein-coding genes (TraesCS2A01G257500–TraesCS2A01G259100). Upstream of this introgression is another small low-coverage region overlapping with the position of the centromere8,26,27. Similar small low-coverage regions overlapping with the centromere were also detected on chromosomes 1A to 5A.
Chromosome 4A is structurally highly rearranged and has retained a large portion of chromosome 7BS through a species-specific translocation28,29. Studies using cytological and NGS methods have clearly shown the rearrangement on 4AL30,31. Interestingly, we did not detect regions with increased coverage of reads from wheat chromosome 7B in T. urartu, T. boeoticum, or T. monococcum. Since both introgressions (introgressions 3 and 9) are very old and were found in all wheat RQAs as well as in the polyploid CWRs, they were not treated as introgressions in a recent RQA analysis8.
We also detected four previously described introgressions. Firstly, two introgressions on chromosome 2B and 2BL (introgressions 4 and 5), potentially originating from Triticum timopheevii (2n = 4x = 28, GAt), were detected in cv. LongReach Lancer and cv. Julius, respectively. Both cultivars are descendants of cv. Wisconsin-2458,32, which has an introgression of the nearly complete chromosome 2G of T. timopheevii (Fig. S1). This introgression harbours the resistance gene Sr36 on chromosome 2BS33,34. Interestingly, cv. LongReach Lancer has a long introgression spanning chromosome 2BS and 2BL8, whereas cv. Julius harbours a much smaller introgression on chromosome 2BL that, to the best of our knowledge, has not been previously associated with T. timopheevii.
Secondly, an introgression was detected only in cv. LongReach Lancer on chromosome 3DL with the putative donor Thinopyrum ponticum (2n = 10x = 70, EEEStESt) (introgression 8). A previous study of cv. LongReach Lancer detected an introgression on chromosome 3DL from Th. ponticum8.
Thirdly, an introgression on chromosome 2AS (introgression 1) was detected in four cultivars; cv. CDC Stanley, cv. Jagger, cv. Mace, and cv. SY Mattis, with Aegilops comosa (2n = 2x = 14, M) and Aegilops uniaristata (2n = 2x = 14, N) identified as the putative diploid donors (Fig. 3). It is known from their pedigree that these four cultivars harbour an introgression from tetraploid Aegilops ventricosa (2n = 4x = 28, DN)35,36,37, which shares the N subgenome with Ae. uniaristata38. Using recently published whole genome sequencing (WGS) data for Ae. ventricosa8, a high-coverage region was detected in the same region on chromosome 2AS (Fig. 3).
In the same region on chromosome 2AS, a second introgression (introgression 2) was detected in the remaining six cultivars (cv. ArinaLrFor, cv. CDC Landmark, cv. Chinese Spring, cv. Julius, cv. LongReach Lancer, and cv. Norin61), with Aegilops markgrafii (2n = 2x = 14, C) identified as the putative diploid donor (Fig. 3). Recently, an introgression from Ae. markgrafii was produced for this region39, but it cannot be the source for these six cultivars because some of them were released a long time ago. Nevertheless, it shows that an introgression from Ae. markgrafii is likely in this region.
For chromosome 2D, an introgression (introgression 6) was detected in four cultivars (cv. ArinaLrFor, cv. Jagger, cv. Julius, and cv. SY Mattis) with the putative donor identified as Ae. markgrafii or Ae. umbellulata (2n = 2x = 14, U). A previously reported introgression in this region19 is enriched in elite materials from Western Europe (Lehnert et al., unpublished data) and explains the highest proportion of variance in the grain yield of these materials when cultivated under optimum conditions40.
Finally, an introgression on 3DS (introgression 7) from the putative diploid donor species Ae. comosa (2n = 2x = 14, M) or Ae. uniaristata (2n = 2x = 14, N) was detected in four cultivars (cv. CDC Landmark, cv. CDC Stanley, cv. Julius, and cv. SY Mattis). We suggest that, due to the very narrow native distribution range of these two species, a polyploid taxon such as Ae. ventricosa (2x = 4n = 28, DN) or Ae. geniculata (2x = 4n = 28, MU), which are much more widespread and share a subgenome with these diploid putative donors38, is more likely to be the donor of this introgression, similar to the observation for introgression 1. To our knowledge, introgression 7 has not yet been described in the literature.
Most of these introgressions were detected using low cost and low-coverage GBS data26,27. To further reduce the costs of genotyping, we tested how much data needs to be generated from CWRs to reasonably describe introgressions in a species. To determine how much data is required to detect an introgression, reads of Ae. markgrafii mapped to cv. Julius were exemplarily subsampled from 100 to 1%. Again, the percentage of fixed-size chromosomal windows that was covered by reads was determined (Fig. S2). When using 10% and 1% of the data, higher values were detected for the introgressed region compared with the flanking regions, indicating that even 1% of these low-coverage GBS data is enough to detect introgressions.
Besides the nine introgressions, several additional patterns were detected in the coverage profiles (Supplementary Data 1). These patterns included several chromosomes with increased coverage of reads from T. boeoticum and T. monococcum, while the coverage profile of the A subgenome donor T. urartu was not decreased. Such patterns were detected on chromosome 1AS in cv. ArinaLrFor and cv. Norin61; on chromosome 5AL in cv. ArinaLrFor, cv. CDC Stanley, cv. Jagger, cv. Julius, cv. LongReach Lancer, and cv. SY Mattis; on chromosome 6AS and 6AL in all 10 cultivars; and on chromosome 7AL in cv. CDC Stanley, cv. Mace, and cv. LongReach Lancer (Supplementary Data 1). Previous studies have reported introgressions on wheat chromosomes 5AL and 7AL from T. monococcum41 and T. boeoticum42, respectively, although the introgressed regions differ. In contrast, the introgression on chromosome 1AS has been described for cv. Norin61 without any evidence for its donor species or putative origin43. Smaller patterns were also detected, for instance, on chromosome 2AL in cv. CDC Stanley and cv. SY Mattis, and on chromosome 5DS in all 10 wheat genomes. These patterns were quite diverse and need to be analysed in more detail in further studies. In the remainder of this manuscript, we focus on the previously uncharacterised introgressions on chromosomes 2AS, 2DL, and 3DS (introgression 2, 6, 7).
Some introgressions are derived from natural interspecific hybridisations
The observed introgressions were classified into three groups based on their first appearance. The first group (introgressions 3 and 9) consisted of ancient introgressions that probably date back to the time of tetraploidisation of wheat. The second group (introgressions 1, 4, 5, and 8) could be assigned to known interspecific crosses during research and breeding in recent decades. The third group consisted of introgressions whose time of first appearance is still unknown; namely introgression 2 on chromosome 2AS, introgression 6 on chromosome 2DL, and introgression 7 on chromosome 3DS. We checked whether these introgressions were present in some old wheat cultivars that were released before interspecific hybridisation was introduced as a breeding method. For example, the old French wheat cultivar Vilmorin-27, which was released in 1928, is an ancestor of cv. Julius and many contemporary European elite cultivars32. Based on publicly available genotyping-by-sequencing (GBS) data for cv. Vilmorin-2719, no decreased coverage was detected for the introgressed regions in cv. Julius, indicating that these introgressions were already present in cv. Vilmorin-27 (Fig. S3). To further explore the distribution of these introgressions, the ancestors of cv. Vilmorin-27 were extracted from the pedigree and their seeds were obtained from the genebank at Gatersleben (Table S4). The whole genomes of four plants of each of these cultivars were re-sequenced with low sequencing depth (Table S5) and compared with the cv. Julius reference genome (Fig. S3).
For the low-coverage region on chromosome 2AS (introgression 2), cv. Gros Bleu and cv. Japhet showed normal coverage for most of the region and only low coverage at the end of the region, while the other cultivars showed uniform and high coverage in the complete region, indicating that this introgression was potentially widely utilised in the nineteenth century.
For chromosome 2DL (introgression 6), three different states were observed: First, cv. Hatif Inversable, cv. Dattel, cv. Ble Seigle, and cv. Noe had low coverage in this region, indicating that the analysed individuals did not carry the introgression; second, cv. Japhet had high coverage only at the beginning of the region; and third, cv. Gros Bleu and cv. Bon Fermier had low coverage only at the end of the region. We did not find the complete introgression in any individual of the analysed ancestors.
Three different states were also observed for chromosome 3DS (introgression 7): First, cv. Noe and cv. Dattel had low coverage in this region, indicating that the analysed individuals did not carry the introgression; second, cv. Japhet and cv. Gros Bleu had low coverage only at the very end of this region; and third, cv. Hatif Inversable, cv. Bon Fermier, and cv. Ble Seigle had high coverage in the complete region.
Some of the observed patterns can be traced in the pedigree of cv. Vilmorin-27. For instance, cv. Bon Fermier probably obtained the introgression on chromosome 2DL from cv. Gros Bleu and that on chromosome 3DS from cv. Ble Seigle. Other observed patterns, for example, the introgression on chromosome 2DL in cv. Gros Bleu, could not explained by the pedigree or the current data. Notably, genebank accessions can be heterogeneous, so we may have missed individuals harbouring the complete introgression. Some descendants of cv. Noe may have inherited the complete introgression on chromosome 2DL from cv. Vilmorin-27. Interestingly, cv. Vilmorin-27 is the oldest of the analysed cultivars that carries all three complete introgressions. Nevertheless, single introgressions or parts thereof, as well as combinations, were detected in old cultivars.
Wide heterogeneity exists within and among genebank accessions
To search for the first occurrence of introgressions on chromosome 2DL in wheat, the genomes of old cultivars maintained ex situ were sequenced. To avoid missing introgressions, we did not use materials descended from single seeds in these analyses. Hence, seeds obtained directly from the genebank stocks at Gatersleben were used. The DNA of four individuals per accession was isolated and sequenced (Table S5), and the coverage profiles were compared among the four individuals. We detected remarkable differences in coverage profiles among the four individuals in six of the eight accessions (Supplementary Data 2). Thus, at least one of these four plants carried a chromosomal modification. An extreme example are the three different profiles detected for chromosome 6B in cv. Krymka (shown in red, green, and blue/black in Fig. 4). The differences were large, consisting of several megabases. Interestingly, there were no obvious differences in the phenotypes of these four individual plants when cultivated under greenhouse conditions (Fig. S4 shows the spikes of four individual cv. Krymka plants). We also analysed some CWR accessions using this method. For Aegilops cylindrica, one individual from genebank accession AE 656 showed a different coverage profile for chromosomes 1D, 3D, 4D, and 5D, indicating that it also carries introgressions (Fig. S5). This is plausible given earlier studies on the potential for gene transfer in Ae. cylindrica44.
Introgressed regions harbour homologues of resistance genes
Introgressions have often been initiated to transfer resistance genes from CWRs into breeding materials. Often, the resistance gene is then tracked with linked markers in the introgression or in its flanking sequences45. We utilised recently published markers to identify the candidate gene PGSB_gene_194535, which may be homologous to the leaf rust resistance gene LrM39. In addition, we used the recently described yellow rust resistance genes Yr5 and Yr746 for homology-based gene prediction within the described introgression regions in the 10 RQAs. Besides a high amino acid sequency identity in a pairwise alignment of predicted and reference proteins (≥ 80%), all predicted genes had the same number of coding exons (ce) as the reference gene (rce) and encoded a protein of similar length (aa vs. raa) (Table S6).
For PGSB_gene_1945, homology-based gene prediction using GeMoMa predicted 10 genes—one in each RQA on chromosome 2AS located in the region of the introgression. Interestingly, there were only two sequences of these predicted genes. For cv. CDC Stanley, cv. Jagger, cv. Mace, and cv. SY Mattis, the predicted gene product showed 100% amino acid sequence identity to the product of PGSB_gene_1945. The other six cultivars harboured a gene whose predicted product showed 95.6% amino acid sequence identity to the product of PGSB_gene_1945. These two groups of cultivars were completely consistent with the clusters formed based on the introgression on chromosome 2AS.
Two putative homologues of the resistance gene Yr7 were detected on chromosome 2B. The homologue in this region in cv. CDC Landmark, cv. CDC Stanley, and cv. Mace, encoded a protein with 99.9% amino acid sequence identity to Yr7. The pedigree of these three cultivars indicates that Yr7 was introduced from the tetraploid durum wheat cv. Iumillo into the wheat cv. Thatcher47. The homologue in this region in cv. CDC Landmark, cv. CDC Stanley, cv. Mace, and cv. SY Mattis encoded a protein with 84.9% amino acid sequence identity to Yr7. The remaining three cultivars did not harbour any genes encoding a protein with at least 80% amino acid sequence identity to Yr7.
Three homologues of the resistance gene Yr5 were identified. The Yr5 homologue on chromosome 2B in cv. Julius and cv. LongReach Lancer encoded a protein with 91.1% amino acid sequence identity to Yr5. This predicted gene was located within a potential introgression from T. timopheevii, which is present on chromosome 2BL in both cultivars. The second and third predicted genes were located within the described introgression on chromosome 2DL. The homologue in cv. Jagger and cv. Julius encoded a protein with 99.2% amino acid sequence identity to Yr5, and that in cv. ArinaLrFor and cv. SY Mattis encoded a protein with 98.1% amino acid sequence identity to Yr5. Interestingly, the predicted genes in cv. Jagger and cv. Julius were identical to the Yr5 allele from cv. Claire46. Since the predicted genes were located within the introgression on chromosome 2DL, we speculate that there have been at least two independent events or mutations at this locus. The predicted genes were located in the first part of the introgression that was present in cv. Japhet, cv. Gros Bleu and cv. Bon Fermier. Hence, these cultivars might harbour an allele conferring increased resistance against yellow rust.
Hybridisation between wheat and its wild relatives occurs naturally, but can also be conducted during breeding to introduce beneficial traits into elite wheat breeding material. Here, we have shown for the first time that introgressions and their putative donor species can be identified without prior knowledge of the pedigree. Moreover, these introgressions can be easily traced, facilitating the desirable but challenging task to decrease introgression fragments to reduce linkage drag. The described bioinformatics method can be used for GBS, whole genome exome capture, and whole genome resequencing data. Large introgressions can be detected from GBS data, which is relatively cheap to obtain. Smaller introgressions may be identified from data generated using more expensive, but also more informative techniques such as exome capture and whole genome resequencing.
We have identified several known and previously unknown introgressions in 10 RQAs of wheat. Some of these introgressions are well described and were conducted in the framework of research and breeding programs. Interspecies crosses in wheat were described for the first time in the last quarter of the nineteenth century48. However, we also identified introgressions in old cultivars released in the first half of the nineteenth century. This observation is consistent with findings of introgressions from T. monococcum in bread wheat cv. Mediterranean41 released in 1837. We hypothesise that the introgressions found in old cultivars resulted from spontaneous natural interspecific crosses that were subsequently selected. For various reasons, it is almost impossible to determine the exact origin of these introgressions. These reasons include (i) the common subgenomes of wheat relatives; (ii) the heterogeneity of old landraces; (iii) conservation issues over the centuries, including seed exchange and multiplication leading to the potential loss of alleles and/or contamination that have affected the integrity of genebank accessions; and (iv) the lack of data on the exact timing and location of the first occurrence of these introgressions. Nevertheless, we were able to narrow down the putative donors for several introgressions. These findings will be useful in the search for alternative alleles in donor species that might be valuable for crop improvement under changing climatic conditions and increasingly severe biotic stress.
Further, within the introgressed DNA regions, we identified genes showing strong sequence identity to known resistance genes. Further research is required to test their efficacy against pathogens. Interestingly, the old cv. Noe without the 2DL introgression was described as quite susceptible to rust. In contrast, cv. Gros Bleu and cv. Japhet, which are different selections of cv. Noe that carry a large and a small fragment of the 2DL introgression, respectively, are much more resistant to rust than is cv. Noe49. Within the common part of this introgression, we identified a homologue of the resistance gene Yr5. The discovery of a Yr5 homologue in the introgressed region on chromosome 2DL of cv. Julius and cv. Jagger proves that the Yr5 allele of cv. Claire is indeed located on chromosome 2D and not on chromosome 2B like the original Yr546. In addition, we predicted another allele of Yr5 in the introgressed region on chromosome 2DL of cv. ArinaLrFor and cv. SY Mattis, indicating that there may have been independent introgression events. We also detected Yr5 orthologs on chromosome 2BL and 2DL in cv. Julius, demonstrating that the combination of introgressions on different chromosomes allows for stacking of homoeologous resistance genes. Finding homologue resistance genes in CWRs that can be introgressed into different subgenomes of wheat might facilitate combining resistance genes to yield durable and broadly resistant lines.
Interestingly, we detected Yr5 at the beginning of the introgression on chromosome 2DL. Some wheat cultivars, e.g. cv. Japhet., carry only the first part of the introgression. Nevertheless, the complete introgression on chromosome 2DL is highly enriched in Western European elite winter wheat materials, indicating that at least one other trait may be affected by this introgression, potentially by a gene or genes located in the latter part.
Genome assemblies have been published for some wheat relatives50,51,52,53,54,55, but not for all of them. Therefore, for building a wheat super-pangenome, it is more promising to conduct genome sequencing and assembly for CWRs than for elite cultivars that share a large part of the genome with already sequenced wheat cultivars56. With the genome assemblies of wheat CWRs and the described coverage method, large collections of wheat materials could be analysed to detect natural or induced introgressions, for example, in genebank accessions or breeding material.
Genebank accessions are heterogeneous to some extent, despite careful management including splitting of phenotypically different plants within an accession57,58,59. For this reason, introgression events and important alleles may be lost when using bulks or materials descended from a single seed60. Therefore, it needs to be discussed whether original genebank accessions should be split into independent lines on the basis of genetic data. It has been proposed that duplicates of genebank accessions should be removed to save money and reduce the time and materials needed for their management, while preserving genetic diversity61. We suggest that duplicates should not be identified by detecting SNPs in one seed per accession, but on the basis of a combination of SNPs and coverage analysis for multiple seeds per accession. The capacity gained by removing true duplicates could be used for splitting and maintaining additional genebank accessions.
Genomic data for entire genebank collections (https://www.pflanzenforschung.de/de/forschung-plant-2030/projekte/274/detail#english) will be soon be available for analysis. This could accelerate the breeding process, if accessions of crop species harbouring interesting introgressions could be identified and used immediately. Although we demonstrate the utility of the method for wheat, it could easily be applied to other species.
Plant materials and whole genome resequencing
For each selected genebank accession (Table S4), 10 seeds were retrieved from the Federal ex situ Genebank for Agricultural and Horticultural Plants of Germany maintained at the Leibniz Institute of Plant Genetics and Crop Plant Research (IPK) in Gatersleben. Seeds were sown in small pots (Ø 5 cm) filled with a soil mixture of 70% Substrat 1 (Fa. Klasmann-Deilmann, Geeste, Germany), 20% compost, and 10% sand. Seedlings were grown under controlled greenhouse conditions (14-h/10-h day/night, ~ 15–18/ ~ 12–15 °C day/night). For DNA extraction, fresh leaves were cut from four plants of each accession at the two-leaf stage. The leaves were dried with silica gel at room temperature for 10 days. At the three-leaf stage, plants were transferred to pots (Ø 14 cm) filled with a soil mixture of 40% Substrat 2 (Fa. Klasmann-Deilmann), 50% compost, and 10% sand, and grown under controlled greenhouse conditions (16-h/8-h day/night, ~ 20–23/ ~ 17–20 °C day/night) until maturity.
Total DNA was extracted separately from dried leaf tissue of four individuals per genebank accession using a DNeasy Plant Kit (QIAGEN, Hilden, Germany) according to the manufacturer’s instructions. The WGS library (Nextera DNA Flex, genomic DNA input: 100–500 ng) was prepared according to the standard protocols of the manufacturer (Illumina, Inc., San Diego, CA, USA). The library was quantified by qPCR (KAPA Library Quantification Kit; KAPA Biosystems, Wilmington, MA, USA) and sequenced on the NovaSeq 6000 platform (Illumina, Inc.; run type: SP PE 151) at the IPK.
Raw sequencing data were adapter- and quality-trimmed with Trim Galore (version 0.4.0; non default parameters: quality ≥ 30, read length ≥ 50; https://github.com/FelixKrueger/TrimGalore). Trimmed reads were individually mapped against the wheat RQAs using BWA-mem (v0.7.15-r1140) (Li, 2013). Unmapped reads, supplementary reads, and non-primary alignments were removed from mapped reads using SAMtools (–F 2308) before computing depth. Finally, depth was aggregated and visualised with R62. For visualisation, coverage was displayed on a logarithmic scale using log(x + ε) transformation with ε = 0.01.
For predicting resistance genes in introgressed regions, all 10 wheat RQAs63,64 were analysed using GeMoMa (version 1.8), a homology-based gene prediction program that allows annotations to be transferred from one genome to another. Known resistance genes located in the potential introgressions were used as reference genes. We used the known resistance genes Yr5 and Yr746, as well as TraesCS2A01G040000, which was identified in Chinese Spring using the primers AX_948171722AS and AX_945219402AS39, and corresponds to PGSB_gene_1945 in the cv. Jagger RQA35. Predicted genes were filtered based on the amino acid identity of their encoded products to the reference protein (≥ 80%).
We confirm that experimental research and field studies on plants (either cultivated or wild), including the collection of plant material, comply with relevant institutional, national, and international guidelines and legislation.
Publicly available data were downloaded from65 https://doi.org/10.5447/IPK/2019/18. Additional genomic data was downloaded from EMBL ENA with the following IDs: ERR2936519 and ERR2936122 (GBS data of cv. Vilmorin-27), SRR2061020 (Th. ponticum), SRR13484813 (WGS data of Ae. ventricosa), and PRJNA601245 (GBS data of Triticum and Ae. triuncialis). Own sequence data is available from ENA (EMBL-EBI) with the following ID PRJEB49121. Please note that several classification schemes have been proposed for wheat and Aegilops, e.g., based on morphological, cytogenetic, and genomic characteristics. Sometimes the published information is incorrect or misleading. In this study, we consider the taxon names used by Sharma et al.65. However, we cannot change existing database entries. Overviews of most important wheat and Aegilops classifications can be found in recently published literature65.
FAOSTAT. FAOSTAT Database Collections. http://faostat.fao.org/ (2018).
Venske, E., dos Santos, R. S., Busanello, C., Gustafson, P. & de Oliveira, A. C. Bread wheat: A role model for plant domestication and breeding. Hereditas 156, 16. https://doi.org/10.1186/s41065-019-0093-9 (2019).
Curtis, T. & Halford, N. Food security: The challenge of increasing wheat yield and the importance of not compromising food safety. Ann. Appl. Biol. 164, 354–372. https://doi.org/10.1111/aab.12108 (2014).
Oshunsanya, S. O., Nwosu, N. J. & Li, Y. in Sustainable Agriculture, Forest and Environmental Management (eds M. K. Jhariya, A. Banerjee, R. Swaroop Meena, & D. K. Yadav) 71–100 (Springer, 2019).
Dresselhaus, T. & Hückelhoven, R. Biotic and abiotic stress responses in crop plants. Agronomy 8, 267. https://doi.org/10.3390/agronomy8110267 (2018).
Hickey, L. T. et al. Breeding crops to feed 10 billion. Nat. Biotechnol. 37, 744–754. https://doi.org/10.1038/s41587-019-0152- (2019).
IWGSC. Shifting the limits in wheat research and breeding using a fully annotated reference genome. Science 361, 6403. https://doi.org/10.1126/science.aar7191 (2018).
Walkowiak, S. et al. Multiple wheat genomes reveal global variation in modern breeding. Nature 588, 277–283. https://doi.org/10.1038/s41586-020-2961-x (2020).
Kilian, B. et al. Crop Science special issue: Adapting agriculture to climate change: A walk on the wild side. Crop Sci. 61, 32–36. https://doi.org/10.1002/csc2.20418 (2021).
Hao, M. et al. The resurgence of introgression breeding, as exemplified in wheat improvement. Front. Plant Sci. 11, 252. https://doi.org/10.3389/fpls.2020.00252 (2020).
Wulff, B. B. H. & Moscou, M. J. Strategies for transferring resistance into wheat: From wide crosses to GM cassettes. Front. Plant Sci. 5, 3389. https://doi.org/10.3389/fpls.2014.00692 (2014).
Molnár-Láng, M., Ceoloni, C. & Doležel, J. Alien Introgression in Wheat 1st edn. (Springer, 2015).
Benavente, E., Cifuentes, M., Dusautoir, J. C. & David, J. The use of cytogenetic tools for studies in the crop-to-wild gene transfer scenario. Cytogenet. Genome Res. 120, 384–395. https://doi.org/10.1159/000121087 (2008).
Friebe, B., Jiang, J., Raupp, W. J., McIntosh, R. A. & Gill, B. S. Characterization of wheat-alien translocations conferring resistance to diseases and pests: Current status. Euphytica 91, 59–87. https://doi.org/10.1007/BF00035277 (1996).
Badaeva, E. D., Budashkina, E. B., Badaev, N. S., Kalinina, N. P. & Shkutina, F. M. General features of chromosome substitutions in Triticum aestivum x T. timopheevii hybrids. Theor. Appl. Genet. 82, 227–232. https://doi.org/10.1007/BF00226218 (1991).
Badaeva, E. D. et al. Genetic classification of Aegilops columnaris Zhuk. (2n = 4x = 28, UcUcXcXc) chromosomes based on FISH analysis and substitution patterns in common wheat × Ae. columnaris introgressive lines. Genome 61, 131–143. https://doi.org/10.1139/gen-2017-0186 (2018).
Wendler, N. et al. Bulbosum to go: A toolbox to utilize Hordeum vulgare/bulbosum introgressions for breeding and beyond. Mol. Plant. 8, 1507–1519. https://doi.org/10.1016/j.molp.2015.05.004 (2015).
Scholten, O. E. et al. SNP-markers in Allium species to facilitate introgression breeding in onion. BMC Plant Biol. 16, 187. https://doi.org/10.1186/s12870-016-0879-0 (2016).
Keilwagen, J. et al. Detecting large chromosomal modifications using short read data from genotyping-by-sequencing. Front. Plant Sci. 10, 1133. https://doi.org/10.3389/fpls.2019.01133 (2019).
Dvořák, J., McGuire, P. E. & Cassidy, B. Apparent sources of the A genomes of wheats inferred from polymorphism in abundance and restriction fragment length of repeated nucleotide sequences. Genome 30, 680–689. https://doi.org/10.1139/g88-115 (1988).
Dvořák, J., Terlizzi, P. D., Zhang, H.-B. & Resta, P. The evolution of polyploid wheats: Identification of the A genome donor species. Genome 36, 21–31. https://doi.org/10.1139/g93-004 (1993).
Kihara, H. Die entdeckung des DD-analysators beim weizen. Agric. Hortic. 19, 889–890 (1944).
Feldman, M. & Levy, A. A. in Alien Introgression in Wheat: Cytogenetics, Molecular Biology, and Genomics (eds M. Molnár-Láng, C. Ceoloni, & J. Doležel) 21–76 (Springer, 2015).
Feldman, M. & Levy, A. Allopolyploidy: A shaping force in the evolution of wheat genomes. Cytogenet. Genome Res. 109, 250–258. https://doi.org/10.1159/000082407 (2005).
Kilian, B. et al. Independent wheat B and G genome origins in outcrossing Aegilops progenitor haplotypes. Mol. Biol. Evol. 24, 217–227. https://doi.org/10.1093/molbev/msl151 (2007).
Bernhardt, N. et al. Genome-wide sequence information reveals recurrent hybridization among diploid wheat wild relatives. Plant J. 102, 493–506. https://doi.org/10.1111/tpj.14641 (2020).
Hyun, D. Y. et al. Genotyping-by-sequencing derived single nucleotide polymorphisms provide the first well-resolved phylogeny for the genus Triticum (Poaceae). Front. Plant Sci. https://doi.org/10.3389/fpls.2020.00688 (2020).
Liu, C. J., Atkinson, M. D., Chinoy, C. N., Devos, K. M. & Gale, M. D. Nonhomoeologous translocations between group 4, 5 and 7 chromosomes within wheat and rye. Theor. Appl. Genet. 83, 305–312. https://doi.org/10.1007/BF00224276 (1992).
Jorgensen, C. et al. A high-density genetic map of wild emmer wheat from the Karaca Dağ region provides new evidence on the structure and evolution of wheat chromosomes. Front. Plant Sci. https://doi.org/10.3389/fpls.2017.01798 (2017).
Jiang, J. & Gill, B. S. Different species-specific chromosome translocations in Triticum timopheevii and T. turgidum support the diphyletic origin of polyploid wheats. Chromosome Res. 2, 59–64. https://doi.org/10.1007/BF01539455 (1994).
Berkman, P. J. et al. Sequencing wheat chromosome arm 7BS delimits the 7BS/4AL translocation and reveals homoeologous gene conservation. Theor. Appl. Genet. 124, 423–432. https://doi.org/10.1007/s00122-011-1717-2 (2012).
Martynov, S. & Dobrotvorskyi, D. Genetic Resources Information and Analytical System (GRIS) for Wheat and Triticale. http://wheatpedigree.net/. (2012).
McIntosh, R. A. et al. in 12th International Wheat Genetics Symposium (2013).
Allard, R. & Shands, R. Inheritance of resistance to stem rust and powdery mildew in cytologically stable spring wheats derived from Triticum timopheevi. Phytopathology 44, 266–274 (1954).
Gao, L. et al. The Aegilops ventricosa 2NvS segment in bread wheat: Cytology, genomics and breeding. Theor. Appl. Genet. 134, 529–542. https://doi.org/10.1007/s00122-020-03712-y (2021).
Bariana, H. & McIntosh, R. Cytogenetic studies in wheat. XV. Location of rust resistance genes in VPM1 and their genetic linkage with other disease resistance genes in chromosome 2A. Genome 36, 476–482. https://doi.org/10.1139/g93-065 (1993).
Helguera, M. et al. PCR assays for the Lr37-Yr17-Sr38 cluster of rust resistance genes and their use to develop isogenic hard red spring wheat lines. Crop Sci. 43, 1839–1847. https://doi.org/10.2135/cropsci2003.1839 (2003).
Kilian, B. et al. Wild Crop Relatives: Genomic and Breeding Resources 1–76 (Springer, 2011).
Rani, K. et al. A novel leaf rust resistance gene introgressed from Aegilops markgrafii maps on chromosome arm 2AS of wheat. Theor. Appl. Genet. 133, 2685–2694. https://doi.org/10.1007/s00122-020-03625-w (2020).
Voss-Fels, K. P. et al. Breeding improves wheat productivity under contrasting agrochemical input levels. Nat. Plants 5, 706–714. https://doi.org/10.1038/s41477-019-0445-5 (2019).
Chen, S. et al. Stripe rust resistance gene Yr34 (synonym Yr48) is located within a distal translocation of Triticum monococcum chromosome 5AmL into common wheat. Theor. Appl. Genet. https://doi.org/10.1007/s00122-021-03816-z (2021).
The, T. Chromosome location of genes conditioning stem rust resistance transferred from diploid to hexaploid wheat. Nat. New Biol. 241, 256–256. https://doi.org/10.1038/newbio241256a0 (1973).
Shimizu, K. K. et al. De novo genome assembly of the japanese wheat cultivar norin 61 highlights functional variation in flowering time and fusarium-resistant genes in east asian genotypes. Plant Cell Physiol. 62, 8–27. https://doi.org/10.1093/pcp/pcaa152 (2020).
Zemetra, R. S., Hansen, J. & Mallory-Smith, C. A. Potential for gene transfer between wheat (Triticum aestivum) and jointed goatgrass (Aegilops cylindrica). Weed Sci. 46, 313–317. https://doi.org/10.1017/S0043174500089475 (1998).
Bush, W. S. & Moore, J. H. Genome-wide association studies. PLoS Comput. Biol. 8, e1002822. https://doi.org/10.1371/journal.pcbi.1002822 (2012).
Marchal, C. et al. BED-domain-containing immune receptors confer diverse resistance spectra to yellow rust. Nat. Plants 4, 662–668. https://doi.org/10.1038/s41477-018-0236-4 (2018).
Hayes, H. K. et al. Thatcher Wheat (Springer, 1936).
Wilson, S. II. Wheat and rye hybrids. Trans. Bot. Soc. Edinb. 12, 286–288. https://doi.org/10.1080/03746607309469536 (1873).
Vilmorin, H. L. Les Meilleurs Blés: Description et Culture des Principales Variétés de Froments D’hiver et de Printemps (Vilmorin-Andrieux et cie, 1880).
Ling, H.-Q. et al. Genome sequence of the progenitor of wheat A subgenome Triticum urartu. Nature 557, 424–428. https://doi.org/10.1038/s41586-018-0108-0 (2018).
Luo, M.-C. et al. Genome sequence of the progenitor of the wheat D genome Aegilops tauschii. Nature 551, 498–502. https://doi.org/10.1038/nature24486 (2017).
Maccaferri, M. et al. Durum wheat genome highlights past domestication signatures and future improvement targets. Nat. Genet. 51, 885–895. https://doi.org/10.1038/s41588-019-0381-3 (2019).
Li, L.-F. et al. Genome sequences of the five Sitopsis species of Aegilops and the origin of polyploid wheat B-subgenome. BioRxiv https://doi.org/10.1101/2021.07.05.444401 (2021).
Avni, R. et al. Wild emmer genome architecture and diversity elucidate wheat evolution and domestication. Science 357, 93–97. https://doi.org/10.1126/science.aan0032 (2017).
Avni, R. et al. Genome sequences of Aegilops species of section Sitopsis reveal phylogenetic relationships and provide resources for wheat improvement. BioRxiv https://doi.org/10.1101/2021.08.09.455628 (2021).
Khan, A. W. et al. Super-pangenome by integrating the wild side of a species for accelerated crop improvement. Trends Plant Sci. 25, 148–158. https://doi.org/10.1016/j.tplants.2019.10.012 (2020).
Hamilton, N. R. S., Engels, J. M. & Van Hintum, T. J. Accession Management: Combining or Splitting Accessions as a Tool to Improve Germplasm Management Efficiency (Bioversity International, 2002).
Cross, R. J. & Wallace, A. R. Loss of genetic diversity from heterogeneous self-pollinating genebank accessions. Theor. Appl. Genet. 88, 885–890. https://doi.org/10.1007/BF01254001 (1994).
Lehmann, C. O. & Mansfeld, R. Zur technik der sortimentserhaltung. Die Kulturpflanze 5, 108–138. https://doi.org/10.1007/BF02095492 (1957).
Kyratzis, A. C., Nikoloudakis, N. & Katsiotis, A. Genetic variability in landraces populations and the risk to lose genetic variation: The example of landrace ‘Kyperounda’ and its implications for ex situ conservation. PLoS ONE 14, e0224255. https://doi.org/10.1371/journal.pone.0224255 (2019).
Singh, N. et al. Efficient curation of genebanks using next generation sequencing reveals substantial duplication of germplasm accessions. Sci. Rep. 9, 650. https://doi.org/10.1038/s41598-018-37269-0 (2019).
R Core Team. R: A Language and Environment for Statistical Computing. (R Foundation for Statistical Computing, 2019) http://www.R-project.org/.
Keilwagen, J. et al. Using intron position conservation for homology-based gene prediction. Nucleic Acids Res. 44, e89–e89. https://doi.org/10.1093/nar/gkw092 (2016).
Keilwagen, J., Hartung, F., Paulini, M., Twardziok, S. O. & Grau, J. J. B. B. Combining RNA-seq data and homology-based gene prediction for plants, animals and fungi. BMC Bioinform. 19, 189. https://doi.org/10.1186/s12859-018-2203-5 (2018).
Sharma, S. et al. Introducing beneficial alleles from plant genetic resources into the wheat germplasm. Biology https://doi.org/10.3390/biology10100982 (2021).
We thank Edit Lantos and Ines Walde for excellent technical assistance. We are greatly indebted to Marta Molnár-Láng, Frank Ordon, Antje Rohde, Shivali Sharma, Hakan Özkan, Carolina Sansaloni, Peter Werner for discussions and support. This work was supported by the Julius Kuehn Institute (JKI, Quedlinburg, Germany), the Leibniz Institute of Plant Genetics and Crop Research (IPK, Gatersleben, Germany), the Russian Science Foundation (grant no. 21-76-30003); and the Crop Wild Relatives Project (Adapting Agriculture to Climate Change: Collecting, Protecting and Preparing Crop Wild Relatives) which is supported by the Government of Norway and managed by the Global Crop Diversity Trust [https://www.cwrdiversity.org/].
Open Access funding enabled and organized by Projekt DEAL.
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Keilwagen, J., Lehnert, H., Berner, T. et al. Detecting major introgressions in wheat and their putative origins using coverage analysis. Sci Rep 12, 1908 (2022). https://doi.org/10.1038/s41598-022-05865-w
This article is cited by
Strategies for utilization of crop wild relatives in plant breeding programs
Theoretical and Applied Genetics (2022)
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.