Discovery of beneficial haplotypes for complex traits in maize landraces

Genetic variation is of crucial importance for crop improvement. Landraces are valuable sources of diversity, but for quantitative traits efficient strategies for their targeted utilization are lacking. Here, we map haplotype-trait associations at high resolution in ~1000 doubled-haploid lines derived from three maize landraces to make their native diversity for early development traits accessible for elite germplasm improvement. A comparative genomic analysis of the discovered haplotypes in the landrace-derived lines and a panel of 65 breeding lines, both genotyped with 600k SNPs, points to untapped beneficial variation for target traits in the landraces. The superior phenotypic performance of lines carrying favorable landrace haplotypes as compared to breeding lines with alternative haplotypes confirms these findings. Stability of haplotype effects across populations and environments as well as their limited effects on undesired traits indicate that our strategy has high potential for harnessing beneficial haplotype variation for quantitative traits from genetic resources.

H arnessing the allelic diversity of genetic resources is considered essential for overcoming the challenges of climate change and for meeting future demands on crop production 1,2 . For most traits of agronomic importance, modern breeding material captures only a fraction of the available diversity within crop species 1 . In the case of maize (Zea mays L.), today's elite germplasm went through several bottlenecks, first by geographical dispersion from its center of origin 3,4 , second through the selection of only a few key ancestors sampled from a small number of landraces to establish heterotic groups 5,6 , and third through decades of advanced cycle breeding with high selection intensities 7,8 . For traits that were not targets of selection in the past, but are important today, like abiotic stress tolerance and resource-use efficiency 9 , this might have resulted in the loss of favorable alleles during the breeding process. In addition, unfavorable alleles might have become fixed during the selection process due to drift and/or hitchhiking effects [10][11][12] .
Impressive examples exist where introgression of alleles from genetic resources has improved mono-or oligogenic traits [13][14][15] , but for broadening the genetic diversity of complex traits, such as yield or abiotic stress tolerance successful examples are scarce 2 . Up to date, the genomic characterization of genetic resources has been based predominantly on sampling individuals across a wide range of accessions, maximizing the level of diversity in the genetic material under study 2,[16][17][18][19][20] . Such diverse samples are characterized by high variation in adaptive traits and strong population structure, leading to spurious associations and limited power for detecting associations with nonadaptive traits of agronomic importance 21,22 . Furthermore, alleles which are locally common, but globally rare likely remain undetected in broad, species-wide samples, whereas in a more targeted approach they might show sufficiently high frequencies for detection 22 .
Here, we propose a genome-based strategy (Supplementary Fig. 1) for making native diversity of maize landraces accessible for improving quantitative traits, showing limited genetic variation in elite germplasm, such as cold tolerance and early plant development [23][24][25] . Capitalizing on low levels of linkage disequilibrium (LD), we map haplotype-trait associations at high resolution in~1000 doubled-haploid (DH) lines derived from three European flint maize landraces. The genetic material has been preselected for adaptation to target environments to avoid confounding effects of strong adaptive alleles as suggested by Mayer et al. 26 . We assess promising haplotypes genotypically by quantifying their frequency in a diverse panel of 65 European flint breeding lines. Phenotypically, we evaluate the direction and magnitude of haplotype effects relative to a subset of breeding lines. Many of the discovered haplotypes show stable trait associations across populations and environments. In addition, most of them do not exhibit undesired trait associations, making them ideal for introgression into elite germplasm. We show that our strategy to sample comprehensively individuals from a limited set of preselected landraces is successful in linking molecular variation to meaningful phenotypes, and in identifying alleles for quantitative traits that will enrich the genetic diversity of our crops.

Results
Molecular variation in landraces and breeding lines. The genetic differentiation of 941 DH lines derived from three landraces (Kemater Landmais Gelb, KE; Lalin, LL; and Petkuser Ferdinand Rot, PE) and a diverse panel of 65 European breeding lines 27 based on principal coordinate analysis (PCoA) with 501,124 single-nucleotide polymorphism (SNP) markers is shown in Fig. 1a. The first principal coordinate explained 6.2% of the molecular variation and separated the landrace-derived and the breeding lines based on their geographical origin within Europe from northeast (Germany) to southwest (southern France, Spain). The second principal coordinate explained 5.4% of the variation and separated the two landraces KE and PE from the panel of breeding lines. Diversity parameters polymorphism information content (PIC), gene diversity (H), and minimum number of historical recombination events (nR) were higher in the set of breeding lines compared to the three landrace-derived DH libraries, irrespectively if they were calculated from SNP or haplotype information (Supplementary Table 1 of different source populations across Europe 27 . Combined across landraces, the DH lines almost reached the level of diversity of the breeding line panel. In total, the landrace and breeding line panels comprised 356,724 and 363,290 haplotypes (Fig. 1a), constructed for nonoverlapping windows of ten SNPs, corresponding to an average of 7.12 and 7.25 haplotypes per window, respectively. As expected for genetic material originating from the same germplasm group (European flint maize), haplotype frequencies were positively correlated (Pearson's r = 0.74, P < 2.2e−16) between the two panels (Fig. 1b). Overall, 26.2% of the haplotypes of the landrace panel were not present in the breeding lines, indicating untapped haplotype variation. For those haplotypes, median and mean frequencies in the landrace panel were 0.005 and 0.039, respectively. Only 2.7% of those haplotypes occurred in all three landraces, whereas 82.8% occurred in only one landrace. Within the respective individual landraces their median and mean frequencies increased to 0.065 and 0.101, respectively. The landrace panel captured 72.4% of the haplotypes present in the panel of breeding lines.
Trait-associated genomic regions. A key question for the targeted utilization of genetic resources is, if molecular inventories of landrace-derived material are predictive of their potential to improve traits of agronomic importance. Using SNPs and haplotypes for genome-wide association scans (GWAS), we identified associations for all nine traits under study. Results were very similar for both types of genomic information, as exemplarily shown in Supplementary Fig. 2 for the trait tillering (TILL). As haplotypes are more informative than biallelic SNPs for the comparison with breeding lines, we focused on haplotypes in further analyses. Traitassociated genomic regions were defined based on LD between significant haplotypes ("Methods" section; Fig. 2 and Supplementary Data 1). As landraces were preselected for variation in early plant development 26,28 , most associations (37)(38)(39)(40)(41)(42)(43)(44)(45)(46)(47)(48)(49)(50)(51)(52)(53)(54)(55) were detected for the traits early vigor (EV_V4/V6) and early plant height (PH_V4/ V6). Haplotypes explained between 2 (female flowering time, FF) and 57% (lodging, LO) of the total genetic variance of the respective traits (Fig. 2). Despite the large sample size (n = 899), the proportion of genetic variance explained might be somewhat overestimated 29,30 , and thus has to be interpreted with caution. Only few genomic regions were detected for flowering time, indicating that alleles with large effects were fixed during adaptation of the respective landraces to their geographical region, thus having little impact on GWAS for other traits.
Average r 2 decay distances (r 2 < 0.2) within the three DH libraries were 203 (LL), 484 (PE), and 973 kb (KE), and 201 kb for the combined set. This is consistent with previous results 26 and warrants high mapping resolution in the three DH libraries under study. For comparison, the diverse panel of 65 breeding lines across Europe exhibited an average r 2 decay distance of 107 kb. The lower LD level in the breeding line panel can be explained by admixture of many different source populations with varying linkage phases, which is generally undesired in GWAS. The median size of genomic regions associated with the nine traits under study was 92 kb, with a median number of three annotated genes per region ( Supplementary Fig. 3), enabling prediction of candidate genes and functional analyses. Only for a few regions (<5%) resolution was not optimal, as they comprised >100 annotated genes. Mapping resolution in the three DH libraries is best demonstrated by an example of an already well-characterized locus: teosinte branched 1 (tb1). The gene tb1 played a major role in the transition from highly branched teosinte to maize with strongly reduced branch development 31 . In our study, a strong significant association for TILL was found in a genomic region on chromosome 1 comprising the tb1 locus (size 1.3 Mb, including in total 22 genes; Supplementary Data 1 and Supplementary Fig. 2). In silico fine-mapping in the respective region ("Methods" section) identified a ten-SNP window, which overlapped perfectly with tb1 and its regulatory upstream region.
Effect size and stability of trait-associated haplotypes. The potential of the identified landrace haplotypes for elite germplasm improvement depends on the size and direction of their effects on the traits of interest, their environmental stability and their dependence on the genetic background. In a given trait-associated genomic region, one window of ten SNPs comprising several haplotypes was selected. Significant haplotypes, hereafter referred to as focus haplotypes, entered into a multi-environment model ( Supplementary Fig. 4) and were classified into favorable, unfavorable, or interacting based on the direction and stability of their effects in the different test environments ( Supplementary Fig. 5). According to this categorization scheme, a high number of favorable haplotypes for early plant development traits were found in the DH libraries (Table 1 and Fig. 3a), representing potential candidates for introgression into elite germplasm. For the undesirable traits LO and TILL, many haplotypes had unfavorable effects. Overall, haplotypes identified for all nine traits showed moderate to high effect stability across environments, with similar patterns for favorable and unfavorable haplotypes (Fig. 3a, b). The dependency of haplotype effects on the genomic background can be evaluated comparing effect significance and sign of the identified focus haplotypes between landraces. From the 48 haplotypes associated with PH_V6, 19 haplotypes were present in both KE and PE. Together, these 19 haplotypes showed 115 environment-specific haplotype-trait associations, of which 35 (30%) were significant for both landraces ( Supplementary Fig. 6a). All of those 35 associations had equal effect signs for both landraces. Also for the 80 environment-specific associations significant for only one of the two landraces, a large majority (90%) had equal effect signs for both landraces. Similar patterns were observed for PH_V4 ( Supplementary Fig. 6b).
Haplotype congruency in landraces and breeding lines. The ultimate criterion for assessing the usefulness of landrace haplotypes for germplasm improvement is their frequency in breeding material. If favorable haplotypes are already present at high frequency in the genetic material to be improved, they are of no additional value. We assessed the frequencies of the identified trait-associated focus haplotypes in a panel of 65 breeding lines based on genotypic data. When tracking an ancestral haplotype potentially shared between landrace and breeding material, recombination might have broken up the respective haplotype, but the trait-associated causal mutation might still be present.  pointed to a low probability of haplotypes being broken up by recombination. Frequency distributions of favorable landrace haplotypes in the 65 breeding lines are given for early development traits (EV_V4, EV_V6, PH_V4, and PH_V6) in Fig. 4. As the haplotypes identified for each of the four single traits (Table 1) were partly from similar genomic regions, we only considered 53 favorable haplotypes with a minimum distance of 1 Mb and/or r 2 < 0.8. The frequency of favorable haplotypes (mean = 0.20) was significantly increased (P < 0.01) compared to randomly drawn haplotypes (mean = 0.16). Six favorable focus haplotypes (11%) were absent in the set of breeding lines, and thus have potential for elite germplasm improvement. The mean frequency of 80 unfavorable haplotypes associated with early plant development did not differ significantly (P > 0.30) from the frequency of random haplotypes. A substantial proportion of unfavorable haplotypes (27.5%) were common in the breeding lines (Fig. 4), suggesting that a targeted substitution with favorable haplotypes could lead to further germplasm improvement.
Linking haplotype variation to phenotypes. The potential of individual focus haplotypes to improve elite germplasm has to be evaluated phenotypically, comparing the performance of landrace-derived lines carrying these focus haplotypes and elite lines of the breeding pool one aims to improve. Using a subset of the breeding line panel (n = 14) phenotyped at six locations in 2017, we report exemplarily the results of such comparisons for two genomic regions on chromosomes 3 and 9, found to affect PH_V6 in the GWAS analysis (Fig. 5). On chromosome 3, the focus haplotype (haplotype A in Fig. 3a and Supplementary  Fig. 5) was localized in a ten-SNP window, which explained 4.8% of the genetic variation for PH_V6 and comprised eight additional haplotypes in the DH lines. The focus haplotype had a frequency of 4.1% in the DH lines, outperformed six out of the eight alternative haplotypes significantly and was absent in the panel of breeding lines. A proportion of 93.8% of the 65 breeding lines carried one of the six haplotypes with significant negative effects relative to the focus haplotype (on average 0.61 standard deviations) in almost all environments. The remaining breeding lines (6.2%) carried a haplotype absent in the landrace panel, and thus without effect estimate. Averaged across environments, DH lines carrying the focus haplotype showed an increase of 6.06 cm over breeding lines, but the difference was not significant (P > 0.056; Fig. 5a). When looking at individual environments, however, significant differences (P < 0.044) were observed for locations OLI, EIN, and ROG ( Supplementary Fig. 7a), which showed the lowest temperatures in the field 28 , suggesting that the relative advantage of the identified haplotype might be temperature dependent.
On chromosome 9 in a genomic region of~3 Mb, three independent focus haplotypes affected PH_V6 significantly (two favorably, one unfavorably). One of the three focus haplotypes (haplotype D in Fig. 3a and Supplementary Fig. 5) increased PH_V6 compared to the six alternative haplotypes in the respective window. The genetic variance explained by the haplotypes in this window was small (1.7%) most likely due to the low frequency (0.4%) of the focus haplotype in the DH lines. The focus haplotype was absent in the panel of 65 breeding lines. Instead, 95.4% of the breeding lines carried one of the six inferior haplotypes, while 4.6% carried haplotypes not present in the landrace panel. DH lines carrying the focus haplotype showed a significant increase of 15.1 cm compared to the breeding lines (P < 0.009). Similar as for the haplotype on chromosome 3, the difference was most pronounced in environments showing low temperature during early plant development ( Supplementary  Fig. 7b).
We also assessed genomic regions in more detail where the focus haplotype was unfavorable like, for example, the window comprising the tb1 locus, which explained 13.1% of the genetic variance for TILL in the landrace panel. DH lines carrying the unfavorable focus haplotype showed a significant increase of 1.51 scores compared to the 14 phenotyped breeding lines not carrying the haplotype (Supplementary Fig. 8a; P < 0.0001). Here, the focus haplotype was carried by only two of the 65 breeding lines, but for other genomic regions associated with TILL frequencies were higher, e.g., 15.5% for a region on chromosome 5 explaining 6.6% of the genetic variance in the DH lines. In this case, DH lines carrying the focus haplotype showed a significant increase of 1.69 scores compared to 13 breeding lines not carrying the haplotype (Supplementary Fig. 8b; P < 0.0004). For a genomic region on chromosome 1 associated with EV_V4 ( Supplementary  Fig. 9), more than half of the 65 breeding lines carried the unfavorable focus haplotype, including six of the 14 phenotyped lines. The window in which the focus haplotype was located comprised four additional haplotypes and accounted for 5.1% of the genetic variance in the DH lines. We tested the effect of the focus haplotype in the 14 breeding lines and found a significant difference of 0.875 scores between lines with and without the focus haplotype (P < 0.039, Supplementary Fig. 9), indicating that a targeted substitution of the focus haplotype with one of the alternative haplotypes could lead to germplasm improvement.
Introducing landrace alleles into elite germplasm for a target trait comes at the risk of undesired effects on other traits due to pleiotropy or linkage. We tested the identified focus haplotypes for each of the early plant development traits in bivariate models for significant effects on other traits (PH_final, FF, MF, LO, and TILL). Of the 53 favorable haplotypes referred to in Fig. 4, 20 had a significant effect on at least one out of the five other traits. Thereof, only three haplotypes increased LO or TILL, whereas four haplotypes slightly decreased LO or TILL. Fourteen   Supplementary Fig. 10a increasing LO). An enrichment of such haplotypes in the breeding germplasm is therefore not advisable. In contrast, haplotypes which explained more of the genetic variance for early plant development than for other traits (e.g., haplotypes E or G in Supplementary Fig. 10a) can still be used for improving germplasm for early plant development resulting in only slightly altered flowering time and/or PH_final. Of the 80 focus haplotypes unfavorable for early plant development (Fig. 4), 48 were significant for at least one other trait. Thereof, 14 haplotypes decreased TILL, while 40 decreased PH_final and/or delayed flowering. However, most of them had only moderate effects on these traits ( Supplementary Fig. 10b). Therefore, in many cases selection against those haplotypes can still be recommended.

Discussion
The importance of genetic variation for selection and genetic improvement of crops is undisputed. Genetic resources of domesticated species, such as landraces, are a valuable source of diversity for broadening the genetic base of elite germplasm 1 . However, efficient strategies for utilizing this native diversity for the improvement of quantitative traits are lacking. Here, we developed a strategy to discover beneficial haplotypes for quantitative traits in maize landraces ( Supplementary Fig. 1). The combination of comprehensive molecular inventories and meaningful phenotypes collected in landrace-derived DH libraries in multi-environment trials allowed detection of haplotype-trait associations for quantitative traits with limited genetic variation in elite material. Even though the DH libraries were derived from only three preselected populations, 26% of landrace haplotypes were absent in the panel of breeding lines, representing the allelic diversity of multiple diverse source populations 27 . While most of these haplotypes can be expected to be neutral 32 or disadvantageous, some might represent beneficial novel variation. Landraces represent self-contained populations adapted to their geographical origin 33 . By focusing on diversity within rather than across landraces, confounding effects of strong adaptive alleles are avoided. Consequently, individual trait-associated haplotypes are expected to have moderate to small effects only. Our results meet these expectations. The majority of haplotypetrait associations detected in the DH libraries explained <5% of the genetic variance for all traits under study, including flowering time. However, as shown for the haplotype affecting PH_V6 on chromosome 9 (Fig. 5b), the genetic variance explained in GWAS is not only a function of effect size, but also of haplotype frequency. As DH and breeding lines were sampled from the same germplasm group (European flint maize), haplotype frequencies were positively correlated between the two panels (Fig. 1b). This exemplifies one of the key challenges when searching for untapped variation for quantitative traits, as haplotypes absent in the breeding material tend to have low frequencies also in landraces with shared historical ancestry. Focusing on a set of landraces preselected for variation in target traits increases the chances that they harbor alleles at frequencies large enough to be detected in GWAS. The success of this strategy was reflected in the high number of significant haplotype-trait associations found for target traits early vigor and early plant height.
The large sample of landrace-derived DH lines employed in this study enabled mapping of haplotypes with moderate effect size and comparably low frequency, but as is known for GWAS studies, some of these significant trait associations might be spurious 34 . Here, the sequential determination of significance ( Supplementary Fig. 4) should have minimized the proportion of false positives 35 . In addition, the haplotype-based approach enabled tracking of ancestral alleles between landrace-derived and breeding material, and the phenotypic comparison between the two groups supported the usefulness of identified haplotypes for germplasm improvement. Nevertheless, the construction of haplotypes in landrace-derived material warrants further research. Different methods for haplotype construction exist, generating population-specific haplotype blocks based on LD 36,37 or linkage 38 . Here, we used fixed window sizes, as it is advantageous in comparing haplotype frequencies across datasets varying in their extent of LD. The choice of window size depends on the available marker density and affects the number of haplotypes per window as well as the risk of haplotypes being broken up by recombination. Thus, defining the haplotype inventories of landraces and comparing them to elite germplasm is not trivial. Comprehensive sampling of individuals or lines from a limited number of landraces mitigates difficulties in haplotype  Fig. 3a; PE_Focus, 38 lines) and b a focus haplotype on chromosome 9 (haplotype D in Fig. 3a; PE_Focus; based on three data points only). Vertical lines indicate the mean of each group. The difference in means between BL_all and PE_all was not significant (P > 0.514; permutation test, two-sided). Source data are provided as a Source data file.
construction and at the same time warrants sufficient statistical power and mapping resolution in GWAS through absence of pronounced population structure, rapid decay of LD, and consistency of linkage phases 26 . Here, we put this strategy into practice and showed its potential in identifying favorable alleles not present in breeding lines for improving quantitative traits. For early development traits, overall performance did not differ significantly between the DH libraries and the subset of phenotyped breeding lines, but DH lines carrying specific focus haplotypes not present in breeding lines outperformed the set of breeding lines significantly in environments favoring trait differentiation. This is a first step toward identifying haplotypes from genetic resources for germplasm improvement, but the final proof of concept will have to come from crosses of landracederived material with elite material. As landraces represent openpollinated populations, background dependency of the identified trait-associated haplotypes should not be as pronounced as in mapping populations tracing back to few genetic founders, such as multi-or biparental crosses. In our study, the vast majority of trait-associated haplotypes occurring in landraces KE and PE had equal effect signs across landraces and environments, supporting this hypothesis. In addition, for cases where it was possible to contrast different haplotypes in the breeding lines (Supplementary Fig. 9), the effect of the focus haplotype in the breeding lines was consistent with the effect in the DH lines. If the selected landraces and the target germplasm to be improved share historical ancestry, we expect only minor genetic background effects when introducing favorable haplotypes discovered in landraces into elite material.
After identification of trait associations, fine-mapping of the respective genomic regions and functional characterization of candidate genes is a logical next step. With a limited number of annotated genes per trait-associated genomic region, high mapping resolution was obtained in this study. The envisaged functional validation of relevant haplotypes opens many options for utilization: targeted allele mining from genetic resources, unlocking diversity trapped in disadvantageous or incompatible haplotypes, broadening the genetic diversity at relevant loci in elite germplasm, and improvement of unfavorable haplotypes through gene editing 39 . In addition to targeted haplotype management, genome-wide approaches will also profit from functional knowledge. Pre-breeding programs 2 might be accelerated through the use of genome-based prediction 40,41 . It has been shown that the prediction accuracy is increased if known trait associations are included as fixed effects in prediction models 42 . As our results indicate high stability of haplotype effects across environments and genetic background, as well as limited haplotype-induced correlations between traits, the prospects of the germplasm improvement through the use of landrace-derived material are promising.
By successfully linking molecular inventories of landraces to meaningful phenotypes and identifying beneficial variation for quantitative traits of agronomic importance, the results of this study represent a first step toward the long-term goal of accessing native biodiversity in an informed and targeted way. The strategy proposed in this study and demonstrated experimentally with the European flint germplasm can be extended to other maize germplasm groups and even to other allogamous crop species. The key to an efficient use of genetic resources is to understand how genomic information of gene bank accessions can be translated into plant performance 43 . We envision a future where haplotypes characterized for their genomic structure, allele content and functional relevance can be freely moved between populations. Our goal is to create plants with novel combinations of alleles that will lead to varieties with novel combinations of traits, thus securing sustainable crop production in a changing world.

Methods
Plant materials. We generated >1000 DH lines derived from three European maize landraces: Kemater Landmais Gelb (KE), Lalin (LL) and Petkuser Ferdinand Rot (PE) 28 . The landraces were preselected for phenotypic variation in cold-related traits assessed in field trials and population genetic analyses 26 . The set of breeding lines used in this study was selected from a broad panel of 68 flint lines 27 . The initial dataset included two US sweetcorn lines, IL14H and P39, which we excluded from our analyses. The remaining 66 lines, released between~1950 and 2010, were selected to represent the genetic diversity of the European flint elite breeding germplasm. The panel also includes prominent founder lines like EP1, F2, F7, and DK105 (ref. 44 ).
Genotypic data. In total, 1015 landrace-derived DH lines were genotyped with the 600k Affymetrix® Axiom® Maize Array 45 . After stringent quality filtering 28 , 941 lines (KE = 501, LL = 31, and PE = 409), and 501,124 markers mapped to B73 AGPv4 (ref. 46 ) remained for genetic analyses. Calls indicating heterozygosity (0.19%) were set to missing as in DH lines they can be assumed to result from technical artefacts, and all missing values were imputed separately for each landrace using Beagle version 5.0 (ref. 47 ) with default settings. From the set of 66 breeding lines, 64 lines were genotyped with the same 600k array 27 , whereas for two lines (EZ5 and F64) overlapping SNP positions (85%) were extracted from the HapMap data 48 , which is based on whole-genome sequences. For making the 600k genotyping data comparable to the HapMap data, all alleles were coded according to the B73 AGPv4 (ref. 46 ) forward strand. The breeding line data were filtered for the 501,124 high-quality markers of the set of DH lines. Applying the same quality filter criteria as for the DH panel (heterozygous calls < 5%; callrate > 90%, except for EZ5 and F64 with callrate >84%), one breeding line (FV66) was removed due to an increased number of heterozygous calls. For the remaining 65 lines, calls indicating heterozygosity (0.31%) were set to missing and missing values imputed using Beagle version 5.0 (ref. 47 ) with default settings. For the combined set of landrace-derived DH lines and breeding lines, PCoA 49 was conducted based on modified Rogers' distances 50 , using the R-package ape version 5.3 (ref. 51 ). Pairwise r 2 (ref. 52 ) between SNPs within 1 Mb distance was calculated for the DH libraries (within and across the three landraces) and the panel of breeding lines, respectively. Average LD decay distance (r 2 < 0.2) was estimated using nonlinear regression 53 . If not denoted otherwise, analyses were done using R version 3.6.0 (ref. 54 ). For plotting of results, R-packages ggplot2 version 3.2.0 (ref. 55 ), plot3D version 1.3, and VennDiagram version 1.6.20 were used.
Phenotypic data. In total, 958 DH lines were phenotyped for 25 traits in replicated field trials 28 . Briefly, line per se performance was evaluated in five locations across Germany, and in two locations in northern Spain in 2017 and 2018, resulting in up to 11 environments (location by year combinations) per trait. In each environment, up to ten separate 10 × 10 lattice designs with two replicates per DH line were used. In addition to the DH lines, 15 breeding lines (duplicate entries) and the original landraces (quadruplicate entries) were included as checks in 2017. Fourteen checks comprised important lines of the European flint breeding pool and were included in the set of 65 genotyped breeding lines. In 2018, only four breeding lines were used as checks (three flint lines). A subset of nine traits was analyzed in this study (Supplementary Table 2), related to early plant development, maturity, as well as agronomic characteristics. After stringent quality filtering based on genotypic data 28 , phenotypic data of 899 DH lines (KE = 471, LL = 26, and PE = 402) remained for further analyses. Best linear unbiased estimates (BLUEs) for each DH line and check were calculated across environments using a mixed linear model, with genotypes as fixed and environment as well as design factors as random effects 28 . Analogously, BLUEs were calculated within each environment using the same model without environment-related model terms.
Haplotype construction. For both, the landrace-derived DH lines, as well as the breeding lines, haplotypes were defined as a given nucleotide sequence within nonoverlapping windows of ten SNPs ( Supplementary Fig. 4a), using the Rpackage zoo version 1.8-6 (ref. 56 ). For the 600k chip, the density of SNPs along the chromosomes follows the average recombination rate 45 . Therefore, using a fixed number of SNPs per window leads to similar window sizes as defined based on genetic map units. The median physical window size was 13.5 kb (mean = 37.8 kb), corresponding to 0.006 cM (mean = 0.026 cM) according to a genetic map generated from a F 2 mapping population of a cross of EP1 × PH207 (ref. 44 ). Within each window, haplotypes were coded as presence/absence markers, yielding genotype scores 0 and 2, respectively. To evaluate the potential of untapped variation in landraces for elite germplasm improvement, we compared haplotype frequencies between the landrace-derived DH lines and the panel of 65 breeding lines.
Diversity measures. PIC 57 and H 58 were calculated based on SNPs (PIC SNP and H SNP ) as well as on haplotypes (PIC hap and H hap ) constructed as described above. nR 59 was calculated within the genomic windows used for haplotype construction. For all five parameters mean values across SNPs or ten-SNP windows, respectively, were calculated for each DH library individually and for the combined set of 941 DH lines, as well as for the set of 65 breeding lines. Identification of trait-associated haplotypes. For GWAS in the DH lines, haplotypes which were present less than three times in the panel of 899 phenotyped DH lines were excluded from the analysis. For haplotypes with r 2 = 1, only one was retained, resulting in 154,104 haplotypes used for GWAS ( Supplementary Fig. 4a), with on average 5.73 haplotypes per window. The identification of trait-associated haplotypes was conducted in two steps following Millet et al. 35 , (i) identification of candidate haplotypes in GWAS (Supplementary Fig. 4b) and (ii) backward elimination in a multi-locus multi-environment model ( Supplementary Fig. 4c). GWAS were conducted for single environments, as well as across environments using the corresponding environment-specific and across-environment BLUEs as response variable in the model, respectively. A univariate linear mixed model, implemented in GEMMA version 0.98.1 (ref. 60 ), was used: where y is the n-dimensional vector of phenotypic values (BLUEs), with n being the number of lines; α is a three-dimensional vector of fixed effects (intercept and landrace effects of KE and LL); β is the fixed effect of the tested haplotype; x is the vector of corresponding genotype scores coded as 0 and 2; u is the ndimensional vector of random genotypic effects, with uNð0; Kσ 2 g Þ; and e is the n-dimensional vector of random residual effects, with e~N(0, I n σ 2 ). K denotes the (n × n) genomic relationship matrix based on SNP markers according to Astle and Balding 61 , calculated using the R-package synbreed version 0.12-9 (ref. 62 ). I n denotes the (n × n) identity matrix. σ 2 g and σ 2 refer to the genetic and residual variance pertaining to the model defined in Eq. (1), respectively. Matrices W (n × 3) and Z (n × n) assign phenotypic values to fixed and random effects, respectively. Significance of haplotype-trait associations was assessed for each single-environment as well as for the across-environment GWAS based on the likelihood ratio test, as implemented in GEMMA, using a 15% false discovery rate 63 . Haplotypes with a physical distance of <1 Mb and in high LD (r 2 ≥ 0.8) were considered to mark the same genomic region. The corresponding traitassociated genomic region was described by the start and end positions of the first and last haplotype fulfilling the defined criteria, respectively. To represent genomic regions equally in subsequent analyses, only the most significant haplotype, the focus haplotype, was retained per region in the respective GWAS, resulting in a set of candidate haplotypes.
In the multi-locus, multi-environment mixed linear model, we conducted a backward elimination of those candidate haplotypes as suggested by Millet et al. 35 , using the ASReml-R package version 3.0 (ref 64 ): where y ijk is the phenotypic value (BLUE) of line k belonging to landrace j tested in environment i; μ is the common intercept; ω i is the fixed effect of environment i; δ j is the fixed effect of landrace j; x kq is the genotype score (0 or 2) of line k for haplotype q; β i q is the fixed effect of haplotype q in environment i, comprising the haplotype main and haplotype by environment interaction effect, i.e., u k is the random genotypic effect of line k, and e ijk is the random residual error with environment-specific residual error variance. Q represents the final set of haplotypes obtained through step-wise backward elimination based on the Wald test for β i q (ref. 65 ). At each step, significance of each haplotype was tested when it was the last one entering the model and the least significant haplotype was removed if P ≥ 0.01. The proportion of genetic variance explained by the set of trait-associated haplotypes was estimated by calculating the reduction in genetic variance between models including and excluding the haplotype effects, following Millet et al. 35 . For evaluating effect stability across landraces for the final set of haplotypes Q, we extended Eq. (2) by changing the term P q2Q x kq β i q to P q2Q x kq β ij q , with For comparison, GWAS was also performed with 175,810 SNPs (minor allele counts ≥ 3 and r 2 ≠ 1), analogously as described above.
Favorable and unfavorable haplotypes and their effect stability. The number of environments in which a haplotype was significant was estimated by generating 95% confidence intervals (CI = effect estimate ± 1.96 × standard error) based on Eq.
(2), following Millet et al. 35  Haplotypes associated with multiple traits. We tested if haplotypes identified for early plant development also had an effect on other traits using a bivariate model, similar to Stich et al. 66 : where y tijk is the phenotypic value (BLUE) for trait t of line k belonging to landrace j tested in environment i; μ t is the intercept for trait t; ω ti is the fixed effect of environment i for trait t; δ tj is the fixed effect of landrace j for trait t; x k is the genotype score (0 or 2) of line k for the tested haplotype; β t is the fixed effect of the haplotype for trait t; u tk is the random genotypic effect of line k for trait t, with u~N(0, G ⊗ K); and e tijk is the residual with e~N(0, E ⊗ I n ). G and E correspond to the (t × t) genetic and error variance-covariance matrices among traits pertaining to the model defined in Eq. (5), respectively, and ⊗ denotes the Kronecker product. Haplotypes for which the 95% CIs for both β t did not include 0 were considered significant for both traits. The proportion of genetic variance explained per trait by significant haplotypes was estimated by calculating the respective reduction in G between models including and excluding the haplotype.
Haplotype comparison between landraces and breeding lines. We assessed frequency distributions of identified trait-associated favorable and unfavorable landrace haplotypes in the panel of 65 breeding lines, and compared them with 500 haplotypes randomly drawn out of the set of haplotypes occurring at least three times in the landrace panel. Significance for differences in means between the frequencies of favorable and random haplotypes, as well as unfavorable and random haplotypes was tested with the Mann-Whitney test (two-sided). When tracking potentially shared ancestral haplotypes between populations, the probability of a haplotype being broken up by recombination depends on the haplotype length, the recombination rate in the respective genomic region and the time span back to the most recent common ancestor. To evaluate to what extent recombination might have occurred in the haplotypes constructed in this study, we considered the physical and genetic length of each haplotype, as well as haplotype similarity (1 -H hap ) and nR within the respective genomic windows.
To evaluate the effect of the selected focus haplotype relative to the alternative haplotypes in a given ten-SNP window, we followed the approach of Bustos-Korts et al. 67 , extending Eq. (2) to: where Q 0 represents the set of haplotypes Q as described above without the respective focus haplotype of the window tested, x kh identifies the haplotype (categorical variable) in the window tested carried by line k, and β i h represents the effect of the respective haplotype relative to the focus haplotype. Similar as above, significance of haplotype effects relative to the focus haplotype was determined by constructing 95% CIs. We further estimated the proportion of genetic variance explained by the given window by calculating the reduction in genetic variance between the null model (without P q2Q 0 x kq β i q þ x kh β i h ) and the model with the x kh β i h term.
To evaluate to what extent haplotypes with favorable or unfavorable effects in landraces also have favorable or unfavorable effects in elite material, respectively, we compared performance levels between the landrace-derived DH lines and the 14 breeding lines used as checks. As phenotypic data for the 14 breeding lines were only available for 2017, only the six environments from 2017 were considered. For some traits, differences in means between the landraces were observed 28 , thus comparisons were conducted for each landrace separately. Significance for differences in means between the respective landrace and the 14 checks was tested based on 10,000 permutations (two-sided test). In addition to a comparison of the overall performance level between all lines of the respective landrace and the 14 breeding lines, we compared means between groups of lines carrying a particular haplotype and lines not carrying the haplotype.
Reporting summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.

Data availability
A reporting summary for this article is available as a Supplementary Information file. The datasets generated and analyzed during the current study are available from the corresponding author upon request. Seeds from all genotypes used in the study are available through material transfer agreements. The genotypic data of 941 DH lines and the phenotypic data of 899 DH lines and 14 breeding lines are available in figshare (https://doi.org/10.6084/m9.figshare.12137142). The 600k data of 63 breeding lines can be accessed at figshare (https://doi.org/10.6084/m9.figshare.3427040.v1), while for two lines genotypic data based on whole-genome sequences were downloaded from CyVerse Data Store (http://cbsusrv04.tc.cornell.edu/users/panzea/download.aspx? filegroupid=34). The Source data underlying Figs. 1 and 3-5, as well as Supplementary  Figs. 2, 3, 5, and 7-10 are provided as a Source data file. Source data are provided with this paper.