Discovery of novel haplotypes for complex traits in landraces

Genetic variation is of crucial importance for selection and genetic improvement of crops. Landraces are valuable sources of diversity for germplasm improvement, but for quantitative traits efficient strategies for their targeted utilization are lacking. Here, we propose a genome-based strategy for making native diversity accessible for traits with limited genetic variation in elite germplasm. We generated ~ 1,000 doubled-haploid (DH) lines from three European maize landraces, pre-selected based on molecular and phenotypic information. Using GWAS, we mapped haplotype-trait associations for early development traits at high resolution in eleven environments. Molecular haplotype inventories of landrace derived DH libraries and a broad panel of 65 breeding lines based on 501,124 SNPs revealed novel variation for target traits in the landraces. DH lines carrying these novel haplotypes outperformed breeding lines not carrying the respective haplotypes. Most haplotypes associated with target traits showed stable effects across populations and environments and only limited correlated effects with undesired traits making them ideal for introgression into elite germplasm. Our strategy was successful in linking molecular variation to meaningful phenotypes and identifying novel variation for quantitative traits in plant genetic resources.


27
Harnessing the allelic diversity of genetic resources is considered essential for overcoming 28 the challenges of climate change and for meeting future demands on crop production 1,2 . For  Trait-associated genomic regions 93 To evaluate if molecular inventories of landrace derived material are predictive for their 94 potential to improve traits of agronomic importance, we performed haplotype based genome-95 wide association scans (GWAS) for nine traits. Trait-associated genomic regions were defined 96 based on LD between significant haplotypes (Methods; Fig. 2 to 55) were detected for the traits early vigor (EV_V4/V6) and early plant height (PH_V4/V6).

99
Haplotypes explained between 2% (female flowering time, FF) and 57% (lodging, LO) of the 100 total genetic variance of the respective traits (Fig. 2). Despite the large sample size (n = 899), 101 the proportion of genetic variance explained might be somewhat overestimated 29,30 and thus 102 has to be interpreted with caution. Only few genomic regions were detected for flowering time 103 indicating that alleles with large effects were fixed during adaptation of the respective landraces 104 to their geographical region, thus having little impact on GWAS for other traits. resolution was not optimal as they comprised more than 100 annotated genes. Mapping 115 resolution in the three DH libraries is best demonstrated by an example of an already well 116 characterized locus: teosinte branched 1 (tb1). The gene tb1 played a major role in the transition 117 from highly branched teosinte to maize with strongly reduced branch development 31 . In our 118 study, a strong significant association for tillering (TILL) was found in a genomic region 119 comprising the tb1 locus (size 1.3 Mb, including in total 22 genes; Supplementary Table 1). In 120 silico fine mapping in the respective region (Methods) identified a 10 SNP window which 121 overlapped perfectly with tb1 and its regulatory upstream region.

122
Effect size and stability of trait-associated haplotypes 123 The potential of the identified haplotypes for elite germplasm improvement depends on the 124 size and direction of their effects on the traits of interest, their environmental stability and their 125 dependence on the genetic background. In a given trait-associated genomic region one window 126 of 10 SNPs comprising several haplotypes was selected. Significant haplotypes, hereafter 127 referred to as focus haplotypes, entered into a multi-environment model ( Supplementary Fig.   128 3) and were classified into "favorable", "unfavorable" and "interacting" based on the direction 129 and stability of their effects in the different test environments (Supplementary Fig. 4).

130
According to this categorization scheme, a high number of favorable haplotypes for early plant 131 development traits were found in the DH libraries (  Supplementary Fig. 5a). All of those 35 associations had equal effect signs for both landraces.

142
Also for the 80 environment-specific associations significant for only one of the two landraces,   increase of 6.06 cm over breeding lines, but the difference was not significant (P > 0.056; Fig.   183 5a). When looking at individual environments however, significant differences (P < 0.044) 184 were observed for locations OLI, EIN and ROG ( Supplementary Fig. 6a), which showed the 185 lowest temperatures in the field 28 suggesting that the relative advantage of the identified 186 haplotype might be temperature dependent.

187
On chromosome 9 in a genomic region of about 3 Mb, three independent focus haplotypes 188 affected PH_V6 significantly (two favorably, one unfavorably). One of the three focus 189 haplotypes (Haplotype D in Fig. 3a and Supplementary Fig. 4) increased PH_V6 compared to showing low temperature during early plant development ( Supplementary Fig. 6b). 198 We also assessed genomic regions in more detail where the focus haplotype was

327
Plant material 328 We generated more than 1,000 doubled-haploid (DH) lines derived from three European   which the 95% CIs for both did not include 0 were considered significant for both traits.

451
The proportion of genetic variance explained per trait by significant haplotypes was estimated 452 by calculating the respective reduction in between models including and excluding the 453 haplotype. 455 We assessed frequency distributions of identified trait-associated favorable and unfavorable (1haplotype heterozygosity 60 ) and the minimum number of historical recombination events 61 467 within the respective genomic windows.

468
To evaluate the effect of the selected focus haplotype relative to the alternative haplotypes 469 in a given 10 SNP window, we followed the approach of Bustos-Korts et al. 62            Tillering TILL Score 1-9, 1 = no tillers, 9 = many and long tillers V8-V10 899 5