Genome-wide association study Identified multiple Genetic Loci on Chilling Resistance During Germination in Maize

Maize (Zea mays, L.) cultivation has expanded greatly from tropical to temperate zones; however, its sensitivity to chilling often results in decreased germination rates, weak seedlings with reduced survival rates, and eventually lower yields. We conducted germination tests on the maize-282-diverse-panel (282 inbred lines) under normal (25 °C) and chilling (8 °C) conditions. Three raw measurements of germination were recorded under each condition: 1) germination rate, 2) days to 50% germination, and 3) germination index. Three relative traits were derived as indicators of cold-tolerance. By using the 2,271,584 single nucleotide polymorphisms (SNPs) on the panel from previous studies, and genome-wide association studies by using FarmCPU R package to identify 17 genetic loci associated with cold tolerance. Seven associated SNPs hit directly on candidate genes; four SNPs were in high linkage disequilibrium with candidate genes within 366 kb. In total, 18 candidate genes were identified, including 10 candidate genes supported by previous QTL studies and five genes supported by previous gene cloning studies in maize, rice, and Arabidopsis. Three new candidate genes revealed by two associated SNPs were supported by both QTL analyses and gene cloning studies. These candidate genes and associated SNPs provide valuable resources for future studies to develop cold-tolerant maize varieties.


Results
Germination performance. The diverse maize association panel demonstrated much greater variation in germination rates (GRs) on day 21 under chilling conditions (8 °C) compared to the panel's GRs on day 7 under normal conditions (25 °C). Overall, seed GR was substantially suppressed under chilling conditions (median = 30.0%) in contrast to normal conditions (median = 90.00%) -even when GR was measured as far out as the 21 st day under chilling (Table 1).
Although day 21 has been used as the standard time to measure germination rate under chilling conditions, extending the observations and measuring additional traits are critically needed to adequately monitor germination rate. One of the additional traits we measured was days to 50% germination (DT50). To obtain robust measurements of DT50, we continued observing germination for a total of 31 days. The accumulative measurements indicated that the initial and final values of DT50 were 10.3 and 29.6 days, respectively, using the Boltzmann function (Supplementary Table 1 and Supplementary Figure 1), which fit the observed DT50 well (adjusted R square = 99.56%). This result suggested that the 31-day period was sufficient to cover the observations. All but 37 lines reached 50% germination by the end of 31 days. For these 37 lines, we adjusted the observations back to 31 days using the method developed to handle outlier gene expressions from 5,000 maize inbred lines 25 .
Another additional trait we measured was germination index (GI), which was detailed in the Materials and Methods section. For GR, DT50 and GI, we derived corresponding relative traits by dividing each trait's chilling condition value by its normal condition value, allowing us to make valid comparisons between the two conditions. The relative GR (RGR) had a median of 0.40 ( Table 1). The relative DT50 (RDT50) had a median of 8. 45, which means that maize seedlings grown under chilling conditions took eight-fold longer to reach 50% germination than when grown under normal conditions.
In total, we measured nine traits, including the six combinations of three traits (GR, DT50, and GI) and two treatments (chilling and normal) plus the three relative traits derived from GR, DT50, and GI values. Seven of the nine traits had mono-modal distributions; both DT50_C and DT50_N displayed weak indications of bi-modal distributions (Fig. 2). Nevertheless, a robust statistical analysis was achieved by treating these traits as normally distributed and using a mixed linear model to further derive BLUPs. The overall inbred line performances relative to these nine traits were calculated as BLUPs, using the same method previously used to calculate the overall performances of 5,000 maize inbred lines while eliminating environment effects 26 .
SCIentIfIC RepoRts | 7: 10840 | DOI:10.1038/s41598-017-11318-6 Correlations were weak between each trait under chilling and normal conditions (Supplementary Table 2 and Fig. 2). The weak correlations suggest that germination under chilling conditions might be controlled by different genes. The three traits under chilling conditions were more highly correlated than under normal conditions. The two primary traits (RGR and RDT50) were characterized for their distributions within subpopulations defined in previous study 27 . Associated SNPs. GWAS were conducted on the BLUPs of the three relative traits (RGR, RDT50, and RGI) for 241 maize inbred lines genotyped by 2,271,584 SNPs, using recently developed FarmCPU method 28 and software package (http://zzlab.net/FarmCPU). As the measurements were strongly associated with population structure (Supplementary Figure 2), the principal components derived from all the SNPs were fitted as covariates to control population structure in the association study. The P values were fully controlled without inflation, except very small proportions of SNPs that exceeded the null hypothesis expectation (2.2E-08) at a type I error of 5%. We identified 17 associated SNPs ( Fig. 3 and Table 2) that surpassed the Bonferroni threshold. All SNPs had a Minor Allele Frequency (MAF) of 10% or above. The strongest associated SNP had a P value of 2.96E-19. Among the 17 associated SNPs, only one SNP (ss196436428) was associated with two traits (RDT50 and RGI). Five SNPs were associated with RDT50; two SNPs were located on chromosome 7 and three were distributed on chromosome 2. Eight SNPs were associated with RGI and distributed on chromosomes 1, 2, 4, and 6. The remaining five SNPs were associated with RGR, with two located on chromosome 2, and one each distributed on chromosomes 1, 6, and 9.
Comparison of associated SNPs and QTLs of cold-related traits. Among the 17 associated SNPs, five were located directly within QTL regions identified in previous studies and reported to be responsible for cold traits in maize (Supplementary Table 3). Two SNPs associated with RDT50 on chromosome 2 (S2_117871531 at 117,871,531 bp and ss196436428 at 88,979,688 bp) were located in QTL regions associated with ascorbate, Chlorophyll b, and Chlorophyll a + b content under suboptimal growing temperatures 14 ; trapping efficiency of PSII (F′ v /F′ m ), carbon exchange rate (CER) at 15 °C, and leaf greenness (SPAD) across temperatures 17 ; and Fv/Fm across different sowing stages in a field experiment 29 .
One SNP associated with RGI and two SNPs associated with RGR corresponded to QTLs on chromosomes 1, 2, and 6. SNP S1_258878734 on chromosome 1 at 258,878,734 bp was detected in QTL regions associated with malic enzyme content at suboptimal temperature 14 , specific leaf area (SLA) at suboptimal temperature 12 , operating quantum efficiency of PSII photochemistry (ɸPSII) at 15 °C, and minimal fluorescence (Fo) across temperatures 17 . SNP S2_154533439 (chr2: 154,533,439 bp) was detected in a QTL region associated with CO 2 fixation and ɸPSII in the third-leaf stage at 15 °C 11 . SNP S6_156520680 (chr6: 156,520,680 bp) was located in a QTL region Distributions and correlations among nine direct and derived germination traits. Germination was defined as root emergence from seed. We directly measured Germination Rate (GR) on the 21st day under chilling conditions (GR21_C) and on the 7th day under normal conditions (GR_N). We also directly measured Days To 50% germination under Normal conditions (DT50_N) and Chilling conditions (DT50_C). We derived relative trait values by dividing chilling condition values by their corresponding normal condition values. In total, we evaluated nine germination traits, defined specifically as follows: GR21_C represents germination rate under chilling conditions (8 °C), measured at 21 days; GR_N (control) represents germination rate under normal conditions (25 °C), measured at 7 days; RGR represents relative rate of germination (GR21_C/ GR_N); DT50_C represents days to 50% germination (up to 31 days) under chilling conditions; DT50_N (control) represents days to 50% germination under normal conditions; RDT50 represents relative days to 50% germination (DT50_C/DT50_N); GI_C represents germination index (GI = ∑(Gt/Tt), where Gt was the number of seeds newly germinated on day t and Tt was the number of days elapsed) under chilling conditions; GI_N represents germination index under normal conditions; and RGI represents relative germination index (GI_C/GI_N). Frequency distributions for each trait/index are illustrated as histograms in the center diagonal. Scatter plots of correlations and the numerical correlation coefficients between every two traits are shown in the areas below and above the diagonal, respectively. The red line in the scatter plots represents the correlation trend with diamonds displaying the medians. associated with shoot nitrogen content (N%) under 15 °C growing conditions 17 . These findings suggested most of the detected SNPs (10 out of 18) are located within previously identified QTL regions associated with maize cold-related traits.

Identification of candidate genes in maize.
We used the B73 RefGen_v2 Maize Gene Database 30 (http:// www.maizegdb.org/) to identify candidate genes hit directly by associated SNPs or nearby genes in high LD with the associated SNPs. Among the 17 associated SNPs, seven had hits directly on candidate genes. An additional 11 candidate genes were identified with extended screening on genes in high LD (r 2 ≥ 0.8) and within distance of 200 kb of four associated SNPs. The candidate genes of direct and indirect hits are summarized with table and figure (Table 3 and Fig. 4). Of the total 18 candidate genes identified, three contained the SNP associated with both RDT50 and RGI, six contained the SNPs associated with RDT50, five contained the SNPs associated with RGR, and four contained the SNPs associated with RGI. We were unable to find candidate genes that associated with six SNPs, including one SNP on chromosome 1 (ss196425965) and five SNPs on chromosome 2 (S2_15690029, S2_46211425, S2_161121602, S2_202342038, and S2_43176376). There is potential for new discovery of candidate genes around these SNPs with extended knowledge on gene functions. Pleiotropic SNPs. All the three traits investigated in this study are correlated with phenotypic correlation coefficient of 70% or above (Supplementary Table 2). Among them, RGR and GRI are more correlated (phenotypic correlation coefficient of 90%). Pleiotropic SNPs were observed among all the three traits, especially between RGR and GRI. The pleiotropic effect was demonstrated at both the levels of the SNPs with and without candidate genes identified. Germination was defined as root emergence from seed. GWAS were performed on 241 inbred lines, genotyped with 2,271,584 SNPs, using the recently developed FarmCPU method and software package (http://zzlab. net/FarmCPU). The left panel displays the signals of associations across maize genome. The right panel demonstrates the overlapped and exceeded associations between the observed signals (black dots) and the expected (red lines) signals under the null hypotheses. GWAS identified a total of 17 SNPs that surpassed the Bonferroni threshold (horizontal green lines) for the three traits. One associated SNP was shared by RDT50 and RGI. Among the 17 associated SNPs, 11 were underlain by candidate genes (vertical dash lines) that were previously reported to be associated with cold tolerance.   Table 3. Functions of candidate genes associated with three derived germination traits. All candidate genes contain either an associated SNP or a SNP in high Linkage Disequilibrium (LD) (R Square ≥ 80%) with an associated SNP and within a distance <400 kb. * and # indicate the gene with supports of cloned genes and previous QTL studies, respectively.
The SNP tagged gene GRMZM2G178486 was above the Bonferroni significant threshold for both RDT50 and RGI (Fig. 3). The SNP tagged gene GRMZM2G462797 was above the threshold for both RGI and RGR. The SNP tagged gene GRMZM2G389768 was above the threshold for RGI and was the most significant SNP on chromosome 4 for RDT50 (8.52E-6). The SNP tagged gene GRMZM5G802338 was above the threshold for RGR and was the third significant SNP on chromosome 6 for both RDT50 (1.36E-7) and RGI (2.99E-5).
There were also SNPs that tagged multiple traits and these SNPs had been identified to be associate with candidate genes yet. The top significant SNP for RGI on chromosome 2 was the second significant SNP for RGR and the sixth SNP for RDT50. The top significant SNP for RGR on chromosome 5 was also the top significant for RDT50. The top significant SNP for RDT50 on chromosome 9 was the second significant SNP for RGR.

Identification of candidate genes in analogous species. For each candidate gene identified in maize,
we identified the homologous genes in Arabidopsis and rice. We also performed a comparative functional analysis of these corresponding genes ( Table 4). Five of 18 candidate genes are predicted to respond to abiotic stress based on the known function of their Arabidopsis homologs. These five genes and the remaining 13 genes have been elucidated based on prediction (Supplementary Candidate Genes).
Functional prediction of candidate genes. We found 343 GO terms associated with the 18 candidate genes associated with the three traits (Supplementary Table 4). These GO terms belong to multiple functions. The first type of function is related to biological processes, including protein phosphorylation, metabolism, Figure 4. Diagram of associated SNPs, candidate genes, and support by QTL and gene cloning studies. In total, 18 candidate genes were identified for the 17 associated SNPs. Among these candidate genes, 10 were supported by QTL studies and five by gene cloning studies. Three genes were supported by both types of studies.  Table 4. Homologous genes in Arabidopsis and rice that correspond to the candidate genes associated with the three derived germination traits in maize.

No. Gene ID
methylation, mitochondrial fission, proteolysis, and response to stimulus, brassinosteroid, oxidative stress, and transcription regulation. The second type is related to molecular functions, which involve lyase calmodulin, metal ion binding, zinc ion binding, adenyl-nucleotide exchange factor, and enzyme (N-acetyltransferase, GDP-D-glucose phosphorylase, peroxidase, methyltransferase, peptidase) and translation initiation factor activity. The third type of function is related to cellular components involving the plasma membrane and its integral components, mitochondrial matrix, plasma membrane, and extracellular regions. We identified three candidate genes (GRMZM2G318156, GRMZM5G871707, and GRMZM2G148793) with unknown encoding functions; thus, gene ontology annotations were unavailable. From those genes with known functions, we observed that some candidate genes can be involved in multiple functions. For example, we found four genes (GRMZM2G462797, GRMZM2G081928, GRMZM2G389768, GRMZM2G113158) that are involved in all three functional types. Twelve genes are involved in two types, molecular function and biological process (Supplementary Table 4).
Most of the GO terms were specific to each trait (Supplementary Figure 3). Only 55.6%, 29.9%, and 27.2% were shared with other traits for RDT50, RGI, and RGR, respectively. Although RGR did not share any associated SNPs with the other two relative traits, we found 13 GO terms shared by all three traits. This finding suggests that these 13 GO terms have a greater chance for a stronger association with cold tolerance. Therefore, we detailed the relationships between the 18 candidate genes and these 13 GO terms (Supplementary Table 5 and Supplementary  Figure 4).

Discussion
Increasing maize yields in temperate zones by improving cold tolerance during germination and early growth stages is a complex process. This study focused on germination under chilling conditions; specifically, germination was defined as root emergence from seed. Experimental temperatures were also narrowly set to 25 °C for normal conditions and 8 °C for chilling conditions. Nevertheless, the 17 associated SNPs and 18 candidate genes we identified in this study might lead the direction for future studies. We found that 10 candidate genes were supported by QTL studies on traits related to cold tolerance, even though these previous studies measured their traits differently and at different growth stages. The overlap of five of our candidate genes with cloned genes for cold tolerance in maize, rice, and Arabidopsis also provided hints that genes related to cold tolerance may be shared across species [31][32][33][34][35] . This study provides valuable resources for future studies to enhance the understanding of the genetic architecture of maize for cold tolerance, and eventually, to improve maize varieties through breeding.

Development of radicles under chilling condition.
Root emergence under chilling conditions is the most important indicator of germination activation for cold tolerance 36 . We defined our chilling condition as 8 °C. Visible radicles were used as indicators of germination. Under the chilling condition, the 282-maize diverse panel exhibited a huge variation in the development of radicles, especially radicle length (Fig. 1). But, we only recorded germination as a binary trait (germinated or not) and the variation in length was not reported here. We observed a seemingly strong correlation between radicle length and germination rate. Even though, the genetic loci might be different from the length and could be controlled by different mechanisms of biological development.

Germination analysis.
Measuring GR at the optimal temperature, 25 °C, is a standard practice. Root emergence usually peaks at 2~3 days and germination is completed within 7 days 37 . Therefore, days/hours to reach 50% germination, DT50, and the total GR on the 7th day are the two typical measurements in germination studies. However, the dynamic process of germination is completely different under chilling conditions. For example, under the chilling condition (8 °C), not one line in the maize 282-diverse-panel exhibited 50% germination within 7 days.
Historically, GR on the 21st day has been used as the standard measurement under chilling conditions (8 °C), which corresponds to GR on the 7th day under normal conditions 38 . However, only about one-half of the lines in the maize diversity panel experienced a GR over 50% before the 21st day. For the majority of the remaining lines, germination continued for almost 31 days.
Using the GR at 21 days has two advantages. One advantage is that the methodology and analyses will be consistent with the existing literature. The other advantage is that the length of the experiment is reduced from 31 days to 21 days when GR on the 21st day is of interest. However, because a large portion of our lines were slow to germinate under chilling conditions, we extended our observations and used two complementary metrics-GR on the 21st day and number of days to reach 50% germination. The Pearson correlation coefficient was 85% in the maize 282-diverse-panel. We identified one SNP that was associated with both metrics. And, as expected, we also identified associated SNPs specific to each metric.
Pleiotropy among three germination traits. We found strong correlations among the three relative traits derived from the three raw germination traits. The absolute values of Pearson correlations coefficients were all above 70% and 80% for phenotypic and genetic correlations, respectively. The absolute values of Pearson correlations coefficients between RGI and RGR were 90% and 97% for phenotypic and genetic correlations, respectively. More pleiotropic SNPs were found for the two traits (RGI and RGR) that are closely correlated.

Priority of implications.
This study did not include experimental data to evaluate candidate gene association with maize yield. We did not validate the expression and biological function for candidate genes either. Therefore, implications for future studies can only be prioritized by the confidence of our study results. We investigated multiple dimensions of confidence, including distance, pleiotropy, signal strength, linkage disequilibrium, cloning support, QTL analyses, and genes with GO terms shared with other candidate genes.
SCIentIfIC RepoRts | 7: 10840 | DOI:10.1038/s41598-017-11318-6 The null results include the six associated SNPs without candidate genes identified, eight candidate genes without support of QTLs, and three genes without GO terms identified. The possible causes include false positives, lack of recognition, and incomplete or incorrect gene annotation. These SNPs and genes have the lowest priority.
The Highest priority goes to five of the candidate genes (GRMZM2G389768, GRMZM2G057186, GRMZM2G012148, GRMZM2G178486, and GRMZM5G806387) that involve functions related to cold resistance, such as freezing tolerance response, ascorbate biosynthesis, and ABA, in maize or analogous species. Two of these five genes (GRMZM2G389768 and GRMZM2G057186) were hit directly, in the exonic region, by the associated SNPs identified in this study. The other three candidate genes (GRMZM2G178486, GRMZM5G806387, and GRMZM2G12148) were indirectly hit by two SNPs highly correlated and nearby associated SNPs (ss196436428 and s2_117871531) and were supported by previous QTL analyses (Fig. 4). The overlap among different studies suggested their potential values for future gene cloning studies and breeding to develop cold-tolerant maize varieties.

Materials and Methods
Plant materials and germination experiment. The plant seed materials originated from the USDA-ARS North Central Regional Plant Introduction Station, located in Ames, Iowa. The station collected maize diversity lines and made them accessible for public use. The core collection used in this study consisted of 282 maize lines and is commonly known as the "282 association panel". This panel covers over 75% of the total variation for the entire maize collection, including tropical, subtropical, and temperate germplasms. In 2010, the 282 association panel was introduced into the Chinese Germplasm Bank, located at the Chinese Academy of Agricultural Sciences (CAAS) in Beijing, China. CAAS increased the size of this maize seed collection in 2011 and 2012.
To cover the environment requirements for seed production, we grew the CAAS-produced, 282 association panel seeds in both tropical and temperate regions to ensure seed quality in the extreme lines. Seeds were planted in China's tropical Hainan province in winter 2013 and in temperate Heilongjiang province in spring 2014. Diseases, insects, and weeds were adequately controlled during the growing seasons. Physiologically mature ears were harvested from each environment, tropical and temperate. The seeds were dried under 25~30 °C conditions to achieve 14% water content and then stored at −4 °C. A total of 266 lines qualified for germination tests based on the quantity and physiological maturity status of seeds from both regions. Prior to seed germination experiments, all seeds were surface-disinfested with 1% NaOCl (sodium hypochlorite) for 5 minutes and then triple rinsed with sterile distilled water.
Following the ISTA protocol 37 , we conducted the experiments in two growth chambers (SANYO, MIR-253). The temperature was set at 8.0 °C 38 for chilling conditions (treatment) and at 25.0 °C for normal conditions (control). Seeds were germinated in Petri dishes with sterile wetted filter paper. The Petri dishes were completely randomized. For each condition (treatment and control), we used three replicates per line and per batch of region where the seeds produced. Each replicate contained 30 seeds. Measurements were conducted daily, starting from day 1. We defined seed germination as observed root emergence. Following the advanced protocol of germination developed by Noli, E. et al. 39 and Wen, W. et al. 38 , daily measurements spanned 7 days under normal conditions and 31 days under chilling conditions. The batches of seeds produced in two regions (temperate or tropical) were averaged for mapping genes underlying cold tolerance.
Phenotypic data. We measured a total of nine phenotypic traits. The first six traits included the following: 1) germination (defined as root emergence) rate at 21 days under 8 °C (chilling) (GR21_C); 2) germination at 7 days under 25 °C (control) (GR_N); 3) relative germination rate (RGR), comparing chilling to control conditions (GR21_C/GR_N); 4) days to 50% germination under chilling (DT50_C); 5) days to 50% germination under control (DT50_N); and 6) relative days to 50% germination (RDT50), compared chilling to control conditions (DT50_C/DT50_N). The last three traits were based on a germination index (GI), calculated as GI = ∑(Gt/Tt), where Gt was the number of seeds newly germinated on day t and Tt was the number of days elapsed 40 . The GI under chilling conditions (GI_C) was calculated daily, from 0 to 31 days; the GI under control conditions (GI_N) was calculated daily, from 0 to 7 days (Mhatre and Chaphekar 41 ). The relative GI (RGI) was used to compare chilling to control conditions (GI_C/GI_N).
For each trait above, the estimate breeding value (EBV) was calculated by the Best Linear Unbiased Prediction (BLUP) method. The genetic relationship matrix was estimated using about 51,742 SNPs. Together, the phenotypes of each trait and the genetic relationship matrix of individuals were used in rrBLUP to calculate each trait's EBV [41][42][43] .
The standard deviation, range, mean, and median were calculated for each of the nine traits. EBV was used to assess genetic correlations between the traits. R language 3.12 and R package 'psych' was used to calculate correlations and to produce the correlation diagram 44 .
Genotypic data. For genotypic data, we used a set of 51,742 SNP markers derived from an Illumina maize 50 K array 45 and a GBS (genotyping by sequencing) dataset containing 2,219,845 SNP markers downloaded from http://mirrors.iplantcollaborative.org/browse/iplant/home/shared/panzea/genotypes/GBS/v23/Maize282_ imputed_AllZea_GBS_Build_July_2012_FINAL.zip (accessed on September 12, 2016). The two datasets were combined and filtered for duplicate SNPs. A total of 2,271,584 SNPs, distributed over the entire maize genome, were used in this study.

GWAS.
Genome-wide association analysis on the three relative traits, RDT50, RGI, RGR, was performed using the FarmCPU method 28 . The first three principal components were fitted as the covariates to control population structure. The plots of the first three principal components reflected the subpopulation structure, including stiff stalk, non-stiff stalk and tropical/subtropical (Supplementary Figure 5). The threshold for detecting significant SNPs was determined by the Bonferroni multiple test correction at a type I error of 5% 46 .
Identification of known QTLs. MaizeGDB was used to identify the QTLs on traits related to cold tolerance in maize. The intervals of the QTLs were defined by the starting and ending locations. The associated SNPs were compared with the intervals to examine the overlap between the linkage and association analyses.
Identification of candidate genes. Genes hit directly by the associated SNPs and genes in high linkage disequilibrium (LD) with the associated SNPs and within 400 kb were selected as candidate genes. We used the PLINK analysis toolset (http://pngu.mgh.harvard.edu/~purcell/plink/) to calculate LD as the squared correlation coefficient (r 2 ) between each marker within an intra-chromosomal distance of 1000 kb. A histogram was then plotted to analyze the LD decay according genetic distances and pairwise LD across the entire genome. SNPs with a paired r 2 ≥ 0.8 were identified within a 400 kb region up-and downstream of the significant cold-trait-associated SNPs. Then, we examined all candidate genes either containing the associated SNPs or in high linkage disequilibrium (r 2 ≥ 0.8) with the associated SNPs and within a distance of 400 kb. The B73 RefGen_V2 gene model from the maizeGDB website (http://www.maizegdb.org/) was used to map the loci and genetic information.