Genome-Wide Association Study Dissecting the Genetic Architecture Underlying the Branch Angle Trait in Rapeseed (Brassica napus L.)

The rapeseed branch angle is an important morphological trait because an adequate branch angle enables more efficient light capture under high planting densities. Here, we report that the average angle of the five top branches provides a reliable representation of the average angle of all branches. Statistical analyses revealed a significantly positive correlation between the branch angle and multiple plant-type and yield-related traits. The 60 K Brassica Infinium® single nucleotide polymorphism (SNP) array was utilized to genotype an association panel with 520 diverse accessions. A genome-wide association study was performed to determine the genetic architecture of branch angle, and 56 loci were identified as being significantly associated with the branch angle trait via three models, including a robust, novel, nonparametric Anderson-Darling (A-D) test. Moreover, these loci explained 51.1% of the phenotypic variation when a simple additive model was applied. Within the linkage disequilibrium (LD) decay ranges of 53 loci, we observed plausible candidates orthologous to documented Arabidopsis genes, such as LAZY1, SGR2, SGR4, SGR8, SGR9, PIN3, PIN7, CRK5, TIR1, and APD7. These results provide insight into the genetic basis of the branch angle trait in rapeseed and might facilitate marker-based breeding for improvements in plant architecture.

control branch angles in Arabidopsis and peach 9 . Moreover, a recent study in Arabidopsis showed that mutants with defects in auxin homeostasis or auxin response genes, such as wei8 tar2, tir1-1, afb4-2 afb5-5, and arf10-3 arf16-2, have altered branch angles 10 . However, despite the increasing understanding of branch angle mechanisms in model plants, the genetic basis of branch angle in rapeseed has not been elucidated, a situation that reflects the complexity of genetic studies of polyploid plants.
A genome-wide association study (GWAS) provides a methodical analysis of the genetic architecture of complex traits in crops. GWAS identifies the underlying QTLs at a relatively high resolution, taking full advantage of ancient recombination events 11,12 . To date, GWAS has been successfully performed in many crops, including rice 13 , maize 14,15 and rapeseed [16][17][18] . Among the proposed statistical approaches for GWAS, the mixed linear model (MLM) is a popular method that can eliminate the excess of low p-values for most traits 19,20 . However, MLM can lead to false negatives by overcompensating for population structure and kinship 21 , and MLM has also limited statistical power to detect rare alleles, which in fact constitute a substantial proportion of the natural variation 22 and have potentially large phenotypic effects 23 . Accordingly, a complementary strategy, the Anderson-Darling (A-D) test, has recently been proposed to rectify these shortcomings 24 . The A-D test is a novel nonparametric statistical method that offers a higher power than MLM for traits that have abnormal phenotypic distributions and are controlled by moderate effect loci or rare variations 24 .
In the present study, we investigated correlations between the angles of branches at different positions and between these angles and other important agronomic traits. Genome-wide SNPs of this panel were assessed using the 60K Brassica Infinium ® SNP array, and the corresponding phenotype was evaluated in four environments.
GWAS was performed with 520 diverse accessions to identify underlying QTLs that contribute to rapeseed branch angle variations. A total of 56 loci that were significantly associated with the branch angle trait were identified by three association methods: MLM, the general linear model (GLM) and the A-D test. Considerable candidate genes were identified based on LD decay range of these loci, including multiple orthologues of well-characterized Arabidopsis genes. This study demonstrates that GWAS can be used as an effective approach for dissecting complex quantitative traits in rapeseed.

Results
Phenotypic variations in branch angle among accessions. High positive correlations were observed among the angles of different branches, indicating similar genetic control (Table 1). In general, the correlation coefficient between branches was reduced with decreasing physical proximity. For example, the angle of the second branch from the top showed a maximum correlation with the angle of the third branch (r = 0.72) and then with the fourth and fifth branches (r = 0.61, 0.53, respectively). In addition, we observed that the average angle of a different number of branches from the top was significantly positively correlated with the average angle of all branches, particularly when the branch number reached five (r = 0.93, Table 1). Thus, it is feasible to measure the five top branches as a representation of the branch angle phenotype. Phenotypic data for 30 individual plants are presented in Supplementary Data S1.
We collected phenotypic data for the association panel in four environments; the trial performed at the Changsha farm was the only trial with phenotypic data across two growing seasons (Table 2). Extensive phenotypic variations were observed for branch angles in the association panel, as indicated by the descriptive statistics shown in Table 2. In the four environments, the branch angle varied from 21.7 ± 1.9° to 71.7 ± 4.8°, with an average ranging from 40.3 ± 6.3° to 43.2 ± 6.3°. The coefficient of variation was constant in the different environments and ranged from 14.5% to 16.2%. The phenotypic data for all accessions in the four environments as well as BLUP values are presented in Supplementary Data S2.
The branch angles of the association panel in the four environments exhibited significantly positive correlations with each other, indicating the reliability and repeatability of these phenotypic data (Supplementary Table S1). Analysis of variance (ANOVA) revealed that the genotype, environment (year and location) and genotype × environment interaction all had significant effects on the branch angle, suggesting the crucial influence of environment on branch angle regulation (Supplementary Table S2). Based on phenotypic data for the four environments, the broad-sense heritability of the branch angle was as high as 78.5%. We then analysed the relationship between the branch angle and other important agronomic traits (only data for 2012/2013 and 2013/2014 at Changsha were available for these traits). Notably, branch angle was significantly positively correlated with plant height, which had the highest coefficient (r = 0.25, 2012/2013 Changsha), followed by branch number (r = 0.17, 2012/2013 Changsha, Table 3). Moreover, significant positive correlations were also observed between the branch angle and four yield-related traits, including the main inflorescence pod number, pod length, seed number per pod and seed yield (2012/2013 Changsha, Table 3). Similar results were observed in the 2013/2014 Changsha phenotypic data (Supplementary Table S3). Therefore, the results indicated a close relationship between branch angle and multiple plant-type and yield-related traits.
SNP performance, quality and in silico mapping. The Illumina Brassica 60K Infinium ® SNP array was used to genotype 530 rapeseed accessions. The raw data generated using the Illumina Infinium platform were further analysed with Genome Studio software by cluster refinement with an optimum accession Call Rate > 0.7; SNP Call Freq > 0.75; Minor Freq > 0.05; AA, BB frequency > 0.03; and GenTrain Score > 0.5. Through this analysis, 520 accessions and 33,218 polymorphic SNPs (63.7%) were retained. After excluding SNPs lacking clearly defined clusters or with multiple loci in the genome, 19,167 high-quality SNPs (36.7%) genotyped across 520 rapeseed accessions were utilized for association mapping. The genotyping scores for all polymorphic SNPs are presented in Supplementary Data S3.
Population structure and linkage disequilibrium. The population structure of the association panel was calculated for the 19,167 SNPs using STRUCTURE, and the parameters LnP(D) and Delta K suggested that the 520 genotypes could be assigned to two groups. A probability of membership threshold of 0.60 was used, and 65 and 398 lines were assigned to Groups 1 and 2, respectively, with the remaining 57 lines classified into a mixed group (Supplementary Data S4). In addition, the lm procedure in R showed that population structure accounted for 15.8% of the phenotypic variation of branch angle. The data for population structure and kinship are presented in Supplementary Data S5.
When r 2 = 0.1, seven chromosomes (A01 to A07) exhibited comparatively modest LD, with distances ranging from 708 to 873 kb ( Fig. 1, Supplementary Table S4). Chromosomes A09, C03, C05, C06 and C09 showed stronger LD, with distances ranging from 1,039 to 2,968 kb. However, particularly reinforced LD patterns were observed for chromosomes A08, A10, C01, C02, C04, C07 and C08, which presented a corresponding LD decay ranging from 4,264 to 8,704 kb. Consistent with the performance of the major A chromosomes, the A subgenome exhibited modest LD, with a distance of up to 1,046 kb, whereas the C subgenome exhibited extremely conserved LD of 7,882 kb when r 2 = 0.1. The average LD decay for the entire genome was 6,660 kb when r 2 = 0.1 ( Supplementary Fig. S1, Supplementary Table S4).

Environment
Min ± SD (°) Max ± SD (°) Mean ± SD (°) CV (%)   To better utilize the genotyping and phenotyping information obtained in the present study, two permissive models, GLM and the A-D test, were introduced into our association analysis. Briefly, the two models detected 48 (GLM) and 24 (A-D test) loci significantly associated with BLUP and individual environmental data at the corresponding Bonferroni threshold, values of − log 10 (p) = 4.3 for GLM and − log 10 (p) = 5.6 for the A-D test (Table 4, Fig. 2). We then compared the consistency of the loci identified among the three methods. All loci detected using MLM were repeatedly detected through either GLM or the A-D test, and four loci were consistently detected across all three models, including Bn-A03-p6228570 on A03, Bn-A04-p4410144 on A04, Bn-A07-p13172047 on A07 and Bn-scaff_16062_1-p345501 on C05 (Table 4). A total of 16 associated loci were consistently detected between GLM and the A-D test, whereas 20 and 8 loci were exclusive to GLM and the A-D test, respectively (Table 4). Altogether, the three methods identified 56 unique loci significantly associated with the branch angle trait. Except for C01, these loci are unevenly distributed over all chromosomes. A03 and A07 both have a maximum of 10 loci, A04 and C08 five loci, and A10 and C03 three loci. The remaining chromosomes have either one or two loci (Table 4). Approximately two-thirds of the loci (38/56) are distributed in the A subgenome; the remaining loci are distributed in the C subgenome. When using a simple additive model, the 56 loci explained up to 51.1% of the phenotypic variation.
Candidate gene mining. When using the whole genome genes as reference, two categories of genes, genes with auxin efflux transmembrane transporter activity (GO:0010329) and genes with auxin transmembrane transporter activity (GO:0080161), were found to enrich in the LD decay ranges of significant loci (false discovery rate < 0.05, Supplementary Fig. S3, Supplementary Data S6). Based on the GO annotation, Arabidopsis orthologue information and published gravistimulation microarray data, we further predicted candidate causal genes for loci significantly associated with the branch angle trait within the observed LD decay (r 2 > 0.1), with 77 plausible candidate genes predicted for 53 loci. Due to the low rate of LD decay (776.8 kb on average when r 2 = 0.1), more than one-third of the GWAS loci (20/53) have at least two candidate genes (Supplementary Data S7). For example, three candidate genes, BnaA01g12950, BnaA01g13320 and BnaA01g13580, which are orthologous to Arabidopsis CRK5 25 , ARF9 26 and TRH1 27 , were collectively identified within the LD decay of the GWAS locus Bn-A01-p7430311 (r 2 > 0.2, Supplementary Data S7). Briefly, 48 (62.3%) candidate genes are related to auxin asymmetric redistribution, and 10 (13.0%), 5 (6.5%), and 5 (6.5%) candidate genes are involved in gravity perception, gravity signal transduction and organ curvature, respectively (Supplementary Data S7). The remaining nine genes (11.7%) are associated with ROS, phototropism, ethylene, and strigolactone (Supplementary Data S7).
In Arabidopsis, sgr1-sgr9 represent a series of defective shoot gravity perception mutants with abnormal branch angles 5 . The vacuolar membrane dynamics of the stem gravity-sensing cells of the sgr2, sgr3, sgr4 and sgr8 mutants are abnormal and affect the sedimentable movements of statoliths (amyloplasts) 5 . In the sgr9 mutant, interaction between statoliths and actin filaments is perturbed, resulting in attenuated statolith sedimentation 28 . In the present study, two orthologues of SGR4, BnaA04g09380 and BnaC04g31610, are located at 8.4 Mb on A04 and 33.4 Mb on C04, 205.4 kb downstream from the peak SNP Bn-A04-p6929056 and 370.9 kb upstream from the peak SNP Bn-scaff_16876_1-p1162532, respectively (Supplementary Data S7). We also identified an orthologue of SGR2 at 13.6 Mb on C08, which is 302.7 kb downstream from the peak SNP Bn-scaff_16468_1-p450133 (Supplementary Data S7). In addition, the SGR9 orthologue BnaA10g26980 was identified at 17.1 Mb on A10, 132.8 kb upstream from the peak SNP Bn-A10-p17414621 (Supplementary Data S7). Notably, the loci harbouring the SGR4 orthologue on C04 and the SGR9 orthologue on A10 were both identified by stringent MLM. In addition, the orthologues of Arabidopsis SGR3 and SGR5 were detected within the LD decay of the SNPs at 18.1 Mb on A02 and 22.7 Mb on A06 by using the A-D test, respectively, though the corresponding signals were not significant (− log 10 (p) = 4.6 < 5.6, − log 10 (p) = 4.8 < 5.6).

Discussion
The ideotype theory has prompted crop geneticists to map and clone plant type-related QTLs. To date, several genes that control the rice tiller angle, including LAZY1 6 , TAC1 8 13 . In the present study, we performed GWAS in 520 diverse rapeseed accessions to reveal the QTLs affecting the branch angle trait. Based on the results of GWAS and GO annotation, we identified 77 plausible genes underlying the abundant phenotypic variation, including orthologues of well-characterized Arabidopsis genes, such as SGR2, SGR4, SGR9, LAZY1, TIR1, AFB4, PIN3 and PIN7.
The phenotypic data obtained for branches at different positions enabled us to determine the branch(es) that provide the most reliable representation of the branch angle phenotype. High correlation (r = 0.93) was observed between the average angle of five branches from the top and the average angle of all branches, and measuring only these five branches rather than all branches would reduce the time required to perform such an evaluation and preclude confounding due to the variable branch numbers of different plants. Our statistical analyses revealed that plant height and branch angle are positively correlated, suggesting that shorter accessions tend to accompany a more compact canopy architecture. In addition, we also observed positive correlations between plant-type traits and yield-related traits. These results are informative to breeders attempting to adapt branch angles to achieve the ideal canopy architecture and high yields.
The association panel examined in the present study exhibited a strong LD of up to 6,660 kb (1,046 and 7,882 kb for the A and C subgenomes) at a cut-off value of r 2 = 0.1, which can be explained by three possible reasons. First, the often cross-pollination habit of rapeseed (natural outcrossing rate of 10-30%) could partially   account for this phenomenon, as limited recombination from inadequate outcrossing is insufficient to break strong LD. The similar phenomenon has been observed in crops with low natural outcrossing rate, such as rice, which showed residual LD at a distance of 2,000 kb 21 . Second, the majority of accessions in the present association panel are Chinese elite breeding accessions; therefore, strong artificial selection for certain traits, such as punctual flowering time, high yield, and low erucic acid and glucosinolate levels, has exerted strong selective sweeps on the flanking regions of favourable genes and consequently has caused strong LD. Similar phenomena have been observed in maize, with LD decay ranging from less than 1 kb in landraces 30 to more than 100 kb in elite breeding lines 31 . Third, the "founder effect" could also account for the strong LD in our association panel.  that more than 39.7% of the 63 certified low erucic acid or glucosinolate cultivars from 1985 to 1996 are directly derived from these donor parents 33 . Consequently, the "founder effect" may have caused the strong LD observed in our association panel. Furthermore, the last two explanations might also be responsible for the stronger LD observed in our association panel as compared to that of several reported B. napus populations [34][35][36] . The 56 GWAS loci in the present study cumulatively explained up to 51.1% of the phenotypic variation when using a simple additive model. Interestingly, despite the numerous significant SNPs identified, none of them explained more than 7.0% of the phenotypic variation. The significant SNPs identified by GLM and MLM explained 2.6-6.6% and 3.2-6.0% of the phenotypic variation, respectively (Table 4). Similar genetic architectures have been observed for other complex traits, such as maize leaf angle, whereby 96.0% of the 30 significant QTLs had less than a 2.5° effect 15 . In addition, the 29 large branch angle lines in the association panel have favourable alleles for 32.8 ± 5.6 (± SD) of the 56 loci associated with branch angle, whereas the 27 small branch angle lines have 17.9 ± 4.9 (± SD) favourable alleles (Supplementary Fig. S2). Accordingly, these results suggest that the observed large differences in branch angle among inbred rapeseed lines are not caused by merely a few genes with large effects, but rather by the cumulative effects of numerous QTLs having only small individual impacts on the trait.
In the present study, three methods (MLM, GLM and the A-D test) were collectively applied to dissect the genetic architecture of branch angle. Considerable GWAS loci (described in the Results) overlapped between the methods; however, we also observed variations in the power and applicable scenarios of these three methods. Compared with GLM and particularly MLM, the A-D test showed two advantages. First, because the A-D test is a nonparametric test, it is more robust, particularly for traits with abnormal phenotypic distributions and controlled by moderate effects or rare variations 24 . For example, two rare genes BnaC08g09050 and BnaC08g44370 (MAF = 0.09, 0.08, respectively), which are orthologous to well-known Arabidopsis SGR2 37 and 5PTASE13 38 , were solely detected in the A-D test. Second, population structure has profound effects on the results of GWAS, as reported in a previous study 34 . Because the A-D test does not include a correction for population structure, p-value overcorrections do not occur when the method is applied to traits that are correlated with population structure 39 . Nonetheless, MLM performed better than the A-D test for major QTLs, particularly those with common alleles 24 , such as BnaC04g31610 and BnaC07g15160 (MAF = 0.29, 0.35, respectively). Compared with GLM, MLM is more stringent because it involves familial relationships (kinship) to further reduce p-value inflation, which caused false negative results in the present study. For example, the loci harbouring BnaA07g19520, BnaA10g19550 and BnaA10g26980 failed to reach the significance threshold in MLM but not in GLM, even though the orthologues of these genes in Arabidopsis, TIR1 10 , LAZY1 40 and SGR9 28 , have well-documented roles in branch angle regulation. Because the robust A-D test and GLM may introduce more false positives, it is feasible to combine the A-D test and GLM with MLM to maximize the detection power and take notice of potential false positives.
Despite the large number of loci associated with the branch angle trait, few loci were consistently detected in all environments and a considerable number of loci were only detected in one environment (Table 4). There are three possible reasons for these results. First, because GLM and the A-D test are permissive models that potentially introduce false positives, certain loci may have represented spurious signals that reflect confounding population structure. Second, environment can influence QTL expression and its magnitude because environment represents the manifestation of complex biotic, abiotic and agronomic factors 41 . Indeed, environmental effects are evident in previous QTL mapping analyses [41][42][43] . In the present study, ANOVA analysis revealed that environment did have significant effects on branch angle; therefore, the expression of certain sensitive QTLs could be affected by environment. Third, the GWAS model chosen can also affect the results because different models are proposed based on different statistical assumptions. For example, the locus harbouring one copy of LAZY1 on C03 was detected in two environments (2012-2013 Nanjing and 2013-2014 Wuhan) by using the A-D test but not by using GLM.
Branch angle is a special gravitropic set-point angle (GSA) representing a plant architecture trait that is primarily governed by plant gravitropism; however, the precise mechanisms underlying branch angle maintenance and development remain poorly understood. Interestingly, a recent study proposed a model to address this issue 10 , showing that an auxin-dependent antigravitropic response acts antagonistically with the gravitropic response to maintain angled growth: the branch angle value is dependent on the magnitude of the antigravitropic response and is mediated via TIR1/AFB-AUX/IAA-ARF-dependent auxin signalling pathway within stem endodermal cells 10 . Intriguingly, in the present study, the orthologues of Arabidopsis auxin signalling genes, including TIR1, AFB4, AXL1, RUB1, ARF9, ARF10 and IAA13, were collectively identified as candidate genes (Supplementary Data S7). Although the precise mechanism underlying the antigravitropic response is not fully understood, this model provides a conceptual framework for understanding the mechanism responsible for the branch angle trait and highlights a new avenue for further research.

Plant materials.
A set of 530 diverse rapeseed inbred accessions, including landraces and elite varieties, was collected to construct the association panel, a subset of which was reported in a previous study of flowering time 44 . Based on the information obtained for these varieties, plants were assigned to three germplasm types: winter type (41), semi-winter type (435) and spring type (54). The origins of the plants showed that 485 accessions originated from Asia, 32 from Europe, 8 from North America and 5 from Australia (Supplementary Data S4). Remarkably, China contributed 476 accessions that originated from three rapeseed sub-regions with diverse climates, land fertilities and hydrologies, and these accessions broadly represent the major genetic diversity of the Chinese rapeseed gene pool. Experimental design and trait measurement. In 2011/2012 in Wuhan, we measured the branch angles of 30 randomly selected lines at three weeks after the final flowering stage (Supplementary Data S1). A small section from the base of the stem encompassing each branching node was photographed and analysed using Photoshop to measure the adaxial angle of the branch to the stem. Correlation analysis between the angles of branches at different positions was performed in R 44 .
The association panel was grown in the 2012/2013 and 2013/2014 growing season using a randomized complete block design with three replications on experimental farms at Changsha (N 28.22°, E 113.00°), Wuhan (N 30.52°, E 114.32°) and Nanjing (N 32.05°, E 118.78°) China. Meteorological data for the three locations are presented in Supplementary Data S8. Each line was grown in a plot with two rows and 12-15 plants in each row. The phenotypic investigation started approximately three weeks after the final flowering stage. In 2012/2013, we measured the five branches from the top of each plant, and four plants for each accession from two replicates were selected. In 2013/2014, we extended the sample size to 12.
Correlation analysis and analysis of variance (ANOVA) of branch angle for the association panel across different environments was performed in R 44 . Subsequently, an R script based on a linear model was used to obtain the broad-sense heritability and best linear unbiased prediction (BLUP) of the multi-environment phenotypes for each accession 45 . The BLUPs and individual environment data were used as the phenotypes for association analysis. Pearson's correlation coefficient between branch angle and multiple agronomic traits, including plant height, branch number, main inflorescence pod number, pod length, seed number per pod, seed weight and seed yield (only data from 2012/2013 and 2013/2014 at Changsha were available for these traits), were calculated in R 44 .
SNP genotyping, filtering and in silico mapping. Leaf tissue samples from the entire association panel were obtained from bulk of at least four individuals for each accession at the seedling stage. DNA was extracted using a modified CTAB procedure according to Murray and Thompson 46 . The DNA quality was carefully assessed prior to genotyping.
SNP genotyping was performed using the Illumina Brassica 60K Infinium ® SNP array according to the manufacturer's instructions (http://www.illumina.com/technology/infinium_hd_ assay.ilmn). SNP data were clustered and automatically called using Illumina Genome Studio genotyping software. First, accessions with a Call Rate < 0.7 were excluded and all SNPs were reclustered. Next, SNPs with Call Freq < 0.75, Minor Freq < 0.05, AA or BB frequencies < 0.03 or GenTrain Scores < 0.5 were excluded. The remaining SNPs were manually reassessed, and those that did not show three clearly defined clusters were also excluded. Because heterozygous SNPs cannot be distinguished from hemi-SNPs or false calls, heterozygous calls were treated as missing values.
Fifty base pair sequences of retained SNPs after filtration were used to perform a BLASTN 47 search against the B.napus genome database (http://www.genoscope.cns.fr/brassicanapus/). Using an e-value threshold of e −12 , SNPs corresponding to multiple loci in the genome were excluded, and only the top blast hits were retained for further analysis.
Population structure, kinship and LD decay. The filtered SNP dataset for selected accessions with a Call Rate > 0.7 was entered into STRUCTURE V2.3.3 48 . Five independent runs were performed with a K-value (the putative number of genetic groups) varying from 1 to 10, with the length of the burn-in period and the number of MCMC (Markov Chain Monte Carlo) replications after burn-in both to 100,000 iterations under the admixture model. The most likely K-value was determined using the log probability of data [LnP(D)], and the ad hoc statistic Delta K, which is based on the rate of change of LnP(D) between successive K-values. The cluster membership coefficient matrices of five independent runs from STRUCTURE were integrated to obtain a Q matrix by the CLUMPP software 49 . The proportion of phenotypic variation that contributed by population structure was calculated via the lm function in R 44 . Relative kinship coefficients (K) were calculated using the SPADeGi software package 50 . All negative values between individuals were set to 0. The linkage disequilibrium measurement parameter r 2 was used to estimate linkage disequilibrium (LD) of A and C subgenome chromosomes via TASSEL5.0 51 . When calculating LD for a specific chromosome, the LD window size was adjusted to the chromosomal SNP number to force the calculation for all marker pairs. Locally paired scatterplot smoothing in R was employed to obtain a graphical representation of LD curves.
Genome-wide association study. Trait-SNP association analysis was performed using three methods.
Both GLM and MLM were implemented in TASSEL 5.0 51 . GLM takes into account population structure as a fixed effect. On this basis, MLM incorporates kinship as a random effect to further eliminate the excess of low p-values 19 . For GLM and MLM, the significance of associations between SNPs and traits was based on the uniform threshold p ≤ 5.2 × 10 −5 (− log 10 (p) = 4.3). The A-D test for branch angle was conducted by using the R package ADGWAS 1.0 24 . The A-D test is a nonparametric test that includes no correction for population structure. A more stringent threshold was set for the A-D test with p ≤ 2.6 × 10 −5 (− log 10 (p) = 5.6).
To better understand the explanatory power of the significant SNPs, we used the SNP genotypes at candidate loci as predictor variables in multiple linear models fitted to the phenotypic variables and subsequently ran a model comparison analysis (stepwise AIC procedure implemented as the R function "stepAIC") 52 to determine the best fitting model. The adjusted R 2 of the best fitting multiple regression model was referred to as the phenotypic variation explained by the significant SNPs.
Candidate gene mining. To define regions of interest containing potential candidate genes, local LD decay was calculated within flanking regions up to 12,000 kb on either side of significant SNPs using TASSEL5.0 51 , and a cut-off value of 0.1 was used for the LD statistic r 2 . Genes within the observed LD decay were annotated using the software Blast2GO v3.3.5 with the default settings 53 . In particular, genes with GO terms for gravitropism, amyloplast, and auxin were highlighted. Using the whole genome genes as reference, GO enrichment analysis of all genes within the LD decay ranges was implemented using Fisher's Exact Test in Blast2GO v3.3.5 (false discovery rate < 0.05) 53 . Next, we performed BLASTX searches against the Arabidopsis genome to determine whether candidate SNP-tagged genome regions contain genes orthologous to Arabidopsis genes with established roles in shoot gravitropism. Additionally, we exploited the Arabidopsis "Electronic Fluorescent Pictograph" (eFP) browser 54 and microarray data from previously published gravistimulation studies [55][56][57] to further characterize candidate genes. Notably, associated SNPs that are not in or near branch angle-related genes within the LD decay (r 2 = 0.1) were considered linked to a more distant gene, and the closest one of these genes was considered the most likely candidate.